Complex Search Engine Design Case Study

Who needs another search engine?

A start-up approached us to design a recipe search engine that would generate accurate results for complex nutrition requirements customizable for various food restrictions. The traditional text based search engines produce many incorrect results when faced with ambiguous terms. Likewise, tag based search algorithms are inherently limited when searching for extraordinary compound requirements.

False results have become accepted in most non-scientific applications and the average content searcher takes the need to purge results for granted. With the rise for allergy awareness, the need for a better system became apparent.

Hybrid approach

The decision was made to create a hybrid database design of which the ingredients are the focal point and are linked to an alias table allowing for pseudo text based search. In addition, a hierarchical database for categories allows for both algorithmic as well as human tagging. These ‘tags’ can be nested which allows for inclusion and exclusion of various nested sets such as excluding ‘dairy’, but including ‘blue cheese’.

Limiting the need of human interaction

One of the first POC (proof of concept) efforts made it clear that while the smaller tasks such as assigning ingredients that weren’t known to the system to the right categories or identify them as aliases to known ingredients was manageable by the human capital available, but the assignment of allergy and intolerance specific tags had to be automated. For that a group of domain experts created specific rule sets allowing for both the tagging of recipes at submission along with on the fly creation of complex search patterns for combined allergies.

Making the application profitable

Once the application was launched it became important to create a reporting mechanism that would allow for data driven market segmentation as the integral part of the applications marketability. With the data structured in a very granular fashion, it was a moderately easy task to create highly detailed data-sets that could be pushed out to a visualization platform in which we were able to clearly identify the marketing potentials for the product.

