Hi,

We are using ElasticSearch for navigating through our product catalog. We 
have fairly simple documents like:

        {
            "_index": "catalog",
            "_type": "product",
            "_id": "476",
            "_score": 1,
            "_source": {
               "id": 476,
               "description": "Product description",
               "a8": "100 mm",
               "a12": "250 g",
               "categories": [
                  8,
                  4213
               ]
            }
         }

where every product has the following attributes:

   - id, unique identifier;
   - description, a short description;
   - a*, custom defined attributes;
   - categories, the categories the product is linked to.

We've added queries (including autocomplete), filters and facets so far and 
it works really great. 

So lately we've added a new feature where users can add RDF-triple like 
relations between products using custom predicates. E.g.

   1. <product x> is an alternative for <product y>;
   2. <product x> is a dispenser for <product y>;
   3. etc.
   
My question is about the second example where products are dispensers for 
other products.

We want the user to be able to find disposables using both the disposable 
product attributes as well as the linked dispenser product attributes. 
Example:

For every printer there are different toners available (e.g. different 
capacities, different brands, etc.) and several printers can use the same 
toner. When trying to find a toner we want the user to be able to select 
both attributes of the toners as well as attributes of the printers linked 
to the toners. So when the user selects the brand "Brother" for the toner 
brand facet, only "Brother" toners are shown. But when the user selects 
"Brother" as a filter for the printer brand facet, all toners that are 
suited for the printer are shown, regardless of the toner brand.

So how would this translate in a document design in ES. As both the 
dispenser and disposable products are documents within ES we could only 
store references on each document categorized on the custom predicate like:

        {
            "_index": "catalog",
            "_type": "product",
            "_id": "476",
            "_score": 1,
            "_source": {
               "id": 476,
               "description": "Product description",
               "a8": "100 mm",
               "a12": "250 g",
               "categories": [
                  8,
                  4213
               ],
*               "<predicate_p>": [*
*                  <product_id_x>,*
*                  <product_id_y>*
*               ]*
            }
         }

However when also wanting to represent a facet result count that makes 
sense for both dispenser and disposable, meaning the count for both types 
of products are based on the resulting disposables, this would probably not 
work. We would first need to filter using the dispenser followed by the 
disposable, showing different counts for both the dispenser and disposable 
attributes.

Another option would be storing the whole related document(s) under the 
predicate defined for every document. This means a huge expansion of the 
index and a lot of repetition in all data that would make the maintenance 
of the documents a lot more complex.

So what would be a best practice solution for this scenario? Or could it be 
that we are looking at the wrong type of storage (document store) for this 
kind of question (graph database?).

Any idea on this would be very welcome. Thank you in advance!

Cheers,

Peter

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6d379b8e-4452-4ced-a025-8dd80e22fc10%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to