I am not sure highlight will work as i suspect it will encounter the same 
obstacle, see in:
https://github.com/elasticsearch/elasticsearch/issues/5245

as for suggestion #2, this will break our current schema and will require a 
significant model change (we store the data in MongoDB as well) - so, i am 
not sure if we are not better off to wait until #3022 is solved? for the 
meantime, any workaround will be appreciated...

can we do some in memory searching again? (using native lucene somehow?...)

On Friday, June 20, 2014 1:13:42 AM UTC+3, Itamar Syn-Hershko wrote:
>
> It is very hard to give you concrete advice without knowing more about 
> your domain and usecases, but here are 2 points that came to mind:
>
> 1. You can make use of the highlighting features to show the content that 
> matched. Highlighters can return whole blocks of text, and by using 
> positionIncrements correctly you can get this right.
>
> 2. Yes, Elasticsearch is a document-oriented storage, but is it really 
> necessary for you to index entire books as one document? I'd most certainly 
> look at indexing sections or chapters maybe even pages as single documents 
> and use string references to the book ID. Unless you use data from the book 
> level along with full-text searches on the texts, which even then in some 
> scenarios I would consider denormalization.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Author of RavenDB in Action <http://manning.com/synhershko/>
>
>
> On Thu, Jun 19, 2014 at 10:13 PM, liorg <lior...@gmail.com <javascript:>> 
> wrote:
>
>> Well, assuming we have a book type. the book holds a lot of metadata, 
>> lets say something of the following:
>> {
>> "author": {
>> "name": "Jose",
>>  "lastName": "Martin"
>> },
>> "sections": [{
>>  "chapters": [{
>> "pages": [{
>> "pageNum": 1,
>>  "numOfChars": 1000,
>> "text": "let my people...",
>> "numofWords": 125
>>  },
>> {
>> "pageNum": 2,
>> "numOfChars": 1005,
>>  "text": "let my people go...",
>> "numofWords": 150
>>  }],
>> "chapterName": "the start"
>> },
>>  {
>> "pages": [{
>> "pageNum": 3,
>> "numOfChars": 1000,
>>  "text": "will do...",
>> "numofWords": 125
>> },
>>  {
>> "pageNum": 4,
>> "numOfChars": 1005,
>>  "text": "will do later on...",
>> "numofWords": 150
>>  }],
>> "chapterName": "the end"
>> }],
>>  "sectionName": "prologue"
>> }]
>> }
>>
>> we want to search for all the pages that have "let my people" in their 
>> text and more than 100 words.
>> so, when we use ES we can use nested objects and query on the nested page 
>> object - but the actual returned values are the books (parents) that have 
>> those matching pages.
>> now, if we want to show the user the pages he was looking for - we cannot 
>> do that, as we get the whole book type returned with all its metadata and 
>> not just the nested objects that matched the criteria... - we need to 
>> search again (maybe in memory?) for the pages that matched the criteria in 
>> order to display the user his search results... (the whole type is returned 
>> as ES does not support yet in returning the nested objects that matched the 
>> criteria).
>>
>> i hope it is better understood now
>>
>> On Thursday, June 19, 2014 7:22:13 PM UTC+3, Itamar Syn-Hershko wrote:
>>
>>> This is usually something that's being solved using parent-child, but 
>>> the question here really is what do you mean by needing to retrieve both 
>>> books & pages.
>>>
>>> Can you describe the actual scenario and what you are trying to achieve?
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>> Freelance Developer & Consultant
>>> Author of RavenDB in Action <http://manning.com/synhershko/>
>>>
>>>
>>> On Thu, Jun 19, 2014 at 7:12 PM, liorg <lior...@gmail.com> wrote:
>>>
>>>>  Hi,
>>>>
>>>> we have somehow a complex type holding some nested docs with arrays 
>>>> (lets assume an hierarchy of books and for each book we have an array of 
>>>> pages containing its metadata).
>>>>
>>>> we want to search for the nested doc - search for all the books that 
>>>> have the term "XYZ" in one of their pages - but we want to get back not 
>>>> only the book, but the pages themselves.
>>>>
>>>> We've understood that it's problematic to achieve with ES (see 
>>>> https://github.com/elasticsearch/elasticsearch/issues/3022).
>>>>
>>>> We have a problem to achieve it with parent child model as the data 
>>>> model comes from our mongodb already existing model (and besides, not sure 
>>>> if a parent child model fits here).
>>>>
>>>> so...
>>>>
>>>> 1. Is there any a workaround we can do to get the results of the nested 
>>>> doc? (the actual pages?)
>>>> 2. If not, is there a recommended way we can search for the data again 
>>>> in memory after it was narrowed down by ES server?...
>>>> 3. Any advice will be appreciated as this is quite a big obstacle in 
>>>> our way to implement a solution using ES.
>>>>
>>>> thanks,
>>>>
>>>> Lior
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to elasticsearc...@googlegroups.com.
>>>>
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/7602d608-5730-472e-8259-763ff29614ea%
>>>> 40googlegroups.com 
>>>> <https://groups.google.com/d/msgid/elasticsearch/7602d608-5730-472e-8259-763ff29614ea%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/6c3034e7-34d9-4b4d-802a-5110330b31a4%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/6c3034e7-34d9-4b4d-802a-5110330b31a4%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c31e949a-0d6c-400c-bffd-48e203e86c52%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to