Re: searching on nested docs - geting back the nested docs as a response

2014-06-20 Thread liorg
I am not sure highlight will work as i suspect it will encounter the same 
obstacle, see in:
https://github.com/elasticsearch/elasticsearch/issues/5245

as for suggestion #2, this will break our current schema and will require a 
significant model change (we store the data in MongoDB as well) - so, i am 
not sure if we are not better off to wait until #3022 is solved? for the 
meantime, any workaround will be appreciated...

can we do some in memory searching again? (using native lucene somehow?...)

On Friday, June 20, 2014 1:13:42 AM UTC+3, Itamar Syn-Hershko wrote:
>
> It is very hard to give you concrete advice without knowing more about 
> your domain and usecases, but here are 2 points that came to mind:
>
> 1. You can make use of the highlighting features to show the content that 
> matched. Highlighters can return whole blocks of text, and by using 
> positionIncrements correctly you can get this right.
>
> 2. Yes, Elasticsearch is a document-oriented storage, but is it really 
> necessary for you to index entire books as one document? I'd most certainly 
> look at indexing sections or chapters maybe even pages as single documents 
> and use string references to the book ID. Unless you use data from the book 
> level along with full-text searches on the texts, which even then in some 
> scenarios I would consider denormalization.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Author of RavenDB in Action <http://manning.com/synhershko/>
>
>
> On Thu, Jun 19, 2014 at 10:13 PM, liorg > 
> wrote:
>
>> Well, assuming we have a book type. the book holds a lot of metadata, 
>> lets say something of the following:
>> {
>> "author": {
>> "name": "Jose",
>>  "lastName": "Martin"
>> },
>> "sections": [{
>>  "chapters": [{
>> "pages": [{
>> "pageNum": 1,
>>  "numOfChars": 1000,
>> "text": "let my people...",
>> "numofWords": 125
>>  },
>> {
>> "pageNum": 2,
>> "numOfChars": 1005,
>>  "text": "let my people go...",
>> "numofWords": 150
>>  }],
>> "chapterName": "the start"
>> },
>>  {
>> "pages": [{
>> "pageNum": 3,
>> "numOfChars": 1000,
>>  "text": "will do...",
>> "numofWords": 125
>> },
>>  {
>> "pageNum": 4,
>> "numOfChars": 1005,
>>  "text": "will do later on...",
>> "numofWords": 150
>>  }],
>> "chapterName": "the end"
>> }],
>>  "sectionName": "prologue"
>> }]
>> }
>>
>> we want to search for all the pages that have "let my people" in their 
>> text and more than 100 words.
>> so, when we use ES we can use nested objects and query on the nested page 
>> object - but the actual returned values are the books (parents) that have 
>> those matching pages.
>> now, if we want to show the user the pages he was looking for - we cannot 
>> do that, as we get the whole book type returned with all its metadata and 
>> not just the nested objects that matched the criteria... - we need to 
>> search again (maybe in memory?) for the pages that matched the criteria in 
>> order to display the user his search results... (the whole type is returned 
>> as ES does not support yet in returning the nested objects that matched the 
>> criteria).
>>
>> i hope it is better understood now
>>
>> On Thursday, June 19, 2014 7:22:13 PM UTC+3, Itamar Syn-Hershko wrote:
>>
>>> This is usually something that's being solved using parent-child, but 
>>> the question here really is what do you mean by needing to retrieve both 
>>> books & pages.
>>>
>>> Can you describe the actual scenario and what you are trying to achieve?
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>>> Freelance Developer & Consultant
>>> Author of RavenDB in Action <http://manning.com/synhershko/>
>>>
>>>
>>> On Thu, Jun 19, 2014 at 7:12 PM, liorg  wrote:
>>>
>>>>  Hi,
>>>>
>>>> we have somehow a complex type holding some nested docs with arrays 
>>>> (lets assume an hierarchy of books and for

Re: searching on nested docs - geting back the nested docs as a response

2014-06-19 Thread liorg
Well, assuming we have a book type. the book holds a lot of metadata, lets 
say something of the following:
{
"author": {
"name": "Jose",
"lastName": "Martin"
},
"sections": [{
"chapters": [{
"pages": [{
"pageNum": 1,
"numOfChars": 1000,
"text": "let my people...",
"numofWords": 125
},
{
"pageNum": 2,
"numOfChars": 1005,
"text": "let my people go...",
"numofWords": 150
 }],
"chapterName": "the start"
},
{
"pages": [{
"pageNum": 3,
"numOfChars": 1000,
"text": "will do...",
"numofWords": 125
},
{
"pageNum": 4,
"numOfChars": 1005,
"text": "will do later on...",
"numofWords": 150
 }],
"chapterName": "the end"
}],
"sectionName": "prologue"
}]
}

we want to search for all the pages that have "let my people" in their text 
and more than 100 words.
so, when we use ES we can use nested objects and query on the nested page 
object - but the actual returned values are the books (parents) that have 
those matching pages.
now, if we want to show the user the pages he was looking for - we cannot 
do that, as we get the whole book type returned with all its metadata and 
not just the nested objects that matched the criteria... - we need to 
search again (maybe in memory?) for the pages that matched the criteria in 
order to display the user his search results... (the whole type is returned 
as ES does not support yet in returning the nested objects that matched the 
criteria).

i hope it is better understood now

On Thursday, June 19, 2014 7:22:13 PM UTC+3, Itamar Syn-Hershko wrote:
>
> This is usually something that's being solved using parent-child, but the 
> question here really is what do you mean by needing to retrieve both books 
> & pages.
>
> Can you describe the actual scenario and what you are trying to achieve?
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko <https://twitter.com/synhershko>
> Freelance Developer & Consultant
> Author of RavenDB in Action <http://manning.com/synhershko/>
>
>
> On Thu, Jun 19, 2014 at 7:12 PM, liorg > 
> wrote:
>
>> Hi,
>>
>> we have somehow a complex type holding some nested docs with arrays (lets 
>> assume an hierarchy of books and for each book we have an array of pages 
>> containing its metadata).
>>
>> we want to search for the nested doc - search for all the books that have 
>> the term "XYZ" in one of their pages - but we want to get back not only the 
>> book, but the pages themselves.
>>
>> We've understood that it's problematic to achieve with ES (see 
>> https://github.com/elasticsearch/elasticsearch/issues/3022).
>>
>> We have a problem to achieve it with parent child model as the data model 
>> comes from our mongodb already existing model (and besides, not sure if a 
>> parent child model fits here).
>>
>> so...
>>
>> 1. Is there any a workaround we can do to get the results of the nested 
>> doc? (the actual pages?)
>> 2. If not, is there a recommended way we can search for the data again in 
>> memory after it was narrowed down by ES server?...
>> 3. Any advice will be appreciated as this is quite a big obstacle in our 
>> way to implement a solution using ES.
>>
>> thanks,
>>
>> Lior
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/7602d608-5730-472e-8259-763ff29614ea%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/7602d608-5730-472e-8259-763ff29614ea%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c3034e7-34d9-4b4d-802a-5110330b31a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


searching on nested docs - geting back the nested docs as a response

2014-06-19 Thread liorg
Hi,

we have somehow a complex type holding some nested docs with arrays (lets 
assume an hierarchy of books and for each book we have an array of pages 
containing its metadata).

we want to search for the nested doc - search for all the books that have 
the term "XYZ" in one of their pages - but we want to get back not only the 
book, but the pages themselves.

We've understood that it's problematic to achieve with ES 
(see https://github.com/elasticsearch/elasticsearch/issues/3022).

We have a problem to achieve it with parent child model as the data model 
comes from our mongodb already existing model (and besides, not sure if a 
parent child model fits here).

so...

1. Is there any a workaround we can do to get the results of the nested 
doc? (the actual pages?)
2. If not, is there a recommended way we can search for the data again in 
memory after it was narrowed down by ES server?...
3. Any advice will be appreciated as this is quite a big obstacle in our 
way to implement a solution using ES.

thanks,

Lior

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7602d608-5730-472e-8259-763ff29614ea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.