Re: SIREn plugin for nested documents

2014-07-24 Thread Ivan Brusic
Thanks for chiming in Renaud. Hopefully I will have a chance to test out
the plugin soon. My use case for nested documents is fairly simple.

-- 
Ivan


On Thu, Jul 24, 2014 at 4:00 AM,  wrote:

> Hi Brian,
>
> Our apologies for the issues with the web site, we had some problems on
> our web server yesterday.
>
> What you have described is very close to the indexing model in SIREn.
> SIREn provides an optimised Lucene's Codec for such data structure, and
> provide query operators on top of this data structure.
>
>
> Kind Regards
> --
> Renaud Delbru
>
> On Wednesday, July 23, 2014 7:39:04 PM UTC+1, Brian wrote:
>>
>> Thanks for the link. Unfortunately, Chrome on Mac OS (latest versions of
>> each) causes this web page to blank and redisplay continually. Can't read
>> it; hope you can.
>>
>> In a previous life, I created a search engine that handled parent/child
>> relationships with blindingly fast performance. One trick was that the
>> index didn't just contain the document ID, but it contained the entire
>> hierarchy of IDs. So, for example (and brevity, the IDs are single letters):
>>
>> Document ID and
>> relationship  Fully qualified and indexed ID
>> ---   --
>> A A
>>B  A.B
>>   C   A.B.C
>>D  A.D
>>   E   A.D.E
>>   F   A.D.F
>>
>> So for example, it was nearly instantaneous to determine that, just by
>> looking at and comparing the fully qualified IDs:
>>
>> A and F are in the same parent-child hierarchy, with F being a child of D
>> and a grandchild of A.
>>
>> E and F are siblings under the same parent.
>>
>> And so on.
>>
>> Not sure how this would mesh with Lucene though. But complex parent-child
>> relationships could be intersected just by the fully qualified IDs that
>> came out of the inverted index. Documents did not need to be fetched or
>> cached to perform this operation, and the result was breathtakingly
>> blindingly fast performance.
>>
>> Just FYI. I can discuss off-line if anyone wishes.
>>
>> Brian
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/486046a9-8edf-452f-97a2-2a4fab58f390%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDUw_WNwDo6VcFzTBP%3Dwk8R2A5Xa3n40_By0QeyafZPBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: SIREn plugin for nested documents

2014-07-24 Thread renaud
Hi Brian, 

Our apologies for the issues with the web site, we had some problems on our 
web server yesterday.

What you have described is very close to the indexing model in SIREn. SIREn 
provides an optimised Lucene's Codec for such data structure, and provide 
query operators on top of this data structure.

Kind Regards
-- 
Renaud Delbru

On Wednesday, July 23, 2014 7:39:04 PM UTC+1, Brian wrote:
>
> Thanks for the link. Unfortunately, Chrome on Mac OS (latest versions of 
> each) causes this web page to blank and redisplay continually. Can't read 
> it; hope you can.
>
> In a previous life, I created a search engine that handled parent/child 
> relationships with blindingly fast performance. One trick was that the 
> index didn't just contain the document ID, but it contained the entire 
> hierarchy of IDs. So, for example (and brevity, the IDs are single letters):
>
> Document ID and
> relationship  Fully qualified and indexed ID
> ---   --
> A A
>B  A.B
>   C   A.B.C
>D  A.D
>   E   A.D.E
>   F   A.D.F
>
> So for example, it was nearly instantaneous to determine that, just by 
> looking at and comparing the fully qualified IDs:
>
> A and F are in the same parent-child hierarchy, with F being a child of D 
> and a grandchild of A.
>
> E and F are siblings under the same parent.
>
> And so on.
>
> Not sure how this would mesh with Lucene though. But complex parent-child 
> relationships could be intersected just by the fully qualified IDs that 
> came out of the inverted index. Documents did not need to be fetched or 
> cached to perform this operation, and the result was breathtakingly 
> blindingly fast performance.
>
> Just FYI. I can discuss off-line if anyone wishes.
>
> Brian
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/486046a9-8edf-452f-97a2-2a4fab58f390%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: SIREn plugin for nested documents

2014-07-24 Thread renaud
Hi Ivan,

(P.S.: I am one of the developer of the SIREn plugin)

it would be possible for SIREn to support such functionality (but it is not 
yet implemented), as each element / node in the tree as a unique identifier 
that is retrieved at search time. Therefore, one could use this identifier 
to fetch and filter the relevant element from the original JSON document. 
In both stock Elasticsearch and SIREn case, the main problematic from what 
I understand is that this would require a refactoring of the fetching phase 
in Elasticsearch.

Kind Regards
-- 
Renaud Delbru

On Wednesday, July 23, 2014 6:53:00 PM UTC+1, Ivan Brusic wrote:
>
> Has anyone else seen this plugin? http://siren.solutions/siren/overview/
>
> There was some discussion between one of the developers and Jorg a while 
> back, so I guess this is the outcome. Have not tried it yet, but I will 
> give it a shot this weekend. I am hoping that it can fix a longstanding 
> issue in Elasticsearch (and my biggest roadblock): 
> https://github.com/elasticsearch/elasticsearch/issues/3022 
> 
>
> Cheers,
>
> Ivan
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/40565ad1-d50b-485c-9889-0637a8c78847%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: SIREn plugin for nested documents

2014-07-23 Thread Brian
Thanks for the link. Unfortunately, Chrome on Mac OS (latest versions of 
each) causes this web page to blank and redisplay continually. Can't read 
it; hope you can.

In a previous life, I created a search engine that handled parent/child 
relationships with blindingly fast performance. One trick was that the 
index didn't just contain the document ID, but it contained the entire 
hierarchy of IDs. So, for example (and brevity, the IDs are single letters):

Document ID and
relationship  Fully qualified and indexed ID
---   --
A A
   B  A.B
  C   A.B.C
   D  A.D
  E   A.D.E
  F   A.D.F

So for example, it was nearly instantaneous to determine that, just by 
looking at and comparing the fully qualified IDs:

A and F are in the same parent-child hierarchy, with F being a child of D 
and a grandchild of A.

E and F are siblings under the same parent.

And so on.

Not sure how this would mesh with Lucene though. But complex parent-child 
relationships could be intersected just by the fully qualified IDs that 
came out of the inverted index. Documents did not need to be fetched or 
cached to perform this operation, and the result was breathtakingly 
blindingly fast performance.

Just FYI. I can discuss off-line if anyone wishes.

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5b6ef1ce-3daf-4de5-b106-710fd306863d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: SIREn plugin for nested documents

2014-07-23 Thread joergpra...@gmail.com
I noticed Siren has an example of 1000 library catalog records from British
Library prepared in JSON

https://github.com/sindicetech/siren/blob/master/siren-elasticsearch-demo/src/example/datasets/bnb/

>From what it seems, Siren can index a tree (semi-structured data), using
positional nodes, then you can express a tree node DSL query in JSON, and
the result is something like a list of found node ids.

Regarding the "inner hits" challenge, this seems to get very close, because
a JSON doc is always semi-structured. The question is how to embed Siren
documents into Elasticsearch documents (or vice versa), i.e. can they
co-exist and queried by a single query, combining the power of both.

While this is interesting for nested hierarchical data models, I am
studying JSON-LD and graph search in ES, for being able to follow links
between docs (or even between ES docs and web resources, local or remote).

Jörg


On Wed, Jul 23, 2014 at 7:52 PM, Ivan Brusic  wrote:

> Has anyone else seen this plugin? http://siren.solutions/siren/overview/
>
> There was some discussion between one of the developers and Jorg a while
> back, so I guess this is the outcome. Have not tried it yet, but I will
> give it a shot this weekend. I am hoping that it can fix a longstanding
> issue in Elasticsearch (and my biggest roadblock):
> https://github.com/elasticsearch/elasticsearch/issues/3022
>
> Cheers,
>
> Ivan
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA3-NCDz%2B-gzAd74Pq3-kiGTvEZDW_L-uuhRG6V_-BSvg%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEue3WsZ0h-Ud0y2Z7oY2gp3mo6iWv84DnygCPVibVRRw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


SIREn plugin for nested documents

2014-07-23 Thread Ivan Brusic
Has anyone else seen this plugin? http://siren.solutions/siren/overview/

There was some discussion between one of the developers and Jorg a while
back, so I guess this is the outcome. Have not tried it yet, but I will
give it a shot this weekend. I am hoping that it can fix a longstanding
issue in Elasticsearch (and my biggest roadblock):
https://github.com/elasticsearch/elasticsearch/issues/3022

Cheers,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA3-NCDz%2B-gzAd74Pq3-kiGTvEZDW_L-uuhRG6V_-BSvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.