Querying nested data is very difficult in any modern db that I have seen.

If It works as you suggest then It would be cool if the feature was it going to 
be eventually maintained inside solr.

> On Jul 23, 2014, at 7:13 AM, Renaud Delbru <renaud@siren.solutions> wrote:
> 
> One of the coolest features of Lucene/Solr is its ability to index nested 
> documents using a Blockjoin approach.
> 
> While this works well for small documents and document collections, it 
> becomes unsustainable for larger ones: Blockjoin works by splitting the 
> original document in many documents, one per nested record.
> 
> For example, a single USPTO patent (XML format converted to JSON) will end up 
> being over 1500 documents in the index. This has massive implications on 
> performance and scalability.
> 
> Introducing SIREn
> 
> SIREn is an open source plugin for Solr for indexing and searching rich 
> nested JSON data.
> 
> SIREn uses a sophisticated "tree indexing" design which ensures that the 
> index is not artificially inflated. This ensures that querying on many types 
> of nested queries can be up to 3x faster. Further, depending on the data, 
> memory requirements for faceting can be up to 10x higher. As such, SIREn 
> allows you to use Solr for larger and more complex datasets, especially so 
> for sophisticated analytics. (You can read our whitepaper to find out more 
> [1])
> 
> SIREn is also truly schemaless - it even allows you to change the type of a 
> property between documents without being restricted by a defined mapping. 
> This can be very useful for data integration scenarios where data is 
> described in different ways in different sources.
> 
> You only need a few minutes to download and try SIREn [2]. It comes with a 
> detailed manual [3] and you have access to the code on GitHub [4].
> 
> We look forward to hear about your feedbacks.
> 
> [1] 
> http://siren.solutions/siren/resources/whitepapers/comparing-siren-1-2-and-lucenes-blockjoin-performance-a-uspto-patent-search-scenario/
> [2] http://siren.solutions/siren/downloads/
> [3] http://siren.solutions/manual/preface.html
> [4] https://github.com/sindicetech/siren
> -- 
> Renaud Delbru
> CTO
> SIREn Solutions

Reply via email to