Re: Stored vs non-stored very large text fields

2014-05-05 Thread Jochen Barth
I'll found out that "storing" Documents as separate docs+id does not help either. You must have an completely separate collection/core to get things work fast. Kind regards, Jochen Zitat von Jochen Barth : Ok, https://wiki.apache.org/solr/SolrPerformanceFactors states that: "Retrieving the

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Ok, https://wiki.apache.org/solr/SolrPerformanceFactors states that: "Retrieving the stored fields of a query result can be a significant expense. This cost is affected largely by the number of bytes stored per document--the higher byte count, the sparser the documents will be distributed o

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Something is really strange here: even when configuring fields id + sort_... to docValues="true" -- so there's nothing to get from "stored documents file" -- performance is still terrible with ocr stored=true _even_ with my patch which stores uncompressed like solr4.0.0 (checked with string

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Dear Shawn, see attachment for my first "brute force" no-compression attempt. Kind regards, Jochen Zitat von Shawn Heisey : On 4/29/2014 4:20 AM, Jochen Barth wrote: BTW: stored field compression: are all "stored fields" within a document are put into one compressed chunk, or by per-field b

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Shawn Heisey
On 4/29/2014 4:20 AM, Jochen Barth wrote: > BTW: stored field compression: > are all "stored fields" within a document are put into one compressed chunk, > or by per-field basis? Here's the issue that added the compression to Lucene: https://issues.apache.org/jira/browse/LUCENE-4226 It was made

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
BTW: stored field compression: are all "stored fields" within a document are put into one compressed chunk, or by per-field basis? Kind regards, J. Barth > > Regards, >Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Am 29.04.2014 11:19, schrieb Alexandre Rafalovitch: > Couple of random thoughts: > 1) The latest (4.8) Solr has support for nested documents, as well as > for expand components. Maybe that will let you have more efficient > architecture: http://heliosearch.org/expand-block-join/ Yes, I've seen thi

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Alexandre Rafalovitch
Couple of random thoughts: 1) The latest (4.8) Solr has support for nested documents, as well as for expand components. Maybe that will let you have more efficient architecture: http://heliosearch.org/expand-block-join/ 2) Do you return OCR text to the client? Or just search it? If just search it,

Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Dear reader, I'm trying to use solr for a hierarchical search: metadata from the higher-levelled elements is copied to the lower ones, and each element has the complete ocr text which it belongs to. At volume level, of course, we will have the complete ocr text in one and we need to store it for