Re: Stored vs non-stored very large text fields

Jochen Barth Mon, 05 May 2014 11:09:28 -0700

I'll found out that "storing" Documents as separate docs+id does nothelp either.

You must have an completely separate collection/core to get things work fast.


Kind regards,
Jochen


Zitat von Jochen Barth <ba...@ub.uni-heidelberg.de>:

Ok, https://wiki.apache.org/solr/SolrPerformanceFactors
states that: "Retrieving the stored fields of a query result can bea significant expense. This cost is affected largely by the numberof bytes stored per document--the higher byte count, the sparser thedocuments will be distributed on disk and more I/O is necessary toretrieve the fields (usually this is a concern when storing largefields, like the entire contents of a document)."
But in my case (with docValues=true) there should be no reason toaccess *.fdt.
Kind regards,
Jochen

Zitat von Jochen Barth <ba...@ub.uni-heidelberg.de>:
Something is really strange here:
even when configuring fields id + sort_... to docValues="true" --so there's nothing to get from "stored documents file" --performance is still terrible with ocr stored=true _even_ with mypatch which stores uncompressed like solr4.0.0 (checked withstrings -a on *.fdt).
Just readinghttp://lucene.472066.n3.nabble.com/Can-Solr-handle-large-text-files-td3439504.html .. perhaps things will clear up soon (will check if spltting to index+non-stored and non-indexed+stored could helphere)
Kind regards,
J. Barth


Zitat von Shawn Heisey <s...@elyograg.org>:
On 4/29/2014 4:20 AM, Jochen Barth wrote:
BTW: stored field compression:
are all "stored fields" within a document are put into onecompressed chunk,
or by per-field basis?
Here's the issue that added the compression to Lucene:

https://issues.apache.org/jira/browse/LUCENE-4226

It was made the default stored field format for Lucene, which also made
it the default for Solr.  At this time, there is no way to remove
compression on Solr without writing custom code.  I filed an issue to
make it configurable, but I don't know how to do it.  Nobody else has
offered a solution either.  One day I might find some time to take a
look at the issue and see if I can solve it myself.

https://issues.apache.org/jira/browse/SOLR-4375

Here's the author's blog post that goes into more detail than the LUCENE
issue:

http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene

Thanks,
Shawn

Re: Stored vs non-stored very large text fields

Reply via email to