I'll found out that "storing" Documents as separate docs+id does not help either.
You must have an completely separate collection/core to get things work fast.

Kind regards,
Jochen


Zitat von Jochen Barth <ba...@ub.uni-heidelberg.de>:

Ok, https://wiki.apache.org/solr/SolrPerformanceFactors

states that: "Retrieving the stored fields of a query result can be a significant expense. This cost is affected largely by the number of bytes stored per document--the higher byte count, the sparser the documents will be distributed on disk and more I/O is necessary to retrieve the fields (usually this is a concern when storing large fields, like the entire contents of a document)."

But in my case (with docValues=true) there should be no reason to access *.fdt.

Kind regards,
Jochen

Zitat von Jochen Barth <ba...@ub.uni-heidelberg.de>:

Something is really strange here:

even when configuring fields id + sort_... to docValues="true" -- so there's nothing to get from "stored documents file" -- performance is still terrible with ocr stored=true _even_ with my patch which stores uncompressed like solr4.0.0 (checked with strings -a on *.fdt).

Just reading http://lucene.472066.n3.nabble.com/Can-Solr-handle-large-text-files-td3439504.html .. perhaps things will clear up soon (will check if spltting to index+non-stored and non-indexed+stored could help here)


Kind regards,
J. Barth


Zitat von Shawn Heisey <s...@elyograg.org>:

On 4/29/2014 4:20 AM, Jochen Barth wrote:
BTW: stored field compression:
are all "stored fields" within a document are put into one compressed chunk,
or by per-field basis?

Here's the issue that added the compression to Lucene:

https://issues.apache.org/jira/browse/LUCENE-4226

It was made the default stored field format for Lucene, which also made
it the default for Solr.  At this time, there is no way to remove
compression on Solr without writing custom code.  I filed an issue to
make it configurable, but I don't know how to do it.  Nobody else has
offered a solution either.  One day I might find some time to take a
look at the issue and see if I can solve it myself.

https://issues.apache.org/jira/browse/SOLR-4375

Here's the author's blog post that goes into more detail than the LUCENE
issue:

http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene

Thanks,
Shawn


Reply via email to