Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

Mike Klaas Tue, 29 Jul 2008 13:17:39 -0700

On 28-Jul-08, at 11:16 PM, Britske wrote:

That sounds interesting. Let me explain my situation, which may be avariantof what you are proposing. My documents contain more than 10.000fields, but
these fields are divided like:
1. about 20 general purpose fields, of which more than 1 can beselected in
a query.
2. about 10.000 fields of which each query based on some criteriaexactly
selects one field.
Obviously 2. is killing me here, but given the above perhaps itwould bepossible to make 10.000 vertical slices/ indices, and based on thefield to
be selected (from point 2) select the slice/index to search in.
The 10.000 indices would run on the same box, and the 20 generalpurposefields have have to be copied to all slices (which means someincrease in
overall index size, but managable), but this would give me far more
reasonable sized and compact documents, which would mean (documentsare farmore likely to be in the same cached slot, and be accessed in thesame disk
-seek.


Are all 10k values equally-likely to be retrieved?

Does this make sense?

Well, I would probably split into two indices, one containing the 20fields and one containing the 10k. However, if the 10k fields areequally likely to be chosen, this will not help in the long term,since the working set of disk blocks is still going to be all of them.

Am I correct that this has nothing to do with
Distributed search, since that really is all about horizontalsplitting /sharding of the index, and what I'm suggesting is splittingvertically? Isthere some other part of Solr that I can use for this, or would itbe all
home-grown?

There is some stuff that is coming down the pipeline in lucene, butnothing is currently there. Honestly, it sounds like these extrafields should just be stored in a separate file/database. I alsowonder if solving the underlying problem really requires storing 10kvalues per doc (you haven't given us many clues in this regard)?


-Mike

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

Reply via email to