Re: Exception while loading 2 Billion + Documents in Solr 4.8.0

Erick Erickson Wed, 11 Feb 2015 05:06:53 -0800

bq: Are there any such structures?

Well, I thought there were, but I've got to admit I can't call any to mind
immediately.


bq: 2b is just the hard limit

Yeah, I'm always a little nervous as to when Moore's Law will make
everything I know about current systems' performance obsolete.

At any rate, I _can_ say with certainty that I have no interest at this
point in exceeding this limit. Of course that may change with
compelling use-cases ;)....

Best,
Erick

On Wed, Feb 11, 2015 at 4:14 AM, Toke Eskildsen <t...@statsbiblioteket.dk> 
wrote:
> Erick Erickson [erickerick...@gmail.com] wrote:
>
>> I guess my $0.02 is that you'd have to have strong evidence that extending
>> Lucene to 64 bit is even useful. Or more generally, useful enough to pay the
>> penalty. All the structures that allocate maxDoc id arrays would suddenly
>> require twice the memory for instance,
>
> Are there any such structures? It was my impressions that ID-structures in 
> Solr were either bitmaps, hashmaps or queues. Anyway, if the number of places 
> with full-size ID-arrays is low, there could be dual implementations selected 
> by maxDoc.
>
>> plus all the coding effort that could be spend doing other things.
>
> Very true. I agree that at the current stage, > 2b/shard is still a bit too 
> special to spend a lot of effort on it.
>
> However, 2b is just the hard limit. As has been discussed before, single 
> shards works best in the lower end of the hundreds of millions of documents. 
> One reason is that many parts of Lucene works single-threaded on structures 
> that scale linear to document count. Having some hundreds of millions of 
> documents (log analysis being the typical case) is not uncommon these days. A 
> gradual shift to more multi-thread oriented processing would fit well with 
> current trends in hardware as well as use cases. As opposed to the int->long 
> switch, there would be little to no penalty for setups with low maxDocs (they 
> would just use 1 thread).
>
> - Toke Eskildsen

Re: Exception while loading 2 Billion + Documents in Solr 4.8.0

Reply via email to