Let's make sure we're talking about the same thing. Solr happily
indexes and stores long (64) bit values, no problem. What it doesn't
do is assign _internal_ documents IDs as longs, those are ints.

on admin/statistics, look at maxDocs and numDocs. maxDocs +1 will be the
next _internal_ lucene doc id assigned, so if that's wonky or > 2B, this
is where the rub happens. BTW, the difference between numDocs and
maxDocs is the number of documents deleted from your index. If your number
of current documents is much smaller than 2B, you can get maxDocs
to equal numDocs if you optimize, and get yourself some more headroom.
whether your index will be OK I'm not prepared to guarantee though...

But if I'm reading your notes correctly, the "85% holes" applies to a value in
your document, and has nothing to do with the internal lucene ID issue.....

But internally, the int limit isn't robustly enforced, so I'm not
surprised that it
pops out (if, indeed, this is your problem) in odd places.

Best
Erick

On Wed, Jun 20, 2012 at 10:02 AM, avenka <ave...@gmail.com> wrote:
> Erick, thanks for pointing that out. I was going to say in my original post
> that it is almost like some limit on max documents got violated all of a
> sudden, but the rest of the symptoms didn't seem to quite match. But now
> that I think about it, the problem probably happened at 2B (corresponding
> exactly to the size of the signed int space) as my ID space in the database
> has roughly 85% holes and the problem probably happened when the ID hit
> around 2.4B.
>
> It is still odd that indexing appears to proceed normally and the select
> queries "know" which IDs are used because the error happens only for queries
> with non-empty results, e.g., searching for an ID that doesn't exist gives a
> valid "0 numResponses" response. Is this because solr uses 'long' or more
> for indexing (given that the schema supports long) but not in the querying
> modules?
>
> I hadn't used solr sharding because I really needed "rolling" partitions,
> where I keep a small index of recent documents and throw the rest into a
> slow "archive" index. So maintaining the smaller instance2 (usually < 50M)
> and replicating it if needed was my homebrewed sharding approach. But I
> guess it is time to shard the archive after all.
>
> AV
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/solr-java-lang-NullPointerException-on-select-queries-tp3989974p3990534.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to