[ 
https://issues.apache.org/jira/browse/SOLR-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118722#comment-16118722
 ] 

Erick Erickson commented on SOLR-11211:
---------------------------------------

bq: I wonder how SOLR allowed me to add more documents than what a single shard 
can take.

One possible scenario (and the Lucene guys please step in if this is off the 
wall)...

_segments_ have a base+offset for the internal ID. So segment 1 might have
base: 1,000,000
docs: 0-1,000

So as long as you're adding documents to Solr (actually Lucene) and _not_ 
opening searchers you can create segments forever.

composite IndexReaders look at all the segments and assemble a (conceptual) 
list of all the docs in the segment. So the segment above will have docs 
1,000,000-1,001,000.

Plus note that numDocs isn't the actual total docs. maxDoc is the one that 
counts here, it includes deleted documents.

As far as recovering your data, this occurred to me, but I have not tested 
whether this idea will work.
1> copy 1/2 the segments to each of two new cores.
2> run CheckIndex with -fix. This will drop any "bad" segments, in this case I 
believe it will rewrite your segments file to only include the existing 
segments.
3> Examine both cores to see they're what you expect
4> run MERGEINDEX 
(https://cwiki.apache.org/confluence/display/solr/Merging+Indexes) to bring 
them back together.

It's worth a shot anyway. It's a band-aid, longer term you want to split this 
shard for a variety of reasons.

This is actually a Lucene-level limitation, and unlikely to be changed any time 
soon as it's a very large undertaking.

> Too many documents, composite IndexReaders cannot exceed 2147483519
> -------------------------------------------------------------------
>
>                 Key: SOLR-11211
>                 URL: https://issues.apache.org/jira/browse/SOLR-11211
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>         Environment: Hadoop Centos6
>            Reporter: Wael
>
> I am running a single node Hadoop SOLR machine with 64 GB of ram.
> The issue is that I was using the machine successfully untill yesterday where 
> I made a restart and one of the indexes I am working on wouldn't start giving 
> the error :Too many documents, composite IndexReaders cannot exceed 
> 2147483519". 
> I wonder how SOLR allowed me to add more documents than what a single shard 
> can take. I need a solution to startup the index and I don't want to loose 
> all the data as I only have a 2 week old backup. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to