Andre, Thanks for the info! Unfortunately, my solr is on 3.6 version, and looks like those options are not available. :(
Ming- On Mon, May 6, 2013 at 5:32 AM, Andre Bois-Crettez <andre.b...@kelkoo.com>wrote: > On 05/06/2013 06:03 AM, Michael Sokolov wrote: > >> On 5/5/13 7:48 PM, Mingfeng Yang wrote: >> >>> Dear Solr Users, >>> >>> Does anyone know what is the best way to iterate through each document >>> in a >>> Solr index with billion entries? >>> >>> I tried to use select?q=*:*&start=xx&rows=500 to get 500 docs each time >>> and then change start value, but it got very slow after getting through >>> about 10 million docs. >>> >>> Thanks, >>> Ming- >>> >>> You need to use a unique and stable sort key and get documents> >> sortkey. For example, if you have a unique key, retrieve documents >> ordered by the unique key, and for each batch get documents> max (key) >> from the previous batch >> >> -Mike >> >> There is more details on the wiki : > http://wiki.apache.org/solr/**CommonQueryParameters#pageDoc_** > and_pageScore<http://wiki.apache.org/solr/CommonQueryParameters#pageDoc_and_pageScore> > > > -- > André Bois-Crettez > > Search technology, Kelkoo > http://www.kelkoo.com/ > > > Kelkoo SAS > Société par Actions Simplifiée > Au capital de € 4.168.964,30 > Siège social : 8, rue du Sentier 75002 Paris > 425 093 069 RCS Paris > > Ce message et les pièces jointes sont confidentiels et établis à > l'attention exclusive de leurs destinataires. Si vous n'êtes pas le > destinataire de ce message, merci de le détruire et d'en avertir > l'expéditeur. >