500 millions document for loop.

Valentin Popov Thu, 12 Nov 2015 08:40:04 -0800

Hello everyone. 

We have ~10 indexes for 500M documents, each document has «archive date», and 
«to» address, one of our task is calculate statistics of «to» for last year. 
Right now we are using search archive_date:(current_date - 1 year) and paginate 
results for 50k records for page. Bottleneck of that approach, pagination take 
too long time and on powerful server it take ~20 days to execute, and it is 
very long.


I done experiment with csv file, put there 200M records and parse it with same 
alghoritm as using for statistics, it takes few hours to execute.

Is it possible some how just fast iterate throw lucene documents without search 
and pagination? Or some how increase speed of traverse? 

Thanks

Regards,
Valentin.





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

500 millions document for loop.

Reply via email to