Hi, Neil.

I'm not very familiar with gsearch yet, but this sounds like a very  
common lucene / solr problem (which one are you using?) The solr-users  
list is full of good suggestions about speeding up indexing times. The  
biggest one of these is to be careful about when you're committing:  
commits on large indexes get increasingly expensive. One good strategy  
is to actually maintain two lucene indexes, one for indexing into, and  
one for searching. That way your users aren't bothered by indexing  
performance hits, and you can just replicate your lucene index when  
you're done with your indexing job. Solr has really nice replication  
options available.

Bess


On 17-Jun-09, at 11:06 AM, Guo, Xinjian wrote:

> Hi Gert and all GSearch users,
>
> I am indexing over 130,000 Fedora objects (DC + single page Arabic  
> text DS per object). It was fast at beginning, but got pretty slow  
> after half of the objects were indexed. The Lucene index directory  
> is about 700MB now.
>
> I am using the GSearch index script "runRESTClient.sh" to loop  
> through a long PID list. I did not use the rest? 
> operation=updateIndex updateIndexfromFoxmlFiles as it will re-index  
> all objects.
>
> Have any of you had the same experience? Any suggestion to speed up  
> the indexing process?
>
> Thanks.
>
> Neil

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to