30Gb isn't that big by lucene standards.  Have you considered or tried
just having one large index?  If necessary you could restrict searches
to particular "indexes", or groups thereof, by a field in the combined
index, preferably used as a filter.  If the slow searches have to
search across 63 separate indexes it is perhaps not surprising that
they are slow.  What do you do about sharing or caching
searcher/reader instances?  There are lots of useful tips on
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed.

40 fields isn't that many - should be fine.

On sharding/scaling/etc,
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr
looks well worth a read.


--
Ian.

On Mon, May 9, 2011 at 12:56 PM, Samarendra Pratap <samarz...@gmail.com> wrote:
> Hi list,
>  We have an index directory of 30 GB which is divided into 3 subdirectories
> (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories
> (idx1-1, idx1-2, ...., idx2-1, ...., idx3-1, ...., idx3-21).
>
> We are running with java 1.6, lucene 2.9 (going to upgrade to 3.1 very
> soon), linux (fedora core - kernel 2.6.17-13.1), reiserfs.
>
> We have almost 40 fields in each index (is it a bad to have so many
> fields?). most of them are id based fields.
> We are using 8 servers for search, and each of which receives approximately
> 3000/hour queries in peak hour and search time of more than 1 second is
> considered bad (is it really bad?) as per the business requirement.
>
> Since past few months we are experiencing issues (load and search time) on
> our search servers, due to which I am looking for sharding techniques. Can
> someone guide or give me pointers where i can read more and test?
>
> Keeping parts of indexes on different servers search on all of them and then
> merging the results - what could be the best approach?
>
> Let me tell you that most queries use only 6-7 indexes and 4 - 5 fields (to
> search for) but some queries (searching all the data) require all the
> indexes and are primary cause of the performance degradation.
>
> Any suggestions/ideas are greatly appreciated. And further more will
> sharding (or similar thing) really reduce search time? (load is a less
> severe issue when compared to search time)
>
>
> --
> Regards,
> Samar
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to