RE: noobie question

2006-05-24 Thread George Aroush
Aroush -Original Message- From: Pamela Foxcroft [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 24, 2006 12:11 PM To: lucene-net-dev@incubator.apache.org Subject: Re: noobie question Hi Jeff & George OK, I guess we are stroing a lot of data in our index. Basically we are storing 10 metags

RE: noobie question

2006-05-24 Thread George Aroush
5:44 PM To: lucene-net-dev@incubator.apache.org Subject: Re: noobie question You could certainly load a 7gb index into memory, given sufficient hardware running 64-bit Windows. That said, I wouldn't suggest trying to carry a single 7gb index in a single server's memory. Keeping an in

Re: noobie question

2006-05-24 Thread Pamela Foxcroft
; > > Monitor your CPU and see that it is being max'ed out or not. Chance are > > that it is and if queries are still taking log to run then your focus > > should > > be on disk I/O. > > > > Regards, > > > > -- George Aroush > > > > &g

Re: noobie question

2006-05-23 Thread Jeff Rodenburg
t; be on disk I/O. > > Regards, > > -- George Aroush > > > -Original Message- > From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] > Sent: Saturday, May 20, 2006 11:18 AM > To: lucene-net-dev@incubator.apache.org > Subject: Re: noobie question > > - Our index is

Re: noobie question

2006-05-23 Thread Pamela Foxcroft
enburg [mailto:[EMAIL PROTECTED] Sent: Saturday, May 20, 2006 11:18 AM To: lucene-net-dev@incubator.apache.org Subject: Re: noobie question - Our index is currently 7 Gigs. I take it we should have more than 7 Gigs or RAM on our machine? Can we get any other hardware specs? IE 2, 4 procs? You can

Re: noobie question

2006-05-22 Thread Jeff Rodenburg
e that it is and if queries are still taking log to run then your focus should be on disk I/O. Regards, -- George Aroush -Original Message- From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] Sent: Saturday, May 20, 2006 11:18 AM To: lucene-net-dev@incubator.apache.org Subject: Re: noobie que

RE: noobie question

2006-05-22 Thread George Aroush
be on disk I/O. Regards, -- George Aroush -Original Message- From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] Sent: Saturday, May 20, 2006 11:18 AM To: lucene-net-dev@incubator.apache.org Subject: Re: noobie question - Our index is currently 7 Gigs. I take it we should have more than 7 G

Re: noobie question

2006-05-20 Thread Jeff Rodenburg
, consider not doing it. The goal is to keep the index size as small > as possible to reduce I/O. > > Good luck. > > -- George Aroush > > -Original Message- > From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] > Sent: Friday, May 19, 2006 4:28 PM > To: lucene-net-de

Re: noobie question

2006-05-20 Thread Jeff Rodenburg
Correct on our configuration, give or take a few 100 MB. :-) And we have three servers accessed simultaneously for each search. For our index, we're dealing with information that's geographically defined, so our indexes are broken up along those lines. We still monitor each index for size, but

Re: noobie question

2006-05-19 Thread Pamela Foxcroft
iginal Message- From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] Sent: Friday, May 19, 2006 4:28 PM To: lucene-net-dev@incubator.apache.org Subject: Re: noobie question Yes, the merge parameters does affect indexing performance, but compactness also affects search performance as your index gets

Re: noobie question

2006-05-19 Thread Pamela Foxcroft
OK, I'm very confused here Jeff. It sound like what you are suggesting is that you have multiple indexes per machine, each around 300 Mbyes, which means about 2.5/.3 = 8 indexes per machine, and you have 7.5/2.5 =3 machines in the mix. Is this correct? On what criteria do you partition your index

RE: noobie question

2006-05-19 Thread George Aroush
so, consider not doing it. The goal is to keep the index size as small as possible to reduce I/O. Good luck. -- George Aroush -Original Message- From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] Sent: Friday, May 19, 2006 4:28 PM To: lucene-net-dev@incubator.apache.org Subject: Re: noobie

Re: noobie question

2006-05-19 Thread Jeff Rodenburg
Yes, the merge parameters does affect indexing performance, but compactness also affects search performance as your index gets larger. As you incrementally update the index, the fragmentation effect (which the merge properties will dictate) causes performance degradation at search time. As for i

Re: noobie question

2006-05-19 Thread Pamela Foxcroft
Hi Jeff A couple more questions. Don't the merge parameters determine how aggressively the index is compacted? And if so, doesn't this affect only indexing performance and not search performance? Secondly how large should each index be? Should I be partitioning the indexes, ie by date range? So

Re: noobie question

2006-05-19 Thread Jeff Rodenburg
The Compound file format is the default file format for the index that you create (at least in v1.4.x). When creating an index, you can specify true/false in a constructor that indicates if you wish the index file to be compacted or not. Check out http://lucene.apache.org/java/docs/fileformats.h

Re: noobie question

2006-05-19 Thread Pamela Foxcroft
Thanks Jeff, I am a little confused by the compound vs loose file format you speak of. We are indexing html docs and indexing 10 metatags. By indexing I mean we index the body, but we also query the properties. I am not sure what the correct definition is. Are you saying that if we were merely i

Re: noobie question

2006-05-19 Thread Jeff Rodenburg
Hi Pamela - Performance certainly changes as your index grows, and it's not even necessarily a linear progression. How you indexed your data, compression factors, compound vs. loose file format, number of indexes, etc. all play a part in affecting search performance at runtime. There are a lot