RE: Lucene's scalability
In my experience the time it takes depends much more on the complexity of the query, rather than the number of results returned. If I am making a query with 50-60 terms, I am usually getting down to a couple thousand or less results. Dan -Original Message- From: CNew [mailto:[EMAIL PROTECTED]] Sent: Monday, May 20, 2002 8:46 AM To: Lucene Users List Subject: Re: Lucene's scalability you didn't mention the hit count on the query with 50-60 terms. just wondering if the time was linear. - Original Message - From: Armbrust, Daniel C. <[EMAIL PROTECTED]> To: 'Lucene Users List' <[EMAIL PROTECTED]> Sent: Friday, May 17, 2002 6:57 AM Subject: RE: Lucene's scalability > Currently, we are using a Ultra-80 Sparc Solaris with 4 processors and 4 GB > of Ram. > > However, we are only making use of one of those processors with the index. > Our biggest speed restriction is the fact that our entire index resides on a > single disk drive. We have a raid array coming soon. > > The performance has been very impressive, but as you can imagine, the speed > depends highly on the complexity of the query. If you run a query with a > 1/2 a dozen terms and fields, which returns ~30,000 results, it usually > takes on the order of a second or two. If you run a query with 50-60 terms, > it may take 5-6 seconds. > > I don't have any better performance stats than this currently. > > Dan > > > -Original Message- > From: Harpreet S Walia [mailto:[EMAIL PROTECTED]] > Sent: Friday, May 17, 2002 7:23 AM > To: Lucene Users List > Subject: Re: Lucene's scalability > > > Hi , > > I am also trying to do a similar thing . I am very eager to know what kind > of hardware u are using to maintain such a big index. > In my case it is very important that the search happens very fast . so does > such a big index of 10 gb pose any problems in this direction > > TIA > > Regards > Harpreet > > > > - Original Message ----- > From: "Armbrust, Daniel C." <[EMAIL PROTECTED]> > To: "'Lucene Users List'" <[EMAIL PROTECTED]> > Sent: Tuesday, April 30, 2002 12:07 AM > Subject: RE: Lucene's scalability > > > > I currently have an index of ~ 12 million documents, which are each about > > that size (but in xml form). > > > > When they are transformed for lucene to index, there are upwards of 50 > > searchable fields. > > > > The index is about 10 GB right now. > > > > I have not yet had any problems with "pushing the limits" of lucene. > > > > In the next few weeks, I will be pushing my number of indexed documents up > > into the 15-20 million range. I can let you know if any problems arise. > > > > Dan > > > > > > > > -Original Message- > > From: Joel Bernstein [mailto:[EMAIL PROTECTED]] > > Sent: Monday, April 29, 2002 1:32 PM > > To: [EMAIL PROTECTED] > > Subject: Lucene's scalability > > > > > > Is there a known limit to the number of documents that Lucene can handle > > efficiently? I'm looking to index around 15 million, 2K docs which > contain > > 7-10 searchable fields. Should I be attempting this with Lucene? > > > > Thanks, > > > > Joel > > > > > > -- > > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
Re: Lucene's scalability
you didn't mention the hit count on the query with 50-60 terms. just wondering if the time was linear. - Original Message - From: Armbrust, Daniel C. <[EMAIL PROTECTED]> To: 'Lucene Users List' <[EMAIL PROTECTED]> Sent: Friday, May 17, 2002 6:57 AM Subject: RE: Lucene's scalability > Currently, we are using a Ultra-80 Sparc Solaris with 4 processors and 4 GB > of Ram. > > However, we are only making use of one of those processors with the index. > Our biggest speed restriction is the fact that our entire index resides on a > single disk drive. We have a raid array coming soon. > > The performance has been very impressive, but as you can imagine, the speed > depends highly on the complexity of the query. If you run a query with a > 1/2 a dozen terms and fields, which returns ~30,000 results, it usually > takes on the order of a second or two. If you run a query with 50-60 terms, > it may take 5-6 seconds. > > I don't have any better performance stats than this currently. > > Dan > > > -Original Message- > From: Harpreet S Walia [mailto:[EMAIL PROTECTED]] > Sent: Friday, May 17, 2002 7:23 AM > To: Lucene Users List > Subject: Re: Lucene's scalability > > > Hi , > > I am also trying to do a similar thing . I am very eager to know what kind > of hardware u are using to maintain such a big index. > In my case it is very important that the search happens very fast . so does > such a big index of 10 gb pose any problems in this direction > > TIA > > Regards > Harpreet > > > > - Original Message - > From: "Armbrust, Daniel C." <[EMAIL PROTECTED]> > To: "'Lucene Users List'" <[EMAIL PROTECTED]> > Sent: Tuesday, April 30, 2002 12:07 AM > Subject: RE: Lucene's scalability > > > > I currently have an index of ~ 12 million documents, which are each about > > that size (but in xml form). > > > > When they are transformed for lucene to index, there are upwards of 50 > > searchable fields. > > > > The index is about 10 GB right now. > > > > I have not yet had any problems with "pushing the limits" of lucene. > > > > In the next few weeks, I will be pushing my number of indexed documents up > > into the 15-20 million range. I can let you know if any problems arise. > > > > Dan > > > > > > > > -Original Message- > > From: Joel Bernstein [mailto:[EMAIL PROTECTED]] > > Sent: Monday, April 29, 2002 1:32 PM > > To: [EMAIL PROTECTED] > > Subject: Lucene's scalability > > > > > > Is there a known limit to the number of documents that Lucene can handle > > efficiently? I'm looking to index around 15 million, 2K docs which > contain > > 7-10 searchable fields. Should I be attempting this with Lucene? > > > > Thanks, > > > > Joel > > > > > > -- > > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
RE: Lucene's scalability
Currently, we are using a Ultra-80 Sparc Solaris with 4 processors and 4 GB of Ram. However, we are only making use of one of those processors with the index. Our biggest speed restriction is the fact that our entire index resides on a single disk drive. We have a raid array coming soon. The performance has been very impressive, but as you can imagine, the speed depends highly on the complexity of the query. If you run a query with a 1/2 a dozen terms and fields, which returns ~30,000 results, it usually takes on the order of a second or two. If you run a query with 50-60 terms, it may take 5-6 seconds. I don't have any better performance stats than this currently. Dan -Original Message- From: Harpreet S Walia [mailto:[EMAIL PROTECTED]] Sent: Friday, May 17, 2002 7:23 AM To: Lucene Users List Subject: Re: Lucene's scalability Hi , I am also trying to do a similar thing . I am very eager to know what kind of hardware u are using to maintain such a big index. In my case it is very important that the search happens very fast . so does such a big index of 10 gb pose any problems in this direction TIA Regards Harpreet - Original Message - From: "Armbrust, Daniel C." <[EMAIL PROTECTED]> To: "'Lucene Users List'" <[EMAIL PROTECTED]> Sent: Tuesday, April 30, 2002 12:07 AM Subject: RE: Lucene's scalability > I currently have an index of ~ 12 million documents, which are each about > that size (but in xml form). > > When they are transformed for lucene to index, there are upwards of 50 > searchable fields. > > The index is about 10 GB right now. > > I have not yet had any problems with "pushing the limits" of lucene. > > In the next few weeks, I will be pushing my number of indexed documents up > into the 15-20 million range. I can let you know if any problems arise. > > Dan > > > > -Original Message- > From: Joel Bernstein [mailto:[EMAIL PROTECTED]] > Sent: Monday, April 29, 2002 1:32 PM > To: [EMAIL PROTECTED] > Subject: Lucene's scalability > > > Is there a known limit to the number of documents that Lucene can handle > efficiently? I'm looking to index around 15 million, 2K docs which contain > 7-10 searchable fields. Should I be attempting this with Lucene? > > Thanks, > > Joel > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
Re: Lucene's scalability
Hi , I am also trying to do a similar thing . I am very eager to know what kind of hardware u are using to maintain such a big index. In my case it is very important that the search happens very fast . so does such a big index of 10 gb pose any problems in this direction TIA Regards Harpreet - Original Message - From: "Armbrust, Daniel C." <[EMAIL PROTECTED]> To: "'Lucene Users List'" <[EMAIL PROTECTED]> Sent: Tuesday, April 30, 2002 12:07 AM Subject: RE: Lucene's scalability > I currently have an index of ~ 12 million documents, which are each about > that size (but in xml form). > > When they are transformed for lucene to index, there are upwards of 50 > searchable fields. > > The index is about 10 GB right now. > > I have not yet had any problems with "pushing the limits" of lucene. > > In the next few weeks, I will be pushing my number of indexed documents up > into the 15-20 million range. I can let you know if any problems arise. > > Dan > > > > -Original Message- > From: Joel Bernstein [mailto:[EMAIL PROTECTED]] > Sent: Monday, April 29, 2002 1:32 PM > To: [EMAIL PROTECTED] > Subject: Lucene's scalability > > > Is there a known limit to the number of documents that Lucene can handle > efficiently? I'm looking to index around 15 million, 2K docs which contain > 7-10 searchable fields. Should I be attempting this with Lucene? > > Thanks, > > Joel > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
Re: Lucene's scalability
Great, Thanks for the quick response, I am very interested in hearing how lucene handles itself in the 15-20 million doc range. I will be doing some testing this week with lucene and will report my findings as well. I am also testing FAST and AltaVista and I will post some comparison details. I would be very happy to find that we did not need to buy a commercial engine because Lucene could do the job. Joel - Original Message - From: "Armbrust, Daniel C." <[EMAIL PROTECTED]> To: "'Lucene Users List'" <[EMAIL PROTECTED]> Sent: Monday, April 29, 2002 2:37 PM Subject: RE: Lucene's scalability > I currently have an index of ~ 12 million documents, which are each about > that size (but in xml form). > > When they are transformed for lucene to index, there are upwards of 50 > searchable fields. > > The index is about 10 GB right now. > > I have not yet had any problems with "pushing the limits" of lucene. > > In the next few weeks, I will be pushing my number of indexed documents up > into the 15-20 million range. I can let you know if any problems arise. > > Dan > > > > -Original Message- > From: Joel Bernstein [mailto:[EMAIL PROTECTED]] > Sent: Monday, April 29, 2002 1:32 PM > To: [EMAIL PROTECTED] > Subject: Lucene's scalability > > > Is there a known limit to the number of documents that Lucene can handle > efficiently? I'm looking to index around 15 million, 2K docs which contain > 7-10 searchable fields. Should I be attempting this with Lucene? > > Thanks, > > Joel > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
RE: Lucene's scalability
I currently have an index of ~ 12 million documents, which are each about that size (but in xml form). When they are transformed for lucene to index, there are upwards of 50 searchable fields. The index is about 10 GB right now. I have not yet had any problems with "pushing the limits" of lucene. In the next few weeks, I will be pushing my number of indexed documents up into the 15-20 million range. I can let you know if any problems arise. Dan -Original Message- From: Joel Bernstein [mailto:[EMAIL PROTECTED]] Sent: Monday, April 29, 2002 1:32 PM To: [EMAIL PROTECTED] Subject: Lucene's scalability Is there a known limit to the number of documents that Lucene can handle efficiently? I'm looking to index around 15 million, 2K docs which contain 7-10 searchable fields. Should I be attempting this with Lucene? Thanks, Joel -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
Lucene's scalability
Is there a known limit to the number of documents that Lucene can handle efficiently? I'm looking to index around 15 million, 2K docs which contain 7-10 searchable fields. Should I be attempting this with Lucene? Thanks, Joel