Lucene Performance and usage alternatives

2008-08-05 Thread ezer

I just made a program using the java api of Lucene. Its is working fine for
my actually index size. But i am worried about performance with an biger
index and simultaneous users access.

1) I am worried with the fact of having to make the program in java. I
searched for alternative like the C Port, but i saw that the version used
its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native code
and not use the jvm. Anybody tried it ? Can be an advantage that could
aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program directly from a
php page, is there any architecture model suggested for doing that? I mean
for preview many users accessing to the program. The fact of initiating one
isntance each time someone do a query and opening the index should not
degrade the performance?
-- 
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.



Re: Lucene Performance and usage alternatives

2008-08-05 Thread Grant Ingersoll
Before we go solving a problem that isn't necessarily there, can you  
share a bit about what sizes you are at currently?  Num docs, index  
size, query rate?


Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance 
  ?


-Grant

On Aug 5, 2008, at 10:21 AM, ezer wrote:



I just made a program using the java api of Lucene. Its is working  
fine for
my actually index size. But i am worried about performance with an  
biger

index and simultaneous users access.

1) I am worried with the fact of having to make the program in java. I
searched for alternative like the C Port, but i saw that the version  
used

its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native  
code

and not use the jvm. Anybody tried it ? Can be an advantage that could
aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program  
directly from a
php page, is there any architecture model suggested for doing that?  
I mean
for preview many users accessing to the program. The fact of  
initiating one

isntance each time someone do a query and opening the index should not
degrade the performance?


You shouldn't be instantiating a Reader/Searcher for each query.  See  
the link above.




--
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.






Re: Lucene Performance and usage alternatives

2008-08-05 Thread ezer

Yes i saw that.. it talks about performance, but not about the variants i
mentioned before.
Actually i tested indexing a database of about 200.000 registers. As i
mentioned it works fine with response of less than a second. But this
database can grow to millions of registers, and not sure if i am choosing
the best architecture for that step to allow simultaneous accesing.

Thanks for the help


Grant Ingersoll-6 wrote:
 
 Before we go solving a problem that isn't necessarily there, can you  
 share a bit about what sizes you are at currently?  Num docs, index  
 size, query rate?
 
 Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance 
?
 
 -Grant
 
 On Aug 5, 2008, at 10:21 AM, ezer wrote:
 

 I just made a program using the java api of Lucene. Its is working  
 fine for
 my actually index size. But i am worried about performance with an  
 biger
 index and simultaneous users access.

 1) I am worried with the fact of having to make the program in java. I
 searched for alternative like the C Port, but i saw that the version  
 used
 its a little old an no much people seem to use that.

 2) I also thinking in compiling the code with cgj to generate native  
 code
 and not use the jvm. Anybody tried it ? Can be an advantage that could
 aproximate to the performance of a C program ?

 3) I wont use an application server, i will call the program  
 directly from a
 php page, is there any architecture model suggested for doing that?  
 I mean
 for preview many users accessing to the program. The fact of  
 initiating one
 isntance each time someone do a query and opening the index should not
 degrade the performance?
 
 You shouldn't be instantiating a Reader/Searcher for each query.  See  
 the link above.
 

 -- 
 View this message in context:
 http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
 Sent from the Lucene - General mailing list archive at Nabble.com.

 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18833292.html
Sent from the Lucene - General mailing list archive at Nabble.com.



Re: Lucene Performance and usage alternatives

2008-08-05 Thread ezer

Grant, wich other information can i provide in order to clarify my questions?



ezer wrote:
 
 Yes i saw that.. it talks about performance, but not about the variants i
 mentioned before.
 Actually i tested indexing a database of about 200.000 registers. As i
 mentioned it works fine with response of less than a second. But this
 database can grow to millions of registers, and not sure if i am choosing
 the best architecture for that step to allow simultaneous accesing.
 
 Thanks for the help
 
 
 Grant Ingersoll-6 wrote:
 
 Before we go solving a problem that isn't necessarily there, can you  
 share a bit about what sizes you are at currently?  Num docs, index  
 size, query rate?
 
 Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance 
?
 
 -Grant
 
 On Aug 5, 2008, at 10:21 AM, ezer wrote:
 

 I just made a program using the java api of Lucene. Its is working  
 fine for
 my actually index size. But i am worried about performance with an  
 biger
 index and simultaneous users access.

 1) I am worried with the fact of having to make the program in java. I
 searched for alternative like the C Port, but i saw that the version  
 used
 its a little old an no much people seem to use that.

 2) I also thinking in compiling the code with cgj to generate native  
 code
 and not use the jvm. Anybody tried it ? Can be an advantage that could
 aproximate to the performance of a C program ?

 3) I wont use an application server, i will call the program  
 directly from a
 php page, is there any architecture model suggested for doing that?  
 I mean
 for preview many users accessing to the program. The fact of  
 initiating one
 isntance each time someone do a query and opening the index should not
 degrade the performance?
 
 You shouldn't be instantiating a Reader/Searcher for each query.  See  
 the link above.
 

 -- 
 View this message in context:
 http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
 Sent from the Lucene - General mailing list archive at Nabble.com.

 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18834310.html
Sent from the Lucene - General mailing list archive at Nabble.com.



Re: Lucene Performance and usage alternatives

2008-08-05 Thread Stefan Groschupf
An alternative is always to distribute the index to a set of servers.  
If you need to scale I guess this is the only long term perspective.
You can do your own home grown lucene distribution or look into  
existing one.
I'm currently working on katta (http://katta.wiki.sourceforge.net/) -  
there is no release yet but we are in the QA and test cycles.
But there are other as well - solar for example provides distribution  
as well.


Stefan


On Aug 5, 2008, at 7:21 AM, ezer wrote:



I just made a program using the java api of Lucene. Its is working  
fine for
my actually index size. But i am worried about performance with an  
biger

index and simultaneous users access.

1) I am worried with the fact of having to make the program in java. I
searched for alternative like the C Port, but i saw that the version  
used

its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native  
code

and not use the jvm. Anybody tried it ? Can be an advantage that could
aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program  
directly from a
php page, is there any architecture model suggested for doing that?  
I mean
for preview many users accessing to the program. The fact of  
initiating one

isntance each time someone do a query and opening the index should not
degrade the performance?
--
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.




~~~
101tec Inc.
Menlo Park, California, USA
http://www.101tec.com




Re: Lucene Performance and usage alternatives

2008-08-05 Thread Grant Ingersoll
My point is more that you don't necessarily need to go looking for  
variants.  I've seen Lucene Java scale to millions no problem.  I  
talked w/ a guy using Solr this past week who had ~80 million records  
in a single 80 gb index on one machine.


If I had a PHP front end, I would most likely start with Solr and it's  
PHP client.  No sense in reinventing the wheel, IMO.


On Aug 5, 2008, at 11:15 AM, ezer wrote:



Yes i saw that.. it talks about performance, but not about the  
variants i

mentioned before.
Actually i tested indexing a database of about 200.000 registers. As i
mentioned it works fine with response of less than a second. But this
database can grow to millions of registers, and not sure if i am  
choosing

the best architecture for that step to allow simultaneous accesing.

Thanks for the help


Grant Ingersoll-6 wrote:


Before we go solving a problem that isn't necessarily there, can you
share a bit about what sizes you are at currently?  Num docs, index
size, query rate?

Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance
  ?

-Grant

On Aug 5, 2008, at 10:21 AM, ezer wrote:



I just made a program using the java api of Lucene. Its is working
fine for
my actually index size. But i am worried about performance with an
biger
index and simultaneous users access.

1) I am worried with the fact of having to make the program in  
java. I

searched for alternative like the C Port, but i saw that the version
used
its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native
code
and not use the jvm. Anybody tried it ? Can be an advantage that  
could

aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program
directly from a
php page, is there any architecture model suggested for doing that?
I mean
for preview many users accessing to the program. The fact of
initiating one
isntance each time someone do a query and opening the index should  
not

degrade the performance?


You shouldn't be instantiating a Reader/Searcher for each query.  See
the link above.



--
View this message in context:
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.








--
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18833292.html
Sent from the Lucene - General mailing list archive at Nabble.com.