[IMPORTANT] Fieldable and LUCENE-1349

2008-08-05 Thread Grant Ingersoll
Per https://issues.apache.org/jira/browse/LUCENE-1349, we have made an  
exception to Lucene's backward compatibility rules and marked  
Fieldable as changeable, namely meaning we will allow, on a case-by- 
case basis, changes to the interface, meaning anyone who implements  
there own Fieldable (which we suspect is very, very few people) may  
have to make code changes when upgrading within a minor version.  More  
than likely, Fieldable will be deprecated and changed for 3.0 (when we  
get there.)


This is noted prominently in CHANGES.txt and on the interface.  Sorry  
for the inconvenience.


Thanks,
Grant


Lucene Performance and usage alternatives

2008-08-05 Thread ezer

I just made a program using the java api of Lucene. Its is working fine for
my actually index size. But i am worried about performance with an biger
index and simultaneous users access.

1) I am worried with the fact of having to make the program in java. I
searched for alternative like the C Port, but i saw that the version used
its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native code
and not use the jvm. Anybody tried it ? Can be an advantage that could
aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program directly from a
php page, is there any architecture model suggested for doing that? I mean
for preview many users accessing to the program. The fact of initiating one
isntance each time someone do a query and opening the index should not
degrade the performance?
-- 
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.



Re: Lucene Performance and usage alternatives

2008-08-05 Thread Grant Ingersoll
Before we go solving a problem that isn't necessarily there, can you  
share a bit about what sizes you are at currently?  Num docs, index  
size, query rate?


Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance 
  ?


-Grant

On Aug 5, 2008, at 10:21 AM, ezer wrote:



I just made a program using the java api of Lucene. Its is working  
fine for
my actually index size. But i am worried about performance with an  
biger

index and simultaneous users access.

1) I am worried with the fact of having to make the program in java. I
searched for alternative like the C Port, but i saw that the version  
used

its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native  
code

and not use the jvm. Anybody tried it ? Can be an advantage that could
aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program  
directly from a
php page, is there any architecture model suggested for doing that?  
I mean
for preview many users accessing to the program. The fact of  
initiating one

isntance each time someone do a query and opening the index should not
degrade the performance?


You shouldn't be instantiating a Reader/Searcher for each query.  See  
the link above.




--
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.






Re: Lucene Performance and usage alternatives

2008-08-05 Thread ezer

Yes i saw that.. it talks about performance, but not about the variants i
mentioned before.
Actually i tested indexing a database of about 200.000 registers. As i
mentioned it works fine with response of less than a second. But this
database can grow to millions of registers, and not sure if i am choosing
the best architecture for that step to allow simultaneous accesing.

Thanks for the help


Grant Ingersoll-6 wrote:
 
 Before we go solving a problem that isn't necessarily there, can you  
 share a bit about what sizes you are at currently?  Num docs, index  
 size, query rate?
 
 Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance 
?
 
 -Grant
 
 On Aug 5, 2008, at 10:21 AM, ezer wrote:
 

 I just made a program using the java api of Lucene. Its is working  
 fine for
 my actually index size. But i am worried about performance with an  
 biger
 index and simultaneous users access.

 1) I am worried with the fact of having to make the program in java. I
 searched for alternative like the C Port, but i saw that the version  
 used
 its a little old an no much people seem to use that.

 2) I also thinking in compiling the code with cgj to generate native  
 code
 and not use the jvm. Anybody tried it ? Can be an advantage that could
 aproximate to the performance of a C program ?

 3) I wont use an application server, i will call the program  
 directly from a
 php page, is there any architecture model suggested for doing that?  
 I mean
 for preview many users accessing to the program. The fact of  
 initiating one
 isntance each time someone do a query and opening the index should not
 degrade the performance?
 
 You shouldn't be instantiating a Reader/Searcher for each query.  See  
 the link above.
 

 -- 
 View this message in context:
 http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
 Sent from the Lucene - General mailing list archive at Nabble.com.

 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18833292.html
Sent from the Lucene - General mailing list archive at Nabble.com.



Re: Lucene Performance and usage alternatives

2008-08-05 Thread ezer

Grant, wich other information can i provide in order to clarify my questions?



ezer wrote:
 
 Yes i saw that.. it talks about performance, but not about the variants i
 mentioned before.
 Actually i tested indexing a database of about 200.000 registers. As i
 mentioned it works fine with response of less than a second. But this
 database can grow to millions of registers, and not sure if i am choosing
 the best architecture for that step to allow simultaneous accesing.
 
 Thanks for the help
 
 
 Grant Ingersoll-6 wrote:
 
 Before we go solving a problem that isn't necessarily there, can you  
 share a bit about what sizes you are at currently?  Num docs, index  
 size, query rate?
 
 Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance 
?
 
 -Grant
 
 On Aug 5, 2008, at 10:21 AM, ezer wrote:
 

 I just made a program using the java api of Lucene. Its is working  
 fine for
 my actually index size. But i am worried about performance with an  
 biger
 index and simultaneous users access.

 1) I am worried with the fact of having to make the program in java. I
 searched for alternative like the C Port, but i saw that the version  
 used
 its a little old an no much people seem to use that.

 2) I also thinking in compiling the code with cgj to generate native  
 code
 and not use the jvm. Anybody tried it ? Can be an advantage that could
 aproximate to the performance of a C program ?

 3) I wont use an application server, i will call the program  
 directly from a
 php page, is there any architecture model suggested for doing that?  
 I mean
 for preview many users accessing to the program. The fact of  
 initiating one
 isntance each time someone do a query and opening the index should not
 degrade the performance?
 
 You shouldn't be instantiating a Reader/Searcher for each query.  See  
 the link above.
 

 -- 
 View this message in context:
 http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
 Sent from the Lucene - General mailing list archive at Nabble.com.

 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18834310.html
Sent from the Lucene - General mailing list archive at Nabble.com.



Re: Lucene Performance and usage alternatives

2008-08-05 Thread Stefan Groschupf
An alternative is always to distribute the index to a set of servers.  
If you need to scale I guess this is the only long term perspective.
You can do your own home grown lucene distribution or look into  
existing one.
I'm currently working on katta (http://katta.wiki.sourceforge.net/) -  
there is no release yet but we are in the QA and test cycles.
But there are other as well - solar for example provides distribution  
as well.


Stefan


On Aug 5, 2008, at 7:21 AM, ezer wrote:



I just made a program using the java api of Lucene. Its is working  
fine for
my actually index size. But i am worried about performance with an  
biger

index and simultaneous users access.

1) I am worried with the fact of having to make the program in java. I
searched for alternative like the C Port, but i saw that the version  
used

its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native  
code

and not use the jvm. Anybody tried it ? Can be an advantage that could
aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program  
directly from a
php page, is there any architecture model suggested for doing that?  
I mean
for preview many users accessing to the program. The fact of  
initiating one

isntance each time someone do a query and opening the index should not
degrade the performance?
--
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.




~~~
101tec Inc.
Menlo Park, California, USA
http://www.101tec.com




Re: Lucene Performance and usage alternatives

2008-08-05 Thread Grant Ingersoll
My point is more that you don't necessarily need to go looking for  
variants.  I've seen Lucene Java scale to millions no problem.  I  
talked w/ a guy using Solr this past week who had ~80 million records  
in a single 80 gb index on one machine.


If I had a PHP front end, I would most likely start with Solr and it's  
PHP client.  No sense in reinventing the wheel, IMO.


On Aug 5, 2008, at 11:15 AM, ezer wrote:



Yes i saw that.. it talks about performance, but not about the  
variants i

mentioned before.
Actually i tested indexing a database of about 200.000 registers. As i
mentioned it works fine with response of less than a second. But this
database can grow to millions of registers, and not sure if i am  
choosing

the best architecture for that step to allow simultaneous accesing.

Thanks for the help


Grant Ingersoll-6 wrote:


Before we go solving a problem that isn't necessarily there, can you
share a bit about what sizes you are at currently?  Num docs, index
size, query rate?

Have you looked at http://wiki.apache.org/lucene-java/BasicsOfPerformance
  ?

-Grant

On Aug 5, 2008, at 10:21 AM, ezer wrote:



I just made a program using the java api of Lucene. Its is working
fine for
my actually index size. But i am worried about performance with an
biger
index and simultaneous users access.

1) I am worried with the fact of having to make the program in  
java. I

searched for alternative like the C Port, but i saw that the version
used
its a little old an no much people seem to use that.

2) I also thinking in compiling the code with cgj to generate native
code
and not use the jvm. Anybody tried it ? Can be an advantage that  
could

aproximate to the performance of a C program ?

3) I wont use an application server, i will call the program
directly from a
php page, is there any architecture model suggested for doing that?
I mean
for preview many users accessing to the program. The fact of
initiating one
isntance each time someone do a query and opening the index should  
not

degrade the performance?


You shouldn't be instantiating a Reader/Searcher for each query.  See
the link above.



--
View this message in context:
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18832162.html
Sent from the Lucene - General mailing list archive at Nabble.com.








--
View this message in context: 
http://www.nabble.com/Lucene-Performance-and-usage-alternatives-tp18832162p18833292.html
Sent from the Lucene - General mailing list archive at Nabble.com.