Re: Disk I/O in Lucene

2002-04-09 Thread Peter Carlson
Hi, Lucene does not use the new nio API's in 1.4 (in fact it is compatible back to 1.1.8). What kind of bench marks are you looking for? I am currently searching over 100K documents in about .015 seconds for a simple query on a Sun Netra T1 (450 Mhz). If you have a real need for speed, Lucene a

Getting Terms sequentially without using TermEnumeration

2002-04-09 Thread Slavisa Radic
Hi, I have to build a Terms-Document-Matrix to be able to do some Matrix operations on it. the Matrix should look like (I hope this will be displayed correctly): term1 term2 term3 term4 ... -- Doc1freq freq freq freq Doc2freq..

RE: too many open files in system

2002-04-09 Thread Britton, Colin
I have worked with the cocoon indexer and it creates a field for each xml-element and xml-attribute, with complex xml the number of segment files grows out of control. There is two ways I see to change this. 1) change the cocoon indexer (I looked at this and decided against it) 2) add a styleshee

RE: too many open files in system

2002-04-09 Thread Nader S. Henein
that might be the case I'm indexing 200 000 files each one has about 30 XML fields each one has a set of attributes .. could that be it ? -Original Message- From: Karl Øie [mailto:[EMAIL PROTECTED]] Sent: Tuesday, April 09, 2002 7:03 PM To: Lucene Users List Subject: Re: too many open fil

Re: too many open files in system

2002-04-09 Thread Karl Øie
I have worked a little with the cocoon indexer and it indexes each xml-attribute in a Field. I have done some indexing on both plaintext and xml sources and i think the "Too many open files" problem is directly related to number of fields stored in a document in a index. the reason for this is

RE: too many open files in system

2002-04-09 Thread Otis Gospodnetic
I've indexed 250,000 items (database rows, not files) with Lucene on a system like this: [otis@kyle blink]$ ulimit unlimited [otis@kyle blink]$ tcsh > limit cputime unlimited filesizeunlimited datasizeunlimited stacksize 8192 kbytes coredumpsize100 kbytes mem

RE: too many open files in system

2002-04-09 Thread Nader S. Henein
The issue is the same with Lucene when you index, if you're indexing 200 000 files the amount of files created by the index cause the index system to run out of file handles, is there an equation to find out how many files will be created by the indexer based on the number of files we want indexed

Re: too many open files in system

2002-04-09 Thread Otis Gospodnetic
Judging from other messages in this thread it seems that the cause of your problem could be an unoptimized index (somebody said that lots of files need to be opened for searches). Try optimizing your index. Optimizing an index will reduce the number of files comprising your index. Otis --- roo

RE: too many open files in system

2002-04-09 Thread Otis Gospodnetic
This sounds like a question for Cocoon people, as what you are asking about seems to be related to Cocoon's usage of Lucene, not the core Lucene API. Otis --- Ian Forsyth <[EMAIL PROTECTED]> wrote: > I'm calling in response to the LuceneCocconIndexer, is this class an > XML > file indexer? (excu

RE: too many open files in system

2002-04-09 Thread Ian Forsyth
I'm calling in response to the LuceneCocconIndexer, is this class an XML file indexer? (excuse my ignorance i am just stepping into this whole thing..) I do a lot of development with PHP, on different platforms (WIN,*NIXES) and I want to get into indexing data... I am wondering if there are clas

RE: HTML Parser

2002-04-09 Thread Mark Ayad
Neal Thats because the HTMLParser.jj is NOT a java file it contains the grammar for the JavaCC, have a look at http://www.quiotix.com/downloads/html-parser/ Regards Mark -Original Message- From: Neal Weinstein [mailto:[EMAIL PROTECTED]] Sent: 09 April 2002 16:21 To: '[EMAIL PROTECTE

HTML Parser

2002-04-09 Thread Neal Weinstein
Hi, I am working with the lucene demo and would like to compile the demo so that I may eventually modify it for my own use. I am using the source from lucene-demos-1.2-rc4.jar.zip. However, the HTMLParser class had the filename HTMLParser.jj and won't compile. I changed the name to HTMLParser.ja

RE: too many open files in system

2002-04-09 Thread Nader S. Henein
that depends on how many files you're indexing .. I still have to figure out too what logic does the LuceneCocoonIndexer adhere when it is creating the index files -Original Message- From: root [mailto:[EMAIL PROTECTED]] Sent: Tuesday, April 09, 2002 4:50 PM To: Lucene Users List Subject

Re: too many open files in system

2002-04-09 Thread root
On Tuesday, 9. April 2002 14:08, you wrote: > root wrote: > > Doesn't Lucene releases the filehandles?? > > > > because I get "too many open files in system" after running lucene a > > while! > > Are you closing the readers and writers after you've finished using them? > > cheers, > > Chris Yes

RE: too many open files in system

2002-04-09 Thread Nader S. Henein
it's not a matter of releasing the handles, it needs to keep them open, this tricked me as well I thought it kept the file handles of the source XML files open, but if you look at the code it actually reads the contents of the files from an HTTP request, the file handles are consumed by the files

Re: too many open files in system

2002-04-09 Thread Chris Withers
root wrote: > > Doesn't Lucene releases the filehandles?? > > because I get "too many open files in system" after running lucene a while! Are you closing the readers and writers after you've finished using them? cheers, Chris -- To unsubscribe, e-mail: For additi

too many open files in system

2002-04-09 Thread root
Hi List! Doesn't Lucene releases the filehandles?? because I get "too many open files in system" after running lucene a while! I use the 1.2 rc 4 version! regards -- To unsubscribe, e-mail: For additional commands, e-mail: