Re: hits.length() changes during delete process.

2004-12-06 Thread Morus Walter
David Townsend writes: So the short question is, should the hits object be changing and what is the best way to delete all the results of a search (it's a range query so I can't use delete(Term term)? The hits object caches only part of the hits (initially the first 100 (?)). This cache

Re: restricting search result

2004-12-06 Thread Sergiu Gordea
Paul wrote: Hi, how yould you restrict the search results for a certain user? I'm indexing all the existing data in my application but there are certain access levels so some users should see more results then an other. Each lucene document has a field with an internal id and I want to restrict on

reoot site query results

2004-12-06 Thread Chris Fraschetti
My lucene implementation works great, its basically an index of many web crawls. The main thing my users complain about is say a search for slashdot will return the http://www.slashdot.org/soem_dir/somepage.asp as the top result because the factors i have scoring it determine it as so... but

addIndexes() Size

2004-12-06 Thread Garrett Heaver
Hi. Its probably really simple to explain this but since I'm not up to speed on the way Lucene stores the data I'm a little confused. I'm building an Index, which resides on Server A, with the Lucene Service running on Server B. Now not to bore you with the details but because of the

Re: LUCENE + 1.4.2

2004-12-06 Thread Erik Hatcher
On Dec 6, 2004, at 1:22 AM, Karthik N S wrote: I am not able to find the FINAL Lucene 1.4.2 SRC any where on http://jakarta.apache.org/lucene/docs/index.html Please can some Body Reply the Form with the URL. Actually Lucene 1.4.3 is now available and I recommend you use it instead, through the

Re: reoot site query results

2004-12-06 Thread Erik Hatcher
On Dec 6, 2004, at 4:53 AM, Chris Fraschetti wrote: My lucene implementation works great, its basically an index of many web crawls. The main thing my users complain about is say a search for slashdot will return the http://www.slashdot.org/soem_dir/somepage.asp as the top result because the

RE: LUCENE + 1.4.2

2004-12-06 Thread Karthik N S
Hi Erik Apologies... This mean's that Issues w.r.t 1.4.2 and 1.4.1 are fixed in 1.4.3 as of presently, 1) So u say We can retrospectively move our under Developemental Code from to 1.4.3 from 1.4.1 safetly ?. 2) Do we need to Reindex All Of Our Code done via 1.4.1 or continue with

Re: reoot site query results

2004-12-06 Thread Chris Fraschetti
I do this to some extent... currently I apply a boost if its as best i can tell a root page. But I am more asking how to determine root pages... content obviously isn't easy to use ... the url is the main key... but that can be tricky as well... Basically the pages are from a crawl.. so their

Help on Phrase Prefix query

2004-12-06 Thread Mahendra
Hi, Presently i am working on a requirement in my application, to do the search using lucene as follows, Users enters phrase prefix query text. The query should be constructed as follows, - a PhrasePrefixQuery based on the user entered text, for eg FieldA - a termquery based on another field,

Size Search

2004-12-06 Thread Natarajan.T
Hi All, I have indexed File sizes(like 2,4, 5,10,20,etc...) I want search results only the size range between 5 - 20. How can I handle this??? Natarajan.

Re: LUCENE + 1.4.2

2004-12-06 Thread Erik Hatcher
On Dec 6, 2004, at 5:48 AM, Karthik N S wrote: This mean's that Issues w.r.t 1.4.2 and 1.4.1 are fixed in 1.4.3 as of presently, See the CHANGES.txt file for details on what changes took place, but yes some issues were fixed. 1) So u say We can retrospectively move our under Developemental Code

Re: Size Search

2004-12-06 Thread Erik Hatcher
On Dec 6, 2004, at 8:08 AM, Natarajan.T wrote: I have indexed File sizes(like 2,4, 5,10,20,etc...) How did you index them? I want search results only the size range between 5 - 20. How can I handle this??? See here: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields Erik

Re: Help on Phrase Prefix query

2004-12-06 Thread Erik Hatcher
Mahendra, Could you provide a concrete, and simple, example of what you're trying to achieve? It would help me understand what you're after. Any Query implementation works fine as a clause within a BooleanQuery, there is nothing special to do for a PhrasePrefixQuery in this regard.

API suggestion

2004-12-06 Thread Nestel, Frank IZ/HZA-IOL
Hello, I'm currently investigating improving the Highlighter currently supplied in the lucene sandbox. Especially we'd like to parse more different styles of Querys. Most important WildcardQuery. As it turns out, this shouldn't be too difficult, but the problem is that the API

API suggestion

2004-12-06 Thread David Donohue
Return Receipt Your API suggestion document :

Re: reoot site query results

2004-12-06 Thread Doug Cutting
In web search, link information helps greatly. (This was Google's big discovery.) There are lots more links that point to http://www.slashdot.org/ than to http://www.slashdot.org/xxx/yyy, and many (if not most) of these links have the term slashdot, while links to

Re: addIndexes() Size

2004-12-06 Thread Otis Gospodnetic
If I were you, I would first use Luke to peek at the index. You may find something obvious there, like multiple copies of the same Document. Does your temp index 'overlap' with A index in terms of Documents? If so, you will end up with multliple copies, as addIndexes method doesn't detect and

RE: addIndexes() Size

2004-12-06 Thread Garrett Heaver
No there are no duplicate copies - I've the correct number when I view through luke and I don't overlap - the temporary index is destroyed after it is added to the main index - I'm currently at index version 159 and it seems that all of my .prx files come in at around 1435 megs (ouch) Thanks

Re: addIndexes() Size

2004-12-06 Thread Erik Hatcher
There was a bug in 1.4 (and maybe 1.4.1?) that kept some index files around that were not used. Are you using Lucene 1.4.3? It not, try that and see if it helps. Erik On Dec 6, 2004, at 12:17 PM, Garrett Heaver wrote: No there are no duplicate copies - I've the correct number when I

Re: Recommended values for mergeFactor, minMergeDocs, maxMergeDocs

2004-12-06 Thread Doug Cutting
Chuck Williams wrote: I've got about 30k documents and have 3 indexing scenarios: 1. Full indexing and optimize 2. Incremental indexing and optimize 3. Parallel incremental indexing without optimize Search performance is critical. For both cases 1 and 2, I'd like the fastest

RE: addIndexes() Size

2004-12-06 Thread Garrett Heaver
Cheers for that Erik - believe it or not I'm still back at v1.3 (doh!!!) Will try 1.4.3 tomorrow Thanks Garrett -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: 06 December 2004 17:27 To: Lucene Users List Subject: Re: addIndexes() Size There was a bug in 1.4 (and

Re: indexReader close method

2004-12-06 Thread Helen Warren
Hi Otis, Thanks for the reply. I'm using lucene 1.4.2. I believe that the IndexSearcher in this version will close the IndexReader if the IndexReader was supplied implicitly but I'm constructing the IndexReader first and passing it to the IndexSearcher constructor as I want to use the reader

[Fwd: API suggestion]

2004-12-06 Thread markharw00d
The documentation for the highlighter already covers how to handle wildcard queries. See the javadoc notes on query.rewrite. Cheers Mark ---BeginMessage--- Hello, I'm currently investigating improving the Highlighter currently supplied in the lucene sandbox. Especially we'd like to parse more

Re: indexReader close method

2004-12-06 Thread Chris Hostetter
: Do you know why I can't close the IndexReader explicitly under some : circumstances and why, when I do manage to close it I can still call : methods on the reader? 1) I tried to create a test case that demonstrated your bug based on the code outline you provided, and i couldn't (see below).

Index delete failing

2004-12-06 Thread Ravi
Hi We need to delete a lucene index from our application using java.io.file.delete(). We are closing the indexWriter and even all the index searchers on that folder. But a call to delete returns false. There is no lock on the index directory. Interesting thing is that the deletable and segments

Re: Index delete failing

2004-12-06 Thread Otis Gospodnetic
This smells like a Windows issue. It is possible that something in your JVM is still holding onto the index directory (for example, FSDirectory), and Winblows is not letting you remove the directory. I bet this will work if you exit the JVM and run java.io.file.delete() without calling Lucene.

RE: Index delete failing

2004-12-06 Thread Ravi
Yep, It works if I exit the JVM and run file.delete() from a different class without using Lucene. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Monday, December 06, 2004 4:48 PM To: Lucene Users List Subject: Re: Index delete failing This smells like a

Problem with indexing/merging indices - documents not indexed.

2004-12-06 Thread [EMAIL PROTECTED]
Hello all After reading the list for more than a year, I've finally decided (got courage) to post my first question. I'm not an expert in Lucene or Java, but I can find my way around it and right now I'm having a problem that I hope this list could help me out with. I'm using MySQL to store

Re: Problem with indexing/merging indices - documents not indexed.

2004-12-06 Thread Chris Hostetter
: I would appreciate any feedback on my code and whether I'm doing : something in a wrong way, because I'm at a total loss right now : as to why documents are not being indexed at all. I didn't try running your code (because i don't have a DB to test it with) but a quick read gives me a good

Single Digit Indexing

2004-12-06 Thread Bill von Ofenheim (LaRC)
How can I get Lucene to index single digits (e.g. 8 as in Gemini 8)? I am able to index numbers with two or more digits (e.g. 11 as in Apollo 11). Thanks, Bill von Ofenheim - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Re: Single Digit Indexing

2004-12-06 Thread Otis Gospodnetic
Hm, if you can index 11, you should be able to index 8 as well. In any case, you most likely want to make sure that your Analyzer is not just throwing your numbers out. This may stillbe up to date: http://www.jguru.com/faq/view.jsp?EID=538308 See also:

Re: Problem with indexing/merging indices - documents not indexed.

2004-12-06 Thread [EMAIL PROTECTED]
Hi Chris actually for merging indices that's how Otis did it in the article I quoted: // if -r argument was specified, use RAMDirectory RAMDirectory ramDir= new RAMDirectory(); IndexWriter ramWriter = new IndexWriter(ramDir, analyzer, true);

dotLucene 1.4.3 (port of Jakarta Lucene to C#)

2004-12-06 Thread George Aroush
Hi Folks, I am pleased to announce the availability of dotLucene 1.4.3 beta build-001 This is the first beta release of version 1.4.3 of Jakarta Lucene ported to C# and is intended to be Release Candidate. Please visit http://www.sourceforge.net/projects/dotlucene/ to learn more about

RE: dotLucene (port of Jakarta Lucene to C#)

2004-12-06 Thread George Aroush
Hi Pasha, If you don't use any tools, are you saying that you did the conversion by hand?!! I see a lot of code resemblances in 1.3 and what JLCA generates for me in 1.4 and 1.4.3, I mean a lot. Regards, -- George -Original Message- From: Pasha Bizhan [mailto:[EMAIL PROTECTED] Sent:

Re: Single Digit Indexing

2004-12-06 Thread David Spencer
Otis Gospodnetic wrote: Hm, if you can index 11, you should be able to index 8 as well. In any case, you most likely want to make sure that your Analyzer is not just In theory you could have a length filter tossing out tokens that are too short or too long, and maybe you're getting rid of all

Is this a bug or a feature with addIndexes?

2004-12-06 Thread [EMAIL PROTECTED]
Greetings, Ok, so maybe this is common knowledge to most of you but I'm a lamen when it comes to Lucene and I couldnt find any details about this after some searching. When you merge two indexes via addIndexes, does it only work in batches (10 or more documents)? Because I've been banging my

Re: Is this a bug or a feature with addIndexes?

2004-12-06 Thread Otis Gospodnetic
Hello, Try changing IndexWriter's mergeFactor variable. It's 10 by default. Change it to 1, for instance. Otis --- [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Greetings, Ok, so maybe this is common knowledge to most of you but I'm a lamen when it comes to Lucene and I couldnt find any

Re: Is this a bug or a feature with addIndexes?

2004-12-06 Thread [EMAIL PROTECTED]
Hi Otis I did try, here's what I get: [EMAIL PROTECTED] tmp]# time java MemoryVsDisk 1 1 10 -r Docs in the RAM index: 1 Docs in the FS index: 0 Total time: 142 ms real0m0.322s user0m0.268s sys 0m0.033s I tried other combinations but they dont seem to affect the outcome either

Re: Is this a bug or a feature with addIndexes?

2004-12-06 Thread Chris Hostetter
: [EMAIL PROTECTED] tmp]# time java MemoryVsDisk 1 1 10 -r : Docs in the RAM index: 1 : Docs in the FS index: 0 : Total time: 142 ms I looked at the code from the article you mentioned and added the print statements i'm guessing you added for ramWriter/fsWriter.docCount() before and after

.NET Version of Lucene

2004-12-06 Thread Ben Litchfield
I know there has been talk about a .NET version of lucene. I have been looking into doing something similar for PDFBox and came across a project called IKVM http://www.ikvm.net/ I don't believe it has been mentioned on this list. It is a little different approach than what I people have been

RE: Size Search

2004-12-06 Thread Natarajan.T
Thanks for your response. I have indexed like Field.Text(size,docSize) -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, December 06, 2004 6:48 PM To: Lucene Users List Subject: Re: Size Search On Dec 6, 2004, at 8:08 AM, Natarajan.T wrote: I have

Re: Help on Phrase Prefix query

2004-12-06 Thread Mahendra
Hi Erik, Thanks for responding. I have attached a sample java file for the sample implementation. -regards, mahendra On Mon, 6 Dec 2004 08:19:54 -0500, Erik Hatcher [EMAIL PROTECTED] wrote: Mahendra, Could you provide a concrete, and simple, example of what you're trying to achieve? It

finalize delete without optimize

2004-12-06 Thread John Wang
Hi: Is there a way to finalize delete, e.g. actually remove them from the segments and make sure the docIDs are contiguous again. The only explicit way to do this is by calling IndexWriter.optmize(). But this call does a lot more (also merges all the segments), hence is very expensive. Is

Filter !!!

2004-12-06 Thread Natarajan.T
Hi All, I want pass multiple filter (QueryFilter,DateFilter) objects to search method.. See below: Hits hits = indexSearcher.search(searchQuery, filter) // here I want to pass multiple filter... (DateFilter,QueryFilter) How can I handle this?? Regards, Natarajan.

Re: Filter !!!

2004-12-06 Thread Chris Hostetter
: Hits hits = indexSearcher.search(searchQuery, filter) // here I want : to pass multiple filter... (DateFilter,QueryFilter) You can write a Filter that takes in multiple filters and ANDs them together (or ORs them, it's not clear what you want) Hits h = s.search(q,new

RE: Filter !!!

2004-12-06 Thread Natarajan.T
Thanks for your response.. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Hostetter Sent: Tuesday, December 07, 2004 11:26 AM To: Lucene Users List Subject: Re: Filter !!! : Hits hits = indexSearcher.search(searchQuery, filter) //

Re: Help on Phrase Prefix query

2004-12-06 Thread Erik Hatcher
On Dec 6, 2004, at 11:05 PM, Mahendra wrote: Thanks for responding. I have attached a sample java file for the sample implementation. Please convert this program to building the index using RAMDirectory also. I cannot run it as it is because it relies on an external index using a Windows path

Re: Filter !!!

2004-12-06 Thread Erik Hatcher
On Dec 7, 2004, at 12:55 AM, Chris Hostetter wrote: : Hits hits = indexSearcher.search(searchQuery, filter) // here I want : to pass multiple filter... (DateFilter,QueryFilter) You can write a Filter that takes in multiple filters and ANDs them together (or ORs them, it's not clear what

Re: indexReader close method

2004-12-06 Thread Morus Walter
Helen Warren writes: //close the IndexReader object myReader.close(); //return results return hits; The myReader.close() line causes the IOException to be thrown. To try Are you sure it's the myReader.close() that fails? I'd suspect that to fail as soon as you want to do anything

Re: Size Search

2004-12-06 Thread Erik Hatcher
On Dec 6, 2004, at 10:41 PM, Natarajan.T wrote: Thanks for your response. I have indexed like Field.Text(size,docSize) Field.Text runs the text through the analyzer, which may or may not strip numbers. In this case, you probably want to use Field.Keyword instead. Be sure, like the wiki