Re: Index and Field.Text

2003-12-05 Thread Tatu Saloranta
On Friday 05 December 2003 10:45, Doug Cutting wrote: > Tatu Saloranta wrote: > > Also, shouldn't there be at least 3 methods that take Readers; one for > > Text-like handling, another for UnStored, and last for UnIndexed. > > How do you store the contents of a Reader? You'd have to double-buffer

write.lock

2003-12-05 Thread Aaron Galea
Hi I am starting to get an error about a write.lock in lucene when creating an index in an empty directory. It used to work fine before but now it started to occur and as far as I know I didn't touch anything. Printing out the stack trace from the excpetion thrown I get the following : java.io

Re: Returning one result

2003-12-05 Thread Dror Matalon
On Fri, Dec 05, 2003 at 05:28:54PM -0500, Erik Hatcher wrote: > On Friday, December 5, 2003, at 04:28 PM, Dror Matalon wrote: > >Then I'm out of ideas. The next thing is for you to post your search > >code so we can see why it's not searching the field. > > Giving up so easily, Dror?! :)) You'

Re: Returning one result

2003-12-05 Thread Erik Hatcher
On Friday, December 5, 2003, at 04:28 PM, Dror Matalon wrote: Then I'm out of ideas. The next thing is for you to post your search code so we can see why it's not searching the field. Giving up so easily, Dror?! :)) The problem is, when using any type of QueryParser with a Keyword field, you h

Re: Returning one result

2003-12-05 Thread Dror Matalon
Then I'm out of ideas. The next thing is for you to post your search code so we can see why it's not searching the field. On Fri, Dec 05, 2003 at 03:34:38PM -0500, Pleasant, Tracy wrote: > Yes it is in the list of arrays that I want searched. > > -Original Message- > From: Dror Matalon [

RE: Probabilistic Model in Lucene - possible?

2003-12-05 Thread ambiesense
I guess you mean "Modern Information Retrieval" ... I would be a little bit careful since this book has theoretical glasses on. It might look more difficult than expected. However I would like to discuss this further. How could it be archived to get the values your are writing about? Any first idea

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
Yes it is in the list of arrays that I want searched. -Original Message- From: Dror Matalon [mailto:[EMAIL PROTECTED] Sent: Friday, December 05, 2003 3:32 PM To: Lucene Users List Subject: Re: Returning one result On Fri, Dec 05, 2003 at 03:14:08PM -0500, Pleasant, Tracy wrote: > What d

Re: Returning one result

2003-12-05 Thread Dror Matalon
Mike, Boy, I said it so badly and yet you understood :-). Dror On Fri, Dec 05, 2003 at 03:31:15PM -0500, Michael Giles wrote: > Tracy, > > I believe what Dror was referring to was the call to > MultiFieldQueryParser.parse(). The second argument to that call is a > String[] of field names on

Re: Returning one result

2003-12-05 Thread Dror Matalon
On Fri, Dec 05, 2003 at 03:14:08PM -0500, Pleasant, Tracy wrote: > What do you mean 'add' in MultiFieldQueryParser? I am using all the > fields Sorry, that was wrong. What I meant to say is are you adding the field to the array of fields that need to be searched? You need to use a MultiFieldQ

RE: Returning one result

2003-12-05 Thread Michael Giles
Tracy, I believe what Dror was referring to was the call to MultiFieldQueryParser.parse(). The second argument to that call is a String[] of field names on which to execute the query. If the field that contains "AR345" isn't listed in that array, you will not get any results. -Mike At 03:14

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
What do you mean 'add' in MultiFieldQueryParser? I am using all the fields When I index it does add (Field.Keyword(..,..)) But I don't want the user to have to type ID: It would be nice to just type ID Number. On your site if you just put: 11183 in the search box there are no results. wel

Re: Returning one result

2003-12-05 Thread Dror Matalon
On Fri, Dec 05, 2003 at 02:45:34PM -0500, Pleasant, Tracy wrote: > Maybe we are having some communication issues. > > At any rate, I did index it as a KEYWORD and when displaying used the > TermQuery. > > The only problem with this though is by storing the ID (i.e. AR345) as a > Keyword, if I se

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
Thanks, but using it as a Keyword, it will not get returned with my search results when I use MultiFieldQueryParser. If I could I would use just parse(query) but that is not a static method, only parse(query,field,analyzer) is... So when I do that and use an analyzer, the keyword field isn't searc

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
Maybe we are having some communication issues. At any rate, I did index it as a KEYWORD and when displaying used the TermQuery. The only problem with this though is by storing the ID (i.e. AR345) as a Keyword, if I search for AR345 no results are returned when I use the MultiFieldQueryParser .

Re: Returning one result

2003-12-05 Thread Dror Matalon
On Fri, Dec 05, 2003 at 01:25:23PM -0500, Pleasant, Tracy wrote: > What I meant is. > > Say ID is Ar3453 .. well the user may want to search for Ar3453, so in > order for it to be searchable then it would have to be indexed and not a > keyword. No. You should store it as a keyword. >From the ja

Re: Returning one result

2003-12-05 Thread Erik Hatcher
On Friday, December 5, 2003, at 01:25 PM, Pleasant, Tracy wrote: Say ID is Ar3453 .. well the user may want to search for Ar3453, so in order for it to be searchable then it would have to be indexed and not a keyword. *arg* - we're having a serious communication issue here. My advice to you is

RE: How would you delete an entry that was indexed like this

2003-12-05 Thread Aviran
This is kind of a problem, in order to delete documents using terms you need to have a keyword field which contain a unique value, otherwise you might ending deleting more then you want. -Original Message- From: Mike Hogan [mailto:[EMAIL PROTECTED] Sent: Friday, December 05, 2003 1:06 PM

Re: implementing a TokenFilter for aliases

2003-12-05 Thread Doug Cutting
Position increments are for relative token positions. A position increment of zero means that a token is logically at the same position as the previous token. A position increment of one means that a token immediately follows the preceding token in the stream, it's the next token to the right

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
Also what I am indexing is not a bunch of separate documents - or then it would be easy to simply have a field called "url" and then the link would go directly do that document. However, there is a text URL with many records During indexing, a function parses each record and puts each into a docu

RE: implementing a TokenFilter for aliases

2003-12-05 Thread Allen Atamer
173 is the ID field from a database (which we use as a primary key). For Lucene's purpose, it only stores the field, and does not index it. The place where I put the print statements is before the actual filtering. The goal of the AliasFilter is to replace spitline. The debug line is in the Tokeni

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
What I meant is. Say ID is Ar3453 .. well the user may want to search for Ar3453, so in order for it to be searchable then it would have to be indexed and not a keyword. So after using TermQuery query = new TermQuery(new Term("id", term)); How would I return the other fields in the document? Fo

Re: Testing for Optimization

2003-12-05 Thread Doug Cutting
jt oob wrote: Can I safely delete those files which do not have the prefix listed in the segments file? Have a look at the index file format documentation: http://jakarta.apache.org/lucene/docs/fileformats.html The only file besides segments that should exist is the "deleteable" file, and the

How would you delete an entry that was indexed like this

2003-12-05 Thread Mike Hogan
Hi, If I index a document like this: IndexWriter writer = createWriter(); Document document = new Document(); document.add(Field.Text(ID_FIELD_NAME, componentId)); document.add(Field.Text(CONTENTS_FIELD_NAME, componentDescription)); writer.addDocument(document); writer.optimize(); writer.close();

Re: Index and Field.Text

2003-12-05 Thread Doug Cutting
Tatu Saloranta wrote: Also, shouldn't there be at least 3 methods that take Readers; one for Text-like handling, another for UnStored, and last for UnIndexed. How do you store the contents of a Reader? You'd have to double-buffer it, first reading it into a String to store, and then tokenizing t

Re: implementing a TokenFilter for aliases

2003-12-05 Thread Erik Hatcher
On Friday, December 5, 2003, at 11:59 AM, Allen Atamer wrote: Below are the results of a debug run on the piece of text that I want aliased. The token "spitline" must be recognized as "splitline" i.e. when I do a search for "splitline", this record will come up. 1: [173] , start:1, end:2 1: [mis

RE: implementing a TokenFilter for aliases

2003-12-05 Thread Allen Atamer
Erik, Below are the results of a debug run on the piece of text that I want aliased. The token "spitline" must be recognized as "splitline" i.e. when I do a search for "splitline", this record will come up. 1: [173] , start:1, end:2 1: [missing] , start:1, end:6 2: [hardware] , start:9, end:7 3:

Re: Returning one result

2003-12-05 Thread Erik Hatcher
On Friday, December 5, 2003, at 10:41 AM, Pleasant, Tracy wrote: Maybe I should have been more clear. static Field Keyword(String name, String value) Constructs a String-valued Field that is not tokenized, but is indexed and stored. I need to have it tokenized because people will sear

Re: Index and Field.Text

2003-12-05 Thread Tatu Saloranta
On Friday 05 December 2003 08:22, Erik Hatcher wrote: > On Friday, December 5, 2003, at 09:48 AM, Grant Ingersoll wrote: ... > > Field.Text(String, String) instead of the Field.Text(String, Reader) > > version, which means I am storing the contents in the index. > > So use Field.UnStored(String, S

Lucene 1.2 "Hit Highlighting"

2003-12-05 Thread Kenneth Campbell
Can someone point me in the right direction with regards to "Hit Highlighting" I have seen what Mark Harwood has done and I like it, however I am using lucene 1.2. Are there a compatibility issues. If no any suggestions about implementation would be helpful. If yes are there any suggestions fo

Re: Returning one result

2003-12-05 Thread Erik Hatcher
On Friday, December 5, 2003, at 10:31 AM, Pleasant, Tracy wrote: Ok thanks, but still I can't use the Simple analyzer since it won't even index that whole thing. I 'll give TermQuery a try. Thanks. Yes, certainly the analyzer is important for "analyzed" fields, but it is not used for Field.Key

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
Maybe I should have been more clear. static Field Keyword(String name, String value) Constructs a String-valued Field that is not tokenized, but is indexed and stored. I need to have it tokenized because people will search for that also and it needs to be searchable. Should I have t

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
Actually Erik, no I'm using Field.Text When I used Field.Keyword and tried to get the word for return with search results it would not display correctly... -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2003 6:18 PM To: Lucene Users List Subje

RE: Returning one result

2003-12-05 Thread Pleasant, Tracy
Ok thanks, but still I can't use the Simple analyzer since it won't even index that whole thing. I 'll give TermQuery a try. Thanks. -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2003 6:18 PM To: Lucene Users List Subject: Re: Returning one re

Re: Index and Field.Text

2003-12-05 Thread Erik Hatcher
On Friday, December 5, 2003, at 09:48 AM, Grant Ingersoll wrote: I have seen the example SAX based XML processing in the Lucene sandbox (thanks to the authors for contributing!) and have successfully adapted this approach for my application. The one thing that does not sit well with me is the

class definition used in Lucene

2003-12-05 Thread Shengli.Wu
hi, I have problems for understanding some classes definitions in Lucene (see the end of this e-mail for the source code). A class "FilterIndexReader" is defined at 1. Then "FilterTermDocs" is defined as a nested static class at 2. At 3, public FilterTermDocs(TermDocs in) is a constructo

RE: Index and Field.Text

2003-12-05 Thread Chong, Herb
you are storing the same information both ways. the string gets analyzed and discarded, just like with the Reader. Herb... -Original Message- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: Friday, December 05, 2003 9:49 AM To: [EMAIL PROTECTED] Subject: Index and Field.Text Hi,

Index and Field.Text

2003-12-05 Thread Grant Ingersoll
Hi, I have seen the example SAX based XML processing in the Lucene sandbox (thanks to the authors for contributing!) and have successfully adapted this approach for my application. The one thing that does not sit well with me is the fact that I am using the method Field.Text(String, String) in

Re: Range Query

2003-12-05 Thread Erik Hatcher
On Friday, December 5, 2003, at 06:54 AM, Ramrakhiani, Vikas wrote: Hi, When I do range query like id:[0* to 9*] the result set exclude documents having id 0, 90 ... i.e boundary values are excluded. Is it expected or am I going wrong some where. It is expected. You're thinking that wildcards wo

RE: Probabilistic Model in Lucene - possible?

2003-12-05 Thread Chong, Herb
anyone interested, contact me offline. whoever contacts me by the end of next week, i'll email an outline of the derivation and we can discuss it in private emails. i guarantee, you will learn something interesting about search engines. Herb -Original Message- From: Adam Saltiel [ma

RE: Probabilistic Model in Lucene - possible?

2003-12-05 Thread Shengli.Wu
Deal all, I am interested in implement a probabilistic model in Lucene as well. I checked the book titled "model information retrieval" authored by Ricardo Baeza-Yates and Berthier Ribeiro-Neto, it seems to me that the implementation is not very complicated when we use Lucene's IndexReader clas

Range Query

2003-12-05 Thread Ramrakhiani, Vikas
Hi, When I do range query like id:[0* to 9*] the result set exclude documents having id 0, 90 ... i.e boundary values are excluded. Is it expected or am I going wrong some where. thanks, vikas. - To unsubscribe, e-mail: [EMAIL

Re: Testing for Optimization

2003-12-05 Thread jt oob
--- Dror Matalon <[EMAIL PROTECTED]> wrote: > I believe that indexes that are optimized have only one segment. So > in > theory you could check and see that you only have one file with a > ".fdt", ".fdx", etc. If run `cat/index_dir/segements` on an optimized index there is only only string in t

RE: Probabilistic Model in Lucene - possible?

2003-12-05 Thread Adam Saltiel
Herb, Any one game ... ? No takers? I would be very interested, but maybe beyond what can be posted in a mail list. I'd be equally interested in any references you may have. As we are on this subject how does LSI and the similar CNG (context network graph) fit into the model used by lucene. Could l