from:"Rob Outar"

RE: Query question

2003-09-11 Thread Rob Outar

Otis,

Are you referring to this:

How do I retrieve all the values of a particular field that exists 
within
an index, across all documents?

I need a query to do it, the only way clients access the index is via
queries so they cannot write the code in the faq above.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 10, 2003 5:05 PM
To: Lucene Users List
Subject: Re: Query question


Go to Lucene FAQ at jGuru.com and search for the word 'all'.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hi all,

   I have a field called echelon that are assigned to certain files.
 Is
 there a query I can write that will give me all files that have this
 field?

   I have tried stuff like echelon:.+*, echelon:*, etc... some give a
 query
 parser exception while others return nothing.

 Let me know,

 Rob



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Query question

2003-09-10 Thread Rob Outar

Hi all,

I have a field called echelon that are assigned to certain files.  Is
there a query I can write that will give me all files that have this field?

I have tried stuff like echelon:.+*, echelon:*, etc... some give a query
parser exception while others return nothing.

Let me know,

Rob



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Checkpointable Index

2003-08-07 Thread Rob Outar

Hi all,

We have sandboxed file system which Lucene indexes.  Periodically we dump
the file system to disk (checkpoint it); can a Lucene index be checkpointed
then restored and used?  Currently we simply rebuild the index since it only
takes a few minutes.  But we would like the user to be able to take a
snapshot of that file system and restore and use it without rebuilding the
index.

Let me know.

Thanks,

Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Luke - Lucene Index Browser

2003-07-16 Thread Rob Outar

Luke looks pretty slick.  Was wondering how difficult would it be to add
code to add fields, update fields, etc..  I have written something similar
to Luke but next month I need to add support to update fields, remove, add,
etc.. graphically.

Thanks,

Rob


-Original Message-
From: Andrzej Bialecki [mailto:[EMAIL PROTECTED]
Sent: Monday, July 14, 2003 11:48 AM
To: Lucene Users List
Subject: Luke - Lucene Index Browser


Dear Lucene Users,

Luke is a diagnostic tool for Lucene (http://jakarta.apache.org/lucene)
indexes. It enables you to browse documents in existing indexes, perform
queries, navigate through terms, optimize indexes and more.

Please go to http://www.getopt.org/luke and give it a try. A Java
WebStart version will be available soon.

--
Best regards,
Andrzej Bialecki

-
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-
FreeBSD developer (http://www.freebsd.org)




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

java.io.IOException: Cannot delete deletetable

2003-06-19 Thread Rob Outar

Hi all,

I am intermittently getting the above exception while build an index.  I
have been trying for an house to reproduce it but can't as of yet.  But in
any case I was wondering if anyone knew anything about the above error and
if so how to stop it from occurring.  In the stack trace I printed out, it
looked like it was in the rename method of FSDirectory that the exception
occurred.  As soon as I can replicate I will post the exception and any
additional information requested.

Thanks as always,

Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: java.io.IOException: Cannot delete deletetable

2003-06-19 Thread Rob Outar

We use windows and linux but I have only seen this error on Windows so far.
I will check the Jar file I am using to make sure it is the most recent, I
am assuming the most recent is Lucene 1.3 RC1 ?

Thanks,

Rob


-Original Message-
From: Matt Tucker [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 19, 2003 2:03 PM
To: Lucene Users List
Subject: Re: java.io.IOException: Cannot delete deletetable


Rob,

Are you using the very latest Lucene code? The standard File.renameTo
operation fails every once in awhile, especially on Windows. I sent in a
patch that was put in somewhat recently. It fixed all the errors we were
seeing with renames.

Regards,
Matt

Rob Outar wrote:

 Hi all,

   I am intermittently getting the above exception while build an index.  I
 have been trying for an house to reproduce it but can't as of yet.  But in
 any case I was wondering if anyone knew anything about the above error and
 if so how to stop it from occurring.  In the stack trace I printed out, it
 looked like it was in the rename method of FSDirectory that the exception
 occurred.  As soon as I can replicate I will post the exception and any
 additional information requested.

 Thanks as always,

 Rob


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: java.lang.IllegalArgumentException: attempt to access a deleted document

2003-06-06 Thread Rob Outar

I added the following code:

   for (int i = 0; i  numOfDocs; i++) {
if ( !reader.isDeleted(i)) {
doc = reader.document(i);
docs[i] =
doc.get(SearchEngineConstants.REPOSITORY_PATH);
}
}
return docs;

but it never goes in the if statement, for every value of i, isDeleted(i) is
returning true?!?  Am I doing something wrong?  I was trying to do what Doug
outlined below.


Thanks,

Rob
-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 04, 2003 12:34 PM
To: Lucene Users List
Subject: Re: java.lang.IllegalArgumentException: attempt to access a
deleted document


Rob Outar wrote:
  public synchronized String[] getDocuments() throws IOException {

 IndexReader reader = null;
 try {
 reader = IndexReader.open(this.indexLocation);
 int numOfDocs  = reader.numDocs();
 String[] docs  = new String[numOfDocs];
 Document doc   = null;

 for (int i = 0; i  numOfDocs; i++) {
 doc = reader.document(i);
 docs[i] = doc.get(SearchEngineConstants.REPOSITORY_PATH);
 }
 return docs;
 }
 finally {
 if (reader != null) {
 reader.close();
 }
 }
 }

The limit of your iteration should be IndexReader.maxDoc(), not
IndexReader.numDocs():

http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexReade
r.html#maxDoc()

Also, you should first check that each document is not deleted before
calling IndexReader.document(int):

http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexReade
r.html#isDeleted(int)

Doug


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Analyzer Incorrect?

2003-04-04 Thread Rob Outar

Hi all,

Sorry for the flood of questions this week, clients finally started using
the search engine I wrote which uses Lucene.  When I first started
developing with Lucene the Analyzers it came with did some odd things so I
decided to implement my own but it is not working the way I expect it to.
First and foremost I would like to like to have case insensitive searches
and I do not want to tokenize the fields.  No field will ever have a space
in it so therefore there is no need to tokenize it.  I came up with this
Analyzer but case still seems to be an issue:

  public TokenStream tokenStream(String field, final Reader reader) {

// do not tokenize any field
TokenStream t = new CharTokenizer(reader) {
protected boolean isTokenChar(char c) {
return true;
}
};

//case insensitive search
t = new LowerCaseFilter(t);
return t;
}

Is there anything I am doing wrong in the Analyzer I have written?

Thanks,

Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Analyzer Incorrect?

2003-04-04 Thread Rob Outar

Yeah it has been a bad week.  I don't think Query parser is not lowercasing
my fields, maybe it is something I am doing wrong:

 public synchronized String[] queryIndex(String query) throws
ParseException,
IOException {

checkForIndexChange();
QueryParser p = new QueryParser(,
new RepositoryIndexAnalyzer());
this.query = p.parse(query);
Hits hits = this.searcher.search(this.query);
return buildReturnArray(hits);

}

When I create Querypaser I do not want it to have default field since
clients can query on whatever field they want.  I use my Analyzer which I do
not think is lowercasing the fields because I have tested querying with all
lowercase (got results) with mixed case (no results) so I think my code or
my analyzer is hosed.

Thanks,

Rob


-Original Message-
From: Tatu Saloranta [mailto:[EMAIL PROTECTED]
Sent: Friday, April 04, 2003 9:09 AM
To: Lucene Users List
Subject: Re: Analyzer Incorrect?


On Friday 04 April 2003 05:24, Rob Outar wrote:
 Hi all,

   Sorry for the flood of questions this week, clients finally started using
 the search engine I wrote which uses Lucene.  When I first started

Yup... that's the root of all evil. :-)
(I'm in similar situation, going through user acceptance test as we speak...
and getting ready to do second version that'll have more advanced metadata
based search using Lucene).

 developing with Lucene the Analyzers it came with did some odd things so I
 decided to implement my own but it is not working the way I expect it to.
 First and foremost I would like to like to have case insensitive searches
 and I do not want to tokenize the fields.  No field will ever have a space

If you don't need to tokenize a field, you don't need an analyzer either.
However, to get case insensitive search, you should lower-case field
contents
before adding them to document. QueryParser will do lower casing for search
terms automatically (if you are using it), so matching should work fine
then.

-+ Tatu +-


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-03 Thread Rob Outar

Would there be any abnormal effects if after adding a document, you called
optimize?  I am still seeing a large growth from setting a field.  When I
set a field I:

1.  Get the document
2.  Remove the field.
3.  Write the document to index
4.  Get the document again.
5.  Add the new field object.
6.  Write the document to index.
7.  Call optimize.

From writing out my steps it looks like I should write a set method instead
of treating set as removeField() and addField(), I thought combining these
two would equal set which it does, but it seems horribly inefficient.  But
in any case would the above cause in the index to grow from say 10.5 megs to
31 megs?

Is there any efficient way to implement a set, for example if there was a
field value pair of book/hamlet, but now we wanted to set book = none?
Please keep in mind there could be multiple field names with book.  So it is
not simply a matter of removing the field book and then readding it.

Anyhow let me know your thoughts.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 02, 2003 11:35 AM
To: Lucene Users List
Subject: RE: Indexing Growth


Funny how this is the outcome of 90% of the problems people have with
software - their own mistakes :)

Regarding reindexing - no need for any explicit calls.  When you add a
document to the index it is indexed right away.  You will have to
detect index change (methods for that are there) and re-open the
IndexSearcher in order to see newly added/indexed documents.

Otis


--- Rob Outar [EMAIL PROTECTED] wrote:
 I found the freakin problem, I am going to kill my co-worker when he
 gets
 in.  He was removing a field and adding the same field back for each
 document in the index in a piece of code I did not notice until
 now  He is so dead.  I commented out that piece of
 code,
 queried to my hearts content and the index has not changed.  Heck the
 tool
 is like super fast now.

 One last concern is about the re-indexing thing, when does that
 occur?
 optimize()?  I am curious what method would cause a reindex.

 I want to thank all of you for your help, it was truly appreciated!

 Thanks,

 Rob



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-03 Thread Rob Outar

I took out the optimize() after the write and the index is growing but at
like a 1kb rate, but now there are tons of 1kb files.  I assume at this
optimize would fix this?  What is a good rule of thumb for calling
optimize()?  Will Lucene ever invoke an optimize() on it's own?

Thanks,

Rob Outar
OneSAF AI -- SAIC
Software\Data Engineer
321-235-7660
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]


-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 03, 2003 10:53 AM
To: Lucene Users List
Subject: RE: Indexing Growth


Would there be any abnormal effects if after adding a document, you called
optimize?  I am still seeing a large growth from setting a field.  When I
set a field I:

1.  Get the document
2.  Remove the field.
3.  Write the document to index
4.  Get the document again.
5.  Add the new field object.
6.  Write the document to index.
7.  Call optimize.

From writing out my steps it looks like I should write a set method instead
of treating set as removeField() and addField(), I thought combining these
two would equal set which it does, but it seems horribly inefficient.  But
in any case would the above cause in the index to grow from say 10.5 megs to
31 megs?

Is there any efficient way to implement a set, for example if there was a
field value pair of book/hamlet, but now we wanted to set book = none?
Please keep in mind there could be multiple field names with book.  So it is
not simply a matter of removing the field book and then readding it.

Anyhow let me know your thoughts.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 02, 2003 11:35 AM
To: Lucene Users List
Subject: RE: Indexing Growth


Funny how this is the outcome of 90% of the problems people have with
software - their own mistakes :)

Regarding reindexing - no need for any explicit calls.  When you add a
document to the index it is indexed right away.  You will have to
detect index change (methods for that are there) and re-open the
IndexSearcher in order to see newly added/indexed documents.

Otis


--- Rob Outar [EMAIL PROTECTED] wrote:
 I found the freakin problem, I am going to kill my co-worker when he
 gets
 in.  He was removing a field and adding the same field back for each
 document in the index in a piece of code I did not notice until
 now  He is so dead.  I commented out that piece of
 code,
 queried to my hearts content and the index has not changed.  Heck the
 tool
 is like super fast now.

 One last concern is about the re-indexing thing, when does that
 occur?
 optimize()?  I am curious what method would cause a reindex.

 I want to thank all of you for your help, it was truly appreciated!

 Thanks,

 Rob



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Querying Question

2003-04-03 Thread Rob Outar

Hi all,

I am a little fuzzy on complex querying using AND, OR, etc..  For example:

I have the following name/value pairs

file 1 = name = checkpoint value = filename_1
file 2 = name = checkpoint value = filename_2
file 3 = name = checkpoint value = filename_3
file 4 = name = checkpoint value = filename_4

I ran the following Query:

name:\checkpoint\ AND  value:\filenane_1\

Instead of getting back file 1, I got back all four files?

Then after trying different things I did:

+(name:\checkpoint\) AND  +(value:\filenane_1\)

it then returned file 1.

Our project queries solely on name value pairs and we need the ability to
query using AND, OR, NOTS, etc..  What the correct syntax for such queries?

The code I use is :
 QueryParser p = new QueryParser(,
 new RepositoryIndexAnalyzer());
 this.query = p.parse(query.toLowerCase());
 Hits hits = this.searcher.search(this.query);

Thanks as always,

Rob



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Querying Question

2003-04-03 Thread Rob Outar

RepositoryIndexAnalyzer :

 /**
 * Creates a TokenStream which tokenizes all the text in the provided
Reader.
 * Default implementation forwards to tokenStream(Reader) for
compatibility
 * with older version. Override to allow Analyzer to choose strategy
based
 * on document and/or field.
 * @param field is the name of the field
 * @param reader is the data
 * @return a token stream
 * @build 10
 */
public TokenStream tokenStream(String field, final Reader reader) {

// do not tokenize any field
TokenStream t = new CharTokenizer(reader) {
protected boolean isTokenChar(char c) {
return true;
}
};

//case insensitive search
t = new LowerCaseFilter(t);
return t;

}

but earlier when I did a query case became an issue I am not sure why as the
analyzer should have lowercased the token but it did not.

Thanks,

Rob

-Original Message-
From: Eric Isakson [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 03, 2003 5:23 PM
To: Lucene Users List
Subject: RE: Querying Question


This query.toLowerCase() lowercased your query to become:

name:\checkpoint\ and  value:\filenane_1\

The keyword AND must be uppercase when the query parser gets a hold of it.

If your RepositoryIndexAnalyzer lowercases its tokens you don't need to do
query.toLowerCase(). If it doesn't lowercase its tokens, you may want to
modify it so that it does.

Eric

-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 03, 2003 5:11 PM
To: Lucene Users List
Subject: Querying Question
Importance: High


Hi all,

I am a little fuzzy on complex querying using AND, OR, etc..  For example:

I have the following name/value pairs

file 1 = name = checkpoint value = filename_1
file 2 = name = checkpoint value = filename_2
file 3 = name = checkpoint value = filename_3
file 4 = name = checkpoint value = filename_4

I ran the following Query:

name:\checkpoint\ AND  value:\filenane_1\

Instead of getting back file 1, I got back all four files?

Then after trying different things I did:

+(name:\checkpoint\) AND  +(value:\filenane_1\)

it then returned file 1.

Our project queries solely on name value pairs and we need the ability to
query using AND, OR, NOTS, etc..  What the correct syntax for such queries?

The code I use is :
 QueryParser p = new QueryParser(,
 new RepositoryIndexAnalyzer());
 this.query = p.parse(query.toLowerCase());
 Hits hits = this.searcher.search(this.query);

Thanks as always,

Rob



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-02 Thread Rob Outar

Hi all,

This is too odd and I do not even know where to start.  We built a Windows
Explorer type tool that indexes all files in a sabdboxed file system.
Each Lucene document contains stuff like path, parent directory, last
modified date, file_lock etc..  When we display the files in a given
directory through the tool we query the index about 5 times for each file in
the repository, this is done so we can display all attributes in the index
about that file.  So for example if there are 5 files in the directory, each
file has 6 attributes that means about 30 term queries are executed.  The
initial index when build it about 10.4megs, after accessing about 3 or 4
directories the index size increased to over 100megs, and we did not add
anything!!  All we are doing is querying!!  Yesterday after querying became
ungodly slow, we looked at the index size it had grown from 10megs to 1.5GB
(granted we tested the tool all morning).  But I have no idea why the index
is growing like this.  ANY help would be greatly appreciated.


Thanks,

Rob


-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:32 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: RE: Indexing Growth


I reuse the same searcher, analyzer and Query object I don't think that
should cause the problem.

Thanks,

Rob


-Original Message-
From: Alex Murzaku [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:22 PM
To: 'Lucene Users List'
Subject: RE: Indexing Growth


I don't know if I remember this correctly: I think for every query
(term) is created a file but the file should disappear after the query
is completed.

-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:13 PM
To: Lucene Users List
Subject: RE: Indexing Growth


Dang I must be doing something crazy cause all my client app does is
search and the index size increases.  I do not add anything.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:07 PM
To: Lucene Users List
Subject: Re: Indexing Growth


Only when you add new documents to it.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hi all,

   Will the index grow based on queries alone?  I build my index,
then
 run several queries against it and afterwards I check the size of the
 index and
 in some cases it has grown quite a bit although I did not add
 anything???

 Anyhow please let me know the cases when the index will grow.

 Thanks,

 Rob


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://platinum.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-02 Thread Rob Outar

Additional info on the problem, the index contains several 1kb files and
several files that have different names, but the same file size.  It looks
like the files that comprise the index are being duplicated causing the
index to become huge.

Thanks,

Rob

-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 02, 2003 8:51 AM
To: Lucene Users List
Subject: RE: Indexing Growth
Importance: High


Hi all,

This is too odd and I do not even know where to start.  We built a Windows
Explorer type tool that indexes all files in a sabdboxed file system.
Each Lucene document contains stuff like path, parent directory, last
modified date, file_lock etc..  When we display the files in a given
directory through the tool we query the index about 5 times for each file in
the repository, this is done so we can display all attributes in the index
about that file.  So for example if there are 5 files in the directory, each
file has 6 attributes that means about 30 term queries are executed.  The
initial index when build it about 10.4megs, after accessing about 3 or 4
directories the index size increased to over 100megs, and we did not add
anything!!  All we are doing is querying!!  Yesterday after querying became
ungodly slow, we looked at the index size it had grown from 10megs to 1.5GB
(granted we tested the tool all morning).  But I have no idea why the index
is growing like this.  ANY help would be greatly appreciated.


Thanks,

Rob


-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:32 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: RE: Indexing Growth


I reuse the same searcher, analyzer and Query object I don't think that
should cause the problem.

Thanks,

Rob


-Original Message-
From: Alex Murzaku [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:22 PM
To: 'Lucene Users List'
Subject: RE: Indexing Growth


I don't know if I remember this correctly: I think for every query
(term) is created a file but the file should disappear after the query
is completed.

-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:13 PM
To: Lucene Users List
Subject: RE: Indexing Growth


Dang I must be doing something crazy cause all my client app does is
search and the index size increases.  I do not add anything.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:07 PM
To: Lucene Users List
Subject: Re: Indexing Growth


Only when you add new documents to it.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hi all,

   Will the index grow based on queries alone?  I build my index,
then
 run several queries against it and afterwards I check the size of the
 index and
 in some cases it has grown quite a bit although I did not add
 anything???

 Anyhow please let me know the cases when the index will grow.

 Thanks,

 Rob


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://platinum.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-02 Thread Rob Outar

Just about everything calls getValue:

 public synchronized String getValue(String key, File file)
throws ParseException,  IOException {

Document doc = getDocument(file);
return doc.get(key.toLowerCase());
}

which calls get document:

 private synchronized Document getDocument(File file) throws
MalformedURLException,
IOException {


checkForIndexChange();
Term t   = new Term(PATH,
file.toURI().toString().toLowerCase());
TermQuery tQ = new TermQuery(t);

Hits hits= this.searcher.search(tQ);

if (hits.length() == 1) {
return hits.doc(0);
}
//this should never happen, cannot have a URL that returns 2 hits
//that would mean the same file has been indexed twice
else {
return null;
}


Thanks,

Rob


-Original Message-
From: Michael Barry [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 02, 2003 9:20 AM
To: Lucene Users List
Subject: Re: Indexing Growth


Sounds like you either have an indexer that's run amok (maybe
a background process that's continually re-indexing your sandbox -
or expanding outside your sandbox) or your Query code is doing more
than querying. It's not behaviour I've seen. Without a snippet of
Query code, it's going to be hard to help.

Rob Outar wrote:

Hi all,

   This is too odd and I do not even know where to start.  We built a Windows
Explorer type tool that indexes all files in a sabdboxed file system.
Each Lucene document contains stuff like path, parent directory, last
modified date, file_lock etc..  When we display the files in a given
directory through the tool we query the index about 5 times for each file
in
the repository, this is done so we can display all attributes in the index
about that file.  So for example if there are 5 files in the directory,
each
file has 6 attributes that means about 30 term queries are executed.  The
initial index when build it about 10.4megs, after accessing about 3 or 4
directories the index size increased to over 100megs, and we did not add
anything!!  All we are doing is querying!!  Yesterday after querying became
ungodly slow, we looked at the index size it had grown from 10megs to 1.5GB
(granted we tested the tool all morning).  But I have no idea why the index
is growing like this.  ANY help would be greatly appreciated.


Thanks,

Rob




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-02 Thread Rob Outar

After building the index for the first time:

_l1d.f1  _l1d.f3  _l1d.f5  _l1d.f7  _l1d.f9   _l1d.fdx  _l1d.frq  _l1d.tii
deletable
_l1d.f2  _l1d.f4  _l1d.f6  _l1d.f8  _l1d.fdt  _l1d.fnm  _l1d.prx  _l1d.tis
segments

After running first query to get all attributes from all files in the given
directory, there were 17 files, each file has 5 attributes so 85 queries
were ran:

_l1j.f1   _l1p.f9   _l21.f3   _l27.fdx  _l2j.f5   _l2p.prx  _l31.f7
_l3j.f1   _l3p.f9   _l41.f3   _l44.fdx
_l1j.f2   _l1p.fdt  _l21.f4   _l27.frq  _l2j.f6   _l2p.tis  _l31.f8
_l3j.f2   _l3p.fdt  _l41.f4   _l44.frq
_l1j.f3   _l1p.fdx  _l21.f5   _l27.prx  _l2j.f7   _l2v.f1   _l31.f9
_l3j.f3   _l3p.fdx  _l41.f5   _l44.prx
_l1j.f4   _l1p.frq  _l21.f6   _l27.tis  _l2j.f8   _l2v.f2   _l31.fdt
_l3j.f4   _l3p.frq  _l41.f6   _l44.tis
_l1j.f5   _l1p.prx  _l21.f7   _l2d.f1   _l2j.f9   _l2v.f3   _l31.fdx
_l3j.f5   _l3p.prx  _l41.f7   _l47.f1
_l1j.f6   _l1p.tis  _l21.f8   _l2d.f2   _l2j.fdt  _l2v.f4   _l31.frq
_l3j.f6   _l3p.tis  _l41.f8   _l47.f2
_l1j.f7   _l1v.f1   _l21.f9   _l2d.f3   _l2j.fdx  _l2v.f5   _l31.prx
_l3j.f7   _l3v.f1   _l41.f9   _l47.f3
_l1j.f8   _l1v.f2   _l21.fdt  _l2d.f4   _l2j.frq  _l2v.f6   _l31.tis
_l3j.f8   _l3v.f2   _l41.fdt  _l47.f4
_l1j.f9   _l1v.f3   _l21.fdx  _l2d.f5   _l2j.prx  _l2v.f7   _l37.f1
_l3j.f9   _l3v.f3   _l41.fdx  _l47.f5
_l1j.fdt  _l1v.f4   _l21.frq  _l2d.f6   _l2j.tis  _l2v.f8   _l37.f2
_l3j.fdt  _l3v.f4   _l41.frq  _l47.f6
_l1j.fdx  _l1v.f5   _l21.prx  _l2d.f7   _l2p.f1   _l2v.f9   _l37.f3
_l3j.fdx  _l3v.f5   _l41.prx  _l47.f7
_l1j.frq  _l1v.f6   _l21.tis  _l2d.f8   _l2p.f2   _l2v.fdt  _l37.f4
_l3j.frq  _l3v.f6   _l41.tis  _l47.f8
_l1j.prx  _l1v.f7   _l27.f1   _l2d.f9   _l2p.f3   _l2v.fdx  _l37.f5
_l3j.prx  _l3v.f7   _l44.f1   _l47.f9
_l1j.tis  _l1v.f8   _l27.f2   _l2d.fdt  _l2p.f4   _l2v.frq  _l37.f6
_l3j.tis  _l3v.f8   _l44.f2   _l47.fdt
_l1p.f1   _l1v.f9   _l27.f3   _l2d.fdx  _l2p.f5   _l2v.prx  _l37.f7
_l3p.f1   _l3v.f9   _l44.f3   _l47.fdx
_l1p.f2   _l1v.fdt  _l27.f4   _l2d.frq  _l2p.f6   _l2v.tis  _l37.f8
_l3p.f2   _l3v.fdt  _l44.f4   _l47.fnm
_l1p.f3   _l1v.fdx  _l27.f5   _l2d.prx  _l2p.f7   _l31.f1   _l37.f9
_l3p.f3   _l3v.fdx  _l44.f5   _l47.frq
_l1p.f4   _l1v.frq  _l27.f6   _l2d.tis  _l2p.f8   _l31.f2   _l37.fdt
_l3p.f4   _l3v.frq  _l44.f6   _l47.prx
_l1p.f5   _l1v.prx  _l27.f7   _l2j.f1   _l2p.f9   _l31.f3   _l37.fdx
_l3p.f5   _l3v.prx  _l44.f7   _l47.tii
_l1p.f6   _l1v.tis  _l27.f8   _l2j.f2   _l2p.fdt  _l31.f4   _l37.frq
_l3p.f6   _l3v.tis  _l44.f8   _l47.tis
_l1p.f7   _l21.f1   _l27.f9   _l2j.f3   _l2p.fdx  _l31.f5   _l37.prx
_l3p.f7   _l41.f1   _l44.f9   deletable
_l1p.f8   _l21.f2   _l27.fdt  _l2j.f4   _l2p.frq  _l31.f6   _l37.tis
_l3p.f8   _l41.f2   _l44.fdt  segments

I have no reason to add anything to the index all I want to do is getch the
attributes for the list of files in that directory.

Thanks,

Rob


-Original Message-
From: Ian Lea [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 02, 2003 9:24 AM
To: Rob Outar
Cc: Lucene Users List
Subject: RE: Indexing Growth


What does the index directory look like before and after running
queries?  Are files growing or being added?  Which files? How many
documents are there in the index before and after? Are you absolutely
100% positive there is no way that your application is adding entries
to the index?  That still has to be the most likely explanation, I think.



--
Ian.
[EMAIL PROTECTED]


 [EMAIL PROTECTED] (Rob Outar) wrote

 Hi all,

   This is too odd and I do not even know where to start.  We built a
Windows
 Explorer type tool that indexes all files in a sabdboxed file system.
 Each Lucene document contains stuff like path, parent directory, last
 modified date, file_lock etc..  When we display the files in a given
 directory through the tool we query the index about 5 times for each file
in
 the repository, this is done so we can display all attributes in the index
 about that file.  So for example if there are 5 files in the directory,
each
 file has 6 attributes that means about 30 term queries are executed.  The
 initial index when build it about 10.4megs, after accessing about 3 or 4
 directories the index size increased to over 100megs, and we did not add
 anything!!  All we are doing is querying!!  Yesterday after querying
became
 ungodly slow, we looked at the index size it had grown from 10megs to
1.5GB
 (granted we tested the tool all morning).  But I have no idea why the
index
 is growing like this.  ANY help would be greatly appreciated.


 Thanks,

 Rob


 -Original Message-
 From: Rob Outar [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, April 01, 2003 3:32 PM
 To: Lucene Users List; [EMAIL PROTECTED]
 Subject: RE: Indexing Growth


 I reuse the same searcher, analyzer and Query object I don't think that
 should cause the problem.

 Thanks,

 Rob


 -Original Message-
 From: Alex Murzaku [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, April 01, 2003 3:22 PM
 To: 'Lucene Users List'
 Subject: RE

RE: Indexing Growth

2003-04-02 Thread Rob Outar

/**
 * Returns true if the index has changed.
 * @return true iff the index has been changed since the IndexSearcher
 * class was created.
 * @build 10
 */
private synchronized boolean hasIndexChanged() {

try {
long temp = IndexReader.lastModified(this.indexLocation);
return temp  this.lastModified;
}
//assume it has changed
catch (IOException e) {
return true;
}
}

/**
 * Checks whether the index has changed since the IndexSearcher was
 * created, if it has IndexSearcher is reinitalized.
 *  @build 10
 */
private synchronized void checkForIndexChange() {

try {
if ( hasIndexChanged()) {
this.searcher = new IndexSearcher(this.indexLocation);
}
}
catch (IOException e) {

}
}



Thanks,

Rob


-Original Message-
From: Ian Lea [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 02, 2003 10:32 AM
To: Rob Outar
Cc: Lucene Users List
Subject: RE: Indexing Growth


They look like the type of file name that would be created
when documents were added to the index.  So I still think
something is adding stuff to your index.  Could it be an
external process as someone suggested?  Does the index
grow even if you don't search?  In the code you posted,
what does checkForIndexChange() do?  Yes, I can guess what
it is supposed to do, but is it perhaps doing something else
as well or instead, directly or indirectly?


--
Ian.

 [EMAIL PROTECTED] (Rob Outar) wrote

 After building the index for the first time:

 _l1d.f1  _l1d.f3  _l1d.f5  _l1d.f7  _l1d.f9   _l1d.fdx  _l1d.frq  _l1d.tii
 deletable
 _l1d.f2  _l1d.f4  _l1d.f6  _l1d.f8  _l1d.fdt  _l1d.fnm  _l1d.prx  _l1d.tis
 segments

 After running first query to get all attributes from all files in the
given
 directory, there were 17 files, each file has 5 attributes so 85 queries
 were ran:

 _l1j.f1   _l1p.f9   _l21.f3   _l27.fdx  _l2j.f5   _l2p.prx  _l31.f7
 _l3j.f1   _l3p.f9   _l41.f3   _l44.fdx
 _l1j.f2   _l1p.fdt  _l21.f4   _l27.frq  _l2j.f6   _l2p.tis  _l31.f8
 _l3j.f2   _l3p.fdt  _l41.f4   _l44.frq
 ...

--
Searchable personal storage and archiving from http://www.digimem.net/



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-02 Thread Rob Outar

I found the freakin problem, I am going to kill my co-worker when he gets
in.  He was removing a field and adding the same field back for each
document in the index in a piece of code I did not notice until
now  He is so dead.  I commented out that piece of code,
queried to my hearts content and the index has not changed.  Heck the tool
is like super fast now.

One last concern is about the re-indexing thing, when does that occur?
optimize()?  I am curious what method would cause a reindex.

I want to thank all of you for your help, it was truly appreciated!

Thanks,

Rob



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Indexing Growth

2003-04-01 Thread Rob Outar

Hi all,

Will the index grow based on queries alone?  I build my index, then run
several queries against it and afterwards I check the size of the index and
in some cases it has grown quite a bit although I did not add anything???

Anyhow please let me know the cases when the index will grow.

Thanks,

Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-01 Thread Rob Outar

Dang I must be doing something crazy cause all my client app does is search
and the index size increases.  I do not add anything.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:07 PM
To: Lucene Users List
Subject: Re: Indexing Growth


Only when you add new documents to it.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hi all,

   Will the index grow based on queries alone?  I build my index, then
 run
 several queries against it and afterwards I check the size of the
 index and
 in some cases it has grown quite a bit although I did not add
 anything???

 Anyhow please let me know the cases when the index will grow.

 Thanks,

 Rob


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://platinum.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Indexing Growth

2003-04-01 Thread Rob Outar

I reuse the same searcher, analyzer and Query object I don't think that
should cause the problem.

Thanks,

Rob


-Original Message-
From: Alex Murzaku [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:22 PM
To: 'Lucene Users List'
Subject: RE: Indexing Growth


I don't know if I remember this correctly: I think for every query
(term) is created a file but the file should disappear after the query
is completed.

-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:13 PM
To: Lucene Users List
Subject: RE: Indexing Growth


Dang I must be doing something crazy cause all my client app does is
search and the index size increases.  I do not add anything.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2003 3:07 PM
To: Lucene Users List
Subject: Re: Indexing Growth


Only when you add new documents to it.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hi all,

   Will the index grow based on queries alone?  I build my index,
then
 run several queries against it and afterwards I check the size of the
 index and
 in some cases it has grown quite a bit although I did not add
 anything???

 Anyhow please let me know the cases when the index will grow.

 Thanks,

 Rob


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://platinum.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

write.lock

2003-03-25 Thread Rob Outar

Hi all,

I am experiencing an odd problem where sometimes the write.lock files gets
left behind.  I have looked over all the my code and I close IndexWriter
after I use it.  I do a lot of batch processing where I write tons of files
to the index.  Has anyone run across this before?  Is IndexWriter the only
class that creates the write.lock file?  When is that write.lock file
deleted?

Let me know.

Thanks,

Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Quick Question On Adding Fields

2003-03-20 Thread Rob Outar

What happens if I add the same name/value pair to a Lucene Document?  Does
it override it?  Does it append it so you have duplicates?

Let me know,


Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Quick Question On Adding Fields

2003-03-20 Thread Rob Outar

I ran a little test where I did:

doc.add(new Field(name,value));
doc.add(new Field(name,value));

Then got a list of the field for that doc and sure enough it is in there
twice.  So it appends whatever value to the field, even if the value already
exists.


Thanks,

Rob


-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 20, 2003 8:53 AM
To: Lucene Users List
Subject: Re: Quick Question On Adding Fields


Rob Outar wrote:

What happens if I add the same name/value pair to a Lucene Document?  Does
it override it?  Does it append it so you have duplicates?

I believe it 'appends' in the sense that if you add 2 fields with the same
name then the Document has the union of the content of both fields
added, and then you can search on anything in either or both of the field
values you added.

One use case is if you're indexing html and you want a field for the
title, a field
for the body, and a easy way for users to refer to both the field and
the body
in a query.

So when you add a Field for the title named title,
you also add one with a name like contents, and then
you add a field for the body named body, and then you
pass the same data and add another field named contents.
Then, voilla, a search on contents:foo returns matches against
the title and the body.


Let me know,


Rob


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Searching for hyphenated terms

2003-03-13 Thread Rob Outar

I had similar problems that were solved with this Analyzer:

 
public TokenStream tokenStream(String field, final Reader reader) {

// do not tokenize any field
TokenStream t = new CharTokenizer(reader) {
protected boolean isTokenChar(char c) {
return true;
}
};

//case insensitive search
t = new LowerCaseFilter(t);
return t;

}

Thanks,
 
Rob 


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 13, 2003 11:22 AM
To: Lucene Users List
Subject: Re: Searching for hyphenated terms


Make a custom Analyzer.  They are super simple to write.
Take pieces of WhitespaceAnalyzer and the Standard one.

Otis

--- Sieretzki, Dionne R, SOLGV [EMAIL PROTECTED] wrote:
 I have seen some previous postings about Escape woes and Hyphens
 not matching, but I haven't seen any resolutions to an issue I've
 been trying to work out.  
 
 I don't want my search field to be case sensitive, so I used
 StandardAnalyzer.  The search field also has corresponding entries
 that may or may not contain hyphens or other special characters. If
 the field is not tokenized, very few search terms result in matches. 
 It appears that terms are only matched if a wildcard is used, such
 as:
 
 Entered: ADOG  / Actual Query is: adog / No match on an exact term
 Entered: ADOG* / Actual Query is: ADOG* /  Match found
 Entered: AAA-ADOG / Actual Query is: aaa -adog / No match 
 Entered: AAA-ADOG / Actual Query is: aaa adog / No match 
 Entered: AAA?ADOG /  Actual Query is: aaa?adog / Match found
 Entered: DOG.2  / Actual Query is: dog.2 / No match 
 Entered: DOG?2 / Actual Query is: DOG?2 /  Match found
 
 
 If the field is tokenized, then even more mixed results are produced.
 
 Entered: ADOG / Actual Query is: adog / Match found for exact term
 Entered: ADOG* / Acutal Query is: ADOG* / No match
 Entered: AAA-ADOG / Actual Query is: aaa -adog / Match found
 Entered: AAA-ADOG / Actual Query is: aaa adog / Match found
 Entered: DOG.2 / Actual Query is: adog.2  / Match found
 Entered: AAA-DOG-BBB / Actual Query is: aaa -dog -bbb / No match
 Entered:  AAA-DOG-BBB / Actual Query is: aaa dog bbb / No match
 Entered: ADOG-I40 / Actual Query is: adog -i40 / Incorrect matches
 Entered: ADOG-I40 / Actual Query is: adog-i40 / Match found for
 exact term
 
 
 Can anyone recommend the right Analyzer to use that isn't case
 sensitive and matches on both hyphenated and non-hyphenated terms?
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Web Hosting - establish your business online
http://webhosting.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: OutOfMemoryException while Indexing an XML file

2003-02-18 Thread Rob Outar

We are aware of DOM limitations/memory problems, but I am using SAX to parse
the file and index elements and attributes in my content handler.

Thanks,

Rob

-Original Message-
From: Tatu Saloranta [mailto:[EMAIL PROTECTED]]
Sent: Friday, February 14, 2003 8:18 PM
To: Lucene Users List
Subject: Re: OutOfMemoryException while Indexing an XML file


On Friday 14 February 2003 07:27, Aaron Galea wrote:
 I had this problem when using xerces to parse xml documents. The problem I
 think lies in the Java garbage collector. The way I solved it was to
create

It's unlikely that GC is the culprit. Current ones are good at purging
objects
that are unreachable, and only throw OutOfMem exception when they really
have
no other choice.
Usually it's the app that has some dangling references to objects that
prevent
GC from collecting objects not useful any more.

However, it's good to note that Xerces (and DOM parsers in general)
generally
use more memory than the input XML files they process; this because they
usually have to keep the whole document struct in memory, and there is
overhead on top of text segments. So it's likely to be at least 2 * input
file size (files usually use UTF-8 which most of the time uses 1 byte per
char; in memory 16-bit unicode-2 chars are used for performance), plus some
additional overhead for storing element structure information and all that.

And since default max. java heap size is 64 megs, big XML files can cause
problems.

More likely however is that references to already processed DOM trees are
not
nulled in a loop or something like that? Especially if doing one JVM process
for item solves the problem.

 a shell script that invokes a java program for each xml file that adds it
 to the index.

-+ Tatu +-


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: OutOfMemoryException while Indexing an XML file

2003-02-14 Thread Rob Outar

So to the best of your knowledge the Lucene Document Object should not cause
the exception even though the XML file is huge and 1000's of fields are
being added to the Lucene Document Object?

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Friday, February 14, 2003 8:21 AM
To: Lucene Users List
Subject: Re: OutOfMemoryException while Indexing an XML file


Nothing in the code snippet you sent would cause that exception.
If I were you I'd run it under a profiler to quickly see where the leak
is.  You can even use something free like JMP.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hi all,

   I was using the sample code provided I believe by Doug Cutting to
 index an
 XML file, the XML file was 2 megs (kinda large) but while adding
 fields to
 the Document object I got an OutOfMemoryException exception.  I work
 with
 XML files a lot, I can easily parse that 2 meg file into a DOM tree,
 I can't
 imagine a Lucene document being larger than a DOM Tree, pasted below
 is the
 SAX handler.

 public class XMLDocumentBuilder
 extends DefaultHandler {

 /** A buffer for each XML element */
 private StringBuffer elementBuffer = new StringBuffer();

 private Document mDocument;


 public void buildDocument(Document doc, String xmlFile) throws
 IOException,
 SAXException {

 this.mDocument = doc;
 SAXReader.parse(xmlFile, this);
 }

 public void startElement(String uri, String localName, String
 qName,
 Attributes atts) {

 elementBuffer.setLength(0);

 if (atts != null) {

 for (int i = 0; i  atts.getLength(); i++) {

 String attname = atts.getLocalName(i);
 mDocument.add(new Field(attname, atts.getValue(i),
 true, true, true));
 }
 }
 }

 // call when cdata found
 public void characters(char[] text, int start, int length) {
 elementBuffer.append(text, start, length);
 }

 public void endElement(String uri, String localName, String
 qName) {
 mDocument.add(Field.Text(localName,
 elementBuffer.toString()));
 }
 public Document getDocument() {
 return mDocument;
 }
 }

 Any help would be appreciated.

 Thanks,

 Rob


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Shopping - Send Flowers for Valentine's Day
http://shopping.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

write.lock file

2002-12-17 Thread Rob Outar

Hello all,

This is the first time I have encountered this in 3 months of testing, the
above file got created, not sure how or when, but every time I try to write
to the index I get an IOException about the indexing being locked.  It is
obviously due to that file but what would cause that lock to get created and
not removed?

Let me know.

Thanks,

Rob


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

RE: Searches are not case insensitive

2002-11-25 Thread Rob Outar

From briefly looking at the code it looks like the field does not get
touched it seems like the only part that gets converted to lower case is the
value, so I am assuming that the field name is case sensitive but the value
is not?


Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 25, 2002 8:25 AM
To: Lucene Users List
Subject: Re: Searches are not case insensitive


Why not add print statements to your analyzer to ensure that what you
think is happening really is happening?  Token has an attribute called
'text' that you could print, I believe.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hello all,

   I created the following analyzer so that clients could pose case
 insensitive searches but queries are still case sensitive:

   // do not tokenize any field
 TokenStream t = new CharTokenizer(reader) {
 protected boolean isTokenChar(char c) {
 return true;
 }
 };
 //case insensitive search
 t = new LowerCaseFilter(t);

 return t;
 }

 I use that index when I create a new instance of IndexWriter and when
 I use
 QueryPaser, I am not sure why my searches are still case dependent.

 Any help would be appreciated.

 Thanks,

 Rob


 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Mail Plus  Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

RE: Slash Problem

2002-11-25 Thread Rob Outar

I don't know if this helps but I had exact same problem, I then stored the
URI instead of the path, I was then able to search on the URI.

Thanks,

Rob


-Original Message-
From: Terry Steichen [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 25, 2002 11:53 AM
To: Lucene Users Group
Subject: Slash Problem


I've got a Text field (tokenized, indexed, stored) called 'path' which
contains a string in the form of '1102\A3345-12RT.XML'.  When I submit a
query like path:1102* it works fine.  But, when I try to be more specific
(such as path:1102\a* or path:1102*a*) it fails.  I've tried escaping
the slash (path:1102\\a*) but that also fails.

I'm using the StandardAnalyzer and the default QueryParser.  Could anyone
suggest what's going wrong here?

Regards,

Terry



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

How does delete work?

2002-11-22 Thread Rob Outar

Hello all,

I used the delete(Term) method, then I looked at the index files, only one
file changed _1tx.del  I found references to the file still in some of the
index files, so my question is how does Lucene handle deletes?

Thanks,

Rob


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

Updating documents

2002-11-22 Thread Rob Outar

I have something odd going on, I have code that updates documents in the
index so I have to delete it and then re add it.  When I re-add the document
I immediately do a search on the newly added field which fails.  However, if
I rerun the query a second time it works??  I have the Searcher class as an
attribute of my search class, does it not see the new changes?  Seems like
when it is reinitialized with the changed index it is then able to search on
the newly added field??

Let me know if anyone has encountered this.

Thanks,

Rob



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

RE: Updating documents

2002-11-22 Thread Rob Outar

There is a reloading issue but I do not think lastModified is it:

static long lastModified(Directory directory)
  Returns the time the index in this directory was last modified.
static long lastModified(File directory)
  Returns the time the index in the named directory was last
modified.
static long lastModified(String directory)
  Returns the time the index in the named directory was last
modified.

Do I need to create a new instance of IndexSearcher each time I search?

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 22, 2002 12:20 PM
To: Lucene Users List
Subject: Re: Updating documents


Don't you have to make use of lastModified method (I think in
IndexSearcher), to 'reload' your instance of IndexSearcher?  I'm
pulling this from some old, not very fresh memory

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 I have something odd going on, I have code that updates documents in
 the
 index so I have to delete it and then re add it.  When I re-add the
 document
 I immediately do a search on the newly added field which fails.
 However, if
 I rerun the query a second time it works??  I have the Searcher class
 as an
 attribute of my search class, does it not see the new changes?  Seems
 like
 when it is reinitialized with the changed index it is then able to
 search on
 the newly added field??

 Let me know if anyone has encountered this.

 Thanks,

 Rob



 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Mail Plus  Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

A little date help

2002-11-20 Thread Rob Outar

Hello all,

I am indexing the date using the java.io.file.lastModified() method

doc.add(new Field(MODIFIED_DT,
DateField.timeToString(f.lastModified()), true, true, true));

I am trying to search on this field, but I am having a hard time formatting
the date correctly.  I am not sure what date format lastModified() uses so
trying to come up with a query in milliseconds for the above date field is
difficult.

Has anyone run into this problem?  Is there an easier way to do this?

Let me know,

Rob


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

RE: Searching with Multiple Queries

2002-11-18 Thread Rob Outar

Has anyone gotten a chance to review the below to make sure I am not doing
something crazy.

Thanks,

Rob

-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 15, 2002 12:59 PM
To: Lucene Users List
Subject: RE: Searching with Multiple Queries


I did this and it works now I need you guys, the experts :-) to let me know
if I am doing something terribly wrong:

Analyzer:

public TokenStream tokenStream(String field, final Reader reader) {
// do not tokenize any field
return new CharTokenizer(reader) {
protected boolean isTokenChar(char c) {
return true;
}
};
}


Query:

releaseability:US Gov only

the above returns hits.

Let me know.

Thanks,

Rob


-Original Message-
From: Aaron Galea [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 15, 2002 10:53 AM
To: Lucene Users List
Subject: Re: Searching with Multiple Queries


Rob I was reading again the mail and I think I didn't reply exactly to your
question. In the code sent you can remove completely the StandardTokenizer()
or else modify the code from JGuru itself. However I can't really tell you
myself the effect this will have on your searches or indexing. Perhaps
someone else might...

Aaron
- Original Message -
From: Aaron Galea [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Friday, November 15, 2002 4:35 PM
Subject: Re: Searching with Multiple Queries


 Hi Rob

 Here is how I think in my case I will do it but the code is not tested so
it
 might not work:

 1. Create a filter class
 class SearcherFilter extends Filter {

 protected String Directory;

 public SearcherFilter(String dir) {
   Directory = dir;
 }

 public BitSet bits(IndexReader reader) throws IOException {

   BitSet bits = new BitSet(reader.maxDoc());

   TermDocs termDocs = reader.termDocs();
   while (termDocs.next()) {
   int iDoc = termDocs.doc();
   org.apache.lucene.document.Document doc = reader.document(iDoc);

   Field fldDirectory = doc.getField(Directory);
   String str = fldDirectory.stringValue();
   if (str.startsWith(Directory)){
 bits.set(iDoc);
   }
   }

   return bits;

 }

 }


 2. Create an Anlayzer class



 class SearcherAnalyzer extends Analyzer {
 /*
  * An array containing some common words that
  * are not usually useful for searching.
  */
 private static final String[] STOP_WORDS =
 {
   a   , and , are , as  ,
   at  , be  , but , by  ,
   for , if  , in  , into,
   is  , it  , no  , not ,
   of  , on  , or  , s   ,
   such, t   , that, the ,
   their   , then, there   , these   ,
   they, this, to  , was ,
   will,
   with
 };

 /*
  * Stop table
  */
 final static private Hashtable stopTable =
 StopFilter.makeStopTable(STOP_WORDS);

 /*
  * create a token stream for this analyser
  */
 public final TokenStream tokenStream(final Reader reader) {
   try {
   TokenStream result = new StandardTokenizer(reader);

   result = new StandardFilter(result);
   result = new LowerCaseFilter(result);
   result = new StopFilter(result,stopTable);
   result = new PorterStemFilter(result);

   return result;
   } catch (Exception e) {
   return null;
   }
 }
 }


 3. In the main code use it this way:

   IndexSearcher searcher =new IndexSearcher(indexLocation);
   Query qry = QueryParser.parse(question, body, new
 SearcherAnalyzer());

   Hits hits = searcher.search(qry, new SearcherFilter(directory));


 In your case if you do not want for example to use the LetterTokenizer()
do
 not included in the tokenStream method of the Anlayzer.

 Hope this helps,

 Aaron

 - Original Message -
 From: Rob Outar [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Friday, November 15, 2002 4:13 PM
 Subject: RE: Searching with Multiple Queries


  For example JGuru has this:
 
  public class MyAnalyzer extends Analyzer
  {
  private static final Analyzer STANDARD = new StandardAnalyzer();
 
  public TokenStream tokenStream(String field, final Reader reader)
  {
  // do not tokenize field called 'element'
  if (element.equals(field))
  {
  return new CharTokenizer(reader)
  {
  protected boolean isTokenChar(char c)
  {
  return true;
  }
  };
  }
  else
  {
  // use standard analyzer
  return STANDARD.tokenStream(field, reader);
  }
  }
  }
 
 
  I do not want any of my fields toekenized for now

RE: Not getting any results from query

2002-11-18 Thread Rob Outar

I did not see where it said that I saw this:
'AND', 'OR', 'NOT', and FieldNames are case sensitive. Terms are case
sensitive unless the lower case token filter is used during indexing and
search.
Field names are case sensitive.

Even if it is the query:

releaseability:Test R*

should be valid.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 18, 2002 1:53 PM
To: Lucene Users List
Subject: RE: Not getting any results from query


Aren't wildcards case sensitive?  Check the FAQ.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Thanks for all the good information/advice everyone, have one more
 little
 thing, below is my analyzer:

   public TokenStream tokenStream(String field, final Reader reader) {
 // do not tokenize any field
 TokenStream t = new CharTokenizer(reader) {
 protected boolean isTokenChar(char c) {
 return true;
 }
 };
   //case insensitive search
   t = new LowerCaseFilter(t);
 return t;
 }

 Field name = releaseability Value = Test Releaseability;

 How the field is set up:

 doc.add(new Field(releaseability, Test Releaseability, true,
 true,
 true));

 This query works:

 releaseability:Test*

 however this one does not:

 releaseability:Test R*

 Any ideas why?

 Thanks,

 Rob


 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

RE: Not getting any results from query

2002-11-18 Thread Rob Outar

Does not work either, I think it has something to do with the space between
the two words.

This fails test r*

but
test*r* works.

Understanding how the internal of Lucene work is one difficult task but this
group does help a lot.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 18, 2002 2:52 PM
To: Lucene Users List
Subject: RE: Not getting any results from query


How does releaseability:test r* work?
Returns anything?
http://www.jguru.com/faq/view.jsp?EID=538312

Otis


--- Rob Outar [EMAIL PROTECTED] wrote:
 I did not see where it said that I saw this:
 'AND', 'OR', 'NOT', and FieldNames are case sensitive. Terms are case
 sensitive unless the lower case token filter is used during indexing
 and
 search.
 Field names are case sensitive.

 Even if it is the query:

 releaseability:Test R*

 should be valid.

 Thanks,

 Rob


 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
 Sent: Monday, November 18, 2002 1:53 PM
 To: Lucene Users List
 Subject: RE: Not getting any results from query


 Aren't wildcards case sensitive?  Check the FAQ.

 Otis

 --- Rob Outar [EMAIL PROTECTED] wrote:
  Thanks for all the good information/advice everyone, have one more
  little
  thing, below is my analyzer:
 
public TokenStream tokenStream(String field, final Reader reader)
 {
  // do not tokenize any field
  TokenStream t = new CharTokenizer(reader) {
  protected boolean isTokenChar(char c) {
  return true;
  }
  };
  //case insensitive search
  t = new LowerCaseFilter(t);
  return t;
  }
 
  Field name = releaseability Value = Test Releaseability;
 
  How the field is set up:
 
  doc.add(new Field(releaseability, Test Releaseability,
 true,
  true,
  true));
 
  This query works:
 
  releaseability:Test*
 
  however this one does not:
 
  releaseability:Test R*
 
  Any ideas why?
 
  Thanks,
 
  Rob
 
 
  --
  To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
 


 __
 Do you Yahoo!?
 Yahoo! Web Hosting - Let the expert host your site
 http://webhosting.yahoo.com

 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]


 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

RE: Not getting any results from query

2002-11-18 Thread Rob Outar

I am using the QueryParser class and for that query test r* it is forming
a boolean query, not a prefix query.  The problem is I allow clients to
search on whatever they define, if I knew the fields they were searching on
ahead of time then I could use classes that extend Query, but since I do not
know I am forced to use QueryParser class.

Thanks,

Rob


-Original Message-
From: Rob Outar [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 18, 2002 3:03 PM
To: Lucene Users List
Subject: RE: Not getting any results from query


Does not work either, I think it has something to do with the space between
the two words.

This fails test r*

but
test*r* works.

Understanding how the internal of Lucene work is one difficult task but this
group does help a lot.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 18, 2002 2:52 PM
To: Lucene Users List
Subject: RE: Not getting any results from query


How does releaseability:test r* work?
Returns anything?
http://www.jguru.com/faq/view.jsp?EID=538312

Otis


--- Rob Outar [EMAIL PROTECTED] wrote:
 I did not see where it said that I saw this:
 'AND', 'OR', 'NOT', and FieldNames are case sensitive. Terms are case
 sensitive unless the lower case token filter is used during indexing
 and
 search.
 Field names are case sensitive.

 Even if it is the query:

 releaseability:Test R*

 should be valid.

 Thanks,

 Rob


 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
 Sent: Monday, November 18, 2002 1:53 PM
 To: Lucene Users List
 Subject: RE: Not getting any results from query


 Aren't wildcards case sensitive?  Check the FAQ.

 Otis

 --- Rob Outar [EMAIL PROTECTED] wrote:
  Thanks for all the good information/advice everyone, have one more
  little
  thing, below is my analyzer:
 
public TokenStream tokenStream(String field, final Reader reader)
 {
  // do not tokenize any field
  TokenStream t = new CharTokenizer(reader) {
  protected boolean isTokenChar(char c) {
  return true;
  }
  };
  //case insensitive search
  t = new LowerCaseFilter(t);
  return t;
  }
 
  Field name = releaseability Value = Test Releaseability;
 
  How the field is set up:
 
  doc.add(new Field(releaseability, Test Releaseability,
 true,
  true,
  true));
 
  This query works:
 
  releaseability:Test*
 
  however this one does not:
 
  releaseability:Test R*
 
  Any ideas why?
 
  Thanks,
 
  Rob
 
 
  --
  To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
 


 __
 Do you Yahoo!?
 Yahoo! Web Hosting - Let the expert host your site
 http://webhosting.yahoo.com

 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]


 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]



__
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]

RE: Searching with Multiple Queries

2002-11-15 Thread Rob Outar

I thought this was my problem :-), anyhow can I just write an analyzer that
does not tokenize the search string and use it with QueryPaser?

Thanks,

Rob

-Original Message-
From: Aaron Galea [mailto:agale;nextgen.net.mt]
Sent: Friday, November 15, 2002 9:44 AM
To: Lucene Users List
Subject: Re: Searching with Multiple Queries


Ok I will let you know the result

thanks
Aaron
- Original Message -
From: Otis Gospodnetic [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Friday, November 15, 2002 3:37 PM
Subject: Re: Searching with Multiple Queries


 I say: try it :)

 Otis

 --- Aaron Galea [EMAIL PROTECTED] wrote:
  I am not sure but I was going to do it by using a QueryParser and
  creating a
  filter that iterates over the documents. For each document I check
  the
  directory field and use the String.startsWith() function to make it
  kinda
  work like Prefix query. The Query and the Filter are then used in the
  IndexSearcher. Have not tried it yet but I think it will work, what
  do you
  say?
 
  Thanks
  Aaron
 
 
  - Original Message -
  From: Otis Gospodnetic [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Friday, November 15, 2002 3:06 PM
  Subject: Re: Searching with Multiple Queries
 
 
   Sounds like 2 queries to me.
   You could do a prefix AND phrase, but that won't be exactly the
  same as
   doing a phrase query on subset of results of prefix query.
  
   Otis
  
   --- Aaron Galea [EMAIL PROTECTED] wrote:
Hi everyone,
   
I have indexed my documents using a hierarchical indexing by
  adding a
directory field that is indexible but non-tokenized as suggested
  in
the FAQ. Now I want to do a search first using a prefix query and
then apply Phrase query on the returning results. Is this
  possible?
Can it be applied at one go? Not sure whether
  MultiFieldQueryParser
can be used this way. Any suggestions???
   
Thanks
Aaron
   
  
  
   __
   Do you Yahoo!?
   Yahoo! Web Hosting - Let the expert host your site
   http://webhosting.yahoo.com
  
   --
   To unsubscribe, e-mail:
  mailto:lucene-user-unsubscribe;jakarta.apache.org
   For additional commands, e-mail:
  mailto:lucene-user-help;jakarta.apache.org
  
   ---
   [This E-mail was scanned for spam and viruses by NextGen.net.]
  
  
  
 
 
  ---
  [This E-mail was scanned for spam and viruses by NextGen.net.]
 
 
  --
  To unsubscribe, e-mail:
  mailto:lucene-user-unsubscribe;jakarta.apache.org
  For additional commands, e-mail:
  mailto:lucene-user-help;jakarta.apache.org
 


 __
 Do you Yahoo!?
 Yahoo! Web Hosting - Let the expert host your site
 http://webhosting.yahoo.com

 --
 To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
 For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org

 ---
 [This E-mail was scanned for spam and viruses by NextGen.net.]





---
[This E-mail was scanned for spam and viruses by NextGen.net.]


--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Searching with Multiple Queries

2002-11-15 Thread Rob Outar

I did this and it works now I need you guys, the experts :-) to let me know
if I am doing something terribly wrong:

Analyzer:

public TokenStream tokenStream(String field, final Reader reader) {
// do not tokenize any field
return new CharTokenizer(reader) {
protected boolean isTokenChar(char c) {
return true;
}
};
}


Query:

releaseability:US Gov only

the above returns hits.

Let me know.

Thanks,

Rob


-Original Message-
From: Aaron Galea [mailto:agale;nextgen.net.mt]
Sent: Friday, November 15, 2002 10:53 AM
To: Lucene Users List
Subject: Re: Searching with Multiple Queries


Rob I was reading again the mail and I think I didn't reply exactly to your
question. In the code sent you can remove completely the StandardTokenizer()
or else modify the code from JGuru itself. However I can't really tell you
myself the effect this will have on your searches or indexing. Perhaps
someone else might...

Aaron
- Original Message -
From: Aaron Galea [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Friday, November 15, 2002 4:35 PM
Subject: Re: Searching with Multiple Queries


 Hi Rob

 Here is how I think in my case I will do it but the code is not tested so
it
 might not work:

 1. Create a filter class
 class SearcherFilter extends Filter {

 protected String Directory;

 public SearcherFilter(String dir) {
   Directory = dir;
 }

 public BitSet bits(IndexReader reader) throws IOException {

   BitSet bits = new BitSet(reader.maxDoc());

   TermDocs termDocs = reader.termDocs();
   while (termDocs.next()) {
   int iDoc = termDocs.doc();
   org.apache.lucene.document.Document doc = reader.document(iDoc);

   Field fldDirectory = doc.getField(Directory);
   String str = fldDirectory.stringValue();
   if (str.startsWith(Directory)){
 bits.set(iDoc);
   }
   }

   return bits;

 }

 }


 2. Create an Anlayzer class



 class SearcherAnalyzer extends Analyzer {
 /*
  * An array containing some common words that
  * are not usually useful for searching.
  */
 private static final String[] STOP_WORDS =
 {
   a   , and , are , as  ,
   at  , be  , but , by  ,
   for , if  , in  , into,
   is  , it  , no  , not ,
   of  , on  , or  , s   ,
   such, t   , that, the ,
   their   , then, there   , these   ,
   they, this, to  , was ,
   will,
   with
 };

 /*
  * Stop table
  */
 final static private Hashtable stopTable =
 StopFilter.makeStopTable(STOP_WORDS);

 /*
  * create a token stream for this analyser
  */
 public final TokenStream tokenStream(final Reader reader) {
   try {
   TokenStream result = new StandardTokenizer(reader);

   result = new StandardFilter(result);
   result = new LowerCaseFilter(result);
   result = new StopFilter(result,stopTable);
   result = new PorterStemFilter(result);

   return result;
   } catch (Exception e) {
   return null;
   }
 }
 }


 3. In the main code use it this way:

   IndexSearcher searcher =new IndexSearcher(indexLocation);
   Query qry = QueryParser.parse(question, body, new
 SearcherAnalyzer());

   Hits hits = searcher.search(qry, new SearcherFilter(directory));


 In your case if you do not want for example to use the LetterTokenizer()
do
 not included in the tokenStream method of the Anlayzer.

 Hope this helps,

 Aaron

 - Original Message -
 From: Rob Outar [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Friday, November 15, 2002 4:13 PM
 Subject: RE: Searching with Multiple Queries


  For example JGuru has this:
 
  public class MyAnalyzer extends Analyzer
  {
  private static final Analyzer STANDARD = new StandardAnalyzer();
 
  public TokenStream tokenStream(String field, final Reader reader)
  {
  // do not tokenize field called 'element'
  if (element.equals(field))
  {
  return new CharTokenizer(reader)
  {
  protected boolean isTokenChar(char c)
  {
  return true;
  }
  };
  }
  else
  {
  // use standard analyzer
  return STANDARD.tokenStream(field, reader);
  }
  }
  }
 
 
  I do not want any of my fields toekenized for now, so I was thinking
about
  use the above code with a few slight modifications...
 
  Thanks,
 
  Rob
 
 
  -Original Message-
  From: Rob Outar [mailto:routar;ideorlando.org]
  Sent: Friday, November 15, 2002 10:10 AM
  To: Lucene Users List
  Subject: RE: Searching

Not getting any results from query

2002-11-14 Thread Rob Outar

Hello all,

I am storing the field in this fashion:

  doc.add(new Field(releaseability, releaseability, true, true,
false));

so it is indexed and stored but not tokenized.

The value is Test Releaseability;

I am using the query releaseability:test releaseability

I am not getting any results, is my query wrong?

Let me know.

Thanks,

Rob





--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Multiple field searches using AND and OR's

2002-11-13 Thread Rob Outar

Looked at that already the format is this:

public static Query parse(String query,
  String[] fields,
  Analyzer analyzer)
   throws ParseExceptionParses a query which searches on the
fields specified.

If x fields are specified, this effectively constructs:


 (field1:query) (field2:query) (field3:query)...(fieldx:query)

my query value will not be the same.  This lets u query multiple field
with the same query, my query string will be different

f_name = rob and l_name = outar or address = some value

stuff like that.

Plus there is no way of specifying OR and AND's.


Thanks,

Rob O
-Original Message-
From: Kelvin Tan [mailto:kelvin-lists;relevanz.com]
Sent: Wednesday, November 13, 2002 9:42 AM
To: Lucene Users List
Subject: Re: Multiple field searches using AND and OR's


Rob,

I believe MultiFieldQueryParser will do the job for you...

Regards,
Kelvin


On Wed, 13 Nov 2002 08:58:36 -0500, Rob Outar said:
Hello all,

I am wondering how I would do multiple field searches of the
form:

field1 = value and field2 = value2 or field2 = value3

I am thinking that each one of the above would be a term query but
how would I string them together with AND's and OR's?

Any help would be appreciated.

Thanks,

Rob

PS I found this in the FAQ, but I was wondering if there was any
other way to do it:

My documents have multiple fields, do I have to replicate a query
for each of them ?
Not necessarily. A simple solution is to index the documents using a
general field that contains a concatenation of the content of all
the searchable fields ('author', 'title', 'body' etc). This way, a
simple query will search in entire document content.

The disadvantage of this method is that you cannot boost certain
fields relative to others. Note also the matches in longer documents
results in lower ranking.




--
To unsubscribe, e-mail:   mailto:lucene-user-
[EMAIL PROTECTED] For additional commands, e-mail:
mailto:lucene-user-
[EMAIL PROTECTED]




--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Several fields with the same name

2002-11-06 Thread Rob Outar

Would the solution be to call Document.fields(), iterate through that enum
and get my data?


Thanks,

Rob


-Original Message-
From: Rob Outar [mailto:routar;ideorlando.org]
Sent: Wednesday, November 06, 2002 2:46 PM
To: Lucene Users List
Subject: Several fields with the same name


Hello all,

I have a relationship where for one key there are many values, basically a
1 to many relationship.  For example with the key = name, value = bob, jim,
etc..

When a client wants all the values that have been associated with the field
name, how would I get that?  The javadoc for Document.get(String name)
states:

Returns the string value of the field with the given name if any exist 
in
this document, or   null.   If multiple fields may exist with this name, 
this
method returns the last added   suchadded.

I don't need the last field's value, I need all values associated with that
field.

Any help would be appreciated.

Thanks,

Rob



--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Several fields with the same name

2002-11-06 Thread Rob Outar

Cool, so it will keep getting the last value excluding the one it just
fetched?

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodnetic;yahoo.com]
Sent: Wednesday, November 06, 2002 2:57 PM
To: Lucene Users List
Subject: Re: Several fields with the same name


Looking at the source if looks like you can just call it multiple times
until it returns null.

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hello all,

   I have a relationship where for one key there are many values,
 basically a
 1 to many relationship.  For example with the key = name, value =
 bob, jim,
 etc..

   When a client wants all the values that have been associated with
 the field
 name, how would I get that?  The javadoc for Document.get(String
 name)
 states:

   Returns the string value of the field with the given name if any
 exist in
 this document, or null.   If multiple fields may exist with this
 name, this
 method returns the last added suchadded.

   I don't need the last field's value, I need all values associated
 with that
 field.

 Any help would be appreciated.

 Thanks,

 Rob



 --
 To unsubscribe, e-mail:
 mailto:lucene-user-unsubscribe;jakarta.apache.org
 For additional commands, e-mail:
 mailto:lucene-user-help;jakarta.apache.org



__
Do you Yahoo!?
HotJobs - Search new jobs daily now
http://hotjobs.yahoo.com/

--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

Lucene and XML

2002-10-30 Thread Rob Outar

Hello all,

I did not know there were packages like ISOGEN that used Lucene to build a
searchable index based on XML files.  From visiting ISOGEN's website it
looks like it is a commercial software, are there any open source extensions
to Lucene that allow XML indexing and searching?

Please let me know.

Thanks again,

Rob


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: User Base

2002-10-25 Thread Rob Outar

Is Lucene GPL or Apache?

Thanks,

Rob


-Original Message-
From: Craig Walls [mailto:wallsc;michaels.com]
Sent: Friday, October 25, 2002 10:32 AM
To: Lucene Users List
Subject: Re: User Base



Absolutely--I'm very aware of how the various OS licenses work and we avoid
GPL like the
plague. In fact, doing a quick mental inventory of the OS stuff we've used,
I believe
that all of it has been under the Apache license. We tinkered with an LGPL
project once,
but never actually used it in production code.



Robert A. Decker wrote:

 Your boss should be very worried about the software being brought into
 your projects - not because of security but because of viral licenses. The
 GPL is particularly heinous. The apache and FreeBSD licenses are
 excellent. Take a look at:
 http://www.oreillynet.com/lpt/a//policy/2001/12/12/transition.html
 http://www.apache.org/foundation/licence-FAQ.html

 thanks,
 rob

 http://www.robdecker.com/
 http://www.planetside.com/

 On Fri, 25 Oct 2002, Craig Walls wrote:

 
  Unofficially, my company is using Lucene for searching for products and
projects on
  our web-site. By unofficially I mean that while my boss knows that
we're using
  Lucene, my boss' boss doesn't know because he's very reluctant to buy
into this
  open-source thing. (We've used other OS projects in our own projects as
well...it's
  been a don't ask, don't tell kinda thing.)
 
  We launched our new search about 2 weeks ago and it rocks! In the end,
we've fully
  met and in many cases exceeded expectations with Lucene, but they just
don't know
  that we're using Lucene.
 
  Rob Outar wrote:
 
   All,
  
   I am trying to sell my lead on using this awesome
search\indexing engine,
   but he wants to know the user base for this product.  He wants to be
assured
   that if we choose this products that it will not go away, and that we
will
   have some form of support.  Worst case of course is we do have the
source.
   Anyhow if anyone can let me know what the user base is, or anything
that
   would lead to some assurance for him, I would greatly appreciate it.
  
   Thanks,
  
   Rob
  
   --
   To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
   For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org
 
 
  --
  To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
  For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org
 

 --
 To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
 For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Error when trying to match file path

2002-10-24 Thread Rob Outar

Thanks for the reply, but I am already using a standard analyzer:

 Analyzer analyzer =new StandardAnalyzer();

The path is being stored as a string so I do not know why I cannot find a
match when I used the path as a query?  Do I need to phrase the query
differently?

QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);

Let me know.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodnetic;yahoo.com]
Sent: Wednesday, October 23, 2002 5:54 PM
To: Lucene Users List
Subject: Re: Error when trying to match file path


http://www.jguru.com/faq/view.jsp?EID=538308

--- Rob Outar [EMAIL PROTECTED] wrote:
 Hi all,

 I am indexing the filepath with the below:

  Document doc = new Document();
 doc.add(Field.UnIndexed(path, f.getAbsolutePath()));

 I then try to run the following after building the index:

  this.query =
 QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
 Hits hits = this.searcher.search(this.query);

 It returns zero hits?!?

 What am I doing wrong?

 Any help would be appreciated.

 Thanks,

 Rob



__
Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site
http://webhosting.yahoo.com/

--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Error when trying to match file path

2002-10-24 Thread Rob Outar

Some more information, with the following:

this.query =
QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
System.out.println(this.query.toString(path));
I got:
F:onesaf dev block b pair dev unittestdatafiles tools unitcomposer.xml

So it looks like the Query Parser is stripping out all the \, and doing
something with the F:\, would anyone happen to know why this is happening?
Do I need to use a different query to get the infromation I need?

Thanks,

Rob

-Original Message-
From: Rob Outar [mailto:routar;ideorlando.org]
Sent: Wednesday, October 23, 2002 5:48 PM
To: [EMAIL PROTECTED]
Subject: Error when trying to match file path


Hi all,

I am indexing the filepath with the below:

 Document doc = new Document();
doc.add(Field.UnIndexed(path, f.getAbsolutePath()));

I then try to run the following after building the index:

 this.query =
QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
Hits hits = this.searcher.search(this.query);

It returns zero hits?!?

What am I doing wrong?

Any help would be appreciated.

Thanks,

Rob


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Error when trying to match file path

2002-10-24 Thread Rob Outar

I cannot get this to work for the life of me.  I am using a Standard
analyzer now.

Question 1:  What field type should the path be it be?
 doc.add(Field.UnIndexed(path, f.getAbsolutePath())); where f is a
file object

Question 2:  What should the query be to retieve that one file?

Term:

Term t = new Term(path,file.getAbsolutePath());
TermQuery tQ = new TermQuery(t);
System.out.println(tQ.toString(path));
Hits hits = this.searcher.search(tQ);
System.out.println(hits.length() +  total matching documents);
or:

/*  QueryParser parser = new QueryParser();
System.out.println(file.getAbsolutePath());
this.query = parser.parse(file.getAbsolutePath());
System.out.println(this.query.toString(path));
Hits hits = this.searcher.search(this.query);
System.out.println(hits.length() +  total matching documents);
Document doc = hits.doc(0);
System.out.println(class =  + doc.get(classification));*/

If I can get past this hurdle I will so be on my way.


Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodnetic;yahoo.com]
Sent: Thursday, October 24, 2002 10:37 AM
To: Lucene Users List
Subject: RE: Error when trying to match file path


The Analyzer is stripping your \ characters.  Query Parser doesn't do
that...

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Some more information, with the following:

   this.query =
 QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
 System.out.println(this.query.toString(path));
 I got:
 F:onesaf dev block b pair dev unittestdatafiles tools
 unitcomposer.xml

 So it looks like the Query Parser is stripping out all the \, and
 doing
 something with the F:\, would anyone happen to know why this is
 happening?
 Do I need to use a different query to get the infromation I need?

 Thanks,

 Rob

 -Original Message-
 From: Rob Outar [mailto:routar;ideorlando.org]
 Sent: Wednesday, October 23, 2002 5:48 PM
 To: [EMAIL PROTECTED]
 Subject: Error when trying to match file path


 Hi all,

 I am indexing the filepath with the below:

  Document doc = new Document();
 doc.add(Field.UnIndexed(path, f.getAbsolutePath()));

 I then try to run the following after building the index:

  this.query =
 QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
 Hits hits = this.searcher.search(this.query);

 It returns zero hits?!?

 What am I doing wrong?

 Any help would be appreciated.

 Thanks,

 Rob


 --
 To unsubscribe, e-mail:
 mailto:lucene-user-unsubscribe;jakarta.apache.org
 For additional commands, e-mail:
 mailto:lucene-user-help;jakarta.apache.org



__
Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site
http://webhosting.yahoo.com/

--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

User Base

2002-10-24 Thread Rob Outar

All,

I am trying to sell my lead on using this awesome search\indexing engine,
but he wants to know the user base for this product.  He wants to be assured
that if we choose this products that it will not go away, and that we will
have some form of support.  Worst case of course is we do have the source.
Anyhow if anyone can let me know what the user base is, or anything that
would lead to some assurance for him, I would greatly appreciate it.

Thanks,

Rob


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

Setting fields

2002-10-24 Thread Rob Outar

Hello,

Is there is way to set a field once it has been associated with a document?
For example if I have a field named filename, and the file is renamed I now
need to update the field filename with the new name of the file.  I did not
see any setter methods on Field.  The only solution that comes to mind is to
fetch the document based on it's URL, remove it from the index, then read it
with the new value.

Let me know,

Thanks,

Rob


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

RE: Error when trying to match file path

2002-10-24 Thread Rob Outar

I finally got it to work, but I do not understand the solution, instead of
storing the file path (F:\blah\blah\blah.xml), I stored the URL of file in
the field called path, I was then able to use a TermQuery(path, URL) to
retrieve that one document from the index.

Thanks,

Rob


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodnetic;yahoo.com]
Sent: Thursday, October 24, 2002 10:37 AM
To: Lucene Users List
Subject: RE: Error when trying to match file path


The Analyzer is stripping your \ characters.  Query Parser doesn't do
that...

Otis

--- Rob Outar [EMAIL PROTECTED] wrote:
 Some more information, with the following:

   this.query =
 QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
 System.out.println(this.query.toString(path));
 I got:
 F:onesaf dev block b pair dev unittestdatafiles tools
 unitcomposer.xml

 So it looks like the Query Parser is stripping out all the \, and
 doing
 something with the F:\, would anyone happen to know why this is
 happening?
 Do I need to use a different query to get the infromation I need?

 Thanks,

 Rob

 -Original Message-
 From: Rob Outar [mailto:routar;ideorlando.org]
 Sent: Wednesday, October 23, 2002 5:48 PM
 To: [EMAIL PROTECTED]
 Subject: Error when trying to match file path


 Hi all,

 I am indexing the filepath with the below:

  Document doc = new Document();
 doc.add(Field.UnIndexed(path, f.getAbsolutePath()));

 I then try to run the following after building the index:

  this.query =
 QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
 Hits hits = this.searcher.search(this.query);

 It returns zero hits?!?

 What am I doing wrong?

 Any help would be appreciated.

 Thanks,

 Rob


 --
 To unsubscribe, e-mail:
 mailto:lucene-user-unsubscribe;jakarta.apache.org
 For additional commands, e-mail:
 mailto:lucene-user-help;jakarta.apache.org



__
Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site
http://webhosting.yahoo.com/

--
To unsubscribe, e-mail:
mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail:
mailto:lucene-user-help;jakarta.apache.org


--
To unsubscribe, e-mail:   mailto:lucene-user-unsubscribe;jakarta.apache.org
For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org

Error when trying to match file path

2002-10-23 Thread Rob Outar

Hi all,

I am indexing the filepath with the below:

 Document doc = new Document();
doc.add(Field.UnIndexed(path, f.getAbsolutePath()));

I then try to run the following after building the index:

 this.query =
QueryParser.parse(file.getAbsolutePath(),path,this.analyzer);
Hits hits = this.searcher.search(this.query);

It returns zero hits?!?

What am I doing wrong?

Any help would be appreciated.

Thanks,

Rob

54 matches

Mail list logo