xml parsing examples

2002-04-26 Thread Aruna Raghavan

Hi,
I have a couple of examples of parsing .xml file using SAX/DOM from my code
that uses lucene for indexing. Can I submit these somewhere? Please let me
know.
Aruna.

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Removing a write.lock file

2002-04-18 Thread Aruna Raghavan


Hi,
The write.lock file won't be there if you close the index using a lock
mechanism. I use my own RWLock to access the index dir and unlock it after I
close the index. Basically, the access to the index is synchronized. I have
never had any problems with this approach.
Aruna.
-Original Message-
From: suneethad [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, April 17, 2002 11:47 PM
To: Lucene Users List
Subject: Removing a write.lock file


Hi,
I'm currently indexing allowing  multiple access , I find that a
write.lock file has got created.
I know this is to prevent  multiple writers, but now how do I
continue.??I  do not want to reindex as I work on a very large database
and it takes a real long time How do I remove this lock file ??

Thanx 4 ur help,
Suneetha.



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Removing a write.lock file

2002-04-18 Thread Aruna Raghavan

I don't think it is a good approach to delete the write.lock file by hand.
It is there for a reason. You may want to dig into some of the older
dialogs/e-mails on this topic.

-Original Message-
From: Biswas, Goutam_Kumar [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 9:53 AM
To: 'Lucene Users List'
Subject: RE: Removing a write.lock file


well suneetha,

   before I write to the index I check whether a write.lock file exists! If
it does I delete it before opening the index. It works fine
for me. 

-Goutam

-Original Message-
From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 8:22 PM
To: 'Lucene Users List'
Subject: RE: Removing a write.lock file



Hi,
The write.lock file won't be there if you close the index using a lock
mechanism. I use my own RWLock to access the index dir and unlock it after I
close the index. Basically, the access to the index is synchronized. I have
never had any problems with this approach.
Aruna.
-Original Message-
From: suneethad [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, April 17, 2002 11:47 PM
To: Lucene Users List
Subject: Removing a write.lock file


Hi,
I'm currently indexing allowing  multiple access , I find that a
write.lock file has got created.
I know this is to prevent  multiple writers, but now how do I
continue.??I  do not want to reindex as I work on a very large database
and it takes a real long time How do I remove this lock file ??

Thanx 4 ur help,
Suneetha.



--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Removing a write.lock file

2002-04-18 Thread Aruna Raghavan

Sorry, but I would say the same thing. I don't think you are supposed do it
even programmatically. It is a lock internal to lucene.

-Original Message-
From: Biswas, Goutam_Kumar [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 10:25 AM
To: 'Lucene Users List'
Subject: RE: Removing a write.lock file


I'm not removing the write.lock file by hand. I'm doing it inside the code
before opening the index
-Goutam


-Original Message-
From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 8:37 PM
To: 'Lucene Users List'
Subject: RE: Removing a write.lock file


I don't think it is a good approach to delete the write.lock file by hand.
It is there for a reason. You may want to dig into some of the older
dialogs/e-mails on this topic.

-Original Message-
From: Biswas, Goutam_Kumar [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 9:53 AM
To: 'Lucene Users List'
Subject: RE: Removing a write.lock file


well suneetha,

   before I write to the index I check whether a write.lock file exists! If
it does I delete it before opening the index. It works fine
for me. 

-Goutam

-Original Message-
From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 8:22 PM
To: 'Lucene Users List'
Subject: RE: Removing a write.lock file



Hi,
The write.lock file won't be there if you close the index using a lock
mechanism. I use my own RWLock to access the index dir and unlock it after I
close the index. Basically, the access to the index is synchronized. I have
never had any problems with this approach.
Aruna.
-Original Message-
From: suneethad [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, April 17, 2002 11:47 PM
To: Lucene Users List
Subject: Removing a write.lock file


Hi,
I'm currently indexing allowing  multiple access , I find that a
write.lock file has got created.
I know this is to prevent  multiple writers, but now how do I
continue.??I  do not want to reindex as I work on a very large database
and it takes a real long time How do I remove this lock file ??

Thanx 4 ur help,
Suneetha.



--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Search question

2002-04-17 Thread Aruna Raghavan

Hi,
I am looking for ways to cancel a search in response to a cancel from a user
interface. I don't see any thing like a timeout on the Searcher.search()
method. Is there a way to terminate a search request?
Aruna Raghavan
Senior Software Engineer
OPIN Systems SPC

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: how to decide when the index needs to be optimized ?

2002-04-11 Thread Aruna Raghavan

Hi,
I was using the following to do analysis on our document management system
that uses lucene-
opimization counter(how often optimize() should be called, this seems to
help to clean up the deletable files even if you are not interested in
speeding up the searches)
Merge factor - decides how often segments should be merged
Max Merge factor- upper limit on number of documents that can be merged
JVM heap size - determines how much heap should be given to the java process
that uses lucene (-Xmx520m)

If there are any others, I would like to know.
Aruna.

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 11, 2002 11:35 AM
To: Lucene Users List
Subject: Re: how to decide when the index needs to be optimized ?


My understanding it that you don't even have to optimize the index,
unless you want your searches to be faster.
I don't think Lucene has any internal limitation to the number of files
that comprise an unoptimized index, so you'll hit the wall with Java or
OS first, but even that limit is pretty high.
You could just optimize every X documents or at the end of indexing.

Otis



--- Biswas, Goutam_Kumar [EMAIL PROTECTED] wrote:
 Hello !
   
 We're building a Document Management System and we're using
 Lucene to
 index the 
 document contents. Initially when we're populating our database
 we're
 adding the 
 documents to the index also. We're also Optimizing the index
 after
 adding the  
 documents to the index. Now over a period of time more doucments
 will be
 added to
 the index. So it's understabdable that after a period of time the
 index
 will be
 unoptimized. Now is there some way we can detect that the index
 needs
 optimizaion.
 Or we'll just have to keep optimizing the index, say for every n
 documents being
 added to the index, and if so how do we really figure out how
 many
 documents we 
 can add before optimizing the index. 
 
 Can anyone throw some light on this ? 
 
 Regards
 -goutam- 
 
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Match All Words Query

2002-04-08 Thread Aruna Raghavan

Hi,
I haven't tried two levels of boolean queries but I did use the following
and it works fine for me.

BooleanQuery bool_query = new BooleanQuery();
for each field 
{
   Query q = QueryParser.parse(term,field,analyzer);
   bool_query.add(q,false,false); 
}
searcher.search( bool_query);
Aruna.
-Original Message-
From: Melissa Mifsud [mailto:[EMAIL PROTECTED]]
Sent: Saturday, April 06, 2002 10:17 AM
To: Lucene User
Subject: Match All Words Query


Hi!

I've been going round in circles trying to come up with a query that will
return documents which contian ALL the query terms. This should be easy,
however I would like the words to span ANY of the fields of the documents.

If the BooleanQuery(ies) do actually follow boolean logic, then I should be
able to form this query:

BooleanQuery b = new BooleanQuery();

for each term in the query {
BooleanQuery sub_query = new BooleanQuery();
for each field {
Query q = QueryParser.parse(term,field,analyzer);
sub_query.add(q,false,false);  
disjunction of fields
}

b.add(sub_query,true,false);  conjunction of terms
}

And then b *should* be the query.

However, the query does not give the desired results!

Probably most all users of Lucene have needed such a query... I feel i'm
complicating things here! 

Help would be greatly appreciated.

Melissa.

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Querying multiple fields of a index

2002-04-04 Thread Aruna Raghavan

Hi,
I use a boolean query and add individual queries to it.

-Original Message-
From: Harpreet S Walia [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 04, 2002 10:13 AM
To: Lucene Users List
Subject: Querying multiple fields of a index


Hi,

Is it possible to query multiple fields of  a given index and get the result
based on this combined query.
i.e for example if  i want to serach for a word lucene in the title field
and the word engine in the summary filed and want the results based on
these words .

How can i achieve this ?

TIA

Regards
Harpreet



--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Case Sensitivity

2002-04-03 Thread Aruna Raghavan

Hi,
I am using StandardAnalyzer - the problem was with wildcard queries being
case sensitive. Even with Standard Analyzer, you have to worry about case
sensitivity in this case. Thanks for the tip on example Analyzer, I will
take a peek.

-Original Message-
From: Joshua O'Madadhain [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, April 03, 2002 1:40 PM
To: Lucene Users List
Subject: RE: Case Sensitivity


Alan, Aruna:

The built-in solution is to use LowerCaseFilter in your Analyzer.  (The
SimpleAnalyzer, StopAnalyzer, and StandardAnalyzer classes already do
this; see the Lucene API docs to see which filters each uses.)  The FAQ
includes an example implementation of an Analyzer if you want to build
your own.

Joshua

 [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden
Joshua Madden: Information Scientist, Musician, Philosopher-At-Tall
 It's that moment of dawning comprehension that I live for--Bill Watterson
My opinions are too rational and insightful to be those of any organization.

On Wed, 3 Apr 2002, Aruna Raghavan wrote:

 Hi,
 I worked around the problem by converting everything to lowercase in my
code
 prior to indexing into lucene and also prior to searching for a string.
 Ofcourse, I also had to use pattern matching to change bool operators such
 as ANDs and ORs to uppercase again because lucene expects those to be
 uppercase.
 
 -Original Message-
 From: Alan Weissman [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, April 03, 2002 1:26 PM
 To: Lucene Users List
 Subject: Case Sensitivity
 
 
 What can I do to configure Lucene to make in case insensitive? 
 
 Thanks,
 Alan
 
 
 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 
 --
 To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
mailto:[EMAIL PROTECTED]
 
 


--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Lucene with Number+Text

2002-03-27 Thread Aruna Raghavan

Hi,
I am indexing as text field. Search for 05qzFebqz01, 05q* do not work. I am
using a StandardAnalyzer. Search for 05* works.
Searches on another word cq6r work fine.
 Any idea why this is happening?
Thanks!
Aruna.

-Original Message-
From: Ian Lea [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 25, 2002 3:56 PM
To: Lucene Users List
Subject: Re: Lucene with Number+Text


Good thinking.  In my test, using a Text field, searches
for 1727a and 1727* both return a hit but if switch to
Keyword they don't.


--
Ian.

 [EMAIL PROTECTED] (Shannon Booher) wrote 

 I think I have seen a similar problem.
 
 Are you guys using Keyword or Text fields?

--
Searchable personal storage and archiving from http://www.digimem.net/


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Term

2002-03-27 Thread Aruna Raghavan

Hi All,
I just tried this again, seems to work fine. Not sure what I have done wrong
the first time.  Just a follow up.

-Original Message-
From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 27, 2002 12:45 PM
To: Lucene Users List
Subject: Term


Hi,
While adding documents using something like the following-
document.add(Field.Text(object number, m_strObjectNumber));
I used a string object number as you can see. I can not find the  values
for object number when I do a search. I am using a StandardAnalyzer.
Any idea why this is happening?
Thanks,
Aruna.

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Term

2002-03-27 Thread Aruna Raghavan

Ype,
Thanks for the response. I think the reason my search worked was because
object number got indexed as object and the searcher searched for
object as well.

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 27, 2002 1:31 PM
To: [EMAIL PROTECTED]
Subject: Re: Term


Aruna,

 Hi,
 While adding documents using something like the following-
 document.add(Field.Text(object number, m_strObjectNumber));
 I used a string object number as you can see. I can not find the 
 values for object number when I do a search. I am using a
 StandardAnalyzer. Any idea why this is happening?

You would need to pose a query like this

object number:54321

However this is parsed by the standard analyzer  as a query looking
for the term 'object' in the default field and looking
for the term '54321' in the field named 'number'.

There are three workarounds:
- change your fieldname to eg. objectnumber, and query by:
  objectnumber:54321
- use 'object number' as the default field for searching.
- construct the query without using the standard analyzer.

I think the best solution would be to change the fieldname
into something shorter like 'onr' which allows for easy querying.


Regards,
Ype



--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Limit on search results?

2002-03-25 Thread Aruna Raghavan

Hi,
Is there any way to limit the number of search results being returned?
Aruna

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Multiple field searching

2002-03-20 Thread Aruna Raghavan

I use a BooleanQuery and add individual queries to it, it is working for me.

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 20, 2002 1:59 AM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: Re: Multiple field searching


I'm using MultiTermQueryParser and it works for me.

Otis

--- Tate Jones [EMAIL PROTECTED] wrote:
 hi,
 
 I am trying to search across multiple fields using the following
 query
 
 +keyword:computers +subject:News content:xml
 or
 +(keyword:{computers}) +(subject:{News}) content:xml
 
 i have added the fields to the document correctly. 
 
 Have also tried using the MutipleFieldQueryParser without success.
 
 The only query that works is, which is not correct as they are OR's
 keyword:computers subject:IT content:xml
 
 Is anyone having the same problems
 
 Thanks in advance
 Tate
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Phone number Searches

2002-03-14 Thread Aruna Raghavan

Hi,
I have just noticed that 1-954-612-1276 (phrase query) works but a search
for 1-954-612-1276 is returning all documents I have probably because in the
latter case, lucene searcher is treating the - as exclusion. Is this
correct?
Thanks,
Aruna.

-Original Message-
From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 13, 2002 10:48 AM
To: Lucene Users List
Subject: Phone number Searches


Hello All,
I tried doing a search for a phone number 1-954-612-1276. It worked fine. I
am using a StandardAnalyzer for both indexing and searching. From looking at
StandardTokenizer.jj and StandardAnalyzer, - is a valid character. So, how
is this differentiated from - that we use for exclusion such as
+(dog)-(cat) i.e, all dogs but no cats?
Thanks!
Aruna.

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Phone number Searches

2002-03-14 Thread Aruna Raghavan

Thanks, I am trying to do that. But the JBuilder IDE I am using does not
recognize the .jj files. How do I link these in?


-Original Message-
From: Norbert Pabis [mailto:[EMAIL PROTECTED]]
Sent: Thursday, March 14, 2002 8:41 AM
To: Lucene Users List
Subject: Re: Phone number Searches


Recompile Lucene with debug on, them you will see exactly what it does.


 -Original Message-
 From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, March 13, 2002 10:48 AM
 To: Lucene Users List
 Subject: Phone number Searches
 
 Hello All,
 I tried doing a search for a phone number 1-954-612-1276. It worked fine.
I
 am using a StandardAnalyzer for both indexing and searching. From looking
at
 StandardTokenizer.jj and StandardAnalyzer, - is a valid character. So,
how
 is this differentiated from - that we use for exclusion such as
 +(dog)-(cat) i.e, all dogs but no cats?
 Thanks!
 Aruna.
-- 
Norbert Pabi

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Phone number Searches

2002-03-13 Thread Aruna Raghavan

Hello All,
I tried doing a search for a phone number 1-954-612-1276. It worked fine. I
am using a StandardAnalyzer for both indexing and searching. From looking at
StandardTokenizer.jj and StandardAnalyzer, - is a valid character. So, how
is this differentiated from - that we use for exclusion such as
+(dog)-(cat) i.e, all dogs but no cats?
Thanks!
Aruna.

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: special character handling

2002-03-12 Thread Aruna Raghavan

Otis,
I am using StandardAnalyzer.

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 12, 2002 3:37 PM
To: Lucene Users List
Subject: Re: special character handling


It depends on the Analyzer used.

Otis

--- Aruna Raghavan [EMAIL PROTECTED] wrote:
 Hi,
 Does lucene replace all special characters with spaces when it adds
 the
 document to the index?
 Thanks!
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: special character handling

2002-03-12 Thread Aruna Raghavan


Hi,
I guess my question is really regarding characters like ,%, $,#,- etc. (-
is used for exclusion, for eg) I remember testing and with a standard
analyzer and finding that it didn't quite work. Is there any reason these
charactwers won't work with a standard analyzer? The stop table for
StandardAnalyzer does not inlcude these chracters. Does it mean they are
supported?
Thanks!
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 12, 2002 4:39 PM
To: Lucene Users List
Subject: RE: special character handling


This is answered in FAQA:
http://jguru.com/faq/view.jsp?EID=538308

--- Aruna Raghavan [EMAIL PROTECTED] wrote:
 Otis,
 I am using StandardAnalyzer.
 
 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, March 12, 2002 3:37 PM
 To: Lucene Users List
 Subject: Re: special character handling
 
 
 It depends on the Analyzer used.
 
 Otis
 
 --- Aruna Raghavan [EMAIL PROTECTED] wrote:
  Hi,
  Does lucene replace all special characters with spaces when it adds
  the
  document to the index?
  Thanks!
  
  --
  To unsubscribe, e-mail:  
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
  
 
 
 __
 Do You Yahoo!?
 Try FREE Yahoo! Mail - the world's greatest free email!
 http://mail.yahoo.com/
 
 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: optimize(), delete() calls on IndexWriter

2002-03-08 Thread Aruna Raghavan

Yes, thanks.

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Friday, March 08, 2002 11:46 AM
To: Lucene Users List
Subject: Re: optimize(), delete() calls on IndexWriter


No they don't. Note that delete() is in IndexReader.

Otis

--- Aruna Raghavan [EMAIL PROTECTED] wrote:
 Hi,
 Do calls like optimize() and delete() on the Indexwriter cause a
 separate
 thread to be kicked off?
 Thanks!
 Aruna.
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Deleting documents

2002-03-08 Thread Aruna Raghavan

Hi,
Is there anything wrong with the following code?
  try {
   m_lock.write(); // obtain a write lock on a RWLock
   IndexReader indexReader = IndexReader.open(mypath);
   IndexSearcher indexSearcher = new IndexSearcher(mypath);
  // use the searcher to search for documents to be deleted
  // use the reader to do the deletes.
  indexReader.close();
  }
  catch(Throwable e)
  {   
   e.printStackTrace();
  }
  finally
  {
   m_lock.unlock();
  }

Sometimes I am getting the following exception:
java.io.IOException: Index locked for write:
Lock@D:\RevealCS\Search\Data\reports\write.lock
at org.apache.lucene.index.IndexReader.delete(Unknown Source)
at org.apache.lucene.index.IndexReader.delete(Unknown Source)
at
revsearch.RevSearch$DeleteWatcherThread.checkAction(RevSearch.java:1455)
at revsearch.RevSearch$WatcherThread.run(RevSearch.java:250)

This exception was not happening every time the code was run, it was
intermittent.

I suspect it is because I am using indexSearcher and indexWriter to open the
myPath dir. I changed it such that indexSearcher uses the indexReader in the
constructor.

I am hoping that some one can shed some light on what went wrong, thanks.
Aruna.



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Optimization and deletes

2002-02-28 Thread Aruna Raghavan

Hi,
I have noticed that unless I optimize the indexing while adding documents to
it, the deleted documents are not getting physically deleted right away
(even though they seemed to have been flagged as deleted. The searcher
could not find them once they were deleted). If I decide not to optimize the
index, when would the deleted documents actually get deleted?

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




mergefactor and mergemaxdocs

2002-02-28 Thread Aruna Raghavan

Hello,
Lucene javadoc defines the merge factor and mergemaxdocs as follows:
int maxMergeDocs Determines the largest number of documents ever merged by
addDocument().
int mergeFactor Determines how often segment indexes are merged by
addDocument().
void optimize Merges all segments together into a single segment, optimizing
an index for search.
Using the above three, combined with the JVM heap size (-Xmx) I am trying to
nail down a configuration for my application that uses Lucene for searches.
A few questions regarding these -
If mergeFactor determines how often segment indexes are merged, if I set it
to a value  maxMergeDocs, what value gets used? I assume it is limited by
maxMergeDocs. So is maxMergeDocs an upper limit for mergeFactor?
If no explicit optimize() calls are used, will the segements still be merged
according to the values set for maxMergeDocs and mergeFactor? Or do the
mergeFactor and maxMergeDocs only get used when optimize() is called?
Thanks for all the help!



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Boolean AND query

2002-02-21 Thread Aruna Raghavan

Daniel,
Thanks for the response but I am going by the definition of the 
Syntax in Lucene FAQ:
Query  ::=  Clause  ( [ Conjunction ] Clause ) *

Where:
Clause ::=  [ Modifier ] [ FieldName ':' ] BasicClause 
Modifier::= '-' | '+' | '!' | 'NOT'
BasicClause ::= ( Term | Phrase | | PrefixQuery '(' Query ')'
PrefixQuery ::= Term '*'
Term::= a-word-or-token-to-match
Phrase  ::= '' Term * ''

Conjunction ::= 'AND' | 'OR' | '||'

According to the above, AND and OR should work too, right?

-Original Message-
From: Daniel Calvo [mailto:[EMAIL PROTECTED]]
Sent: Thursday, February 21, 2002 11:12 AM
To: Lucene Users List
Subject: RE: Boolean AND query


Hi,

To achieve what you want, you need to use the required operand (+)

--Daniel 

 -Original Message-
 From: Aruna Raghavan [mailto:[EMAIL PROTECTED]]
 Sent: quinta-feira, 21 de fevereiro de 2002 13:44
 To: 'Lucene Users List'
 Subject: Boolean AND query
 
 
 Hello,
 Has anyone run into problems with boolean AND query? Basically, I am using
 the following code to do the query to look for 
 10060 AND 10040
 
 BooleanQuery bq = new BooleanQuery();
 Analyzer analyzer = new StandardAnalyzer();
 Query query = QueryParser.parse(m_strKeyword, pageText, analyzer);
  bq.add(query, true, false);
 
 In this case, I am just using one query to add to bq but there can be
more. 
 
 I am getting correct results when 10060 AND 10040 exists in the document.
 But when one of them does not exist, I am still getting the same results.
In
 other words, AND seems to be acting like an OR. I noticed this in the
latest
 RC4 as well as an older lucene build from before lucene joined jakarta.
 
 Thanks!
 
 --
 To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
mailto:[EMAIL PROTECTED]
 

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RC3 release

2002-02-14 Thread Aruna Raghavan

Hi,
I have been using an older release from back when lucene was not under
jakarta. I just tried the released RC3 version of apache.lucene libs, I was
getting errors while indexing documents. Usually, there is a write.lock file
left in the index dir. I did see some e-mails on a related subject, (RE:
problems with last patch  (obtain write.lock while deleting d ocuments)) 
I think Doug has fixed this on Feb 11th. I am at a point in my development
of a search engine using lucene that I need to put the new apache.lucene
libs in. Are there any release notes on rc3? Also, how soon the writelock
fix be released officially?
Thanks!

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Limit on number of characters before wildcard?

2002-01-11 Thread Aruna Raghavan

Yes, I am clear on how a prefix query is defined. But Dave says somehow a
search *ogleash would work with a PrefixQuery. dog*eash or dog* would work,
not *ogleash. That's where the confusion came from. Just to clarify...
Thanks again.

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Friday, January 11, 2002 12:49 PM
To: Lucene Users List
Subject: RE: Limit on number of characters before wildcard?


Just so that nobody is confused in the future, PrefixQuery that Dave is
mentioning is actually a query that lets you make searches such as
'Consult*'.

See http://jguru.com/faq/view.jsp?EID=480194

Otis

--- Dave Kor [EMAIL PROTECTED] wrote:
 First character asterisk (eg, *ogleash) is performed by PrefixQuery,
 which
 executes much faster than WildcardQuery.
 
 
 Dave Kor Kian Wei
 Consultant
 Product Engineering
 NexusEdge Technologies Pte. Ltd.
 6 Aljunied Ave 3, #01-02 (Level 4)
 Singapore 389932
 Tel : (+65)848-2552
 Fax : (+65)747-4536
 Web : www.nexusedge.com
 
  -Original Message-
  From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
  Sent: Friday, January 11, 2002 11:40 AM
  To: Lucene Users List
  Subject: Re: Limit on number of characters before wildcard?
 
 
  Hello,
 
  I haven't tested this like you did, but from looking at the query
  parser (QueryParser.jj file in the Lucene distribution)
  it seems that only a single character is required before '*' or
 '?':
 
  ...
  | WILDTERM:  _TERM_START_CHAR
(_TERM_CHAR | ( [ *, ? ] ))* 
  ...
 
  _TERM_START_CHAR is defined as:
  [ a-z, A-Z, _, \u0080-\uFFFE ]
 
  and as you can see from the first definition above this character
 can
  be followed by either zero or more _TERM_CHAR or * or ?.
 
  This also answers your question about using an asterisk as the very
  first character in the query.
 
  It would be great if Doug or Brian Goetz could confirm or dispute
 this,
  so that I can add it to the Lucene FAQ at jGuru.com.
 
  Otis
 
 
 
 
 
  --- Aruna Raghavan [EMAIL PROTECTED] wrote:
   Hi,
   From some testing that I have done it appears that there is a
 limit
   of 3
   characters before the wild card for wildcard queries. In other
  words,
   if the
   word is dogleash and I looking by using do* it returns wrong
  results
   (usually only a asubset) where as if I use dog*, I get correct
   results.
  
   Also, wildcard at the begining of the keyword does not seem to be
   supported.
   (*ogleash)
   Can some one confirm this? Is this documented anywhere?
  
   --
   To unsubscribe, e-mail:
   mailto:[EMAIL PROTECTED]
   For additional commands, e-mail:
   mailto:[EMAIL PROTECTED]
  
 
 
  __
  Do You Yahoo!?
  Send FREE video emails in Yahoo! Mail!
  http://promo.yahoo.com/videomail/
 
  --
  To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
 
 
 
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]