xml parsing examples
Hi, I have a couple of examples of parsing .xml file using SAX/DOM from my code that uses lucene for indexing. Can I submit these somewhere? Please let me know. Aruna. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Removing a write.lock file
Hi, The write.lock file won't be there if you close the index using a lock mechanism. I use my own RWLock to access the index dir and unlock it after I close the index. Basically, the access to the index is synchronized. I have never had any problems with this approach. Aruna. -Original Message- From: suneethad [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 17, 2002 11:47 PM To: Lucene Users List Subject: Removing a write.lock file Hi, I'm currently indexing allowing multiple access , I find that a write.lock file has got created. I know this is to prevent multiple writers, but now how do I continue.??I do not want to reindex as I work on a very large database and it takes a real long time How do I remove this lock file ?? Thanx 4 ur help, Suneetha. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Removing a write.lock file
I don't think it is a good approach to delete the write.lock file by hand. It is there for a reason. You may want to dig into some of the older dialogs/e-mails on this topic. -Original Message- From: Biswas, Goutam_Kumar [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 18, 2002 9:53 AM To: 'Lucene Users List' Subject: RE: Removing a write.lock file well suneetha, before I write to the index I check whether a write.lock file exists! If it does I delete it before opening the index. It works fine for me. -Goutam -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 18, 2002 8:22 PM To: 'Lucene Users List' Subject: RE: Removing a write.lock file Hi, The write.lock file won't be there if you close the index using a lock mechanism. I use my own RWLock to access the index dir and unlock it after I close the index. Basically, the access to the index is synchronized. I have never had any problems with this approach. Aruna. -Original Message- From: suneethad [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 17, 2002 11:47 PM To: Lucene Users List Subject: Removing a write.lock file Hi, I'm currently indexing allowing multiple access , I find that a write.lock file has got created. I know this is to prevent multiple writers, but now how do I continue.??I do not want to reindex as I work on a very large database and it takes a real long time How do I remove this lock file ?? Thanx 4 ur help, Suneetha. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Removing a write.lock file
Sorry, but I would say the same thing. I don't think you are supposed do it even programmatically. It is a lock internal to lucene. -Original Message- From: Biswas, Goutam_Kumar [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 18, 2002 10:25 AM To: 'Lucene Users List' Subject: RE: Removing a write.lock file I'm not removing the write.lock file by hand. I'm doing it inside the code before opening the index -Goutam -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 18, 2002 8:37 PM To: 'Lucene Users List' Subject: RE: Removing a write.lock file I don't think it is a good approach to delete the write.lock file by hand. It is there for a reason. You may want to dig into some of the older dialogs/e-mails on this topic. -Original Message- From: Biswas, Goutam_Kumar [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 18, 2002 9:53 AM To: 'Lucene Users List' Subject: RE: Removing a write.lock file well suneetha, before I write to the index I check whether a write.lock file exists! If it does I delete it before opening the index. It works fine for me. -Goutam -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 18, 2002 8:22 PM To: 'Lucene Users List' Subject: RE: Removing a write.lock file Hi, The write.lock file won't be there if you close the index using a lock mechanism. I use my own RWLock to access the index dir and unlock it after I close the index. Basically, the access to the index is synchronized. I have never had any problems with this approach. Aruna. -Original Message- From: suneethad [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 17, 2002 11:47 PM To: Lucene Users List Subject: Removing a write.lock file Hi, I'm currently indexing allowing multiple access , I find that a write.lock file has got created. I know this is to prevent multiple writers, but now how do I continue.??I do not want to reindex as I work on a very large database and it takes a real long time How do I remove this lock file ?? Thanx 4 ur help, Suneetha. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Search question
Hi, I am looking for ways to cancel a search in response to a cancel from a user interface. I don't see any thing like a timeout on the Searcher.search() method. Is there a way to terminate a search request? Aruna Raghavan Senior Software Engineer OPIN Systems SPC -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: how to decide when the index needs to be optimized ?
Hi, I was using the following to do analysis on our document management system that uses lucene- opimization counter(how often optimize() should be called, this seems to help to clean up the deletable files even if you are not interested in speeding up the searches) Merge factor - decides how often segments should be merged Max Merge factor- upper limit on number of documents that can be merged JVM heap size - determines how much heap should be given to the java process that uses lucene (-Xmx520m) If there are any others, I would like to know. Aruna. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 11, 2002 11:35 AM To: Lucene Users List Subject: Re: how to decide when the index needs to be optimized ? My understanding it that you don't even have to optimize the index, unless you want your searches to be faster. I don't think Lucene has any internal limitation to the number of files that comprise an unoptimized index, so you'll hit the wall with Java or OS first, but even that limit is pretty high. You could just optimize every X documents or at the end of indexing. Otis --- Biswas, Goutam_Kumar [EMAIL PROTECTED] wrote: Hello ! We're building a Document Management System and we're using Lucene to index the document contents. Initially when we're populating our database we're adding the documents to the index also. We're also Optimizing the index after adding the documents to the index. Now over a period of time more doucments will be added to the index. So it's understabdable that after a period of time the index will be unoptimized. Now is there some way we can detect that the index needs optimizaion. Or we'll just have to keep optimizing the index, say for every n documents being added to the index, and if so how do we really figure out how many documents we can add before optimizing the index. Can anyone throw some light on this ? Regards -goutam- -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Match All Words Query
Hi, I haven't tried two levels of boolean queries but I did use the following and it works fine for me. BooleanQuery bool_query = new BooleanQuery(); for each field { Query q = QueryParser.parse(term,field,analyzer); bool_query.add(q,false,false); } searcher.search( bool_query); Aruna. -Original Message- From: Melissa Mifsud [mailto:[EMAIL PROTECTED]] Sent: Saturday, April 06, 2002 10:17 AM To: Lucene User Subject: Match All Words Query Hi! I've been going round in circles trying to come up with a query that will return documents which contian ALL the query terms. This should be easy, however I would like the words to span ANY of the fields of the documents. If the BooleanQuery(ies) do actually follow boolean logic, then I should be able to form this query: BooleanQuery b = new BooleanQuery(); for each term in the query { BooleanQuery sub_query = new BooleanQuery(); for each field { Query q = QueryParser.parse(term,field,analyzer); sub_query.add(q,false,false); disjunction of fields } b.add(sub_query,true,false); conjunction of terms } And then b *should* be the query. However, the query does not give the desired results! Probably most all users of Lucene have needed such a query... I feel i'm complicating things here! Help would be greatly appreciated. Melissa. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Querying multiple fields of a index
Hi, I use a boolean query and add individual queries to it. -Original Message- From: Harpreet S Walia [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 04, 2002 10:13 AM To: Lucene Users List Subject: Querying multiple fields of a index Hi, Is it possible to query multiple fields of a given index and get the result based on this combined query. i.e for example if i want to serach for a word lucene in the title field and the word engine in the summary filed and want the results based on these words . How can i achieve this ? TIA Regards Harpreet -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Case Sensitivity
Hi, I am using StandardAnalyzer - the problem was with wildcard queries being case sensitive. Even with Standard Analyzer, you have to worry about case sensitivity in this case. Thanks for the tip on example Analyzer, I will take a peek. -Original Message- From: Joshua O'Madadhain [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 03, 2002 1:40 PM To: Lucene Users List Subject: RE: Case Sensitivity Alan, Aruna: The built-in solution is to use LowerCaseFilter in your Analyzer. (The SimpleAnalyzer, StopAnalyzer, and StandardAnalyzer classes already do this; see the Lucene API docs to see which filters each uses.) The FAQ includes an example implementation of an Analyzer if you want to build your own. Joshua [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden Joshua Madden: Information Scientist, Musician, Philosopher-At-Tall It's that moment of dawning comprehension that I live for--Bill Watterson My opinions are too rational and insightful to be those of any organization. On Wed, 3 Apr 2002, Aruna Raghavan wrote: Hi, I worked around the problem by converting everything to lowercase in my code prior to indexing into lucene and also prior to searching for a string. Ofcourse, I also had to use pattern matching to change bool operators such as ANDs and ORs to uppercase again because lucene expects those to be uppercase. -Original Message- From: Alan Weissman [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 03, 2002 1:26 PM To: Lucene Users List Subject: Case Sensitivity What can I do to configure Lucene to make in case insensitive? Thanks, Alan -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Lucene with Number+Text
Hi, I am indexing as text field. Search for 05qzFebqz01, 05q* do not work. I am using a StandardAnalyzer. Search for 05* works. Searches on another word cq6r work fine. Any idea why this is happening? Thanks! Aruna. -Original Message- From: Ian Lea [mailto:[EMAIL PROTECTED]] Sent: Monday, March 25, 2002 3:56 PM To: Lucene Users List Subject: Re: Lucene with Number+Text Good thinking. In my test, using a Text field, searches for 1727a and 1727* both return a hit but if switch to Keyword they don't. -- Ian. [EMAIL PROTECTED] (Shannon Booher) wrote I think I have seen a similar problem. Are you guys using Keyword or Text fields? -- Searchable personal storage and archiving from http://www.digimem.net/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Term
Hi All, I just tried this again, seems to work fine. Not sure what I have done wrong the first time. Just a follow up. -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 27, 2002 12:45 PM To: Lucene Users List Subject: Term Hi, While adding documents using something like the following- document.add(Field.Text(object number, m_strObjectNumber)); I used a string object number as you can see. I can not find the values for object number when I do a search. I am using a StandardAnalyzer. Any idea why this is happening? Thanks, Aruna. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Term
Ype, Thanks for the response. I think the reason my search worked was because object number got indexed as object and the searcher searched for object as well. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 27, 2002 1:31 PM To: [EMAIL PROTECTED] Subject: Re: Term Aruna, Hi, While adding documents using something like the following- document.add(Field.Text(object number, m_strObjectNumber)); I used a string object number as you can see. I can not find the values for object number when I do a search. I am using a StandardAnalyzer. Any idea why this is happening? You would need to pose a query like this object number:54321 However this is parsed by the standard analyzer as a query looking for the term 'object' in the default field and looking for the term '54321' in the field named 'number'. There are three workarounds: - change your fieldname to eg. objectnumber, and query by: objectnumber:54321 - use 'object number' as the default field for searching. - construct the query without using the standard analyzer. I think the best solution would be to change the fieldname into something shorter like 'onr' which allows for easy querying. Regards, Ype -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Limit on search results?
Hi, Is there any way to limit the number of search results being returned? Aruna -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Multiple field searching
I use a BooleanQuery and add individual queries to it, it is working for me. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 20, 2002 1:59 AM To: Lucene Users List; [EMAIL PROTECTED] Subject: Re: Multiple field searching I'm using MultiTermQueryParser and it works for me. Otis --- Tate Jones [EMAIL PROTECTED] wrote: hi, I am trying to search across multiple fields using the following query +keyword:computers +subject:News content:xml or +(keyword:{computers}) +(subject:{News}) content:xml i have added the fields to the document correctly. Have also tried using the MutipleFieldQueryParser without success. The only query that works is, which is not correct as they are OR's keyword:computers subject:IT content:xml Is anyone having the same problems Thanks in advance Tate -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Sports - live college hoops coverage http://sports.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Phone number Searches
Hi, I have just noticed that 1-954-612-1276 (phrase query) works but a search for 1-954-612-1276 is returning all documents I have probably because in the latter case, lucene searcher is treating the - as exclusion. Is this correct? Thanks, Aruna. -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 13, 2002 10:48 AM To: Lucene Users List Subject: Phone number Searches Hello All, I tried doing a search for a phone number 1-954-612-1276. It worked fine. I am using a StandardAnalyzer for both indexing and searching. From looking at StandardTokenizer.jj and StandardAnalyzer, - is a valid character. So, how is this differentiated from - that we use for exclusion such as +(dog)-(cat) i.e, all dogs but no cats? Thanks! Aruna. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Phone number Searches
Thanks, I am trying to do that. But the JBuilder IDE I am using does not recognize the .jj files. How do I link these in? -Original Message- From: Norbert Pabis [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 14, 2002 8:41 AM To: Lucene Users List Subject: Re: Phone number Searches Recompile Lucene with debug on, them you will see exactly what it does. -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 13, 2002 10:48 AM To: Lucene Users List Subject: Phone number Searches Hello All, I tried doing a search for a phone number 1-954-612-1276. It worked fine. I am using a StandardAnalyzer for both indexing and searching. From looking at StandardTokenizer.jj and StandardAnalyzer, - is a valid character. So, how is this differentiated from - that we use for exclusion such as +(dog)-(cat) i.e, all dogs but no cats? Thanks! Aruna. -- Norbert Pabi -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Phone number Searches
Hello All, I tried doing a search for a phone number 1-954-612-1276. It worked fine. I am using a StandardAnalyzer for both indexing and searching. From looking at StandardTokenizer.jj and StandardAnalyzer, - is a valid character. So, how is this differentiated from - that we use for exclusion such as +(dog)-(cat) i.e, all dogs but no cats? Thanks! Aruna. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: special character handling
Otis, I am using StandardAnalyzer. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Tuesday, March 12, 2002 3:37 PM To: Lucene Users List Subject: Re: special character handling It depends on the Analyzer used. Otis --- Aruna Raghavan [EMAIL PROTECTED] wrote: Hi, Does lucene replace all special characters with spaces when it adds the document to the index? Thanks! -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Try FREE Yahoo! Mail - the world's greatest free email! http://mail.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: special character handling
Hi, I guess my question is really regarding characters like ,%, $,#,- etc. (- is used for exclusion, for eg) I remember testing and with a standard analyzer and finding that it didn't quite work. Is there any reason these charactwers won't work with a standard analyzer? The stop table for StandardAnalyzer does not inlcude these chracters. Does it mean they are supported? Thanks! -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Tuesday, March 12, 2002 4:39 PM To: Lucene Users List Subject: RE: special character handling This is answered in FAQA: http://jguru.com/faq/view.jsp?EID=538308 --- Aruna Raghavan [EMAIL PROTECTED] wrote: Otis, I am using StandardAnalyzer. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Tuesday, March 12, 2002 3:37 PM To: Lucene Users List Subject: Re: special character handling It depends on the Analyzer used. Otis --- Aruna Raghavan [EMAIL PROTECTED] wrote: Hi, Does lucene replace all special characters with spaces when it adds the document to the index? Thanks! -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Try FREE Yahoo! Mail - the world's greatest free email! http://mail.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Try FREE Yahoo! Mail - the world's greatest free email! http://mail.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: optimize(), delete() calls on IndexWriter
Yes, thanks. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Friday, March 08, 2002 11:46 AM To: Lucene Users List Subject: Re: optimize(), delete() calls on IndexWriter No they don't. Note that delete() is in IndexReader. Otis --- Aruna Raghavan [EMAIL PROTECTED] wrote: Hi, Do calls like optimize() and delete() on the Indexwriter cause a separate thread to be kicked off? Thanks! Aruna. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Try FREE Yahoo! Mail - the world's greatest free email! http://mail.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Deleting documents
Hi, Is there anything wrong with the following code? try { m_lock.write(); // obtain a write lock on a RWLock IndexReader indexReader = IndexReader.open(mypath); IndexSearcher indexSearcher = new IndexSearcher(mypath); // use the searcher to search for documents to be deleted // use the reader to do the deletes. indexReader.close(); } catch(Throwable e) { e.printStackTrace(); } finally { m_lock.unlock(); } Sometimes I am getting the following exception: java.io.IOException: Index locked for write: Lock@D:\RevealCS\Search\Data\reports\write.lock at org.apache.lucene.index.IndexReader.delete(Unknown Source) at org.apache.lucene.index.IndexReader.delete(Unknown Source) at revsearch.RevSearch$DeleteWatcherThread.checkAction(RevSearch.java:1455) at revsearch.RevSearch$WatcherThread.run(RevSearch.java:250) This exception was not happening every time the code was run, it was intermittent. I suspect it is because I am using indexSearcher and indexWriter to open the myPath dir. I changed it such that indexSearcher uses the indexReader in the constructor. I am hoping that some one can shed some light on what went wrong, thanks. Aruna. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Optimization and deletes
Hi, I have noticed that unless I optimize the indexing while adding documents to it, the deleted documents are not getting physically deleted right away (even though they seemed to have been flagged as deleted. The searcher could not find them once they were deleted). If I decide not to optimize the index, when would the deleted documents actually get deleted? -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
mergefactor and mergemaxdocs
Hello, Lucene javadoc defines the merge factor and mergemaxdocs as follows: int maxMergeDocs Determines the largest number of documents ever merged by addDocument(). int mergeFactor Determines how often segment indexes are merged by addDocument(). void optimize Merges all segments together into a single segment, optimizing an index for search. Using the above three, combined with the JVM heap size (-Xmx) I am trying to nail down a configuration for my application that uses Lucene for searches. A few questions regarding these - If mergeFactor determines how often segment indexes are merged, if I set it to a value maxMergeDocs, what value gets used? I assume it is limited by maxMergeDocs. So is maxMergeDocs an upper limit for mergeFactor? If no explicit optimize() calls are used, will the segements still be merged according to the values set for maxMergeDocs and mergeFactor? Or do the mergeFactor and maxMergeDocs only get used when optimize() is called? Thanks for all the help! -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Boolean AND query
Daniel, Thanks for the response but I am going by the definition of the Syntax in Lucene FAQ: Query ::= Clause ( [ Conjunction ] Clause ) * Where: Clause ::= [ Modifier ] [ FieldName ':' ] BasicClause Modifier::= '-' | '+' | '!' | 'NOT' BasicClause ::= ( Term | Phrase | | PrefixQuery '(' Query ')' PrefixQuery ::= Term '*' Term::= a-word-or-token-to-match Phrase ::= '' Term * '' Conjunction ::= 'AND' | 'OR' | '||' According to the above, AND and OR should work too, right? -Original Message- From: Daniel Calvo [mailto:[EMAIL PROTECTED]] Sent: Thursday, February 21, 2002 11:12 AM To: Lucene Users List Subject: RE: Boolean AND query Hi, To achieve what you want, you need to use the required operand (+) --Daniel -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: quinta-feira, 21 de fevereiro de 2002 13:44 To: 'Lucene Users List' Subject: Boolean AND query Hello, Has anyone run into problems with boolean AND query? Basically, I am using the following code to do the query to look for 10060 AND 10040 BooleanQuery bq = new BooleanQuery(); Analyzer analyzer = new StandardAnalyzer(); Query query = QueryParser.parse(m_strKeyword, pageText, analyzer); bq.add(query, true, false); In this case, I am just using one query to add to bq but there can be more. I am getting correct results when 10060 AND 10040 exists in the document. But when one of them does not exist, I am still getting the same results. In other words, AND seems to be acting like an OR. I noticed this in the latest RC4 as well as an older lucene build from before lucene joined jakarta. Thanks! -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RC3 release
Hi, I have been using an older release from back when lucene was not under jakarta. I just tried the released RC3 version of apache.lucene libs, I was getting errors while indexing documents. Usually, there is a write.lock file left in the index dir. I did see some e-mails on a related subject, (RE: problems with last patch (obtain write.lock while deleting d ocuments)) I think Doug has fixed this on Feb 11th. I am at a point in my development of a search engine using lucene that I need to put the new apache.lucene libs in. Are there any release notes on rc3? Also, how soon the writelock fix be released officially? Thanks! -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Limit on number of characters before wildcard?
Yes, I am clear on how a prefix query is defined. But Dave says somehow a search *ogleash would work with a PrefixQuery. dog*eash or dog* would work, not *ogleash. That's where the confusion came from. Just to clarify... Thanks again. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Friday, January 11, 2002 12:49 PM To: Lucene Users List Subject: RE: Limit on number of characters before wildcard? Just so that nobody is confused in the future, PrefixQuery that Dave is mentioning is actually a query that lets you make searches such as 'Consult*'. See http://jguru.com/faq/view.jsp?EID=480194 Otis --- Dave Kor [EMAIL PROTECTED] wrote: First character asterisk (eg, *ogleash) is performed by PrefixQuery, which executes much faster than WildcardQuery. Dave Kor Kian Wei Consultant Product Engineering NexusEdge Technologies Pte. Ltd. 6 Aljunied Ave 3, #01-02 (Level 4) Singapore 389932 Tel : (+65)848-2552 Fax : (+65)747-4536 Web : www.nexusedge.com -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Friday, January 11, 2002 11:40 AM To: Lucene Users List Subject: Re: Limit on number of characters before wildcard? Hello, I haven't tested this like you did, but from looking at the query parser (QueryParser.jj file in the Lucene distribution) it seems that only a single character is required before '*' or '?': ... | WILDTERM: _TERM_START_CHAR (_TERM_CHAR | ( [ *, ? ] ))* ... _TERM_START_CHAR is defined as: [ a-z, A-Z, _, \u0080-\uFFFE ] and as you can see from the first definition above this character can be followed by either zero or more _TERM_CHAR or * or ?. This also answers your question about using an asterisk as the very first character in the query. It would be great if Doug or Brian Goetz could confirm or dispute this, so that I can add it to the Lucene FAQ at jGuru.com. Otis --- Aruna Raghavan [EMAIL PROTECTED] wrote: Hi, From some testing that I have done it appears that there is a limit of 3 characters before the wild card for wildcard queries. In other words, if the word is dogleash and I looking by using do* it returns wrong results (usually only a asubset) where as if I use dog*, I get correct results. Also, wildcard at the begining of the keyword does not seem to be supported. (*ogleash) Can some one confirm this? Is this documented anywhere? -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Send FREE video emails in Yahoo! Mail! http://promo.yahoo.com/videomail/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Send FREE video emails in Yahoo! Mail! http://promo.yahoo.com/videomail/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]