Re: Multiuser environments
On Monday 14 July 2003 08:52, Guilherme Barile wrote: > Hi > I'm writing a web application which will index files using > textmining to extract text and lucene to store it. I do have the > following implementation questions: > > 1) Only one user can write to an index at each time. How are you people > dealing with this ? Maybe some kind of connection pooling ? Two obvious candidates are locking bottleneck methods and doing index writing in a critical section, or having a background thread that does reindexing, and other threads add requests to a queue. In CMS I'm working we are doing the latter (so as not to block actual request threads which could happen with first approach, adding/deleting documents is done as post-processing when documents are created/edited/deleted). In either case you usually have a singleton instance that represents the search engine functionality (assuming single index), and from there on it's reasonably easy to reuse IndexReader as necessary. -+ Tatu +- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Delete files
On Mon, 14 Jul 2003 07:11 pm, [EMAIL PROTECTED] wrote: > Hi, I am new in Lucene. I have a problem with my code. Somebody can help me > why I can't delete some files.Maybe I missing something. Thanks in advance. You have IndexWriter opened while trying to delete document. Use IndexReader to delete document and then open IndexWriter to optimize. > Regards, > > Michel victor - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Files getting deleted when optimize is killed?
Upon further examination what I found is this: - Killing the process while optimize() is still working does NOT cause the index files to be deleted, HOWEVER -- - Once the index is opened again by a new process (now apparently in an unstable state due to the incomplete optimize()), at that time all existing files are deleted and only a file called "segments" remains. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Saturday, July 12, 2003 7:06 AM To: Lucene Users List Subject: Re: Files getting deleted when optimize is killed? --- Steve Rajavuori <[EMAIL PROTECTED]> wrote: > I've had a problem on several occasions where my entire index is > deleted -- > that is, EVERY file (except 'segments') is gone. There were many > users on > the system each time, so its a little hard to tell for sure what was > going > on, but my theory is this: > > My code will automatically call optimize( ) periodically. Because the > index > is very large, it can take a long time. It looks like an > administrator may > have killed my process, and its possible that it was killed while an > optimize( ) was in progress. > > I have two questions: > > 1) Does anyone know if killing an optimize( ) in progress could wipe > out all > files like this? (New index created in temporary files that were not > saved > properly, while old index files were already deleted???) I highly doubt it. > 2) Does anyone know of any other way all files in an index could be > inadvertently deleted (e.g. through killing a process)? For example, > if you > kill the process during an 'add' would that cause all files to be > deleted? Same as above. You can create an artificial, large index for testing purposes. Call optimize once in a while, and then kill the process. I don't think Lucene will remove your files. Otis __ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Luke - Lucene Index Browser
Scott Ganyo wrote: Nifty cool! I'm gonna like this, I can tell already! I'm having a really hard time actually using Luke, though, as all the window panes and table columns are apparently of fixed size. Do you think you could through in the ability to resize the various window panes and table columns? This would make the tool truly useful. Pretty please? :) Well, you can resize the main window.. :-) Regarding column sizes: that's a limitation of the GUI toolkit (soon to be fixed) - if you can wait patiently a couple weeks for the new release of that toolkit, I can add this as well... In any case, if you're referring to the "Search" panel, then you can always double-click on one of the search results, and it will be displayed in the "Documents" panel, where you can not only see all the fields, but also copy them to clipboard... -- Best regards, Andrzej Bialecki - Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator - FreeBSD developer (http://www.freebsd.org) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Luke - Lucene Index Browser
Nifty cool! I'm gonna like this, I can tell already! I'm having a really hard time actually using Luke, though, as all the window panes and table columns are apparently of fixed size. Do you think you could through in the ability to resize the various window panes and table columns? This would make the tool truly useful. Pretty please? :) Thanks, Scott Andrzej Bialecki wrote: Dear Lucene Users, Luke is a diagnostic tool for Lucene (http://jakarta.apache.org/lucene) indexes. It enables you to browse documents in existing indexes, perform queries, navigate through terms, optimize indexes and more. Please go to http://www.getopt.org/luke and give it a try. A Java WebStart version will be available soon. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Luke - Lucene Index Browser
Dear Lucene Users, Luke is a diagnostic tool for Lucene (http://jakarta.apache.org/lucene) indexes. It enables you to browse documents in existing indexes, perform queries, navigate through terms, optimize indexes and more. Please go to http://www.getopt.org/luke and give it a try. A Java WebStart version will be available soon. -- Best regards, Andrzej Bialecki - Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator - FreeBSD developer (http://www.freebsd.org) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Multiuser environments
Hi I'm writing a web application which will index files using textmining to extract text and lucene to store it. I do have the following implementation questions: 1) Only one user can write to an index at each time. How are you people dealing with this ? Maybe some kind of connection pooling ? 2) OutOfMemory issues - I've read about many on the list, which workarounds are you using? Thanks in advance gui
Multiuser environments
Hi I'm writing a web application which will index files using textmining to extract text and lucene to store it. I do have the following implementation questions: 1) Only one user can write to an index at each time. How are you people dealing with this ? Maybe some kind of connection pooling ? 2) OutOfMemory issues - I've read about many on the list, which workarounds are you using? Thanks in advance gui
Field.Text(string, string) and Queryparser
Hello When I use QueryParser.parse(String query,String field,Analyzer analyzer) and have added the field with the type Field.Text(string, string), I can't search in specific field like component:call It can't find the word call in the field component. But if I instead use Field.Text(string, Reader) the query works. Another problem is that I have to use Field.Text(string, string) if I want to know if the hits contain field with the name component. Then I use this code: Vector component = new Vector(); if(doc.getField("component")!=null){ component.addElement(doc); } To use getField(string) I have to use Field.Text(string, string) When I display the hits I want to divide them in catagory according to the field name. If I have field called component, interface and datatypes I want to display the hits like: component hit1 hit2 hit3 interface hit4 data type hit5 But I also want to search in a specific field at the samt time. Help me, please. I hope that I have explained my problem so you can understand it. Thanks /Michelle - This mail sent through IMP: http://horde.org/imp/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: HELP in QueyParsing !!
Thanks Victor. I'll look into your earlier postings for the solution. But I was wandering, there might be many more scenarios what a user might search for. -Original Message- From: Victor Hadianto [mailto:[EMAIL PROTECTED] Sent: Monday, July 14, 2003 2:02 PM To: Lucene Users List Subject: Re: HELP in QueyParsing !! > Input: QueryCreated Remarks > c\+\+ c (Escape character not working) The StandardTokenizer and QueryParser will drop the ++ sign. This problem is similar to the recent thread. Search the archive the the following strings '-' characer not interpreted correctly in field names You may be able to implement similar solution to the one that I've posted. Actually your query got me interested, I've tried my solution for c-- and the -- signs are dropped. This because I define DASHESWORD as | ("-" )+ > This will search for t-shirt, but not tshirt-. Yet another QueryParser peculiarity :) If you absolutely has to search for c++ then I suggest you define another token which encompasses all alpharnumeric word and plus sign. For example (modify StandardTokenizer.jj): |"+")+ > add the line: token = in the next() method. This may work. > c++- (Parser throws an exception) [NOTE-1] As expected. > *c - (throws an exception - [NOTE-2] There has been a number of discussion on this subject, search the mailing list for more information. > Does that mean that the program should taken care of validating the > User input and then pass the query string to QueryParser? Depends how do you look at it. QueryParser will throw ParseException if it has parsing issues, you can in some way treat this as the validation. HTH, victor - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Delete files
Hi, I am new in Lucene. I have a problem with my code. Somebody can help me why I can't delete some files.Maybe I missing something. Thanks in advance. Regards, Michel Here the two code that I use for indexing files and for delete files: the error message is the following: java.io.IOException: Index locked for write: [EMAIL PROTECTED]:\yo\write.lock at org.apache.lucene.index.IndexReader.delete(Unknown Source) at indice.deletefile.main(deletefile.java:15) Exception in thread "main" package indice; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.analysis.SimpleAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import java.io.*; public class IndexarFile { public static void main(String[] args) throws Exception{ String indexPath = "c:\\yo"; IndexWriter writer; String f = "c:\\PROCEDURES\\BCNDMPRD.xls"; writer = new IndexWriter(indexPath, new SimpleAnalyzer(), false); //for (int i=0; i
Re: HELP in QueyParsing !!
> Input: QueryCreated Remarks > c\+\+ c (Escape character not working) The StandardTokenizer and QueryParser will drop the ++ sign. This problem is similar to the recent thread. Search the archive the the following strings '-' characer not interpreted correctly in field names You may be able to implement similar solution to the one that I've posted. Actually your query got me interested, I've tried my solution for c-- and the -- signs are dropped. This because I define DASHESWORD as | ("-" )+ > This will search for t-shirt, but not tshirt-. Yet another QueryParser peculiarity :) If you absolutely has to search for c++ then I suggest you define another token which encompasses all alpharnumeric word and plus sign. For example (modify StandardTokenizer.jj): |"+")+ > add the line: token = in the next() method. This may work. > c++- (Parser throws an exception) [NOTE-1] As expected. > *c - (throws an exception - [NOTE-2] There has been a number of discussion on this subject, search the mailing list for more information. > Does that mean that the program should taken care of validating the > User input and then pass the query string to QueryParser? Depends how do you look at it. QueryParser will throw ParseException if it has parsing issues, you can in some way treat this as the validation. HTH, victor - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
HELP in QueyParsing !!
Hi Need some help in queryparsing. There are few things that doesn't Seem to work as I expected. Have a look at the code at the end Before reading my observations. Document contains following information: D1 = c++ hello bharat D2 = c hello sharat D3 = hello bharat Observations Input: QueryCreated Remarks c\+\+ c (Escape character not working) c++- (Parser throws an exception) [NOTE-1] c* c* (Wild card works perfectly fine) *c - (throws an exception - [NOTE-2] (org.apache.lucene.queryParser.TokenMgrError:) "c - (throws an exception - [NOTE-3] Hello "" - (throws an exception) [NOTE-1] : - ( ) { } ! [ ] etc characters behave in the same manner as "+" shown above. [NOTE-2] : Looks like wildcard cannot be the first character of the query [NOTE-3] : I guess this validation can be done after accepting user input. My Comments/Questions = Does that mean that the program should taken care of validating the User input and then pass the query string to QueryParser? If yes, I guess there might be some more validations that should be Done that I have missed out. Can anyone throw some light on those Validations that the program should take care? Code import java.io.IOException; import java.io.BufferedReader; import java.io.InputStreamReader; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.analysis.*; import org.apache.lucene.document.*; import org.apache.lucene.index.*; import org.apache.lucene.search.*; public class TestQueryParser { public static void main(String[] argv) { try { IndexWriter writer = new IndexWriter("indexbbs", new StandardAnalyzer(), true); Document d1 = new Document(); d1.add(Field.Text("f1", "c++ hello bharat")); writer.addDocument(d1); Document d2 = new Document(); d2.add(Field.Text("f1", "c hello sharat")); writer.addDocument(d2); Document d3 = new Document(); d3.add(Field.Text("f1", "hello bharat")); writer.addDocument(d3); writer.optimize(); writer.close(); String qString = ""; try { BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); System.out.print("Input for f1: "); qString = in.readLine(); } catch(Exception e) { System.out.println("Exiting..." + e.getMessage()); return; } System.out.println(""); Searcher searcher = new IndexSearcher("indexbbs"); Analyzer analyzer = new StandardAnalyzer(); QueryParser qp = new QueryParser("f1", analyzer); Query query = qp.parse(qString); System.out.println("QueryInput:" + qString); System.out.println("QueryCreated:" + query.toString("f1")); Hits hits = searcher.search(query); for (int i=0; i