Re: Wild card and multiple keyword search

2005-07-13 Thread Erik Hatcher
On Jul 13, 2005, at 10:40 AM, Rahul D Thakare wrote: que: - The question back to you is do you want searches for simply "MAIN" to find both "MAIN LOGIC" and "MAIN PARTS"? Or should it return no documents since its not an exact match? Ans: It should return no documents since it is not a exac

Re[2]: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Sven Duzont
. i thought it was a lucene user list, not a dbsight one --- sven Le mercredi 13 juillet 2005 à 17:47:14, vous écriviez : CL> Hi, Klaus, thanks. CL> You can simply use DBSight to create the index. It's in Lucene's CL> standard format. CL> And you ca

Re: "docMap" array in SegmentMergeInfo

2005-07-13 Thread Doug Cutting
Lokesh Bajaj wrote: For a very large index where we might want to delete/replace some documents, this would require a lot of memory (for 100 million documents, this would need 381 MB of memory). Is there any reason why this was implemented this way? In practice this has not been an issue. A

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Klaus Hubert
Hi Chris, I've not thought about that. I'm almost done with my program and I will give yours also a try as suggested. I have the lasest (recommended) JDBC 3.1.10. But I still have to download and install Tomcat or similar to run your .war file. I think 5-24h is not that bad, since you can update t

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Klaus Hubert
Yes, it works with breakpoints and so on, but the current line is never highlighted. All I see where it is the line number in the debug window. But you are right, this is no Java Forum and I apologize for beginners questions. -Original Message- From: Karthik N S [mailto:[EMAIL PROTECTED]

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Klaus Hubert
Hi, Thank you all so much for the crash course in Java for Beginners. Indeed the last time I used java was 1996... Lol. But I'm getting now very close. It is all about the right declarations of classes and includes at the correct location. I have almost done it. I will publish my code to the commu

"docMap" array in SegmentMergeInfo

2005-07-13 Thread Lokesh Bajaj
I noticed the following code that builds the "docMap" array in SegmentMergeInfo.java for the case where some documents might be deleted from an index: // build array which maps document numbers around deletions if (reader.hasDeletions()) { int maxDoc = reader.maxDoc(); docM

Re: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Chris Lu
Hi, Klaus, thanks. You can simply use DBSight to create the index. It's in Lucene's standard format. And you can control index field type, analyzers, how to select data from database, number of java threads, etc, just by web UI. No coding is needed. We have a user who didn't know Lucene at all, an

Re: OutOfMemoryError

2005-07-13 Thread Ian Lea
Might be interesting to know if it crashed on 2 docs if you ran it with heap size of 512Mb. I guess you've already tried with default merge values. Shouldn't need to optimize after every 100 docs. jdk 1.3 is pretty ancient - can you use 1.5? I'd try it with a larger heap size, and then look

OutOfMemoryError

2005-07-13 Thread Lasse L
Hi, I can see that this has been up before, but I still hope to get some advice based on my specific environment. I index some documents with 26 fields in them. The size 1 indexed documents is 4mb, so it shouldn't be overwhelming amounts of data compared to what I have heard lucene can do. N

Re: Re: Wild card and multiple keyword search

2005-07-13 Thread Rahul D Thakare
Hi Erik, Thanks for the reply. Here are the answers of your queries que: - The question back to you is do you want searches for simply "MAIN" to   find both "MAIN LOGIC" and "MAIN PARTS"?  Or should it return no   documents since its not an exact match? Ans: It should return no documents

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Karthik N S
hI Apologies Interesting this is not the Form to discuss about HOW to Debugging with Eclipse So I suggest u to use the Help tab in Eclispe Ide. Hint : First set the Break point on hte code and then use Use the Debug tab under Run. this is a Lucene Form Guys Karthik -Or

Re: Wild card and multiple keyword search

2005-07-13 Thread Erik Hatcher
On Jul 13, 2005, at 8:18 AM, Rahul D Thakare wrote: We are using doc.add(Field.Text("keywords",keywords)); to add the keywords to the document, where keywords is comma separated keywords string. If the text is already comma separated and that is the level at which you things tokenized, t

Re: Wild card and multiple keyword search

2005-07-13 Thread Ian Lea
Sounds to me that all you need is to AND rather than OR your search terms. QueryParser qp = new QueryParser("keywords", analyzer); qp.setOperator(QueryParser.DEFAULT_OPERATOR_AND); Query q = qp.parse(words); where analyzer is just the standard one. Or search for +MAIN +BO

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Xing Li
Kalus, Just a few days ago I couldn't even remember how to compile java code. Last time I touched java was like 2001. Don't worry, Lucene is extremely easy, once you know a bit of fund java. It's no different than any other language. Just syntax. I recommend Java from Deitel & Deitel. Fell in l

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Klaus Hubert
Hi Xing, I have the book and as I wrote in my initial message I managed to create the sample index as well managed to read mySQL. But I seem to be not able to combine those programs :-( I'm very new to Java and I haven't found a nice Debugger so far to go step by step through my code. I will try t

Wild card and multiple keyword search

2005-07-13 Thread Rahul D Thakare
  Hi, We are using doc.add(Field.Text("keywords",keywords)); to add the keywords to the document, where keywords is comma separated keywords string. Lucene seems to tokenize the keywords with multiple words like(MAIN BOARD) as different keywords(ie as MAIN and BOARD). Tokenization is based on

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Xing Li
Don't make the mistake of complicating the task. Just read straight from mysql into lucene via java. There is no benefit of exporting data to xml just to regrab the data back into lucene. Get the Lucene In actioin book if you haven't cause all the samples there are real-world practical. Are yo

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Klaus Hubert
Hi Ian, That's something I'm looking for. Right, a simple source code which reads a database and adds the fields to the index. What I've found also so far is another solution at http://www-128.ibm.com/developerworks/java/library/j-lucene/. First step is to export my MySQL database in simple XML an

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Klaus Hubert
Hi Nader, I downloaded Eclipse and also the Hibernate plugin and I really like this IDE. It seems to have lots of power. What I didn't found so far is a Debugger where I can go line by line through the code to see errors eventually. It runs and I get error messages at the line where the problem ar

RE: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Klaus Hubert
Hi Chris, this is indeed a cool application, but I need just to create the index. I definitely will look into your file and see if it makes my life easier. Can you tell any details how long it took to create such a huge index? What experiences you have with the slowest search? Does it go over 1 se

Re: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Nader Henein
Also Hibernate, you can use Eclipse as an IDE, with the Hibernator plugin to create objects cleanly from your MySQL database and then a few lines will fetch an object which you could then be passed to Lucene for indexing. Nader Henein Klaus Hubert wrote: Hi, I played with several search en

Re: SIMPLE Lucene / MySQL Indexer

2005-07-13 Thread Ian Lea
Something like this? IndexWriter iw = whatever ResultSet rs = whatever while (rs.next()) { Document ldoc = new Document(); ldoc.add(Field.Text("f1", rs.getString("f1")); ldoc.add(Field.Unstored("f2", rs.getString("f2")); ldoc.add(Field.Keyword("f3", rs.getString("f3")); ... iw.a