Re: ways to minimize index size?

2007-06-19 Thread Sebastin
When i use the standardAnalyzer storage size increases.how can i minimize index store Sebastin wrote: > > > String outgoingNumber="9198408365809"; > String incomingNumber="9840861114"; > String datesc="070601"; > String imsiNumber="444021365987"; > String callType="1"; >

RE: Lucene index performance

2007-06-19 Thread Fang_Li
Hi Andreas, I am very interested in the multiple index file index/search. Can you kindly help me on following questions? 1) Why you use multi index files? How much is the performance gain for both indexing and searching? Someone reported that there no big performance difference except the n

Re: Lucene 2.2.0 release available

2007-06-19 Thread Yonik Seeley
On 6/19/07, DM Smith <[EMAIL PROTECTED]> wrote: FYI, The announcement has not made it to the http:// lucene.apache.org/ page. I just committed this. It should be viewable in about an hour. Note: I had to change the syntax slightly... I'm using forrest-0.8 now, and apparently it doesn't allow

Highlighter that works with phrase and span queries

2007-06-19 Thread Mark Miller
I have been working on extending the Highlighter with a new Scorer that correctly scores phrase and span queries. The highlighter is working great for me, but could really use some more banging on. If you have a need or an interest in a more accurate Highlighter, please give it a whirl and let

Lucene 2.2.0 release available

2007-06-19 Thread Michael Busch
Release 2.2.0 of Lucene is now available! Many new features, optimizations, and bug fixes have been added since 2.1, including "point-in-time" searching, payloads, function queries and new APIs for pre-analyzed fields. The detailed change log is at: http://svn.apache.org/repos/asf/lucene/java/

Re: Content Summarization

2007-06-19 Thread Bob Carpenter
>> Any one knows of a content summarization library. Take a look at LingPipe (http://alias-i.com/lingpipe/). I'm afraid LingPipe doesn't do content summarization. Basically, it's an AI-hard problem as you need fairly deep understanding in order not to produce word salad. Here are two relati

RE: MultiSearcher holds on to index - optimization not one segment

2007-06-19 Thread Beard, Brian
That works, thanks. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Tuesday, June 19, 2007 9:57 AM To: java-user@lucene.apache.org Subject: Re: MultiSearcher holds on to index - optimization not one segment On 6/19/07, Beard, Brian <[EM

Re: Position of matches to affect scoring

2007-06-19 Thread Steven Rowe
Hi Jes, Jesse Prabawa wrote: > The Lucene FAQ at http://wiki.apache.org/lucene-java/LuceneFAQ > mentions that the position of the matches in the text does not affect > scoring. So is there anyway that I can make the position of the > matches affect scoring? For example, I want matches that occur a

Re: ways to minimize index size?

2007-06-19 Thread Sebastin
String outgoingNumber="9198408365809"; String incomingNumber="9840861114"; String datesc="070601"; String imsiNumber="444021365987"; String callType="1"; //Search Fields String contents=(outgoingNumber+" "+incomingNumber+" "+dateSc+" "+imsiNumber+" "+callType ); //Displa

Re: how to search the fields in SimpleAnalyzer

2007-06-19 Thread Steven Rowe
Hi Sebastin, Sebastin wrote: > i index my document using SimpleAnalyzer() when i search the Indexed > field in the searcher class it doesnt give me the results.help me to sort > out this issue. > > My Code: > > test="9840836598" > test1="bch01" > > testRecords=(test+" "+test1); > > docum

Re: how to search the fields in SimpleAnalyzer

2007-06-19 Thread Sebastin
Hi Erick, thanks for your reply.here is the searcher class to search the document Directory fsDir = FSDirectory.getDirectory(indexDir, false); IndexSearcher is = new IndexSearcher(fsDir); QueryParser parser=new QueryParser("testRecords",new SimpleAnalyzer()); Qu

Re: ways to minimize index size?

2007-06-19 Thread Erick Erickson
Show us the code you use to index. Are you storing the fields? omitting norms? Throwing out stop words? Best Erick On 6/19/07, Sebastin <[EMAIL PROTECTED]> wrote: Hi Does anyone give me an idea to reduce the Index size to down.now i am getting 42% compression in my index store.i want to reduc

Re: Writing a document using two different Analyzers

2007-06-19 Thread Erick Erickson
Assuming you've got two analyzers ...just see the javadoc for PerFieldAnalyzerWrapper. It's pretty easy once you have the magic class name... Best Erick On 6/19/07, Sebastin <[EMAIL PROTECTED]> wrote: could you briefly tell me how to write two analyzers for the two field Paulo Silveira-

Re: FW: Lucene indexing vs RDBMS insertion.

2007-06-19 Thread Erick Erickson
You still haven't described how often you need to index and why. That's critical. If you have an index that only needs to be updated every month, many of your concerns disappear. If it needs to be updated every 5 seconds, it's another matter entirely. So which is it? Best Erick On 6/18/07, Chew

RE: ways to minimize index size?

2007-06-19 Thread Sebastin
Hi Does anyone give me an idea to reduce the Index size to down.now i am getting 42% compression in my index store.i want to reduce upto 70%.i use standardanalyzer to write the document.when i use SimpleAnalyzer it reduce upto 58% but i couldnt search the document.please help me to acheive. Tha

Re: how to search the fields in SimpleAnalyzer

2007-06-19 Thread Erick Erickson
I recommend you get a copy of Luke (google Lucene, luke) and examine your index. Luke will also allow you to see how various queries parse. Since you haven't shown how you parse your query, anything anyone says would be a guess. But at a guess, you may be having troubles with capitalization in yo

Re: MultiSearcher holds on to index - optimization not one segment

2007-06-19 Thread Yonik Seeley
On 6/19/07, Beard, Brian <[EMAIL PROTECTED]> wrote: This may seem like a naïve question - since the garbage collection is not enforcable, is it possible to send a flag to the IndexReader to give this up once the reader is no longer needed? You call close() on the IndexReader (or the IndexSear

RE: MultiSearcher holds on to index - optimization not one segment

2007-06-19 Thread Beard, Brian
This may seem like a naïve question - since the garbage collection is not enforcable, is it possible to send a flag to the IndexReader to give this up once the reader is no longer needed? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent:

Re: Writing a document using two different Analyzers

2007-06-19 Thread Sebastin
could you briefly tell me how to write two analyzers for the two field Paulo Silveira-3 wrote: > > On 5/25/07, karl wettin <[EMAIL PROTECTED]> wrote: >> >> PerFieldAnalyzerWrapper >> > > that was fast! thanks! > > >> http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/ >> or

Re: MultiSearcher holds on to index - optimization not one segment

2007-06-19 Thread Yonik Seeley
On 6/19/07, Beard, Brian <[EMAIL PROTECTED]> wrote: The problem I'm having is once the MultiSearcher is open, it holds on to the index file An IndexReader holds open the files... this is a feature. Not holding the file open would mean that the index would actively change while being searched.

MultiSearcher holds on to index - optimization not one segment

2007-06-19 Thread Beard, Brian
We're using a MultiSearcher to search against multiple lucene indexes which runs inside of a web application in jboss 4.0.4. We're also using a standalone app running in a different jboss server which gets periodic updates from an oracle database and updates the lucene index. Both the searcher a

RE: Lucene for chinese search

2007-06-19 Thread Lee Li Bin
Hi, thanks guys for helping me. I forgot to use back the same analyzer for searching, that's why I can't search for Chinese words.. :) -Original Message- From: Chris Lu [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 19, 2007 4:37 AM To: java-user@lucene.apache.org Subject: Re: Lucene

Position of matches to affect scoring

2007-06-19 Thread Jesse Prabawa
Hi, The Lucene FAQ at http://wiki.apache.org/lucene-java/LuceneFAQ mentions that the position of the matches in the text does not affect scoring. So is there anyway that I can make the position of the matches affect scoring? For example, I want matches that occur at the beginning to weigh more th

Re: FW: Lucene indexing vs RDBMS insertion.

2007-06-19 Thread Chris Lu
Optimized index vs un-optimized actually is very much like searching on one optimized index vs MultiSearcher on multiple optimized indexes. Each segment is like a small index. If you just add them together, they just behave like multiple indexes. If segments number is small, like 3, there won't b

how to search the fields in SimpleAnalyzer

2007-06-19 Thread Sebastin
Hi All, i index my document using SimpleAnalyzer() when i search the Indexed field in the searcher class it doesnt give me the results.help me to sort out this issue. My Code: test="9840836598" test1="bch01" testRecords=(test+" "+test1); document.add("testRecords",testRecords,Field.Store