RE: Pagination

2007-07-02 Thread Lee Li Bin
Hi, I still have no idea of how to get it done. Can give me some details? The web application is in jsp btw. Thanks a lot. Regards, Lee Li Bin -Original Message- From: Chris Lu [mailto:[EMAIL PROTECTED] Sent: Saturday, June 30, 2007 2:21 AM To: java-user@lucene.apache.org Subject:

Re: Pagination

2007-07-02 Thread mark harwood
The Hits class is OK but can be inefficient due to re-running the query unnecessarily. The class below illustrates how to efficiently retrieve a particular page of results and lends itself to webapps where you don't want to retain server side state (i.e. a Hits object) for each client. It

Re: Exchange/PST/Mail parsing

2007-07-02 Thread Nick Burch
On Sun, 1 Jul 2007, Grant Ingersoll wrote: Anyone have any recommendations on a decent, open (doesn't have to be Apache license, but would prefer non-GPL if possible), extractor for MS Exchange and/or PST files? There has been an offer to contribute a PST parser to Apache POI. We're hoping

RE: Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-07-02 Thread Darren Hartford
Thank you for the link to the previous thread, lot of information there! *Synonym use of nicknames - that sounds quite feasible. Do you specifically mean the WordNet module in the Sandbox, or something different? -Original Message- From: Grant Ingersoll [mailto:[EMAIL PROTECTED]

Re: Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-07-02 Thread Grant Ingersoll
On Jul 2, 2007, at 8:07 AM, Darren Hartford wrote: Thank you for the link to the previous thread, lot of information there! *Synonym use of nicknames - that sounds quite feasible. Do you specifically mean the WordNet module in the Sandbox, or something different? No, I think I was

Re: Exchange/PST/Mail parsing

2007-07-02 Thread Christiaan Fluit
Hello Grant (cc-ing aperture-devel), I am one of the Aperture admins, I can tell you a bit more about Aperture's mail facilities. Short intro: Aperture is a framework for crawling and full-text and metadata extraction of a growing number of sources and file formats. We try to select the

Auto Slop

2007-07-02 Thread Walt Stoneburner
I just ran into an interesting problem today, and wanted to know if it was my understanding or Lucene that was out of whack -- right now I'm leaning toward a fault between the chair and the keyboard. I attempted to do a simple phrase query using the StandardAnalyzer: United States Against my

Re: Auto Slop

2007-07-02 Thread Mark Miller
Examine your indexes and analyzers. The default slop is 0, which means allow 0 terms between the terms in the phrase. That would be an exact match. A slop of 1 is not the default and would allow a term movement of one position to match the phrase. - Mark Walt Stoneburner wrote: I just ran

RE: Auto Slop

2007-07-02 Thread Ard Schrijvers
I just ran into an interesting problem today, and wanted to know if it was my understanding or Lucene that was out of whack -- right now I'm leaning toward a fault between the chair and the keyboard. I attempted to do a simple phrase query using the StandardAnalyzer: United States And you

highlighting phrase query

2007-07-02 Thread sandeep chawla
Hi All, I am developing a search tool using lucene. I am using lucene 2.1. i have a requirement to highlight query words in the results. .Lucene-highlighter 2.1 doesn't work well in highlighting phase query. For example - if i have a query string lucene Java .It highlights not only occurrences

Re: Pagination

2007-07-02 Thread Alixandre Santana
Mark, The ScoreDoc[] contains only the IDs of each lucene document. what would be the best way of getting the entire (lucene)document ? Should i do a new search with the ID retrivied by hpc.getScores() - (searcher.doc(idDoc))? thanks. Alixandre On 7/2/07, mark harwood [EMAIL PROTECTED]

Modify search results

2007-07-02 Thread Robert Mullin
I have managed to download and install Lucene. In addition, I have reached the point at which I am able to generate an index and run a search. The search returns a 'raw' list of the HTML pages in which my search term occurs. . . . chapter17, chapter18, etc. Question: how do I go about

Lucene index in memcache

2007-07-02 Thread Cathy Murphy
Is there a way to store lucene index in memcache. During high traffic search becomes very slow. :( -- Cathy www.nachofoto.com

Re: Lucene index in memcache

2007-07-02 Thread Erick Erickson
You can always read the current index into a RAMdir, but I really wonder if that will make much of a difference, as your op system should be taking care of this kind of thing for you. How big is your index? What kind of performance are you seeing? What else is running on that box? I'd do some

Re: highlighting phrase query

2007-07-02 Thread Mark Miller
There has been a lot of Highlighter discussion lately, but just to try and sum up the state of Highlighting in the Lucene world: There are four Highlighter implementations that I know of. From what I can tell, only the original Contrib Highlighter has received sustained active development by

Re: Lucene index in memcache

2007-07-02 Thread Chris Hostetter
: Is there a way to store lucene index in memcache. During high traffic search : becomes very slow. :( http://people.apache.org/~hossman/#xyproblem Your question appears to be an XY Problem ... that is: you are dealing with X, you are assuming Y will help you, and you are asking about Y without

Reusing Document Objects (was Auto Slop)

2007-07-02 Thread Walt Stoneburner
If I create a Document object, can I pass it to multiple index writers without harm? Or, does the process of being handed to an Index Writer somehow mutate the state of the Document object, say during tokenizing, that would cause it's re-use with a totally separate index to cause problems

RE: highlighting phrase query

2007-07-02 Thread Renaud Waldura
Mark: Thanks a million for this comprehensive analysis. This is going straight to my manager. :) --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Monday, July 02, 2007 2:11 PM To: java-user@lucene.apache.org Subject: Re: highlighting phrase query There

multi-term query weighting

2007-07-02 Thread Tim Sturge
I have an index with two different sources of information, one small but of high quality (call it title), and one large, but of lower quality (call it body). I give boosts to certain documents related to their popularity (this is very similar to what one would do indexing the web). The

RE: Pagination

2007-07-02 Thread Lee Li Bin
Hi, Thanks Mark! I do have the same question as Alixandre. How do I get the content of the document instead of the document id? Thanks. Regards, Lee Li Bin -Original Message- From: Alixandre Santana [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 03, 2007 12:55 AM To:

RE: Pagination

2007-07-02 Thread Lee Li Bin
Hi Mark, How do I display results on the second page? I manage to display on one page using your coding. Regards, Lee Li Bin -Original Message- From: Alixandre Santana [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 03, 2007 12:55 AM To: java-user@lucene.apache.org Subject: Re:

Re: highlighting phrase query

2007-07-02 Thread sandeep chawla
Thanks a lot Mark, has any one used Lucene-794? how stable it it. is it widely used in industry. These are some of my questions :) Thanks Sandeep On 03/07/07, Renaud Waldura [EMAIL PROTECTED] wrote: Mark: Thanks a million for this comprehensive analysis. This is going straight to my