Re: Keyphrase Extraction

2007-05-06 Thread Dawid Weiss
You could also try splitting the document into paragraphs and use Carrot2's Lingo algorithm (www.carrot2.org) on a paragraph-level to extract clusters. Labelling routine in Lingo should extract 'key' phrases; this analysis is heavily frequency-based, but... you know, you may want to try it.

Search Result cutoff

2007-05-06 Thread Ram Peters
How do you specify cutoff on search results? If I want to sort the search result, on other than relevancy, I don't want non related stuff showing up at the top. Is there way to set a cutoff, so only result that falls between certain range are displayed? Thanks.

Possible bug in SpanNearQuery

2007-05-06 Thread Moti Nisenson
Looking over the implementation of SpanNearQuery I came upon what looked like a bug. Below is a test which fails due to it. SpanNearQuery doesn't return all matching spans; once it's found a span it always increments the span of the clause appearing first in that span (ie. in the example below

Re: Search Result cutoff

2007-05-06 Thread Erick Erickson
Well, falls between a certain range is problematical. There's nothing hard and fast about scoring. That is, scores between, say, two different queries are not comparable. But I really don't understand the question. You won't get unrelated stuff in your result set as far as I know. Everything has

Re: Possible bug in SpanNearQuery

2007-05-06 Thread Paul Elschot
Moti, I tried your test and it fails in the way you describe, however, I don't think the test shows a bug. Below is the javadoc comment for the package private class NearSpansOrdered. Would that be sufficient documentation for the ordered case? /** A Spans that is formed from the ordered

Finding out which field matched for a query

2007-05-06 Thread makkhar
Here's what is going wrong for me : I have 10 documents, each with 10 fields with parameterName and parameterValue. Now, When i search for some term and I get 5 hits, how do I find out which paramName-Value pair matched ? Very simple a problem, but I could find no information on the forum for

Scoring question - Get Score of Best Query in BooleanQuery

2007-05-06 Thread Thomas Thomas
Hello everyone, Whenever I search a word in my web application, I search in some default fields, e.g. I search the word hello, I generate these queries : title:hello headlines:hello summary:hello content:hello Which I add in a BooleanQuery (BooleanClause.Occur.SHOULD) What I want to achieve

Re: Questions regarding Lucene query syntax

2007-05-06 Thread Daniel Einspanjer
On 5/6/07, Erick Erickson [EMAIL PROTECTED] wrote: On 5/5/07, Daniel Einspanjer [EMAIL PROTECTED] wrote: The query syntax reference page talks about the NOT and the - operators, but it wasn't clear to me what exactly the difference is between them. Could someone tell me briefly what that

Re: Finding out which field matched for a query

2007-05-06 Thread Daniel Noll
On Monday 07 May 2007 06:19:47 makkhar wrote: Here's what is going wrong for me : I have 10 documents, each with 10 fields with parameterName and parameterValue. Now, When i search for some term and I get 5 hits, how do I find out which paramName-Value pair matched ? Very simple a problem,

Re: Keyphrase Extraction

2007-05-06 Thread [EMAIL PROTECTED]
Hi Mark, Do you know of a good paid product that does this? Thanks, Arsen - Original Message From: Mark Miller [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Wednesday, May 2, 2007 7:52:36 AM Subject: Re: Keyphrase Extraction From what I know you generally have to pay if

Re: QueryParser, PrefixQuery, and case sensitivity

2007-05-06 Thread Bill Au
Erick, Thanks for the advice. I will take a look at PerFieldAnalyzerWrapper to see if I want to take this on. For my case, I have to use mexed case for a couple of fields since case really does matter for them (ie apple is not the same as Apple), and I actually don't want users to find the

Re: Keyphrase Extraction

2007-05-06 Thread Otis Gospodnetic
Arsen, I already mentioned it (see below) - LingPipe - http://alias-i.com . Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: [EMAIL PROTECTED] [EMAIL PROTECTED] To:

Re: Finding out which field matched for a query

2007-05-06 Thread makkhar
Well, the approach you suggested is what we use now. We regex use pattern matching to find the search term out. However, due to this we cannot use some of the very sophisticated queries which lucene supports (like boolean query etc). We sure can use highlighting to find out this information. But