Op Friday 15 February 2008 02:47:14 schreef Cedric Ho:
> Sorry that I didn't make myself clear.
>
> [10/5/2] means for terms found in the 1st paragraph, give it score*10,
> for terms in the 2nd, give it score*5, etc.
>
> So I don't know how to do this scoring if the position (paragraph)
> informa
I haven't really been following this thread that closely, but...
: Why not just use ? Check to insure that it makes
: it through whatever analyzer you choose though. For instance,
: LetterTokenizer will remove it...
1) i'm 99% sure you can do something like this...
Document doc = new
: I was wondering if anyone has a more efficient method for achieving this.
: Would changing QueryParser.jj and developing a custom PhraseQuery class be a
: good idea? Any comments would be appreciated.
extending QueryParser and overriding the getPhraseQuery function to return
your own SpanNear
it's not possible with the query syntax out of the box, but you could
write a custom subclass of SpanQuery to make it possible.
if your numbers are simple enough (ie: just 0-9) you could probably make
the SpanRegexQuery work for you.
: I am trying to perform a query that will enable me to run
Hi all,
This is the same problem I am trying to solve as in the thread:
"How to pass additional information into Similarity.scorePayload(...)"
However since these questions are somewhat different, I figure I'd
start a new thread.
After diving into the Lucene Source codes for a while now I have
Sorry that I didn't make myself clear.
[10/5/2] means for terms found in the 1st paragraph, give it score*10,
for terms in the 2nd, give it score*5, etc.
So I don't know how to do this scoring if the position (paragraph)
information is in a separate field.
Cedric
On Fri, Feb 15, 2008 at 7:15 A
I would like to be able to handle the following:
"/\d\d\d{4} \\d\\d/ office"
Where / indicates a regex expression phrase.
One option is extending MultiFieldQueryParser and catching the phrase within
getFieldQuery evaluating whether /, the regex identifier, is present and
then returning a SpanNe
Why not just use ? Check to insure that it makes
it through whatever analyzer you choose though. For instance,
LetterTokenizer will remove it...
Erick
On Thu, Feb 14, 2008 at 4:41 PM, <[EMAIL PROTECTED]> wrote:
> > Rather than index one doc per page, you could index a special
> > token b
I have no idea what the [10/5/2] means, so I can't comment on that.
In case I have missed it previously I'm sorry.
My point was that payloads need not be used for different position info.
It's possible to do that, and it may be good for performance in some cases,
but one can revert to using anothe
> Rather than index one doc per page, you could index a special
> token between pages. Say you index $ as the special
> token.
I have decided to use this version, but...
What token can I use? It must be a token which gets never removed by an
analyzer or altered in a way that it not uniqu
There was a thread on this exact issue.If you are using 2.3, the payload api
would help with that.
-John
On Thu, Feb 14, 2008 at 3:59 AM, Gauri Shankar <[EMAIL PROTECTED]>
wrote:
> Thanks a lot for both of you.
>
> yes, I am talking about internally assigned document id.
>
> Erick : I am already
I'm a little confused about what's happening here
Are you inserting a pre-existing primary key from the DB into your
Lucene documents?
Or are you using the Lucene doc ID as your primary key in the database?
If this latter, you're going to have many, many problems since the Lucene
ID can, and will
Another possibility is to do it backwards, it depends on how expensive the
SQL query is I suppose. The idea would be to go ahead and to your
SQL query *first*, then construct a Lucene Filter to use with your query
using TermDocs/TermEnum.
I'd guess (without knowing much about your problem space) t
Hi Mitesh:
Lucene-OJVM integration is not tested against lucene-2.3.0 version.
I'll do it ASAP.
Best regards, Marcelo.
On Thu, Feb 14, 2008 at 10:01 AM, Mitesh Soni <[EMAIL PROTECTED]> wrote:
>
>
>
>
> I have run the build file in the lucene-2.3.0\contrib\ojvm successfully. But
> I cannot cr
I have run the build file in the lucene-2.3.0\contrib\ojvm successfully. But
I cannot create index with the use of .
create index it1 on t1(f2) indextype is lucene.LuceneIndex
parameters('Analyzer:org.apache.lucene.analysis.SimpleAnalyzer');
ERROR at line 1:
ORA-29855: error occurred in
Thanks a lot for both of you.
yes, I am talking about internally assigned document id.
Erick : I am already using the unique id into the index mapped to one of our
DB's primary key to uniquely identify the docs from index. Now to get the
value of this unique field i need to call getDocumet(). Bu
If it adds the clauses as Occur.SHOULD, it means they should appear, but
does not have to appear.
Looking at suggestSimilar, it looks like it computes the edit_distance
values of the requested word and the suggestions. If the score is lower than
the minimum score, it may skip the word.
Could you tr
Hello Shai,
Thats right, Speller is in the contrib.it is named spellchecker. Basically
it is a special index that stores the words as ngrams.
I looked at the code to see how it is querying the index and basically it
makes ngrams and adds each ngram to a boolean query.
Here is how it adds to the b
Hi Paul,
Sorry I am not sure I understand your solution.
Because I would need to apply this scoring logic to all the different
types of Queries. A search may consists of something like:
+(term1 phrase2 wildcard*) +spanNear(term3 term4) [10/5/2]
And this [10/5/2] ratio have to be applied to the
19 matches
Mail list logo