Another approach maybe to use n-grams. Index each word as follows
2 gram field
in nf fo or rm ma at
3 gram field
inf nfo for orm rma mat
4 gram field
info nfor form orm rmat
and so on.
To search for term "form" simply search the 4 gram field.
The prefix query approach may suffer
27 apr 2006 kl. 10.05 skrev [EMAIL PROTECTED]:
Another approach maybe to use n-grams.
The spell checker in contrib could probably be used as a code base
for that.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
Is it possible to search sentences, more than one word at a time, or phrases
with fuzzy search?
I have implemented fuzzy search, if I only search one single word it works
fine, but if I start searching more than one word or a sentence it does not
find anything...strange, when I set the relevance
27 apr 2006 kl. 10.16 skrev Fisheye:
Is it possible to search sentences, more than one word at a time,
or phrases
with fuzzy search?
I have implemented fuzzy search, if I only search one single word
it works
fine, but if I start searching more than one word or a sentence it
does not
fi
ok, thanks for the link. I will have a look and see...but if this is really
as slow as you describe it, I probably have to accept it like it is and let
it.
--
View this message in context:
http://www.nabble.com/fuzzy-sentence-search-t1516604.html#a4118600
Sent from the Lucene - Java Users forum a
Hello,
What I'm trying to do is to index database with lucene. Each row returned by
SQL query is represented as document, and document contains fields (values of
columns). I'm adding those fields to document by doing the following:
Field fld = new Field("COLUMN_NAME", column.value());
Now when
Hello,
We are upgrading from 1.3 to 1.9.
We planned to use the Highlight package for highlighting, replacing our in
house highlight classes.
>From what I can read, HighLight package requires the use of the
TermFreqVector to be added to the index. I will get into the Highlight
package later, but
On Apr 26, 2006, at 6:20 PM, anton feldmann wrote:
Are the names of a field in a document unique or can i make a field
with the name "sentence" for each sentence in an text document?
The names of a field in a document are unique, but you can add
multiple instances of the same field name. Y
Hi again,
Upgrading from lucene 1.3 to 1.9.
We need to order the result in order of occurrences (score of a doc = sum of
occurrences of all Query).
In lucene 1.3 we did rewrite all the Query classes (BooleanQuery,
PhraseQuery, etc..) to reach our goals, but is there an easier way to do it
I know it is possible to query against multiple indexes, but is it
possible to create a composite query in which part of the query is
against one index and part is against another (similar to querying
against a default and a second field)?
for example index1:query1 AND index2:query2
I tho
Anton,
Please don't cross post "How do I..." questions to the dev list, it
doesn't get you anywhere and just annoys those most likely to help you.
See below.
-Grant
Anton Feldmann wrote:
Hi
I wrote a Indexer which is indexing all the contents of a text and the
sentence are seperated in an o
Anton Feldmann wrote:
3) How do I display the sentence before and after the sentence the hit
is in?
You could:
1. Make your Lucene Document be a set of three sentences (before,
searchable, after), which you store, but write a custom Analyzer which
only returns tokens for the "searchable" cen
Hi
I wrote a Indexer which is indexing all the contents of a text and the
sentence are seperated in an other Document.
"Document document = new Document(new Field ("contents", reader ));
StringTokenizer token = new StringTokenizer(contents.replaceAll(". ",
"\\.x\\") , "\\.x\
You need to use MultiFieldQueryParser
http://lucene.apache.org/java/docs/api/org/apache/lucene/queryParser/MultiFieldQueryParser.html
Sincerely,
Chris Lu
-
Full-text search on Any Databases/Applications
http://www.dbsight.net
On 4/27/06, Audrius Peseckis <[EMAIL PROTECTED]> wrote:
> Hel
27 apr 2006 kl. 12.45 skrev Fisheye:
ok, thanks for the link. I will have a look and see...but if this
is really
as slow as you describe it, I probably have to accept it like it is
and let
it.
You might find this thread interesting:
http://www.nabble.com/Contextual-suggestions-t1372611
: You need to use MultiFieldQueryParser
:
:
http://lucene.apache.org/java/docs/api/org/apache/lucene/queryParser/MultiFieldQueryParser.html
or put the text from all of your fields into one uber catchall field and
make that the default...
foreach (column) {
Field f = new Field(column.name
can you provide a little more clarification as to what it is you are
trying to achieve (not just the way you hope to achieve it)
Specificly: I can't make sense of what you would expect to get back with a
query like this...
index1:query1 AND index2:query2
...traditionally, a query for "A and
: Upgrading from lucene 1.3 to 1.9.
: We need to order the result in order of occurrences (score of a doc = sum of
: occurrences of all Query).
: I am just starting to read on Similarity, weights etc.
You are definitely on the right track with Similarity. What you want is a
Similarity implimen
Hi,
Our application presents search results in a paginated form.
We were unable to find Searcher methods that would return, say, 'n'
(typically, 10) hits after a start offset 'k'.
So we're currently using the Hits collection returned by Searcher.search,
and using its Hits.doc(i) method to get th
27 apr 2006 kl. 20.44 skrev Jean Sini:
Our application presents search results in a paginated form.
We were unable to find Searcher methods that would return, say, 'n'
(typically, 10) hits after a start offset 'k'.
So we're currently using the Hits collection returned by
Searcher.search,
and
On 4/27/06, Jean Sini <[EMAIL PROTECTED]> wrote:
> We were unable to find Searcher methods that would return, say, 'n'
> (typically, 10) hits after a start offset 'k'.
Yes, that's because to find results k through k+n, Lucene must first
find results 0 through k+n.
> So we're currently using the H
On Donnerstag 27 April 2006 14:32, Philippe Deslauriers (Beetext) wrote:
> What are the OFFSETS and POSITIONS used for? Do I need it for
> Highlighting?
No, you can provide an analyzer to Highlight.getBestFragment() and it will
re-analyze your text without the need for term vectors.
Regards
Da
Thanks.
One of the trade-offs we are considering, along the lines of what you
mentioned, has to do with whether or not to cache the Hits. The benefit
being that we'd avoid re-running the search if requests for hits past the
first page do come in, the cost being that we'd have to keep around all the
Sunil Kumar PK wrote:
I want to know is there any possibility or method to merge the weight
calculation of index 1 and its search in a single RPC instead of doing the
both function in separate steps.
To score correctly, weights from all indexes must be created before any
can be searched. This
27 apr 2006 kl. 23.39 skrev Jean Sini:
27 apr 2006 kl. 20.44 skrev Jean Sini:
Our application presents search results in a paginated form.
We were unable to find Searcher methods that would return, say, 'n'
(typically, 10) hits after a start offset 'k'.
So we're currently using the Hits colle
Hi,
After reading the code, I found the similarity measure in Lucene is not the
same as the cosine coefficient measure commonly used. I dont know it is
correct. And I wonder whether i can use the cosine coefficient measure in
lucene or maybe the Dice's coefficient, Jaccard's coefficient and overla
26 matches
Mail list logo