one thing you may not have thought about yet that may affect your
decision: sorting in lucene requires the field be indexed but
untokenized.
so if you want to support sortting on the conceptual "title",
you'll still
need a version of your title field that's untokenized, which can
then be
u
: Thanks Chris, I had thought of that one, but unfortunately the title
: could be quite long, and there are literally millions of documents.
: Isn't each title going to be included as one "term" in the index
: dictionary? If so, won't the index get ridiculously large and slow?
It depends on your
That's deprecated now of course... so you want MultiPhraseQuery.
-Yonik
On 1/5/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> Check out PhrasePrefixQuery.
>
> -Yonik
>
>
> On 1/5/06, Paul Smith <[EMAIL PROTECTED]> wrote:
> > first off response to my own post, I meant PhraseQuery instead.
> >
> > B
Check out PhrasePrefixQuery.
-Yonik
On 1/5/06, Paul Smith <[EMAIL PROTECTED]> wrote:
> first off response to my own post, I meant PhraseQuery instead.
>
> But, since we're only tokenizing this field ,and not storing the
> entire contents of the field, I'm not sure this is ever going to
> work, i
first off response to my own post, I meant PhraseQuery instead.
But, since we're only tokenizing this field ,and not storing the
entire contents of the field, I'm not sure this is ever going to
work, is it?
I notice that if I have a title "auto update", then the phrase query
trick works i
2) index a magic token at the start of the title and include that in a
phrase query:
"_START_ the quick"
Ok, I've gone and chose "0start0" as my start token, because our
analyzer is stripping _.
Now, second dumb question of the day, give the search for starts with
"The qui*", that has t
1) also index the field untokenized and use a straight prefix query
See my reply to Chris, not sure I can afford the index size increment.
2) index a magic token at the start of the title and include that in a
phrase query:
"_START_ the quick"
h, that's clever.
3) use a SpanFirst quer
On 06/01/2006, at 9:33 AM, Chris Hostetter wrote:
: Think SQL of " where title like 'The quick%' ".
I solved this problem by having a variation of my field that was not
tokenized, and did PrefixQueries on that field (so in your case, leave
your title field alone for generic matches, and
Off the top of my head:
1) also index the field untokenized and use a straight prefix query
2) index a magic token at the start of the title and include that in a
phrase query:
"_START_ the quick"
3) use a SpanFirst query (but you have to make the Java Query object yourself)
-Yonik
On 1/5/06,
: Think SQL of " where title like 'The quick%' ".
I solved this problem by having a variation of my field that was not
tokenized, and did PrefixQueries on that field (so in your case, leave
your title field alone for generic matches, and have a titleUntokenized
field for PrefixMatches.
Ano
I'm throwing myself at the mercy of the lucene community, I'm a bit
brain dead today after looking after a screaming 3 month old baby for
4 hours last night...
We have a 'title' field indexed as Field.Text(...), which works
nicely, and has lots of good searching.
However, this application
Hi All,
Any pointers on how to handle range queries if the data type is double
or float?
Best,
Urvashi
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
On Jan 5, 2006, at 2:03 PM, Colin Young wrote:
Just curious, but when one is desiring to make use of stuff in the
contrib is there any particular reason to compile it into the
Lucene jar, or include it in the project that's making use of it,
or is it really just up to the preferences of the
On Donnerstag 05 Januar 2006 03:31, Bill Janssen wrote:
> I've got a some code developed for Lucene 1.4.1, that works around the
> problem of having both (1) multiple default fields, and (2) the AND
> operator for query elements. In 1.4.1, MultiFieldQueryParser
> effectively only allowed the OR o
On Donnerstag 05 Januar 2006 16:09, Yilmazel, Sibel wrote:
> Are there any WordNet dictionary alternatives that anyone had a chance
> to look into for Lucene?
Maybe you could specify your question: what exactly are you trying to do
with WordNet and why are you looking for alternatives?
Regards
I did manage to get things built with ant this morning (I had some time and
space on the train) including the DBDJE support, although I wasn't aware any of
it had been checked into contrib. I'll have to grab that tonight.
I had tried building a normal Eclipse project, but was having some proble
On Thursday 05 January 2006 19:33, Marc Hadfield wrote:
>
> Thanks Erik, Hoss -
>
> I will try MultiPhraseQuery and report back.
>
> Back in an email thread with Doug be mentioned SpanQuery would work, and
> in a fashion it does, but I can't differentiate between terms at the
> same position a
As far as I know, the best information about various Lucene concurrency rules,
locks and locking is in the Lucene book (e.g.
http://www.lucenebook.com/search?query=concurrency+rules ).
The short story is: use 1 IndexSearcher. Get a new one when you detect index
change. Use 1 IndexWriter. If
Thanks Erik, Hoss -
I will try MultiPhraseQuery and report back.
Back in an email thread with Doug be mentioned SpanQuery would work, and
in a fashion it does, but I can't differentiate between terms at the
same position and contiguous positions. The problem gets worse if I
want to test fo
A better place to ask this may be some kind of a computational linguistics
mailing list.
I'm not aware of WordNet alternatives for the English language, but I know
there are WordNet cousins for other languages (e.g. BalkaNet for several
languages spoken on the Balkan peninsula).
Otis
- Ori
Nice contribution by Luc.
Thanks,
Jelda
-Original Message-
From: Vanlerberghe, Luc [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 05, 2006 5:11 PM
To: java-user@lucene.apache.org
Subject: RE: searching and indexing simultaneously...
One reader/searcher per server.
My configuration
One reader/searcher per server.
My configuration uses
- one Lucene index in a shared location,
- one server that uses either a single IndexReader or a single
IndexWriter to delete or add documents
- several servers that read/search the index.
The 'search' servers each have a single IndexReader op
On 1/5/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> my recommendation is to simply set up
> a normal Eclipse project.
I use IntelliJ, and that's how I do it (a normal IntelliJ project).
-Yonik
-
To unsubscribe, e-mail: [EMAIL
But its best to only have one reader/searcher, correct?
-Original Message-
From: Ramana Jelda [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 05, 2006 9:08 AM
To: java-user@lucene.apache.org
Subject: RE: searching and indexing simultaneously...
Hi,
You are right. There can be multipl
Hello,
Are there any WordNet dictionary alternatives that anyone had a chance
to look into for Lucene?
Thanks,
Sibel
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi,
You are right. There can be multiple indexreaders but only one indexwriter
is advised.
No, we can not use two indexwriters simultaneously.
Jelda
-Original Message-
From: K.A.Hussain Ali [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 06, 2006 5:01 PM
To: java-user@lucene.apache.o
HI all.
i am newbie to lucene
Do lucene provides any way to do indexing ,searching and deleting
simultaneously ..
I hope we could do searching and indexing which means there can be multiple
indexreader and only one indexwriter accessing the index..
Could we have two indexwriter working simult
Colin,
I've not used Eclipse in a long while, and I've not used the new Ant
project support it has. It sounds like Eclipse has issues with Ant
imports as you've surmised, and my recommendation is to simply set up
a normal Eclipse project. The source tree to Lucene is extremely
basic...
Marc,
SpanNearQuery isn't capable of performing the proximity to within
only a single position in the manner you've described. A slop of 0
means the terms must be contiguous with no gaps, which also allows
for matches in the same position as in your first example.
I think MultiPhraseQuer
Dmitry,
RegexQuery is similar in behavior to Lucene's built-in WildcardQuery,
except rather than accepting only ? and * as wildcard characters it
leverages the full expression capability of whatever underlying
regular expression engine is selected.
SpanRegexQuery is a "span" version of Re
Gui
How about using the OpenOffice API?
http://weblogs.java.net/blog/tchangu/archive/2005/12/open_office_jav_1.html
http://api.openoffice.org/DevelopersGuide/DevelopersGuide.html
Patrick
On 24/11/05, Guilherme Barile <[EMAIL PROTECTED]> wrote:
> I have some issues with textmining extracting tex
31 matches
Mail list logo