I am a developer on the JIRA Issue tracker, and we are considering
upgrading our Lucene version from v2.2.0 to v2.3.2.
I have been charged with doing the risk analysis, and project work.
I have read the change lists, and the bugs reported on the Lucene Issue
Tracker (JIRA of course ;), and
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi,
I just discovered some strange behaviour with deleted documents. I do a
search for documents with a certain query and delete one using
IndexWriter.deleteDocuments(Term) using a key for the term. Then I repeat
the search and the document is still there because I use a custom
HitCollector
It sounds like TermVectors may apply here?
The TermVectors for a doc are like a miniature inverted index just for
that one document. It lets you retrieve all terms and their
frequencies, plus optionally offset and positions information for each
term occurrence.
Mike
Beijing2008 wrote:
Before 2.3, each doc was in fact a separate segment in memory, and
then these segments were merged together to flush a single segment in
the Directory.
As of 2.3, IndexWriter now writes directly into RAM the data
structures that are needed to create the segment, and then flushing
the
John Patterson wrote:
I just discovered some strange behaviour with deleted documents. I
do a
search for documents with a certain query and delete one using
IndexWriter.deleteDocuments(Term) using a key for the term. Then I
repeat
the search and the document is still there because I
Mark Lassau wrote:
I am a developer on the JIRA Issue tracker, and we are considering
upgrading our Lucene version from v2.2.0 to v2.3.2.
I have been charged with doing the risk analysis, and project work.
I have read the change lists, and the bugs reported on the Lucene
Issue Tracker
On Dienstag, 26. August 2008, Andre Rubin wrote:
For some reason, the TermQuery is not returning any results, even when
querying for a single word (like on*).
Sorry, I meant PrefixQuery. Also, do not add the * to the search string
when creating the PrefixQuery.
Regards
Daniel
--
Hi John,
Are you sure you made the id tokenized while indexing? I could overcome
this issue by having a tokenized field, which was used for the deletion as
below.
document.add(new Field(id, id, Field.Store.YES, *Field.Index.TOKENIZED*));
Thanks
On Tue, Aug 26, 2008 at 2:15 PM, Michael
That was the problem - the id was not tokenized. Thanks for your help.
Kalani Ruwanpathirana wrote:
Hi John,
Are you sure you made the id tokenized while indexing? I could overcome
this issue by having a tokenized field, which was used for the deletion as
below.
document.add(new
Normally an ID should be indexed as Field.Index.UN_TOKENIZED.
Mike
John Patterson wrote:
That was the problem - the id was not tokenized. Thanks for your
help.
Kalani Ruwanpathirana wrote:
Hi John,
Are you sure you made the id tokenized while indexing? I could
overcome
this issue
A little more case sensitivity questions.
Based on the discussion on http://markmail.org/message/q7dqr4r7o6t6dgo5 and
on this thread, is it right to say that a field, if either UN_TOKENIZED or
NO_NORMS-ized, it doesn't get analyzed while indexing? Which means we need
to case-normalize (down-case)
Is there any way to know what are the available commit points
available in an index?
This would be helpful to provide a rollback feature to rollback to a
commitpoint .
--Noble
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For
Yes -- there's now (in trunk) a static
IndexReader.listCommits(Directory) method.
Mike
Noble Paul നോബിള് नोब्ळ् wrote:
Is there any way to know what are the available commit points
available in an index?
This would be helpful to provide a rollback feature to rollback to a
commitpoint .
Hi Chris,
I asked exactly the same question a little while ago and got a pretty good
answer from Paul Elschot.
Try searching the archives for 'Filtering a SpanQuery'. It was around the
13/5/08.
Hope it helps,
Eran.
On Mon, Aug 25, 2008 at 8:18 PM, Christopher M Collins
[EMAIL PROTECTED]wrote:
On Aug 25, 2008, at 7:29 PM, Teruhiko Kurosaka wrote:
Thank you, Grant and (Koji) Sekiguchi-san.
but I don't
understand how the input from reader1 and reader2 are mixed
together.
Will sink1 first reaturn the reader1 text, and reader2?
It depends on the order the fields are added. If
I think I should rephrase my question.
[ Context: Using out of the box StandardAnalyzer for indexing and searching.
]
Is it right to say that a field, if either UN_TOKENIZED or NO_NORMS-ized (
field.setOmitNorms(true) ), it doesn't get analyzed while indexing?
Which means that when we search, it
The lucene FAQ says:
What wildcard search support is available from Lucene?
Lucene supports wild card queries which allow you to perform searches
such as book*, which will find documents containing terms such as book,
bookstore, booklet, etc. Lucene refers to this type of a query as a
'prefix
Hi there,
I use luke v0.8.1 which build base on lucene 2.3.0. First, I run
lucene/demo/IndexFiles to build index successfully. Then I use luke to
open index, but luke issues Unknown format version: -6 . I check the
documentation of lucene which said lucene 2.3.2 does not contain any
new
I think you need to triple check your CLASSPATH? It seems like you
are somehow getting and older version of Luke.
The file format definitely did not change from 2.3.0 -- 2.3.2.
Mike
Jiao, Jason (NSN - CN/Cheng Du) wrote:
Hi there,
I use luke v0.8.1 which build base on lucene
Can you combine these two queries somehow so that they behave like a
PhraseQuery?
I have a custom query parser which takes a phrase like *at sat and
produces a BooleanQuery consisting of a WildcardQuery ('*at') and a
TermQuery ('sat'). This works, but matches more widely than expected
(by
That worked great! Thanks Daniel.
I just have one more use case. I want the same prefix search as before, plus
another match in another field.
I was using MultiFieldQueryParser.parse(), but then I have the same problem
with the One Tw* query, cause MultiFieldQueryParser.parse() returns a
On Dienstag, 26. August 2008, Chris Bamford wrote:
Can you combine these two queries somehow so that they behave like a
PhraseQuery?
You can use MultiPhraseQuery, see
http://lucene.apache.org/java/2_3_2/api/core/org/apache/lucene/search/MultiPhraseQuery.html
Regards
Daniel
--
On Dienstag, 26. August 2008, Andre Rubin wrote:
I just have one more use case. I want the same prefix search as before,
plus another match in another field.
Not sure if I'm following you, but you can create your own BooleanQuery
programmatically, and then add the original PrefixQuery and any
Now I was the one who didn't follow: How do I add a query to an existing
query?
Let me be more clear on my use case:
I have two documents:
1)
label:One Two Three
type:sequence
2)
label:One Two FOUR
type:other
I want to be able to make the same kind of search as you described earlier
using
Daniel,
That sounds like what I'm after - but how do I get hold of the
IndexReader so I can call IndexReader.terms(Term) ?
The code where I am doing this work is getFieldQuery(String field,
String queryText) of my custom query parser ...
Thanks,
- Chris
Daniel Naber wrote:
On Dienstag,
On Dienstag, 26. August 2008, Chris Bamford wrote:
That sounds like what I'm after - but how do I get hold of the
IndexReader so I can call IndexReader.terms(Term) ?
The code where I am doing this work is getFieldQuery(String field,
String queryText) of my custom query parser ...
QueryParser
On Dienstag, 26. August 2008, Andre Rubin wrote:
Now I was the one who didn't follow: How do I add a query to an existing
query?
Something like this should work:
BooleanQuery bq = new BooleanQuery();
PrefixQuery pq = new PrefixQuery(...);
bq.add(pq, BooleanClause.Occur.MUST);
TermQuery tq =
Thanks again Daniel,
It's working now. But for some reason, TermQuery is not working for me (i
think because I have special characters in the query). I replaced the
TermQuery with the query below and I got the results I was expecting.
Thanks
String escapedType = QueryParser.escape(type);
Hi,
Sorry if I missed this somewhere or maybe its not released yet, but I
was anxiously curious about lucene 3.0's expected features/improvements.
Is there a list yet?
thanks!
Darren
-
To unsubscribe, e-mail: [EMAIL
27 aug 2008 kl. 00.52 skrev Darren Govoni:
Hi,
Sorry if I missed this somewhere or maybe its not released yet, but I
was anxiously curious about lucene 3.0's expected features/
improvements.
Is there a list yet?
If everything goes as planned then Lucene 3.0 will be the same as
Lucene
I've been wandering for a while through this list and other Lucene
resources on the web trying to figure out the possible outlines of a
search solution which could fit my case. But as a Lucene newbie I
decided to ask for your help.
Now this is the scenario. I am building a webmail application
Mike,
Thanks for the prompt response.
Michael McCandless wrote:
Mark Lassau wrote:
I am a developer on the JIRA Issue tracker, and we are considering
upgrading our Lucene version from v2.2.0 to v2.3.2.
I have been charged with doing the risk analysis, and project work.
I have read the
Giovanni,
You could try the approach you described - one index per user. When I built
Simpy (see http://simpy.com ) a few years ago I chose the same approach and I
never regretted it. The hardware behind Simpy is very modest, usage is high,
and I never had problems with too many indices open
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Dino, you lost me half-way through your email :(
NO_NORMS does not mean the field is not tokenized.
UN_TOKENIZED does mean the field is not tokenized.
Otis--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Dino Korah [EMAIL PROTECTED]
To:
Dino,
If a field is not tokenized then it is indexed as is.
For example: Dino Korah would get indexed just like that. It would not get
split into multiple tokens, it would not be lowercased, it would not have any
stop words removed from it, etc.
Otis
--
Sematext -- http://sematext.com/ --
39 matches
Mail list logo