Actually, Field.NO_NORMS means Field.UN_TOKENIZED plus
Field.setOmitNorms(true).
Mike
John Griffin wrote:
Dimitri,
Field.TOKENIZED and Field.NO_NORMs send their field's contents
through a tokenizer and make their contents indexed and therefore
searchable. FIELD.UN_TOKENIZED does n
So from what I understand, is it true that if mergeFactor is 10, then when I
index my first 9 documents, I have 9 separate segments, each containing 1
document? And when searching, it will search through every segment?
Thanks!
David
Dimitri,
Field.TOKENIZED and Field.NO_NORMs send their field's contents
through a tokenizer and make their contents indexed and therefore
searchable. FIELD.UN_TOKENIZED does not send its field's contents through a
tokenizer but it still indexes its contents. Only Field.NO does not index
it
On Aug 22, 2008, at 3:47 PM, Teruhiko Kurosaka wrote:
Hello,
I'm interested in knowing how these tokenizers work together.
The API doc for TeeTokenizer
http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/analysis/TeeTokenFilter.html
has this sample code:
SinkTokenizer sink1 = new SinkTok
Try Hibernate Search - http://www.hibernate.org/410.html
John G.
-Original Message-
From: ??? [mailto:[EMAIL PROTECTED]
Sent: Friday, August 22, 2008 3:27 AM
To: java-user@lucene.apache.org
Subject: Lucene Indexing DB records?
Guess I don't quite understand why there are so few posts ab
Hello,
I'm interested in knowing how these tokenizers work together.
The API doc for TeeTokenizer
http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/analysis/TeeTokenFilter.html
has this sample code:
SinkTokenizer sink1 = new SinkTokenizer(null);
SinkTokenizer sink2 = new SinkTokenizer(null
Normalization is done on a field by field basis, as is most scoring.
It doesn't factor all fields in, b/c someone might not be querying all
fields. The field it does use is based on the query.
On Aug 18, 2008, at 10:44 PM, blazingwolf7 wrote:
Hi,
I am currently working on the calculatio
> Actually there are many projects for Lucene + Database. Here is a list I
> know:
>
> * Hibernate Search
> * Compass, (also Hibernate + Lucene)
> * Solr + DataImportHandler (Searching + Crawler)
> * DBSight, (Specific for database, closed source, but very customizable,
> easy to setup)
> * Browse
: I am using the SnowballAnalyzer because of it's multi-language stemming
: capabilities - and am very happy with that.
: There is one small glitch which I'm hoping to overcome - can I get it to split
: up internet domain names in the same way that StopAnalyzer does?
90% of the Lucene Analyzers
Actually there are many projects for Lucene + Database. Here is a list I
know:
* Hibernate Search
* Compass, (also Hibernate + Lucene)
* Solr + DataImportHandler (Searching + Crawler)
* DBSight, (Specific for database, closed source, but very customizable,
easy to setup)
* Browse Engine
--
Chris
I am new to lucene. Here is my question. The document has fields. When I add
a field to the document I can specify that field is Indexed, Tokenized,
etc.. So the same field can be Tokenized in one document and be
not-tokenized in another document. However the is a method
IndexReader.getFieldNames(
Not that I know of. But if you're storing Lucene doc IDs as part of
existing search results, you're playing with fire anyway. Unless there's
a compelling reason to avoid it, you're usually better off storing your
own unique doc ID in a different field and using that because you can
guarantee that i
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hello,
I'd like to modify a Field in an already indexed Document. The only way so
far that I have found is to delete the document through an IndexReader and
add a new one with an IndexWriter. This has the undesirable property that it
alters existing search results for a given keyword. Is there a b
That is very clever. With that, the text we index will get through the
analyser, but will not get tokenized. Will hit the analyser the same way
when we search, again untokenized.
Brilliant!!
-Original Message-
From: Andre Rubin [mailto:[EMAIL PROTECTED]
Sent: 21 August 2008 08:21
To: ja
You might also want to look at Solr and DataImportHandler.
http://lucene.apache.org/solr
http://wiki.apache.org/solr/DataImportHandler
On Fri, Aug 22, 2008 at 2:56 PM, ??? <[EMAIL PROTECTED]> wrote:
> Guess I don't quite understand why there are so few posts about Lucene
> indexing DB records. S
Guess I don't quite understand why there are so few posts about Lucene indexing
DB records. Searched Markmail, but most of the Lucene+DB posts have to do with
lucene index management.
The only thing I found so far is the following, if you have a minute or two:
http://kalanir.blogspot.com/2008/06
18 matches
Mail list logo