Hi,
I am using StandardAnalyzer when creating the Lucene index. It indexes the
word work as it is but does not index the word wo*rk in that manner.
Can I index such words (including * and ?) as it is? Otherwise I have no way
to index and search for words like wo*rk, you?, etc.
Thanks
--
Kalani
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
AUTOMATIC REPLY
Tom Roberts is out of the office till 2nd September 2008.
LUX reopens on 1st September 2008
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
25 aug 2008 kl. 09.19 skrev Kalani Ruwanpathirana:
Hi,
I am using StandardAnalyzer when creating the Lucene index. It
indexes the
word work as it is but does not index the word wo*rk in that
manner.
Can I index such words (including * and ?) as it is? Otherwise I
have no way
to index
Hi,
Thanks, I tried WhitespaceAnalyzer too, but it seems case sensitive.
If I need to search for words like correct?, html (it escapes , and
another few characters too) I need to index those kind of words.
On Mon, Aug 25, 2008 at 1:15 PM, Karl Wettin [EMAIL PROTECTED] wrote:
25 aug 2008 kl.
I think you meant Field.Index.NO and Field.Index.TOKENIZED, for those
two docs.
The answer is yes -- Lucene considers the field indexed if ever any
doc, even a single doc, had set Index.TOKENIZED or Index.UN_TOKENIZED
for that field.
However, your document A still will not have been
25 aug 2008 kl. 11.14 skrev Kalani Ruwanpathirana:
Hi,
Thanks, I tried WhitespaceAnalyzer too, but it seems case sensitive.
Then you simply add a LowercaseFilter to the chain in the Analyzer:
public final class WhitespaceAnalyzer extends Analyzer {
public TokenStream tokenStream(String
Hi All,
I am new to this Lucene, and I am using this for indexing and
searching. Is it possible to search substrings using this, for example if a
field holds the value LuceneIndex and if a give the query as Index, I want
to get this field also.. is there anyway for this.
Thanks in Advance,
Hi ,
You could use wildcard queries in that case (In case I got you right).
Though because of the way the indexed terms are stored, it would not be
advisable to have a *word like query but a word* like would be doable in
real world environment.
Hope this answers your question.
--
Anshum Gupta
Hi Anshum Gupta,
Thanks for your replay, but when I gone through querySyntax-Document for
Lucene, I read that Lucene does not allow queries like *findthis i.e. I
think it doesnot allow wildcards in the beginning of the query.
is it?
Thanks,
Venkata Subbarayudu.
Anshum-2 wrote:
Hi ,
25 aug 2008 kl. 13.54 skrev Venkata Subbarayudu:
Hi All,
I am new to this Lucene, and I am using this for indexing and
searching. Is it possible to search substrings using this, for
example if a
field holds the value LuceneIndex and if a give the query as
Index, I want
to get this
Yes, and that is the reason why I said,
*it would not be advisable to have a *word like query but a word* like
would be doable*
*word : is a prefix wildcard, which can be done, but its not all that
straight, and still would be highly against what I would advise
word* : is Doable and ok.
Else if
On Mon, Aug 25, 2008 at 5:37 PM, Karl Wettin [EMAIL PROTECTED] wrote:
Is this the specific use case, that you want to handle composite words as
in javaFieldAndClassNames? There is no native support for that in Lucene to
my knowledge, but it should not be too hard to implement a TokenStream
Hi All,
i would like to use the FilteredQuery to filter my search results with
the occurrence or absence of certain ids.
Example A:
query - text:albert einstein
filterQuery - doctype:letter
That's ok. I am getting the expected results. But i got no results, if
i filter with the absence of an
Hi all,
Let's say that I have in my index the value One Two Three for field 'A'.
I'm using a custom analyzer that is described in the forwarded message.
My Search query is built like this:
QueryParser parser = new QueryParser(LABEL_FIELD, ANALYZER);
Query query =
Hello,
Can anyone tell me if it's possible to apply a filter to a SpanQuery and
still use query.getSpans(indexReader)? I'm using getSpans to get back the
original positions in the text but I would like to filter the results
returned by getSpans. I have a Filter I can apply if I just search
Heiko,
It's most likely because that B case has a purely negative query. Perhaps you
can combine it with MatchAllDocs query?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Heiko [EMAIL PROTECTED]
To: java-user@lucene.apache.org
Sent:
ok, thanks. I knew that the documents were buffered in memory until they
were flushed, but I thought that in memory, they were still separate
documents/segments until they were merged together at the appropriate time
(dependent on the mergeFactor).
Do you mean that when the IndexWriter flushes
On Montag, 25. August 2008, Andre Rubin wrote:
I tried it out but with no luck (I think I did it wrong). In any
case, is MultiPhraseQuery what I'm looking for? If it is, how should I
use the MultiPhraseQuery class?
No, you won't need it. If you know that the field is not really tokenized
As a test, I tried to compare a few documents on various topics (a few on
linux, and another on the U.S. constitution) to a source document on linux
using a query formed by MoreLikeThis.
1. Looking at the hits, they have the same score. I'd expect them to be
different, based on their relevance to
Exactly as Otis sais, you should use MatchAllDocs as query, but it has a
drawback in performance, it checks every single document deletion state,
I've solved the issue by making my own EnhancedMatchAllDocs query that is
optimized to do not check this document state.
Perhaps the SegmentReader
Mike just committed a read-only IndexReader recently. If you pull Lucene out
of the svn trunk, you'll be able to make use of that. The r-o IR doesn't have
a synchronized isDeleted, I believe.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
Thank you, Grant and (Koji) Sekiguchi-san.
but I don't
understand how the input from reader1 and reader2 are mixed
together.
Will sink1 first reaturn the reader1 text, and reader2?
It depends on the order the fields are added. If source1 is
used first, then reader1 will be first.
For some reason, the TermQuery is not returning any results, even when
querying for a single word (like on*).
query = new TermQuery(new Term(LABEL_FIELD, searchString));
On 8/25/08, Daniel Naber [EMAIL PROTECTED] wrote:
On Montag, 25. August 2008, Andre Rubin wrote:
I tried it out but with
When a document add to index, fields data will split to many terms and saved
into index. Now, How can I get these terms with special field and special
document from index.
--
View this message in context:
Venkata Subbarayudu wrote:
Hi Anshum Gupta,
Thanks for your replay, but when I gone through querySyntax-Document for
Lucene, I read that Lucene does not allow queries like *findthis i.e. I
think it doesnot allow wildcards in the beginning of the query.
It has supported this for some time
I like your nick name.
For the question, I think you must iterate all the terms in index with
TermEnum and see if term will satisfy any of your concerns.
Best
2008/8/26 Beijing2008 [EMAIL PROTECTED]
When a document add to index, fields data will split to many terms and
saved
into index.
Very Thanks. But I'm sorry I can not catch what's your meaning.
A sentence through Analyzer.TokenStream method and will get a TokenStream
result. this TokenStream will save into index with someway, now I'm just to
get all token for this input sentence from index.
my english is very pool, maybe
28 matches
Mail list logo