StandardAnalyzer should work fine, mark the field as indexed, no need
to store it unless you want to retrieve it for display.
Query via QueryParser using "tagname: updateC*" or programatically via
PrefixQuery.
Although I'm not sure exactly what you mean by "strict prefix". If
you mean that the
I'd also add that the Document keeps a pointer to the spot in storage where
that value can be loaded from. It can result in a performance saving in the
typical search use case where one is displaying just "metadata" fields on a
page, but not the full content. In this case, the full content pag
On May 20, 2010, at 5:15 AM, manjula wijewickrema wrote:
> Hi,
>
> I wrote aprogram to get the ferquencies and terms of an indexed document.
> The output comes as follows;
>
>
> If I print : +tfv[0]
>
> Output:
>
> array terms are:{title: capabl/1, code/2, frequenc/1, lucen/4, over/1,
> samp
Hi guys!
does there exist a way to define some threshold on the terms I wanna store
in the index(before they are indexed). I need to store the terms with
higheest frequencies. I done it with term vectors and some cutoff ratio that
cuts off the least occuring terms, but all this is, ofcourse work
Hi,
Thanks. By "strict prefix", I meant a prefix of the name
(case-insensitive). What you suggest ("tagname: updateC*") was the
first thing I tried, but it happens to work only partially. In my
case, I have a lot of names beginning with "m_sz", e.g.
"m_szComment", "m_szName". Trying a query like "
I bet it's that underscore in m_sz. Different analyzers do different
things with different punctuation characters. I can never remember
which does exactly what - it'll be in the javadocs or Lucene In Action
or somewhere on the web.
You can check what exactly has been indexed by using Luke - alwa
Hi,
Thanks for the help.
BTW, if anyone is interested: I tried the same with KeywordAnalyzer -
added a field with value "m_szName", and tried to find it using
"FieldName:m_szN*" but failed. Someone in the Lucene IRC channel
showed me why - QueryParser, by default, lowercases all expanded terms
(e.
I should add that talks on Mahout, Tika, Nutch, etc. are also encouraged.
-Grant
On May 17, 2010, at 8:43 AM, Grant Ingersoll wrote:
> Lucene Revolution Call For Participation - Boston, Massachusetts October 7 &
> 8, 2010
>
> The first US conference dedicated to Apache Lucene and Solr is comi
Does anyone know of any classes available that allow you to define and use your
own synonyms when searching with Lucene? I read some about WordPress but it
seems those synonyms are predefined English words. The application I am working
with searches for the names of contacts and companies. I wou
Larry, you should look at the SynonymFilter in Lucene Contrib Analysis.
simon
On Mon, May 24, 2010 at 9:40 PM, Larry Hendrix wrote:
> Does anyone know of any classes available that allow you to define and use
> your own synonyms when searching with Lucene? I read some about WordPress but
> it
Why do you want to calculate this? This is done for
you by the indexing process and taken into account
when searching.
You're asking for a solution before defining the problem,
which makes it very hard to help.
See: http://people.apache.org/~hossman/#xyproblem
Best
Erick
On Mon, May 24, 2010 at
11 matches
Mail list logo