Hi,
I am using Lucene search engine in my website for document search .
though it is working fine and searching the keywords into the documents
properly, i am facing a problem during the search .
When i am searching some keywords whose occurence are very low in
the document and
This could be the maxFieldLength default in IndexWriter? By default
IndexWriter only indexes the first 10,000 tokens of a document.
Mike
shrish garg wrote:
Hi,
I am using Lucene search engine in my website for document
search .
though it is working fine and searching the
Hi,
I am using Lucene search engine in my website for document search .
though it is working fine and searching the keywords into the documents
properly, i am facing a problem during the search .
When i am searching some keywords whose occurence are very low in
the document and
OK, I opened LUCENE-1254 and committed the fix to trunk (upcoming)
2.3.2.
Mike
Yonik Seeley wrote:
On Mon, Mar 31, 2008 at 5:19 AM, Michael McCandless
[EMAIL PROTECTED] wrote:
I think we should remove those checks and allow addIndexesNoOptimize
to import and index even if it has
May be you can index the set of documents in a temporary index. This index
needs only one field (tag).
Then you can browse the terms collection of the index and get each couple
term/frequency
IndexReader reader = IndexReader.open(temp_index);
TermEnum terms = reader.terms();
Hi all,
Snowball stemmers are part of Lucene, but for few languages only. We
have documents in various languages and so need stemmers for many
languages (in particular polish). One of the ideas is to use ispell
dictionaries. There are ispell dicts for many languages and so this
solution is good
so build a index for the dynamically generated docucements set ,and then try
to find frequency for each terms in this index... not sure it's fast enoug.but
it's worth to have a try...
Thank you Doinique!
- Original Message -
From: Dominique Béjean [EMAIL PROTECTED]
To:
On www.crossfeeds.com, I use this method in order to update hourly a tag
cloud based on the title of 20.000 RSS articles (RSS published during the
last 24 hours). It takes 1 minute.
-Message d'origine-
De : wuqi [mailto:[EMAIL PROTECTED]
Envoyé : mardi 1 avril 2008 14:10
À :
See Chris's reply, but for this So I will not
want to return higher PositionIncrement for each instance of a field, just
those which I'm interested in (title/headers)
I think you want PerFieldAnalyzerWrapper.
Erick
On Mon, Mar 31, 2008 at 10:56 AM, Itamar Syn-Hershko [EMAIL PROTECTED]
wrote:
I registered myself just now, an interesting website.
It seems crossfeeds generate a tag cloud offline hourly ? But I have a more
strict time requirement. user submit a query in my website, and they may get
tens of thousands of search results. I need to generate a tag cloud for all
these
We use Lucene to create simple data stores that we deploy with our
application. Our application also supports auto-updating and we refresh
these data stores monthly. Since Lucene computes the names for the index we
end up deploying new files each time while leaving the old files to continue
I have two slightly different queries, and am filtering to return only a
single unique document. The scores are very slightly different, but in the
opposite way from what my (naive) reasoning would have expected.
In the first case the query is
text:j2ee^2.0, text:soa^2.0, text:webservic,
Wojtek H skrev:
Snowball stemmers are part of Lucene, but for few languages only. We
org.apache.lucene.analysis contains a few more stemmers.
have documents in various languages and so need stemmers for many
languages (in particular polish).
Have you seen Stempel?
Donna L Gresh skrev:
I have two slightly different queries,
Hi Donna,
I can't help you, but perhaps I would understand everthing better if you
also pasted in the explanations.
karl
-
To unsubscribe, e-mail: [EMAIL
Sure; here are the two explanations (below). Your question made me go look
at the explanation more carefully again and (no) surprise, I discovered
that I
misspoke (miswrote) earlier; the two found terms are j2ee and soa,
which then makes my concern much less of one, since in both cases, the
On Tuesday 01 April 2008 18:51:55 Dominique Béjean wrote:
IndexReader reader = IndexReader.open(temp_index);
TermEnum terms = reader.terms();
while (terms.next()) {
String field = terms.term().field();
Gotcha: after calling terms() it's already pointing at
Hi All,
is there any possibility to create compression store for the
following types of string in lucene index store?
String str = II0264.D05|00022745|ABCDE|03/01/2008 00:23:12|00035|
9840836588| 129382152520| 04F4243B600408|04F4243B600408|
|11919898456123|354943011025810L| CPTBS2I|
17 matches
Mail list logo