Hi Kumaran,
See below some part of the code and the .alg file.
Here is the function from DocMaker.java from the package "package
org.apache.lucene.benchmark.byTask.feeds"
/** Set the configuration parameters of this doc maker. */
public void setConfig(Config config, ContentSource source) {
Lucene 4.9 gives much the same result.
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.ja.JapaneseAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import
Thanks so much, that works.
Jin
On Tue, Aug 19, 2014 at 4:13 PM, Uwe Schindler wrote:
> Hi,
> Look at his docs. He has only 2 docs, the second one 3 keywords.
>
> I would use a simple phrase query with a slop value < Analyzers
> positionIncrementGap. This is the gap between fields with same na
On Tue, Aug 19, 2014 at 5:27 PM, Uwe Schindler wrote:
> Hi,
>
> You forgot to close (or commit) IndexWriter before opening the reader.
Huh? The code I posted is closing it:
try (IndexWriter writer = new IndexWriter(directory,
new IndexWriterConfig(Version.LUCENE_36, analyser))) {
Oh sorry guys, ignore what I said. I am going to get myself a coffee. Uwe is
absolutely correct here.
On Aug 19, 2014, at 01:13 PM, Uwe Schindler wrote:
Hi,
Look at his docs. He has only 2 docs, the second one 3 keywords.
I would use a simple phrase query with a slop value < Analyzers
positi
Hi,
Look at his docs. He has only 2 docs, the second one 3 keywords.
I would use a simple phrase query with a slop value < Analyzers
positionIncrementGap. This is the gap between fields with same name. Span or
phrase cannot cross the gap, if slop if small enough, but large enough to find
the te
Whoops, the constraint should be MUST to force all terms present:
http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/BooleanClause.Occur.html#MUST
On Aug 19, 2014, at 01:05 PM, "Tri Cao" wrote:
OR operator does that, AND only returns docs with ALL terms present.
Note that you h
OR operator does that, AND only returns docs with ALL terms present.
Note that you have two options here
1. Create a BooleanQuery object (see the Java doc I linked below) and
programatically
add the term queries with the following constraint:
http://lucene.apache.org/core/4_6_0/core/org/apache/l
Thanks for reply, but won't BooleanQuery return both doc1 and doc2 with
query:
label:States AND label:America AND label:United
Best,
Jin
On Tue, Aug 19, 2014 at 2:07 PM, Tri Cao wrote:
> given that example, the easy way is a boolean AND query of all the terms:
>
>
> http://lucene.apache.org/c
Hi Kumaran,
I am using the benchmark utility from Lucene and doing the indexing via an
.alg file.
Would you like to see the alg file instead?
Thank you.
Regards,
Sachin
On Tue, Aug 19, 2014 at 9:42 AM, Kumaran Ramasubramanian wrote:
> Hi Sachin
>
> i want to look into ur indexing cod
given that example, the easy way is a boolean AND query of all the terms:
http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/BooleanQuery.html
However, if your corpus is more sophisticated you'll find that relevance
ranking is not always that trivial :)
On Aug 19, 2014, at 11:00
Hi,
I am wondering if someone can help me on this:
I have index:
doc 1 -- label: United States of America
doc 2 -- label: United
doc 2 -- label: America
doc 2 -- label: States
I am wondering how to generate a query with terms: states united america
so only doc 1 returns.
I was thinking Spa
Erick, Solr termfreq implementation also uses DocsEnum with the assumption that
freq are called on ascending
doc IDs which is valid when scoring from from the hit list. If freq is
requested for an out of order doc, a new
DocsEnum has to be created.
Bianca, can you explain your use case in more
Have you looked into term vectors? I think they should fit your bill
pretty neatly. Here's a nice blog post with helpful background info:
http://blog.jpountz.net/post/41301889664/putting-term-vectors-on-a-diet
-Mike
On 8/19/2014 10:04 AM, Bianca Pereira wrote:
Hi everybody,
I would like
Hmmm, I'm not at all an expert here, but Solr has a function
query "termfreq" that does what you're doing I think? I wonder
if the code for that function query would be a good place to
copy (or even make use of)? See TermFreqValueSource...
Maybe not helpful at all, but...
Erick
On Tue, Aug 19, 20
Hi Sachin
i want to look into ur indexing code. please share it
-
Kumaran R
On Tue, Aug 19, 2014 at 7:18 PM, Sachin Kulkarni
wrote:
> Hi,
>
> Sorry for all the code, It got sent out accidentally.
>
> The following code is part of the Benchmark utility in Lucene, specifically
> Subm
Hi everybody,
I would like to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency with
de.freq() for the desired document.
My solution gives me the resu
Hi,
Sorry for all the code, It got sent out accidentally.
The following code is part of the Benchmark utility in Lucene, specifically
SubmissionReport.java
// Here reader is the IndexReader.
Iterator itr = docMap.entrySet().iterator();
int totalNumDocuments = reader.numDocs();
Hi Kumaran,
The following code is part of the Benchmark utility in Lucene, specifically
SubmissionReport.java
Iterator itr = docMap.entrySet().iterator();
int totalNumDocuments = reader.numDocs();
ScoreDoc sd[] = td.scoreDocs;
String sep = " \t ";
DocNameExtractor docext = new DocNameExtracto
Hi,
You forgot to close (or commit) IndexWriter before opening the reader.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Trejkaz [mailto:trej...@trypticon.org]
> Sent: Tuesday, August 19, 2014 6:
Hi Sachin Kulkarni,
If possible, Please share your code.
-
Kumaran R
On Tue, Aug 19, 2014 at 9:07 AM, Sachin Kulkarni
wrote:
> Hi,
>
> I am using Lucene 4.6.0.
>
> I have been storing 5 fields for my documents in the index, namely body,
> title, docname, docdate and docid.
>
> But whe
21 matches
Mail list logo