.
Regards,
Sengly
On Mon, Jun 16, 2008 at 9:14 PM, Grant Ingersoll [EMAIL PROTECTED]
wrote:
What do your documents look like? Can you share more about the problem?
Is there some kind of structure that lets you count this information?
-Grant
On Jun 15, 2008, at 5:08 AM, Sengly Heng wrote
://wunderwood.org/most_casual_observer/2007/04/progressive_reranking.html
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Sengly Heng [EMAIL PROTECTED]
To: java-user@lucene.apache.org
Sent: Friday, June 13, 2008 11:47:26 AM
Subject: Seeking
Hello all,
I am facing a problem when dealing a query such as Finding all the
documents that write about at least 5 animals? How to handle it?
Do you have any idea?
Thank you.
Best regards,
Sengly
Dear all,
I would like to seek your suggestion on re-ranking methodology. My
problem is that I have a set of resulting documents to a query and
each one of them with a matching score and also a list of relatedness
score between each two of them. I would like to re-rank my resulting
documents by
Hello all,
I would like to extract the term freq vector from the hit results as a total
vector not by document.
I have searched the mailing and I found many have talked about this issue
but I still could not find the right solution to this matter. Everyone just
suggested to look at
have any. Your help is hightly appreciated.
Best,
Sengly
Sengly Heng wrote:
Hello all,
I would like to extract the term freq vector from the hit results as a
total
vector not by document.
I have searched the mailing and I found many have talked about this
issue
but I still could not find
Dear Karl,
Thank you for taking your time in my problem.
We don't really know what your problem is. Explaining that rathern
than the solution you have thought of might render a couple of
alternate solutions. Perhaps something could be precalculated and
stored in the documents. Perhaps
Once again, thank you for your help.
We don't really know what your problem is. Explaining that rathern
than the solution you have thought of might render a couple of
alternate solutions. Perhaps something could be precalculated and
stored in the documents. Perhaps feature selection
Dear all,
My problem is a little bit strange. Instead of parsing the content of the
document to the indexer. I am adding one by one. Here is a piece of my code
:
Document doc = new Document();
doc.add(Field.Text(Features, blue);
doc.add(Field.Text(Features,beautiful);
fields without knowing in advance
what are the tokens that we have.
Once again, thank you very much for your reply.
Best regards,
Sengly
On 4/4/07, Erick Erickson [EMAIL PROTECTED] wrote:
See below
On 4/4/07, Sengly Heng [EMAIL PROTECTED] wrote:
Dear all,
My problem is a little bit strange
);
TermEnum te=ISer.terms(new Term(Features,blue));
Term te1= te.term();
System.out.println(Frequency of blue +ISer.docFreq(te1));
regards,
-LM
On 4/4/07, Sengly Heng [EMAIL PROTECTED] wrote:
Dear all,
My problem is a little bit strange. Instead of parsing the content
Hello Luceners,
I have a collections of vector of terms (token) that I extracted from files.
I am looking for ways to calculate TF/IDF of each term.
I wanted to use Lucene to do this but Lucene is made for collections of
files and in my case I have already extracted those files into vector of
.
For the calculation of the idf, you can use the provided formula from
the DefaultSimilarity.
To get the document frequency, which is necessary to calculate the idf,
you can call:
reader.docFreq(term)
Hope this helps...
Thomas
Sengly Heng wrote:
Hello Luceners,
I have a collections of vector
fit-in this case.
Thanks once again everyone.
Best regards,
Sengly
On 3/28/07, karl wettin [EMAIL PROTECTED] wrote:
28 mar 2007 kl. 10.36 skrev Sengly Heng:
Does anyone of you know any Java API that directly handle this
problem?
or I have to implement from scratch.
You can also try
.
Thanks once again everyone.
Best regards,
Sengly
On 3/28/07, karl wettin [EMAIL PROTECTED] wrote:
28 mar 2007 kl. 10.36 skrev Sengly Heng:
Does anyone of you know any Java API that directly handle this
problem?
or I have to implement from scratch.
You can also try
15 matches
Mail list logo