least 5x compared to old code.
> Is there any thoughts on why term frequency calls on PostingsEnum are that
> slow ?
>
>
>
> *Thanks and Regards,*
> *Vimal Jain*
>
>
> On Wed, Jun 21, 2023 at 1:43 PM Adrien Grand wrote:
>
> > As far as your performance problem i
rovide more details on what do you mean by dynamic
> > > pruning
> > > > in context of custom term query ?
> > > >
> > > > On Tue, 20 Jun, 2023, 9:45 pm Adrien Grand,
> wrote:
> > > >
> > > > > Intuitively replac
ng
> > > in context of custom term query ?
> > >
> > > On Tue, 20 Jun, 2023, 9:45 pm Adrien Grand, wrote:
> > >
> > > > Intuitively replacing a disjunction across multiple fields with a
> > single
> > > > term query should alw
n, 2023, 9:45 pm Adrien Grand, wrote:
> >
> > > Intuitively replacing a disjunction across multiple fields with a
> single
> > > term query should always be faster.
> > >
> > > You're saying that you're storing the type of token as part of the term
> >
t; > term query should always be faster.
> >
> > You're saying that you're storing the type of token as part of the term
> > frequency. This doesn't sound like something that would play well with
> > dynamic pruning, so I wonder if this is the reason why you are seeing
> >
:
> Intuitively replacing a disjunction across multiple fields with a single
> term query should always be faster.
>
> You're saying that you're storing the type of token as part of the term
> frequency. This doesn't sound like something that would play well with
> dynamic pr
Intuitively replacing a disjunction across multiple fields with a single
term query should always be faster.
You're saying that you're storing the type of token as part of the term
frequency. This doesn't sound like something that would play well with
dynamic pruning, so I wonder
and instead of creating
multiple term queries , we create only 1 term query for the merged field
and the scorer of this term query ( on merged field ) makes use of custom
term frequency info to deduce type of token ( during indexing we store this
info ) and hence the score that we were using earlier.
So
; Hi,
> > I want to understand if fetching the term frequency of a term during
> > scoring is relatively cpu bound operation ?
> > Context - I am storing custom term frequency during indexing and later
> > using it for scoring during query execution time ( in Scorer's sc
Note - i am using lucene 7.7.3
*Thanks and Regards,*
*Vimal Jain*
On Tue, Jun 20, 2023 at 12:26 PM Vimal Jain wrote:
> Hi,
> I want to understand if fetching the term frequency of a term during
> scoring is relatively cpu bound operation ?
> Context - I am storing custom term freq
Hi,
I want to understand if fetching the term frequency of a term during
scoring is relatively cpu bound operation ?
Context - I am storing custom term frequency during indexing and later
using it for scoring during query execution time ( in Scorer's score()
method ). I noticed a performance drop
nuary 2017 at 18:25, Ahmet Arslan <iori...@yahoo.com.invalid> wrote:
> Hi,
>
> I think you are missing the main query parameter? q=*:*
>
> By the way you may get more response in the sole-user mailing list.
>
> Ahmet
>
>
> On Wednesday, January 4, 2017 4:59 PM, hud
;
>
> On Wednesday, January 4, 2017 4:59 PM, huda barakat <
> eng.huda.bara...@gmail.com> wrote:
> Please help me with this:
>
>
> I have this code which return term frequency from techproducts example:
>
> //
Hi,
I think you are missing the main query parameter? q=*:*
By the way you may get more response in the sole-user mailing list.
Ahmet
On Wednesday, January 4, 2017 4:59 PM, huda barakat
<eng.huda.bara...@gmail.com> wrote:
Please help me with this:
I have this code which retur
Please help me with this:
I have this code which return term frequency from techproducts example:
/
import java.util.List;
import org.apache.solr.client.solrj.SolrClient;
import
onyms). TTF is also handled by the same class.
>
> Now, I want to handle the term frequency. As far as I can tell, raw TF is
> given to the similarity class by score(int doc, float freq). Which class
> does provide that freq? Or what can I change to provide a different freq
> value, practi
Hi,
I'm using Lucene 6.3.0, and trying to handle synonyms at query time.
I think I've handled DF correctly with BlendedTermQuery (by returning the
max DF of the synonyms). TTF is also handled by the same class.
Now, I want to handle the term frequency. As far as I can tell, raw TF is
given
uery = new SolrQuery();
query.setQuery("*:*");
SolrRequest req = new QueryRequest(query);
QueryResponse rsp = req.process(solr);
System.out.println("numFound: " +
rsp.getResults().getNumFound());
I get results but the problem I want to get term frequency in
the exception line does not match the code you pasted, but do make
sure your object actually not null before accessing its method.
On Thu, Nov 24, 2016 at 5:42 PM, huda barakat
<eng.huda.bara...@gmail.com> wrote:
> I'm using SOLRJ to find term frequency for each term in a field
I'm using SOLRJ to find term frequency for each term in a field, I wrote
this code but it is not working:
1. String urlString = "http://localhost:8983/solr/huda;;
2. SolrClient solr = new HttpSolrClient.Builder(urlString).build();
3.
4. SolrQuery query
On Tue, Aug 19, 2014 at 7:04 AM, Bianca Pereira aivykar...@gmail.com
wrote:
Hi everybody,
I would like to know your suggestions to calculate Term Frequency
in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through
Hi everybody,
I would like to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency with
de.freq() for the desired document.
My solution gives me
, 2014 at 7:04 AM, Bianca Pereira aivykar...@gmail.com wrote:
Hi everybody,
I would like to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency with
de.freq
to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency with
de.freq() for the desired document.
My solution gives me the result I want but I am having
...@gmail.com
wrote:
Hi everybody,
I would like to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency
Hi,
I am new in the list and I have been working on a problem for some time
already. I would like to know if someone has any idea of how I can solve it.
Given a term, I want to get the term frequency in a lucene document. When
I use the WhiteSpaceAnalyzer my code works properly but when I use
need to manually filter your query terms. Sounds like
maybe a term got stemmed.
-- Jack Krupansky
-Original Message-
From: Bianca Pereira
Sent: Thursday, August 7, 2014 7:28 AM
To: java-user@lucene.apache.org
Subject: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency
Hi
Message- From: Bianca Pereira
Sent: Thursday, August 7, 2014 7:28 AM
To: java-user@lucene.apache.org
Subject: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency
Hi,
I am new in the list and I have been working on a problem for some time
already. I would like to know
9:00 AM
To: java-user@lucene.apache.org
Subject: RE: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency
Hi,
if you create the term yourself, it is not going through the analyzer:
public int getTermFrequency(String term, String id)
(you create a BytesRef out of it). So you have
the aalyzer
yourself. The stemming is very likely the culprit here.
-- Jack Krupansky
-Original Message- From: Uwe Schindler
Sent: Thursday, August 7, 2014 9:00 AM
To: java-user@lucene.apache.org
Subject: RE: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term
Frequency
Hi
Hi all,
We have over 6 million documents in our index, and would like to construct a
term frequency matrix over all 6 million documents as quickly as possible.
Each document has a numeric date field, so we would like to build a time series
which contains values which are the sum of all
Is there a way to add a document to the index by supplying terms and
term frequencies directly, rather than via Analysis and/or TokenStream?
I ask because I want to model some data where I know the term
frequencies, but there is no underlying text document to be analyzed. I
could create one
I am a student and studying the functionality of Lucene for my project work.
If I have to add a new user-generated document in lucene with a term having
a particular frequency just like any text file, how do I do it?
For eg, say I have to add the following documents analyzed from an image
doc1 =
frequency counter
so that it uses my term frequencies. I think term frequency counts are
calculated during indexing, so I don't think I can just write my own
Similarity class?
This is correct, frequencies are computed at indexing time. I just
wanted to mention that you can influence scores based
, it generates counts for a
term by counting how many times the term appears in a particular
document.
Instead of having Lucene do the counting, I want to do my own counting
and
feed a term-frequency vector representation of a document directly into
the
indexer which will take my counts
On Tue, Apr 2, 2013 at 4:10 PM, Sharon W Tam s...@mit.edu wrote:
Are there any other ideas?
Since scoring seems to be what you are interested in, you could have a
look to payloads: there can store arbitrary data and can be used to
score matches.
--
Adrien
I believe that when Lucene indexes documents, it generates counts for a
term by counting how many times the term appears in a particular document.
Instead of having Lucene do the counting, I want to do my own counting and
feed a term-frequency vector representation of a document directly
and
feed a term-frequency vector representation of a document directly into the
indexer which will take my counts and proceed to do the other processing
such as generating inverse document frequency. My term-frequencies may not
all be integers. Is there a way to do this?
You could provide
Hi,
I have generated my own term-frequency vector representations of documents
and would like to be able to query these with term-frequency vector queries
instead of a text-string query. Is there anyway to bypass the Lucene
preprocessing that occurs in the indexing of documents and queryparsing
Store the term value as payload, and score with it.
On Mon, Mar 4, 2013 at 10:10 AM, Sharon Tam sharon...@gmail.com wrote:
Hi,
I have generated my own term-frequency vector representations of documents
and would like to be able to query these with term-frequency vector queries
instead
:33 PM
To: java-user@lucene.apache.org
Subject: filter by term frequency
I imagine this is a question that comes up from time to time, but I
haven't been able to find a definitive answer anywhere, so...
I'm wondering whether there is some type of Lucene query that filters by
term frequency
I imagine this is a question that comes up from time to time, but I
haven't been able to find a definitive answer anywhere, so...
I'm wondering whether there is some type of Lucene query that filters by
term frequency. For example, suppose I want to find all documents that
have exactly 2
frequency
I imagine this is a question that comes up from time to time, but I
haven't been able to find a definitive answer anywhere, so...
I'm wondering whether there is some type of Lucene query that filters by
term frequency. For example, suppose I want to find all documents that
have exactly
I am currently using Lucene to index a dump of Wikipedia.
I'm using the demo's IndexFiles function for the most part, but I also
want to store the term frequency of a document in the index as well, is
this possible?
Right now, the index just stores the (term - document pathname)
mappings
lucene keeps track of the term frequency etc. why would you want to do
this at search time?
simon
On Mon, Oct 24, 2011 at 1:05 PM, Simon Willnauer
simon.willna...@googlemail.com wrote:
so you are saying you got (uniqueTerm, freq) tuples and you want to
make lucene use this directly? I think
that at search time.
hu? I don't understand, if you provide the terms at indexing time
lucene keeps track of the term frequency etc. why would you want to do
this at search time?
During search time I get the following input ( only for 1 field ) =
solr:3 rocks:2 apache:1 . For this I have to create
Use term boosts? solr^3 rocks^2 apache
http://lucene.apache.org/java/3_4_0/queryparsersyntax.html#Boosting%20a%20Term
Am 25.10.2011 11:19, schrieb prasenjit mukherjee:
During search time I get the following input ( only for 1 field ) =
solr:3 rocks:2 apache:1 . For this I have to create the
Thanks, this is helpful. Is the affect ( in ranking ) gonna be the
same as passing multiple terms ? I will try it out definitely.
On Tue, Oct 25, 2011 at 3:21 PM, Rene Hackl-Sommer rene.a.ha...@gmx.de wrote:
Use term boosts? solr^3 rocks^2 apache
so you are saying you got (uniqueTerm, freq) tuples and you want to
make lucene use this directly? I think the easiest way is to write a
simple tokenFilter that emit the term X times where X is the term
frequency. There is no easy way to pass these tuples to lucene
directly.
simon
On Mon, Oct 24
to
make lucene use this directly? I think the easiest way is to write a
simple tokenFilter that emit the term X times where X is the term
frequency. There is no easy way to pass these tuples to lucene
directly.
simon
On Mon, Oct 24, 2011 at 3:28 AM, prasenjit mukherjee
prasen@gmail.com
I already have the term-frequency-count for all the terms in a
document. Is there a way I can re-use that info while indexing. I
would like to use solr for this.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
Of curse, it can be reused.
But from my point of view, it's meaningless, since the analysis process has
to be performed to collect such as prox, offset, or syno, payload and so on.
On Sun, Oct 23, 2011 at 11:22 PM, prasenjit mukherjee
prasen@gmail.comwrote:
I already have the term-frequency
Can you tell me how I can feed the lucene index by using the term
frequency directly ?
Actually I am getting the documents along with their term-frequency
and don't want to write any additional code to expand them.
On 10/23/11, ppp c peter.c.e...@gmail.com wrote:
Of curse, it can be reused
, response: 2, word: bike}
etc.
I would like to get the word which is the most used for question 1.
I learned something about term frequency but all the code samples I
found on the internet deals about the entire index (with indexReader.terms).
Any idea ?
Thank you
before they are stored, but i guess there could be some way to work it
around???
All hellp appreciated!
Thank you!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Applying-term-frequency-thresholds-on-indexing-time-tp839449p839449.html
Sent from the Lucene - Java Users
.nabble.com/Applying-term-frequency-thresholds-on-indexing-time-tp839449p839449.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional
!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Applying-term-frequency-thresholds-on-indexing-time-tp839449p839449.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e
Hi .
I have phrases like brain natriuretic peptide indexed as a single token
using Lucene.
When I calculate the term frequency for the same the count is 0 since the
tokens from the text are indexed separately i.e. brain , natriuretic ,
peptide.
Is there a way to solve this problem and get
indexed as a single token
using Lucene.
When I calculate the term frequency for the same the count is 0 since the
tokens from the text are indexed separately i.e. brain , natriuretic ,
peptide.
Is there a way to solve this problem and get the term frequency for the
entire phrase ?
Regards
On a quick read, your statements are contradictory
I have phrases like brain natriuretic peptide indexed as a single
token
When I calculate the term frequency for the same the count is 0 since
the
tokens from the text are indexed separately i.e. brain , natriuretic ,
peptide.
Either brain
When do you detect that they are phrases? During indexing or during search?
On Jan 8, 2010, at 5:16 AM, hrishim wrote:
Hi .
I have phrases like brain natriuretic peptide indexed as a single token
using Lucene.
When I calculate the term frequency for the same the count is 0 since
, hrishim wrote:
Hi .
I have phrases like brain natriuretic peptide indexed as a single token
using Lucene.
When I calculate the term frequency for the same the count is 0 since
the
tokens from the text are indexed separately i.e. brain , natriuretic ,
peptide.
Is there a way to solve
that they are phrases? During indexing or during
search?
On Jan 8, 2010, at 5:16 AM, hrishim wrote:
Hi .
I have phrases like brain natriuretic peptide indexed as a single token
using Lucene.
When I calculate the term frequency for the same the count is 0 since
the
tokens from the text are indexed
the term frequency for the same the count is 0 since
the
tokens from the text are indexed separately i.e. brain , natriuretic ,
peptide.
Is there a way to solve this problem and get the term frequency for the
entire phrase ?
Regards,
Hrishi
--
View this message in context
than using a HashMap with a TermVectorMapper to store the
counts of the terms and calling getTermFreqVector().
I do not require the term frequency within a document.
I think that is as fast as its going to get unless you have some
other restrictions that would allow you to use a FieldCache
a TermVectorMapper.
I was wondering if anyone knew if there was a faster way to do this rather
than using a HashMap with a TermVectorMapper to store the counts of the
terms and calling getTermFreqVector().
I do not require the term frequency within a document.
I think that is as fast as its
to store the
counts of the terms and calling getTermFreqVector().
I do not require the term frequency within a document.
I think that is as fast as its going to get unless you have some other
restrictions that would allow you to use a FieldCache.Can you
describe the bigger problem you
getTermFreqVector().
I do not require the term frequency within a document.
Thanks,
Thomas
HashMap termDocCount = new HashMap();
TermQuery tagQuery = new TermQuery(tagTerm);
TopDocs docs = searcher.search(tagQuery, numDocs);
for (int i=0 ; idocs.scoreDocs.length; ++i) {
ScoreDoc sdoc
...@apache.org
To: java-user@lucene.apache.org
Sent: Tuesday, June 30, 2009 9:48 PM
Subject: Re: Term Frequency vector consumes memory
In Lucene, a Term Vector is a specific thing that is stored on disk
when creating a Document and Field. It is optional and off by
default. It is separate from being
At the end of the day, I used to build the stats of top indexed terms. I
enabled term frequency for the single field. It is working fine. I could able
to get the top terms and its frequencies. It consumes huge amount of RAM. My
index size is 5 GB and has 8 million records. If i didn't enable
not clear on your question.
Cheers,
Grant
On Jun 30, 2009, at 3:37 AM, Ganesh wrote:
At the end of the day, I used to build the stats of top indexed
terms. I enabled term frequency for the single field. It is working
fine. I could able to get the top terms and its frequencies. It
consumes
to load term vector. I want to switch off
this feature? Is that possible without re-indexing?
Regards
Ganesh
- Original Message -
From: Grant Ingersoll gsing...@apache.org
To: java-user@lucene.apache.org
Sent: Tuesday, June 30, 2009 9:48 PM
Subject: Re: Term Frequency vector consumes memory
: The easiest way to change the tf calculation would be overwriting
: tf in an own implementation of Similarity like it's done in
: SweetSpotSimilarity. But the average term frequency of the
: document is missing. Is there a simple way to get or calc this
: number?
there was quite a bit
Hi,
i'd like to use the term frequency normalization described in
http://wiki.apache.org/lucene-java/TREC%202007%20Million%20Queries%20Track%20-%20IBM%20Haifa%20Team
so that the term frequency tf becomes
tf(f, d) = log(1 + feq(t, d)) / log(1 + avgFreq(d))
The easiest way to change the tf
Hi,
i'd like to use the term frequency normalization described in
http://wiki.apache.org/lucene-java/TREC%202007%20Million%20Queries%20Track%20-%20IBM%20Haifa%20Team
so that the term frequency tf becomes
tf(f, d) = log(1 + feq(t, d)) / log(1 + avgFreq(d))
The easiest way to change the tf
: References:
: offfa5f4d3.751e9148-on8525753f.003e1216-8525753f.003e6...@us.ibm.com
: 1998.130.159.185.12.1232021837.squir...@webmail.cis.strath.ac.uk
: Date: Thu, 15 Jan 2009 04:49:49 -0800 (PST)
: Subject: Term Frequency and IndexSearcher
http://people.apache.org/~hossman
Hi,
I know it is very easy to get the frequency of a given term using the
indexReader but I am looking to perform an index search and would like to get
the frequency of the given term in the result set. Is this possible?
Thanks in advance,
Paul
Hi Paul,
I am tempted to suggest the following ( I am assuming here that the
document and the particular fields are TFVed when indexing):
For every doc in the result set:
- get the doc id
- using the doc id, get the TermFreqVector of this document from the
index reader
I have a quick question, could someone point me towards where in the API
I'll have to investigate in order to figure out the term frequencies of
more complex terms?
For example I want to know the tf of kit ligand treated as a phrase.
I see that luke has access to this information in its
docs by clicking on Index at
the top of the docs. They're all there.
-Original Message-
From: Matthew Hall [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 03, 2008 10:20 AM
To: lucene
Subject: Term Frequency for more complex terms
I have a quick question, could someone point me towards
On 5/25/07, Walt Stoneburner [EMAIL PROTECTED] wrote:
In reading the math for scoring at the bottom of:
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html
It appears that if I can make tf() and idf(), term frequency and
inverse
Hi,
I'm trying to figure what I need to do with Lucene to score a
document higher when it has a larger number of unique search terms
that are hit, rather than term frequency counts.
A quick example.
If I'm searching for BIRD CAT DOG (all should clauses), then I want
...a document
of unique search terms
that are hit, rather than term frequency counts.
A quick example.
If I'm searching for BIRD CAT DOG (all should clauses), then I want
...a document with BIRD, CAT, and DOG terms, each only appearing
once, in it to score higher than
...a document with BIRD, CAT, CAT
values. We are forming a RangeQuery for time and normal query for other
field values.
Now I am able to find Term Frequency per index i.e for the whole 24
hours. But I want to find the Term Frequency for 1 hour i.e between
01:00:00 to 02:00:00. Will it be possible? Is there any API to find Term
29 apr 2007 kl. 18.33 skrev saikrishna venkata pendyala:
Where does the lucene compute term frequency vector ?
{filename,function
name}
DocumentWriter.java
private final void invertDocument(Document doc)
Actually the task is to replace the all term frequencies with some
constant number
Hai ,
Where does the lucene compute term frequency vector ? {filename,function
name}
Actually the task is to replace the all term frequencies with some
constant number(integer), how to do this ?
Any kind of help is appreciated .
Thanks in advance.
Hi,
How to get term frequency of multi terms in particular document? Any API
method other than using TermVector may help?
Also How to calculate termfreq. of time range. i.e : If my index have a
field TIME with values in millis (like 1176281188000)., and I want to
calculate term freq
Hi,
Thanx for replying. In my scenario i'm not going to index any of my docs.
So is there a way to find out term frequencies of the terms in a doc
without doing the indexing part?
Thanx in advance,
Hari
On 4/12/07, Grant Ingersoll [EMAIL PROTECTED] wrote:
Add Term Vectors to your Field during
12 apr 2007 kl. 09.12 skrev sai hariharan:
Thanx for replying. In my scenario i'm not going to index any of my
docs.
So is there a way to find out term frequencies of the terms in a doc
without doing the indexing part?
Using an analyzer (Tokenstream) and a MapString, Integer?
while ((t =
karl wettin [EMAIL PROTECTED] wrote on 12/04/2007 00:25:47:
12 apr 2007 kl. 09.12 skrev sai hariharan:
Thanx for replying. In my scenario i'm not going to index any of my
docs.
So is there a way to find out term frequencies of the terms in a doc
without doing the indexing part?
Using
11 apr 2007 kl. 04.21 skrev Grant Ingersoll:
Would some sort of caching strategy work? How big is your overall
collection?
Also, lately there have been a few threads on TV (term vector)
performance. I don't recall anyone having actively profiled or
examined it for improvements, so
On Apr 11, 2007, at 9:07 AM, karl wettin wrote:
11 apr 2007 kl. 04.21 skrev Grant Ingersoll:
Would some sort of caching strategy work? How big is your overall
collection?
Also, lately there have been a few threads on TV (term vector)
performance. I don't recall anyone having actively
Hi,
I've just started using Lucene. Can anybody assist me in calculating
the term frequencies of the terms(words) that occur in a document(*.txt),
when a particular doc is submitted.
Say when i submit sample.txt , i should first analyze the document
with a standard anlyzer, then the term
Add Term Vectors to your Field during indexing. See the Field
constructors. To get a Term Vector out, see
IndexReader.getTermFreqVector method.
-Grant
On Apr 11, 2007, at 3:23 PM, sai hariharan wrote:
Hi,
I've just started using Lucene. Can anybody assist me in calculating
the term
Hello all,
I would like to extract the term freq vector from the hit results as a total
vector not by document.
I have searched the mailing and I found many have talked about this issue
but I still could not find the right solution to this matter. Everyone just
suggested to look at
.
Here is an example:
for (int i = 0; i 10; i++) {
int docNumber = hits.id(i);
TermFreqVector[] termsV =
ir.getTermFreqVectors(docNumber); //return an array of term frequency
vectors for the specified document.
for (int xy = 0; xy
the document vector space
model is not available in any other fashion than the term frequency
vectors, or building them from scratch by enumerating the whole
index. The latter of course beeing horrible slow in most cases.
--
karl
the hits object you can iterate over the
first results.
Here is an example:
for (int i = 0; i 10; i++) {
int docNumber = hits.id(i);
TermFreqVector[] termsV =
ir.getTermFreqVectors(docNumber); //return an array of term frequency
vectors for the specified
Dear Karl,
Thank you for taking your time in my problem.
We don't really know what your problem is. Explaining that rathern
than the solution you have thought of might render a couple of
alternate solutions. Perhaps something could be precalculated and
stored in the documents. Perhaps
10 apr 2007 kl. 17.48 skrev Sengly Heng:
We don't really know what your problem is. Explaining that rathern
than the solution you have thought of might render a couple of
alternate solutions. Perhaps something could be precalculated and
stored in the documents. Perhaps feature selection
1 - 100 of 112 matches
Mail list logo