ast 5x compared to old code.
> Is there any thoughts on why term frequency calls on PostingsEnum are that
> slow ?
>
>
>
> *Thanks and Regards,*
> *Vimal Jain*
>
>
> On Wed, Jun 21, 2023 at 1:43 PM Adrien Grand wrote:
>
> > As far as your performance problem i
an you please provide more details on what do you mean by dynamic
> > > pruning
> > > > in context of custom term query ?
> > > >
> > > > On Tue, 20 Jun, 2023, 9:45 pm Adrien Grand,
> wrote:
> > > >
> > > > > Int
> > pruning
> > > in context of custom term query ?
> > >
> > > On Tue, 20 Jun, 2023, 9:45 pm Adrien Grand, wrote:
> > >
> > > > Intuitively replacing a disjunction across multiple fields with a
> > single
> > > > term query should
20 Jun, 2023, 9:45 pm Adrien Grand, wrote:
> >
> > > Intuitively replacing a disjunction across multiple fields with a
> single
> > > term query should always be faster.
> > >
> > > You're saying that you're storing the type of token as part of th
le
> > term query should always be faster.
> >
> > You're saying that you're storing the type of token as part of the term
> > frequency. This doesn't sound like something that would play well with
> > dynamic pruning, so I wonder if this is the reason why you
:
> Intuitively replacing a disjunction across multiple fields with a single
> term query should always be faster.
>
> You're saying that you're storing the type of token as part of the term
> frequency. This doesn't sound like something that would play well with
>
Intuitively replacing a disjunction across multiple fields with a single
term query should always be faster.
You're saying that you're storing the type of token as part of the term
frequency. This doesn't sound like something that would play well with
dynamic pruning, so I wonder
instead of creating
multiple term queries , we create only 1 term query for the merged field
and the scorer of this term query ( on merged field ) makes use of custom
term frequency info to deduce type of token ( during indexing we store this
info ) and hence the score that we were using earlier.
So
i,
> > I want to understand if fetching the term frequency of a term during
> > scoring is relatively cpu bound operation ?
> > Context - I am storing custom term frequency during indexing and later
> > using it for scoring during query execution time ( in Scorer'
Note - i am using lucene 7.7.3
*Thanks and Regards,*
*Vimal Jain*
On Tue, Jun 20, 2023 at 12:26 PM Vimal Jain wrote:
> Hi,
> I want to understand if fetching the term frequency of a term during
> scoring is relatively cpu bound operation ?
> Context - I am storing custom term freq
Hi,
I want to understand if fetching the term frequency of a term during
scoring is relatively cpu bound operation ?
Context - I am storing custom term frequency during indexing and later
using it for scoring during query execution time ( in Scorer's score()
method ). I noticed a performance
et Arslan wrote:
> Hi,
>
> I think you are missing the main query parameter? q=*:*
>
> By the way you may get more response in the sole-user mailing list.
>
> Ahmet
>
>
> On Wednesday, January 4, 2017 4:59 PM, huda barakat <
> eng.huda.bara...@gmail.
uary 4, 2017 4:59 PM, huda barakat <
> eng.huda.bara...@gmail.com> wrote:
> Please help me with this:
>
>
> I have this code which return term frequency from techproducts example:
>
> ///
Hi,
I think you are missing the main query parameter? q=*:*
By the way you may get more response in the sole-user mailing list.
Ahmet
On Wednesday, January 4, 2017 4:59 PM, huda barakat
wrote:
Please help me with this:
I have this code which return term frequency from techproducts example
Please help me with this:
I have this code which return term frequency from techproducts example:
/
import java.util.List;
import org.apache.solr.client.solrj.SolrClient;
import
same class.
>
> Now, I want to handle the term frequency. As far as I can tell, raw TF is
> given to the similarity class by score(int doc, float freq). Which class
> does provide that freq? Or what can I change to provide a different freq
> value, practically changing the document re
Hi,
I'm using Lucene 6.3.0, and trying to handle synonyms at query time.
I think I've handled DF correctly with BlendedTermQuery (by returning the
max DF of the synonyms). TTF is also handled by the same class.
Now, I want to handle the term frequency. As far as I can tell, raw TF i
ery query = new SolrQuery();
query.setQuery("*:*");
SolrRequest req = new QueryRequest(query);
QueryResponse rsp = req.process(solr);
System.out.println("numFound: " +
rsp.getResults().getNumFound());
I get results but the problem I want to get term frequen
the exception line does not match the code you pasted, but do make
sure your object actually not null before accessing its method.
On Thu, Nov 24, 2016 at 5:42 PM, huda barakat
wrote:
> I'm using SOLRJ to find term frequency for each term in a field, I wrote
> this code but it is
I'm using SOLRJ to find term frequency for each term in a field, I wrote
this code but it is not working:
1. String urlString = "http://localhost:8983/solr/huda";;
2. SolrClient solr = new HttpSolrClient.Builder(urlString).build();
3.
4. SolrQu
ermFreqValueSource...
>
> Maybe not helpful at all, but...
> Erick
>
> On Tue, Aug 19, 2014 at 7:04 AM, Bianca Pereira > wrote:
> > Hi everybody,
>>
>> I would like to know your suggestions to calculate Term Frequency
> in a
>
> Hi everybody,
>
> I would like to know your suggestions to calculate Term Frequency in a
> Lucene document. Currently I am using MultiFields.getTermDocsEnum,
> iterating through the DocsEnum 'de' returned and getting the frequency
wit
like to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency with
de.freq() for the desired document.
My solution gives me the result I want
rick
On Tue, Aug 19, 2014 at 7:04 AM, Bianca Pereira wrote:
> Hi everybody,
>
> I would like to know your suggestions to calculate Term Frequency in a
> Lucene document. Currently I am using MultiFields.getTermDocsEnum,
> iterating through the DocsEnum 'de' returned and g
Hi everybody,
I would like to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency with
de.freq() for the desired document.
My solution gi
he aalyzer
> yourself. The stemming is very likely the culprit here.
>
> -- Jack Krupansky
>
> -Original Message- From: Uwe Schindler
> Sent: Thursday, August 7, 2014 9:00 AM
> To: java-user@lucene.apache.org
> Subject: RE: EnglishAnalyzer vs WhiteSpaceAnalyzer in
rsday, August 7, 2014 9:00 AM
To: java-user@lucene.apache.org
Subject: RE: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency
Hi,
if you create the term yourself, it is not going through the analyzer:
public int getTermFrequency(String term, String id)
(you create a BytesRef out of it).
thout also stemming the term before you
StandardAnalyzer does not do stemming, so terms (mostly) stay as they are. But
also for this analyzer, you theoretically has to pass the term through the
analyzer before you can do a term frequency lookup. Just think about that the
term was not lowercased, in
-- Jack Krupansky
>
> -Original Message- From: Bianca Pereira
> Sent: Thursday, August 7, 2014 7:28 AM
> To: java-user@lucene.apache.org
> Subject: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency
>
>
> Hi,
>
> I am new in the list and I have b
need to manually filter your query terms. Sounds like
maybe a term got stemmed.
-- Jack Krupansky
-Original Message-
From: Bianca Pereira
Sent: Thursday, August 7, 2014 7:28 AM
To: java-user@lucene.apache.org
Subject: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency
Hi
Hi,
I am new in the list and I have been working on a problem for some time
already. I would like to know if someone has any idea of how I can solve it.
Given a term, I want to get the term frequency in a lucene document. When
I use the WhiteSpaceAnalyzer my code works properly but when I use
ve over 6 million documents in our index, and would like to construct
> a term frequency matrix over all 6 million documents as quickly as
> possible. Each document has a numeric date field, so we would like to
> build a time series which contains values which are the sum of all
> freq
Hi all,
We have over 6 million documents in our index, and would like to construct a
term frequency matrix over all 6 million documents as quickly as possible.
Each document has a numeric date field, so we would like to build a time series
which contains values which are the sum of all
Is there a way to add a document to the index by supplying terms and
term frequencies directly, rather than via Analysis and/or TokenStream?
I ask because I want to model some data where I know the term
frequencies, but there is no underlying text document to be analyzed. I
could create one b
I am a student and studying the functionality of Lucene for my project work.
If I have to add a new user-generated document in lucene with a term having
a particular frequency just like any text file, how do I do it?
For eg, say I have to add the following documents analyzed from an image
doc1 =
Hi,
On Tue, Apr 9, 2013 at 5:24 PM, Sharon Tam wrote:
> I tried following following this payloads tutorial to attach the term
> frequencies as payloads:
> http://searchhub.org/2009/08/05/getting-started-with-payloads/
>
> But I'm confused as to where I need to override the te
On Tue, Apr 2, 2013 at 4:10 PM, Sharon W Tam wrote:
> Are there any other ideas?
Since scoring seems to be what you are interested in, you could have a
look to payloads: there can store arbitrary data and can be used to
score matches.
--
Adrien
ounts for a
> > term by counting how many times the term appears in a particular
> document.
> > Instead of having Lucene do the counting, I want to do my own counting
> and
> > feed a term-frequency vector representation of a document directly into
> the
> > in
Hi,
On Thu, Mar 28, 2013 at 8:25 PM, Sharon Tam wrote:
> I believe that when Lucene indexes documents, it generates counts for a
> term by counting how many times the term appears in a particular document.
> Instead of having Lucene do the counting, I want to do my own counting and
>
I believe that when Lucene indexes documents, it generates counts for a
term by counting how many times the term appears in a particular document.
Instead of having Lucene do the counting, I want to do my own counting and
feed a term-frequency vector representation of a document directly into the
Store the term value as payload, and score with it.
On Mon, Mar 4, 2013 at 10:10 AM, Sharon Tam wrote:
> Hi,
>
> I have generated my own term-frequency vector representations of documents
> and would like to be able to query these with term-frequency vector queries
> instead o
Hi,
I have generated my own term-frequency vector representations of documents
and would like to be able to query these with term-frequency vector queries
instead of a text-string query. Is there anyway to bypass the Lucene
preprocessing that occurs in the indexing of documents and queryparsing
inal Message- From: Mike Sokolov
Sent: Saturday, June 16, 2012 2:33 PM
To: java-user@lucene.apache.org
Subject: filter by term frequency
I imagine this is a question that comes up from time to time, but I
haven't been able to find a definitive answer anywhere, so...
I'm wondering whe
2012 2:33 PM
To: java-user@lucene.apache.org
Subject: filter by term frequency
I imagine this is a question that comes up from time to time, but I
haven't been able to find a definitive answer anywhere, so...
I'm wondering whether there is some type of Lucene query that filters by
term fr
I imagine this is a question that comes up from time to time, but I
haven't been able to find a definitive answer anywhere, so...
I'm wondering whether there is some type of Lucene query that filters by
term frequency. For example, suppose I want to find all documents that
have
I am currently using Lucene to index a dump of Wikipedia.
I'm using the demo's IndexFiles function for the most part, but I also
want to store the term frequency of a document in the index as well, is
this possible?
Right now, the index just stores the (term -> document pathna
Thanks, this is helpful. Is the affect ( in ranking ) gonna be the
same as passing multiple terms ? I will try it out definitely.
On Tue, Oct 25, 2011 at 3:21 PM, Rene Hackl-Sommer wrote:
> Use term boosts? "solr^3 rocks^2 apache"
>
> http://lucene.apache.org/java/3_4_0/queryparsersyntax.html#Bo
Use term boosts? "solr^3 rocks^2 apache"
http://lucene.apache.org/java/3_4_0/queryparsersyntax.html#Boosting%20a%20Term
Am 25.10.2011 11:19, schrieb prasenjit mukherjee:
During search time I get the following input ( only for 1 field ) =
"solr:3 rocks:2 apache:1" . For this I have to create the
t search time.
>
> hu? I don't understand, if you provide the terms at indexing time
> lucene keeps track of the term frequency etc. why would you want to do
> this at search time?
During search time I get the following input ( only for 1 field ) =
"solr:3 rocks:2 apache:1" . Fo
ucene keeps track of the term frequency etc. why would you want to do
this at search time?
simon
>
> On Mon, Oct 24, 2011 at 1:05 PM, Simon Willnauer
> wrote:
>> so you are saying you got (uniqueTerm, freq) tuples and you want to
>> make lucene use this directly? I think t
e this directly? I think the easiest way is to write a
> simple tokenFilter that emit the term X times where X is the term
> frequency. There is no easy way to pass these tuples to lucene
> directly.
>
> simon
>
> On Mon, Oct 24, 2011 at 3:28 AM, prasenjit mukherjee
> wrote:
>
so you are saying you got (uniqueTerm, freq) tuples and you want to
make lucene use this directly? I think the easiest way is to write a
simple tokenFilter that emit the term X times where X is the term
frequency. There is no easy way to pass these tuples to lucene
directly.
simon
On Mon, Oct 24
Can you tell me how I can feed the lucene index by using the term
frequency directly ?
Actually I am getting the documents along with their term-frequency
and don't want to write any additional code to expand them.
On 10/23/11, ppp c wrote:
> Of curse, it can be reused.
> But from
Of curse, it can be reused.
But from my point of view, it's meaningless, since the analysis process has
to be performed to collect such as prox, offset, or syno, payload and so on.
On Sun, Oct 23, 2011 at 11:22 PM, prasenjit mukherjee
wrote:
> I already have the term-frequency-count for
I already have the term-frequency-count for all the terms in a
document. Is there a way I can re-use that info while indexing. I
would like to use solr for this.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
ord: great }
> doc { question 1, response: 2, word: bad }
> doc { question 1, response: 2, word: excellent}
> doc { question 2, response: 1, word: car}
> doc { question 2, response: 2, word: bike}
> etc.
>
> I would like to get the word which is the most used for question 1.
>
, response: 2, word: bike}
etc.
I would like to get the word which is the most used for question 1.
I learned something about term frequency but all the code samples I
found on the internet deals about the entire index (with indexReader.terms).
Any idea ?
Thank you
quencies of the terms
> before they are stored, but i guess there could be some way to work it
> around???
>
> All hellp appreciated!
>
> Thank you!
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Applying-term-frequency-thresholds-on-indexing-time
???
>
> All hellp appreciated!
>
> Thank you!
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Applying-term-frequency-thresholds-on-indexing-time-tp839449p839449.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
.nabble.com/Applying-term-frequency-thresholds-on-indexing-time-tp839449p839449.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional
you detect that they are phrases? During indexing or during
> > search?
> >
> > On Jan 8, 2010, at 5:16 AM, hrishim wrote:
> >
> >>
> >> Hi .
> >> I have phrases like brain natriuretic peptide indexed as a single token
> >> using
);
> double tf = termDocs.freq();
>
> Regards,
> Hrishi
>
>
> Grant Ingersoll-6 wrote:
>>
>> When do you detect that they are phrases? During indexing or during
>> search?
>>
>> On Jan 8, 2010, at 5:16 AM, hrishim wrote:
>>
>>>
t;
> On Jan 8, 2010, at 5:16 AM, hrishim wrote:
>
>>
>> Hi .
>> I have phrases like brain natriuretic peptide indexed as a single token
>> using Lucene.
>> When I calculate the term frequency for the same the count is 0 since
>> the
>> tokens from th
When do you detect that they are phrases? During indexing or during search?
On Jan 8, 2010, at 5:16 AM, hrishim wrote:
>
> Hi .
> I have phrases like brain natriuretic peptide indexed as a single token
> using Lucene.
> When I calculate the term frequency for the same the count
Fri, Jan 8, 2010 at 5:16 AM, hrishim wrote:
>
> Hi .
> I have phrases like brain natriuretic peptide indexed as a single token
> using Lucene.
> When I calculate the term frequency for the same the count is 0 since the
> tokens from the text are indexed separately i.e. brain , n
a single token
> using Lucene.
> When I calculate the term frequency for the same the count is 0 since the
> tokens from the text are indexed separately i.e. brain , natriuretic ,
> peptide.
> Is there a way to solve this problem and get the term frequency for the
> entire phr
Hi .
I have phrases like brain natriuretic peptide indexed as a single token
using Lucene.
When I calculate the term frequency for the same the count is 0 since the
tokens from the text are indexed separately i.e. brain , natriuretic ,
peptide.
Is there a way to solve this problem and get the
ed by
>> running a query using a TermVectorMapper.
>> I was wondering if anyone knew if there was a faster way to do this rather
>> than using a HashMap with a TermVectorMapper to store the counts of the
>> terms and calling getTermFreqVector().
>> I do not require the term freq
ather than using a HashMap with a TermVectorMapper to store the
counts of the terms and calling getTermFreqVector().
I do not require the term frequency within a document.
I think that is as fast as its going to get unless you have some
other restrictions that would allow you to use a Field
apper to store the
counts of the terms and calling getTermFreqVector().
I do not require the term frequency within a document.
I think that is as fast as its going to get unless you have some other
restrictions that would allow you to use a FieldCache.Can you
describe the bigger proble
getTermFreqVector().
I do not require the term frequency within a document.
Thanks,
Thomas
HashMap termDocCount = new HashMap();
TermQuery tagQuery = new TermQuery(tagTerm);
TopDocs docs = searcher.search(tagQuery, numDocs);
for (int i=0 ; i public void map(String term, int frequency
ant Ingersoll"
To:
Sent: Tuesday, June 30, 2009 9:48 PM
Subject: Re: Term Frequency vector consumes memory
In Lucene, a Term Vector is a specific thing that is stored on disk
when creating a Document and Field. It is optional and off by
default. It is separate from being able to get th
er to load term vector. I want to switch off
this feature? Is that possible without re-indexing?
Regards
Ganesh
- Original Message -
From: "Grant Ingersoll"
To:
Sent: Tuesday, June 30, 2009 9:48 PM
Subject: Re: Term Frequency vector consumes memory
> In Lucene, a Term Ve
I am
not clear on your question.
Cheers,
Grant
On Jun 30, 2009, at 3:37 AM, Ganesh wrote:
At the end of the day, I used to build the stats of top indexed
terms. I enabled term frequency for the single field. It is working
fine. I could able to get the top terms and its frequencies. It
con
At the end of the day, I used to build the stats of top indexed terms. I
enabled term frequency for the single field. It is working fine. I could able
to get the top terms and its frequencies. It consumes huge amount of RAM. My
index size is 5 GB and has 8 million records. If i didn't e
: The easiest way to change the tf calculation would be overwriting
: tf in an own implementation of Similarity like it's done in
: SweetSpotSimilarity. But the average term frequency of the
: document is missing. Is there a simple way to get or calc this
: number?
there was quite a b
Hi,
i'd like to use the term frequency normalization described in
http://wiki.apache.org/lucene-java/TREC%202007%20Million%20Queries%20Track%20-%20IBM%20Haifa%20Team
so that the term frequency tf becomes
tf(f, d) = log(1 + feq(t, d)) / log(1 + avgFreq(d))
The easiest way to change t
Hi,
i'd like to use the term frequency normalization described in
http://wiki.apache.org/lucene-java/TREC%202007%20Million%20Queries%20Track%20-%20IBM%20Haifa%20Team
so that the term frequency tf becomes
tf(f, d) = log(1 + feq(t, d)) / log(1 + avgFreq(d))
The easiest way to change t
: References:
:
: <1998.130.159.185.12.1232021837.squir...@webmail.cis.strath.ac.uk>
: Date: Thu, 15 Jan 2009 04:49:49 -0800 (PST)
: Subject: Term Frequency and IndexSearcher
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting
Hi Paul,
I am tempted to suggest the following ( I am assuming here that the
document and the particular fields are TFVed when indexing):
For every doc in the result set:
- get the doc id
- using the doc id, get the TermFreqVector of this document from the
index reader (tfv=ireader.getTermFr
Hi,
I know it is very easy to get the frequency of a given term using the
indexReader but I am looking to perform an index search and would like to get
the frequency of the given term in the result set. Is this possible?
Thanks in advance,
Paul
---
ok for explain in the API docs by clicking on Index at
the top of the docs. They're all there.
-Original Message-
From: Matthew Hall [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 03, 2008 10:20 AM
To: lucene
Subject: Term Frequency for more complex terms
I have a quick question,
I have a quick question, could someone point me towards where in the API
I'll have to investigate in order to figure out the term frequencies of
more complex terms?
For example I want to know the tf of "kit ligand" treated as a phrase.
I see that luke has access to this information in its exp
I know you have a solution already that I agree with, but I do think
the DisjunctionMaxQuery could serve as the start for writing your own
Query that did what you want. Why would you want to? Well, maybe
you have other ways you want to search as well and don't want to mess
with custom Sim
On 5/25/07, Walt Stoneburner <[EMAIL PROTECTED]> wrote:
In reading the math for scoring at the bottom of:
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html
It appears that if I can make tf() and idf(), term frequency and
i
In reading the math for scoring at the bottom of:
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html
It appears that if I can make tf() and idf(), term frequency and
inverse document frequency respectively, both return 1, then coord
Grant writes:
Have a look at the DisjunctionMaxQuery, I think it might help,
although I am not sure it will fully cover your case.
The definition for DisjunctionMaxQuery is provided at this URL:
http://incubator.apache.org/lucene.net/docs/2.1/Lucene.Net.Search.DisjunctionMaxQuery.html,
Grossly
umber of unique search terms
that are hit, rather than term frequency counts.
A quick example.
If I'm searching for "BIRD CAT DOG" (all should clauses), then I want
...a document with BIRD, CAT, and DOG terms, each only appearing
once, in it to score higher than
...a document
Hi,
I'm trying to figure what I need to do with Lucene to score a
document higher when it has a larger number of unique search terms
that are hit, rather than term frequency counts.
A quick example.
If I'm searching for "BIRD CAT DOG" (all should clauses), then I want
field
values. We are forming a RangeQuery for time and normal query for other
field values.
Now I am able to find Term Frequency per index i.e for the whole 24
hours. But I want to find the Term Frequency for 1 hour i.e between
01:00:00 to 02:00:00. Will it be possible? Is there any API to find
29 apr 2007 kl. 18.33 skrev saikrishna venkata pendyala:
Where does the lucene compute term frequency vector ?
{filename,function
name}
DocumentWriter.java
private final void invertDocument(Document doc)
Actually the task is to replace the all term frequencies with some
constant number
Hai ,
Where does the lucene compute term frequency vector ? {filename,function
name}
Actually the task is to replace the all term frequencies with some
constant number(integer), how to do this ?
Any kind of help is appreciated .
Thanks in advance.
Hi,
How to get term frequency of multi terms in particular document? Any API
method other than using TermVector may help?
Also How to calculate termfreq. of time range. i.e : If my index have a
field "TIME" with values in millis (like 1176281188000)., and I want to
calculate ter
karl wettin <[EMAIL PROTECTED]> wrote on 12/04/2007 00:25:47:
>
> 12 apr 2007 kl. 09.12 skrev sai hariharan:
>
> > Thanx for replying. In my scenario i'm not going to index any of my
> > docs.
> > So is there a way to find out term frequencies of the terms in a doc
> > without doing the indexing p
12 apr 2007 kl. 09.12 skrev sai hariharan:
Thanx for replying. In my scenario i'm not going to index any of my
docs.
So is there a way to find out term frequencies of the terms in a doc
without doing the indexing part?
Using an analyzer (Tokenstream) and a Map?
while ((t = ts.next)!=null)
Hi,
Thanx for replying. In my scenario i'm not going to index any of my docs.
So is there a way to find out term frequencies of the terms in a doc
without doing the indexing part?
Thanx in advance,
Hari
On 4/12/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
Add Term Vectors to your Field durin
Add Term Vectors to your Field during indexing. See the Field
constructors. To get a Term Vector out, see
IndexReader.getTermFreqVector method.
-Grant
On Apr 11, 2007, at 3:23 PM, sai hariharan wrote:
Hi,
I've just started using Lucene. Can anybody assist me in calculating
the term frequ
Hi,
I've just started using Lucene. Can anybody assist me in calculating
the term frequencies of the terms(words) that occur in a document(*.txt),
when a particular doc is submitted.
Say when i submit sample.txt , i should first analyze the document
with a standard anlyzer, then the term frequenc
On Apr 11, 2007, at 9:07 AM, karl wettin wrote:
11 apr 2007 kl. 04.21 skrev Grant Ingersoll:
Would some sort of caching strategy work? How big is your overall
collection?
Also, lately there have been a few threads on TV (term vector)
performance. I don't recall anyone having actively
11 apr 2007 kl. 04.21 skrev Grant Ingersoll:
Would some sort of caching strategy work? How big is your overall
collection?
Also, lately there have been a few threads on TV (term vector)
performance. I don't recall anyone having actively profiled or
examined it for improvements, so perh
1 - 100 of 122 matches
Mail list logo