It seems that tokens are sorted by frequencies :
...
Collections.sort(profile, new TokenComparator());
...
and
private static class TokenComparator implements ComparatorToken {
public int compare(Token t1, Token t2) {
return t2.cnt - t1.cnt;
}
and cnt is the token count.
order comes from the way they are
inserted in hashmap 'tokens' and not from the order the tokens appear on
original text.
Frederico
-Original Message-
From: lboutros [mailto:boutr...@gmail.com]
Sent: sexta-feira, 8 de Abril de 2011 09:49
To: solr-user@lucene.apache.org
Subject: Re: Using MLT
/SendEmail.jtp?type=nodenode=2794604i=1by-user=t
Subject: Re: Using MLT feature
It seems that tokens are sorted by frequencies :
...
Collections.sort(profile, new TokenComparator());
...
and
private static class TokenComparator implements ComparatorToken {
public int compare(Token t1, Token
10:11
To: solr-user@lucene.apache.org
Subject: Re: Using MLT feature
Couldn't you extend the TextProfileSignature and modify the TokenComparator
class to use lexical order when token have the same frequency ?
Ludovic.
2011/4/8 Frederico Azeiteiro [via Lucene]
ml-node+2794604-1683988626-383...@n3
...@openindex.io]
Sent: terça-feira, 5 de Abril de 2011 15:20
To: solr-user@lucene.apache.org
Cc: Frederico Azeiteiro
Subject: Re: Using MLT feature
If you check the code for TextProfileSignature [1] your'll notice the init
method reading params. You can set those params as you did. Reading Javadoc
Azeiteiro
Subject: Re: Using MLT feature
If you check the code for TextProfileSignature [1] your'll notice the init
method reading params. You can set those params as you did. Reading Javadoc
[2] might help as well. But what's not documented in the Javadoc is how QUANT
is computed; it rounds
using the TextProfileSignature with success?
Thank you,
Frederico
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: segunda-feira, 4 de Abril de 2011 16:47
To: solr-user@lucene.apache.org
Cc: Frederico Azeiteiro
Subject: Re: Using MLT feature
Hi again,
I
[mailto:frederico.azeite...@cision.com]
Sent: segunda-feira, 4 de Abril de 2011 11:59
To: solr-user@lucene.apache.org
Subject: RE: Using MLT feature
Thank you Markus it looks great.
But the wiki is not very detailed on this.
Do you mean if I:
1. Create
=minTokenLen5/str
On the processor tag.
Best regards,
Frederico
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: terça-feira, 5 de Abril de 2011 12:01
To: solr-user@lucene.apache.org
Cc: Frederico Azeiteiro
Subject: Re: Using MLT feature
On Tuesday 05 April 2011
.
Best regards,
Frederico
-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: terça-feira, 5 de Abril de 2011 12:01
To: solr-user@lucene.apache.org
Cc: Frederico Azeiteiro
Subject: Re: Using MLT feature
On Tuesday 05 April 2011 12:19:33 Frederico
Do you want to not index if something similar? Or don't index if exact. I
would look into a hash code of the document if you don't want to index exact.
Similar though, I think has to be based off a document in the index.
On Apr 4, 2011, at 5:16, Frederico Azeiteiro
-
From: Chris Fauerbach [mailto:chris.fauerb...@gmail.com]
Sent: segunda-feira, 4 de Abril de 2011 10:22
To: solr-user@lucene.apache.org
Subject: Re: Using MLT feature
Do you want to not index if something similar? Or don't index if exact.
I would look into a hash code of the document if you
the MLT feature to find similar docs before adding to final
index?
Thanks,
Frederico
-Original Message-
From: Chris Fauerbach [mailto:chris.fauerb...@gmail.com]
Sent: segunda-feira, 4 de Abril de 2011 10:22
To: solr-user@lucene.apache.org
Subject: Re: Using MLT feature
Do you
-
From: Markus Jelsma [mailto:markus.jel...@openindex.io]
Sent: segunda-feira, 4 de Abril de 2011 10:48
To: solr-user@lucene.apache.org
Subject: Re: Using MLT feature
http://wiki.apache.org/solr/Deduplication
On Monday 04 April 2011 11:34:52 Frederico Azeiteiro wrote:
Hi,
The ideia is don't
[mailto:frederico.azeite...@cision.com]
Sent: segunda-feira, 4 de Abril de 2011 11:59
To: solr-user@lucene.apache.org
Subject: RE: Using MLT feature
Thank you Markus it looks great.
But the wiki is not very detailed on this.
Do you mean if I:
1. Create:
updateRequestProcessorChain name=dedupe
processor
: segunda-feira, 4 de Abril de 2011 11:59
To: solr-user@lucene.apache.org
Subject: RE: Using MLT feature
Thank you Markus it looks great.
But the wiki is not very detailed on this.
Do you mean if I:
1. Create:
updateRequestProcessorChain name=dedupe
processor
class
16 matches
Mail list logo