I start to feel that is not that easy to contribute improvements or small
fix to Solr ( if they are not super interesting to the mass) .
I think this one could be a good improvement in the MLT but I would love to
discuss this with some committer.
The patch is attached, it is there since months ago...
Any feedback would be appreciated, I want to contribute, but I need some
second opinions ...

Cheers

On 11 February 2016 at 13:48, Alessandro Benedetti <abenede...@apache.org>
wrote:

> Hi Guys,
> is it possible to have any feedback ?
> Is there any process to speed up bug resolution / discussions ?
> just want to understand if the patch is not good enough, if I need to
> improve it or simply no-one took a look ...
>
> https://issues.apache.org/jira/browse/LUCENE-6954
>
> Cheers
>
> On 11 January 2016 at 15:25, Alessandro Benedetti <abenede...@apache.org>
> wrote:
>
>> Hi guys,
>> the patch seems fine to me.
>> I didn't spend much more time on the code but I checked the tests and the
>> pre-commit checks.
>> It seems fine to me.
>> Let me know ,
>>
>> Cheers
>>
>> On 31 December 2015 at 18:40, Alessandro Benedetti <abenede...@apache.org
>> > wrote:
>>
>>> https://issues.apache.org/jira/browse/LUCENE-6954
>>>
>>> First draft patch available, I will check better the tests new year !
>>>
>>> On 29 December 2015 at 13:43, Alessandro Benedetti <
>>> abenede...@apache.org> wrote:
>>>
>>>> Sure, I will proceed tomorrow with the Jira and the simple patch +
>>>> tests.
>>>>
>>>> In the meantime let's try to collect some additional feedback.
>>>>
>>>> Cheers
>>>>
>>>> On 29 December 2015 at 12:43, Anshum Gupta <ans...@anshumgupta.net>
>>>> wrote:
>>>>
>>>>> Feel free to create a JIRA and put up a patch if you can.
>>>>>
>>>>> On Tue, Dec 29, 2015 at 4:26 PM, Alessandro Benedetti <
>>>>> abenede...@apache.org
>>>>> > wrote:
>>>>>
>>>>> > Hi guys,
>>>>> > While I was exploring the way we build the More Like This query, I
>>>>> > discovered a part I am not convinced of :
>>>>> >
>>>>> >
>>>>> >
>>>>> > Let's see how we build the query :
>>>>> > org.apache.lucene.queries.mlt.MoreLikeThis#retrieveTerms(int)
>>>>> >
>>>>> > 1) we extract the terms from the interesting fields, adding them to
>>>>> a map :
>>>>> >
>>>>> > Map<String, Int> termFreqMap = new HashMap<>();
>>>>> >
>>>>> > *( we lose the relation field-> term, we don't know anymore where
>>>>> the term
>>>>> > was coming ! )*
>>>>> >
>>>>> > org.apache.lucene.queries.mlt.MoreLikeThis#createQueue
>>>>> >
>>>>> > 2) we build the queue that will contain the query terms, at this
>>>>> point we
>>>>> > connect again there terms to some field, but :
>>>>> >
>>>>> > ...
>>>>> >> // go through all the fields and find the largest document frequency
>>>>> >> String topField = fieldNames[0];
>>>>> >> int docFreq = 0;
>>>>> >> for (String fieldName : fieldNames) {
>>>>> >>   int freq = ir.docFreq(new Term(fieldName, word));
>>>>> >>   topField = (freq > docFreq) ? fieldName : topField;
>>>>> >>   docFreq = (freq > docFreq) ? freq : docFreq;
>>>>> >> }
>>>>> >> ...
>>>>> >
>>>>> >
>>>>> > We identify the topField as the field with the highest document
>>>>> frequency
>>>>> > for the term t .
>>>>> > Then we build the termQuery :
>>>>> >
>>>>> > queue.add(new ScoreTerm(word, *topField*, score, idf, docFreq, tf));
>>>>> >
>>>>> > In this way we lose a lot of precision.
>>>>> > Not sure why we do that.
>>>>> > I would prefer to keep the relation between terms and fields.
>>>>> > The MLT query can improve a lot the quality.
>>>>> > If i run the MLT on 2 fields : *description* and *facilities* for
>>>>> example.
>>>>> > It is likely I want to find documents with similar terms in the
>>>>> > description and similar terms in the facilities, without mixing up
>>>>> the
>>>>> > things and loosing the semantic of the terms.
>>>>> >
>>>>> > Let me know your opinion,
>>>>> >
>>>>> > Cheers
>>>>> >
>>>>> >
>>>>> > --
>>>>> > --------------------------
>>>>> >
>>>>> > Benedetti Alessandro
>>>>> > Visiting card : http://about.me/alessandro_benedetti
>>>>> >
>>>>> > "Tyger, tyger burning bright
>>>>> > In the forests of the night,
>>>>> > What immortal hand or eye
>>>>> > Could frame thy fearful symmetry?"
>>>>> >
>>>>> > William Blake - Songs of Experience -1794 England
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Anshum Gupta
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --------------------------
>>>>
>>>> Benedetti Alessandro
>>>> Visiting card : http://about.me/alessandro_benedetti
>>>>
>>>> "Tyger, tyger burning bright
>>>> In the forests of the night,
>>>> What immortal hand or eye
>>>> Could frame thy fearful symmetry?"
>>>>
>>>> William Blake - Songs of Experience -1794 England
>>>>
>>>
>>>
>>>
>>> --
>>> --------------------------
>>>
>>> Benedetti Alessandro
>>> Visiting card : http://about.me/alessandro_benedetti
>>>
>>> "Tyger, tyger burning bright
>>> In the forests of the night,
>>> What immortal hand or eye
>>> Could frame thy fearful symmetry?"
>>>
>>> William Blake - Songs of Experience -1794 England
>>>
>>
>>
>>
>> --
>> --------------------------
>>
>> Benedetti Alessandro
>> Visiting card : http://about.me/alessandro_benedetti
>>
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>>
>> William Blake - Songs of Experience -1794 England
>>
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to