To clarify additionally: we use StandardTokenizer & StandardFilter in front
of the WDF. Already following ST's transformations e-tail gets split into
two consecutive tokens

On Mon, Jun 15, 2015 at 10:08 AM, Dmitry Kan <solrexp...@gmail.com> wrote:

> Thanks, Erick. Analysis page shows the positions are growing=> there are
> no "glued" words on the same position.
>
> On Sun, Jun 14, 2015 at 6:10 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> My guess is that you have WordDelimiterFilterFactory in your
>> analysis chain with parameters that break up E-Tail to both "e" and
>> "tail" _and_
>> put them in the same position. This assumes that the result fragment
>> you pasted is incomplete and "commerce" is in it
>>
>> From <em>E</em>-Tail <em>commerce</em>
>>
>> or some such. Try the admin/analysis screen with the "verbose" box checked
>> and you'll see the position of each token after analysis to see if my
>> guess
>> is accurate.
>>
>> Best,
>> Erick
>>
>> On Sun, Jun 14, 2015 at 4:34 AM, Dmitry Kan <solrexp...@gmail.com> wrote:
>> > Hi guys,
>> >
>> > We observe some strange bug in solr 4.10.2, where by a sloppy query hits
>> > words it should not:
>> >
>> > <lst name="debug"><str name="rawquerystring">the "e commerce"</str><str
>> > name="querystring">the "e commerce"</str><str
>> > name="parsedquery">SpanNearQuery(spanNear([Contents:the,
>> > spanNear([Contents:eä, Contents:commerceä], 0, true)], 300,
>> > false))</str><str name="parsedquery_toString">spanNear([Contents:the,
>> > spanNear([Contents:eä, Contents:commerceä], 0, true)], 300, false)</str>
>> >
>> >
>> > This query produces words as hits, like:
>> >
>> > From <em>E</em>-Tail
>> >
>> > In the inner spanNear query we expect that e and commerce will occur
>> within
>> > 0 slop in that order.
>> >
>> > Can somebody shed light into what is going on?
>> >
>> > --
>> > Dmitry Kan
>> > Luke Toolbox: http://github.com/DmitryKey/luke
>> > Blog: http://dmitrykan.blogspot.com
>> > Twitter: http://twitter.com/dmitrykan
>> > SemanticAnalyzer: www.semanticanalyzer.info
>>
>
>
>
> --
> Dmitry Kan
> Luke Toolbox: http://github.com/DmitryKey/luke
> Blog: http://dmitrykan.blogspot.com
> Twitter: http://twitter.com/dmitrykan
> SemanticAnalyzer: www.semanticanalyzer.info
>
>


-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info

Reply via email to