[ https://issues.apache.org/jira/browse/LUCENE-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928145#comment-15928145 ]
Shawn Heisey commented on LUCENE-7708: -------------------------------------- There's no fix version here. CHANGES.txt says it's in 6.5.0. (looking for possible causes of a shingle filter problem confirmed in Solr 6.3 and 6.4, this couldn't be the cause) > Track PositionLengthAttribute abuse > ----------------------------------- > > Key: LUCENE-7708 > URL: https://issues.apache.org/jira/browse/LUCENE-7708 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser, modules/analysis > Reporter: Jim Ferenczi > Attachments: LUCENE-7708.patch, LUCENE-7708.patch > > > Some token filters uses the position length attribute of the token stream to > encode the number of terms they put in a single token. > This breaks the query parsing because it creates disconnected graph. > I've tracked down the abusive case to 2 candidates: > * ShingleFilter which sets the position length attribute to the length of the > shingle. > * CJKBigramFilter which always sets the position length attribute to 2. > I don't think these filters should set the position length at all so the best > would be to remove the attribute from these token filters but this could > break BWC. > Though this is a serious bug since shingles and cjk bigram now produce > invalid queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org