[ https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630885#action_12630885 ]
michaelsembwever edited comment on LUCENE-1380 at 9/14/08 5:58 AM: --------------------------------------------------------------------- > All this patch does is to set all position increment of the tokens produced > by the ShingleFilter to 0, right? > I'm going to remove this for 2.4 fix and recommend you to use the filter > strategy mentioned. The patch to add the new TokenFilter isn't easy-as-abc as lucene needs to have the filter class added to classpath, and Solr needs the TokenFilterFactory added to be able to read it from the configuration files. A lot of work when we're (almost) agreed that removing positional information from all tokens makes sense when using the ShingleFilter. If it were just the one installation i wouldn't have a problem with adding the custom TokenFilter, but because our use-case is an open sourced and documented system ( read http://sesat.no/howto-solr-query-evaluation.html ) i'd like to make it as easy as possible for third parties. I would also think that because this is a way to replace commercial and competing technology from FAST that the community would be behind such an enhancement... was (Author: michaelsembwever): > All this patch does is to set all position increment of the tokens produced by the ShingleFilter to 0, right? > I'm going to remove this for 2.4 fix and recommend you to use the filter > strategy mentioned. The patch to add the new TokenFilter isn't easy-as-abc as lucene needs to have the filter class added to classpath, and Solr needs the TokenFilterFactory added to be able to read it from the configuration files. A lot of work when we're (almost) agreed that removing positional information from all tokens makes sense when using the ShingleFilter. If it were just the one installation i wouldn't have a problem with adding the custom TokenFilter, but because our use-case is an open sourced and documented system ( read http://sesat.no/howto-solr-query-evaluation.html ) i'd like to make it as easy as possible for third parties. I would also think that this is a way to replace commercial and competing technology from FAST that the community would be behind such an enhancement... > Patch for ShingleFilter.enablePositions > --------------------------------------- > > Key: LUCENE-1380 > URL: https://issues.apache.org/jira/browse/LUCENE-1380 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/analyzers > Reporter: Michael Semb Wever > Assignee: Karl Wettin > Priority: Trivial > Attachments: LUCENE-1380.patch, LUCENE-1380.patch > > > Make it possible for *all* words and shingles to be placed at the same > position. > Default is to place each shingle at the same position as the unigram (or > first shingle if outputUnigrams=false). That is, each coterminal token has > positionIncrement=1 and every other token a positionIncrement=0. > This leads to a MultiPhraseQuery where at least one word/shingle must be > matched from each word/token. This is not always desired. > See http://comments.gmane.org/gmane.comp.jakarta.lucene.user/34746 for > mailing list thread. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]