[ https://issues.apache.org/jira/browse/LUCENE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651158#comment-14651158 ]
Robert Muir commented on LUCENE-6664: ------------------------------------- OK, if you say its a generalization, then I am ok. But you are saying current code in queryparsers won't do the wrong thing?: I know its a tough one, since it already is somewhat wrong today!? I'm just asking because we dont want to make it worse or more confusing. > Replace SynonymFilter with SynonymGraphFilter > --------------------------------------------- > > Key: LUCENE-6664 > URL: https://issues.apache.org/jira/browse/LUCENE-6664 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: 5.3, Trunk > > Attachments: LUCENE-6664.patch, LUCENE-6664.patch, LUCENE-6664.patch, > LUCENE-6664.patch, usa.png, usa_flat.png > > > Spinoff from LUCENE-6582. > I created a new SynonymGraphFilter (to replace the current buggy > SynonymFilter), that produces correct graphs (does no "graph > flattening" itself). I think this makes it simpler. > This means you must add the FlattenGraphFilter yourself, if you are > applying synonyms during indexing. > Index-time syn expansion is a necessarily "lossy" graph transformation > when multi-token (input or output) synonyms are applied, because the > index does not store {{posLength}}, so there will always be phrase > queries that should match but do not, and then phrase queries that > should not match but do. > http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html > goes into detail about this. > However, with this new SynonymGraphFilter, if instead you do synonym > expansion at query time (and don't do the flattening), and you use > TermAutomatonQuery (future: somehow integrated into a query parser), > or maybe just "enumerate all paths and make union of PhraseQuery", you > should get 100% correct matches (not sure about "proper" scoring > though...). > This new syn filter still cannot consume an arbitrary graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org