[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
[ https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529942#comment-14529942 ] ASF subversion and git services commented on SOLR-6878: --- Commit 1677923 from [~thelabdude] in branch 'dev/trunk' [ https://svn.apache.org/r1677923 ] SOLR-6878: support adding symmetric synonym lists using the managed synonym API solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand) Key: SOLR-6878 URL: https://issues.apache.org/jira/browse/SOLR-6878 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.10.2 Reporter: Tomasz Sulkowski Assignee: Timothy Potter Labels: ManagedSynonymFilterFactory, REST, SOLR Attachments: SOLR-6878.patch, SOLR-6878.patch Hi, After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I have found out that there is no way to set an all-to-all synonyms relation. Basically (judgind from google search) there is a need for expand functionality switch (known from SynonymFilterFactory) which will treat all synonyms with its keyword as equal. For example: if we define a car:[wagen,ride] relation it would translate a query that includes one of the synonyms or keyword to car or wagen or ride independently of which word was used from those three. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
[ https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529959#comment-14529959 ] ASF subversion and git services commented on SOLR-6878: --- Commit 1677924 from [~thelabdude] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1677924 ] SOLR-6878: support adding symmetric synonym lists using the managed synonym API solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand) Key: SOLR-6878 URL: https://issues.apache.org/jira/browse/SOLR-6878 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.10.2 Reporter: Tomasz Sulkowski Assignee: Timothy Potter Labels: ManagedSynonymFilterFactory, REST, SOLR Attachments: SOLR-6878.patch, SOLR-6878.patch Hi, After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I have found out that there is no way to set an all-to-all synonyms relation. Basically (judgind from google search) there is a need for expand functionality switch (known from SynonymFilterFactory) which will treat all synonyms with its keyword as equal. For example: if we define a car:[wagen,ride] relation it would translate a query that includes one of the synonyms or keyword to car or wagen or ride independently of which word was used from those three. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
[ https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524285#comment-14524285 ] Hoss Man commented on SOLR-6878: the expand option in the original SynonymFilterFactory was/is really just about allowing brevity for symetric synonyms in the data file -- the best approach for the API is to tackle the same problem. Instead of thinking about expand as a stateful option in ManagedSynonymFilterFactory (or worse, an _immutabe_ stateful option), i would suggest that instead it should just be a (transient) property of the request to add to / create the synonyms mappings -- one that doesn't even need to be explicit, since the list syntax already makes it clear. today, if someone sends a map of KEY = LIST-OF(VALUES) to the API, we interpret that as for each KEY, for each VALUE in LIST-OF(VALUES), add a synonym mapping of KEY=VALUE and later if the user asks to GET mappings or delete mappings they do so by KEY. why not let the new expand feature just be syntactic sugar on adding symetric sets of KEY=VALUE mappings via lists of lists? if a user is creating or adding to a synonym mapping with a LIST-OF(LIST-OF(VALUES)) then let the logic be: for each LIST-OF(VALUES) in the outer LIST, loop over the inner LIST and add a mapping from every VALUE = ever other VALUE in the same inner LIST it should be purely syntactic sugar -- GET requests should make it clear how the data is internally modeled. bq. What should the API do if the user then decides to update the specific mappings for funny by sending in a request such as ... we update that exact mapping, and no other mappings are changed -- update/delete requests should operate on individual keys, regardless of what type of request added those keys. The (more complex) alternative is to create a much more general abstraction of synonym dictionary entries where each entry is either a one way mapping or a multi directional mapping ... so that we internally track remember that the user gave us some set of one way mappings like \{'mad': \['angry'\]\} and also gave us a set of multi directional mappings as lists like \['funny','jocular','whimiscal'\] and support some new syntax for saying i want to edit the list i previously gave you which contains 'jocular' such that it no longer contains 'whimiscal' but now contains 'happy' and also have sanity checks in place to prevent people from trying to mix the two. but i think (as you alluded to above) that way leads to madness. solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand) Key: SOLR-6878 URL: https://issues.apache.org/jira/browse/SOLR-6878 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.10.2 Reporter: Tomasz Sulkowski Assignee: Timothy Potter Labels: ManagedSynonymFilterFactory, REST, SOLR Attachments: SOLR-6878.patch Hi, After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I have found out that there is no way to set an all-to-all synonyms relation. Basically (judgind from google search) there is a need for expand functionality switch (known from SynonymFilterFactory) which will treat all synonyms with its keyword as equal. For example: if we define a car:[wagen,ride] relation it would translate a query that includes one of the synonyms or keyword to car or wagen or ride independently of which word was used from those three. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
[ https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524298#comment-14524298 ] Timothy Potter commented on SOLR-6878: -- bq. why not let the new expand feature just be syntactic sugar on adding symetric sets of KEY=VALUE mappings via lists of lists? Good idea! I'll start down that path as it seems pretty straight-forward to implement w/o all the state management issues as you mentioned. Thanks Hoss. solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand) Key: SOLR-6878 URL: https://issues.apache.org/jira/browse/SOLR-6878 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.10.2 Reporter: Tomasz Sulkowski Assignee: Timothy Potter Labels: ManagedSynonymFilterFactory, REST, SOLR Attachments: SOLR-6878.patch Hi, After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I have found out that there is no way to set an all-to-all synonyms relation. Basically (judgind from google search) there is a need for expand functionality switch (known from SynonymFilterFactory) which will treat all synonyms with its keyword as equal. For example: if we define a car:[wagen,ride] relation it would translate a query that includes one of the synonyms or keyword to car or wagen or ride independently of which word was used from those three. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
[ https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523482#comment-14523482 ] Timothy Potter commented on SOLR-6878: -- I started going through this patch and I have some questions about how to support the equivalent synonyms feature for managed synonyms. NOTE: I'm using the term equivalent synonyms based on the doc here: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Specifically, here are a couple of issues I see with supporting equivalent synonyms lists at the managed API level: 1) The default value for expand is true (in the patch), but what if the user changes it to false after already having added equivalent synonym lists? Or vice-versa. What do we do about existing equivalent mappings? We could store the equivalent lists in a separate data structure and then apply the correct behavior depending on the expand flag when the managed data is viewed, i.e. either a GET request from the API or when updating the data used to initialize the underlying SynonymMap. This is similar to what we do with ignoreCase, however the ignoreCase was easily handled but I think allowing expand to be changed by the API is much more complicated. Of course we could punt on this issue altogether and just make the expand flag immutable, i.e. you can set it initially to true or false, but cannot change it with the API. If we make it immutable, then we can apply the mapping on update and not have to maintain any additional data structures to remember the raw state of equiv lists. 2) Let's say we allow users to send in equivalent synonym lists to the API, such as: {code} curl -v -X PUT \ -H 'Content-type:application/json' \ --data-binary '[funny,entertaining,whimsical,jocular]' \ 'http://localhost:8983/solr/techproducts/schema/analysis/synonyms/english' {code} If expand is true, then you end up with the following mappings (pardon the Java code syntax as I didn't want to clean that up for this example): {code} assertJQ(endpoint + /funny, /funny==['entertaining','jocular','whimiscal']); assertJQ(endpoint + /entertaining, /entertaining==['funny','jocular','whimiscal']); assertJQ(endpoint + /jocular, /jocular==['entertaining','funny','whimiscal']); assertJQ(endpoint + /whimiscal, /whimiscal==['entertaining','funny','jocular']); {code} What should the API do if the user then decides to update the specific mappings for funny by sending in a request such as: {code} curl -v -X PUT \ -H 'Content-type:application/json' \ --data-binary '{funny:[hilarious]}' \ 'http://localhost:8983/solr/techproducts/schema/analysis/synonyms/english' {code} Does the API treat explicit mappings as having precedence over equivalent lists? Or does it fail with some weird error most users won't understand? Seems to get complicated pretty fast ... I didn't go too far down the path of implementing this so there are probably more questions that will come up. To reiterate my original design assumption for managed synonyms, the API was not intended for humans to interact with directly, rather there should be some sort of UI layer on top of this API that translates synonym mappings into low-level API calls. For me, it's much more clear to send in explicit mappings for each synonym than it is to send some flat list and then interpret that list differently based on some flag. The only advantage I can see is if the synonym list is huge, then expanding that out in the request makes the request larger. Other than that are there other use cases that require this expand functionality that cannot be achieved with the current implementation? If so, we need to decide if expand should be immutable and what the API should do if an explicit mapping is received for a term that is already used in an equivalent synonym list. [~Soolek] your thoughts on this? solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand) Key: SOLR-6878 URL: https://issues.apache.org/jira/browse/SOLR-6878 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.10.2 Reporter: Tomasz Sulkowski Assignee: Timothy Potter Labels: ManagedSynonymFilterFactory, REST, SOLR Attachments: SOLR-6878.patch Hi, After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I have found out that there is no way to set an all-to-all synonyms relation. Basically (judgind from google search) there is a need for expand functionality switch (known from SynonymFilterFactory) which will treat all synonyms with its keyword as equal. For example: if we define a car:[wagen,ride] relation it would
[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
[ https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500859#comment-14500859 ] Timothy Potter commented on SOLR-6878: -- Thanks for the patch Vitaliy, I'll get this into 5.2 solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand) Key: SOLR-6878 URL: https://issues.apache.org/jira/browse/SOLR-6878 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 4.10.2 Reporter: Tomasz Sulkowski Assignee: Timothy Potter Labels: ManagedSynonymFilterFactory, REST, SOLR Attachments: SOLR-6878.patch Hi, After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I have found out that there is no way to set an all-to-all synonyms relation. Basically (judgind from google search) there is a need for expand functionality switch (known from SynonymFilterFactory) which will treat all synonyms with its keyword as equal. For example: if we define a car:[wagen,ride] relation it would translate a query that includes one of the synonyms or keyword to car or wagen or ride independently of which word was used from those three. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org