[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)

2015-05-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529942#comment-14529942
 ] 

ASF subversion and git services commented on SOLR-6878:
---

Commit 1677923 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1677923 ]

SOLR-6878: support adding symmetric synonym lists using the managed synonym API

 solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
 

 Key: SOLR-6878
 URL: https://issues.apache.org/jira/browse/SOLR-6878
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.10.2
Reporter: Tomasz Sulkowski
Assignee: Timothy Potter
  Labels: ManagedSynonymFilterFactory, REST, SOLR
 Attachments: SOLR-6878.patch, SOLR-6878.patch


 Hi,
 After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I 
 have found out that there is no way to set an all-to-all synonyms relation. 
 Basically (judgind from google search) there is a need for expand 
 functionality switch (known from SynonymFilterFactory) which will treat all 
 synonyms with its keyword as equal.
 For example: if we define a car:[wagen,ride] relation it would 
 translate a query that includes one of the synonyms or keyword to car or 
 wagen or ride independently of which word was used from those three.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)

2015-05-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529959#comment-14529959
 ] 

ASF subversion and git services commented on SOLR-6878:
---

Commit 1677924 from [~thelabdude] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1677924 ]

SOLR-6878: support adding symmetric synonym lists using the managed synonym API

 solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
 

 Key: SOLR-6878
 URL: https://issues.apache.org/jira/browse/SOLR-6878
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.10.2
Reporter: Tomasz Sulkowski
Assignee: Timothy Potter
  Labels: ManagedSynonymFilterFactory, REST, SOLR
 Attachments: SOLR-6878.patch, SOLR-6878.patch


 Hi,
 After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I 
 have found out that there is no way to set an all-to-all synonyms relation. 
 Basically (judgind from google search) there is a need for expand 
 functionality switch (known from SynonymFilterFactory) which will treat all 
 synonyms with its keyword as equal.
 For example: if we define a car:[wagen,ride] relation it would 
 translate a query that includes one of the synonyms or keyword to car or 
 wagen or ride independently of which word was used from those three.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)

2015-05-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524285#comment-14524285
 ] 

Hoss Man commented on SOLR-6878:


the expand option in the original SynonymFilterFactory was/is really just 
about allowing brevity for symetric synonyms in the data file -- the best 
approach for the API is to tackle the same problem.

Instead of thinking about expand as a stateful option in 
ManagedSynonymFilterFactory (or worse, an _immutabe_ stateful option), i would 
suggest that instead it should just be a (transient) property of the request to 
add to / create the synonyms mappings -- one that doesn't even need to be 
explicit, since the list syntax already makes it clear.

today, if someone sends a map of KEY = LIST-OF(VALUES) to the API, we 
interpret that as for each KEY, for each VALUE in LIST-OF(VALUES), add a 
synonym mapping of KEY=VALUE  and later if the user asks to GET mappings or 
delete mappings they do so by KEY.

why not let the new expand feature just be syntactic sugar on adding symetric 
sets of KEY=VALUE mappings via lists of lists?

if a user is creating or adding to a synonym mapping with a 
LIST-OF(LIST-OF(VALUES)) then let the logic be: for each LIST-OF(VALUES) in 
the outer LIST, loop over the inner LIST and add a mapping from every VALUE = 
ever other VALUE in the same inner LIST

it should be purely syntactic sugar -- GET requests should make it clear how 
the data is internally modeled.

bq. What should the API do if the user then decides to update the specific 
mappings for funny by sending in a request such as ...

we update that exact mapping, and no other mappings are changed -- 
update/delete requests should operate on individual keys, regardless of what 
type of request added those keys.



The (more complex) alternative is to create a much more general abstraction of 
synonym dictionary entries where each entry is either a one way mapping or 
a multi directional mapping ... so that we internally track  remember that 
the user gave us some set of one way mappings like \{'mad': \['angry'\]\} and 
also gave us a set of multi directional mappings as lists like 
\['funny','jocular','whimiscal'\] and support some new syntax for saying i 
want to edit the list i previously gave you which contains 'jocular' such that 
it no longer contains 'whimiscal' but now contains 'happy' and also have 
sanity checks in place to prevent people from trying to mix the two.

but i think (as you alluded to above) that way leads to madness.


 solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
 

 Key: SOLR-6878
 URL: https://issues.apache.org/jira/browse/SOLR-6878
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.10.2
Reporter: Tomasz Sulkowski
Assignee: Timothy Potter
  Labels: ManagedSynonymFilterFactory, REST, SOLR
 Attachments: SOLR-6878.patch


 Hi,
 After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I 
 have found out that there is no way to set an all-to-all synonyms relation. 
 Basically (judgind from google search) there is a need for expand 
 functionality switch (known from SynonymFilterFactory) which will treat all 
 synonyms with its keyword as equal.
 For example: if we define a car:[wagen,ride] relation it would 
 translate a query that includes one of the synonyms or keyword to car or 
 wagen or ride independently of which word was used from those three.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)

2015-05-01 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524298#comment-14524298
 ] 

Timothy Potter commented on SOLR-6878:
--

bq. why not let the new expand feature just be syntactic sugar on adding 
symetric sets of KEY=VALUE mappings via lists of lists?

Good idea! I'll start down that path as it seems pretty straight-forward to 
implement w/o all the state management issues as you mentioned. Thanks Hoss.


 solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
 

 Key: SOLR-6878
 URL: https://issues.apache.org/jira/browse/SOLR-6878
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.10.2
Reporter: Tomasz Sulkowski
Assignee: Timothy Potter
  Labels: ManagedSynonymFilterFactory, REST, SOLR
 Attachments: SOLR-6878.patch


 Hi,
 After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I 
 have found out that there is no way to set an all-to-all synonyms relation. 
 Basically (judgind from google search) there is a need for expand 
 functionality switch (known from SynonymFilterFactory) which will treat all 
 synonyms with its keyword as equal.
 For example: if we define a car:[wagen,ride] relation it would 
 translate a query that includes one of the synonyms or keyword to car or 
 wagen or ride independently of which word was used from those three.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)

2015-05-01 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523482#comment-14523482
 ] 

Timothy Potter commented on SOLR-6878:
--

I started going through this patch and I have some questions about how to 
support the equivalent synonyms feature for managed synonyms.

NOTE: I'm using the term equivalent synonyms based on the doc here:
https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

Specifically, here are a couple of issues I see with supporting equivalent 
synonyms lists at the managed API level:

1) The default value for expand is true (in the patch), but what if the user 
changes it to false after already having added equivalent synonym lists? Or 
vice-versa. What do we do about existing equivalent mappings? We could store 
the equivalent lists in a separate data structure and then apply the correct 
behavior depending on the expand flag when the managed data is viewed, i.e. 
either a GET request from the API or when updating the data used to initialize 
the underlying SynonymMap. This is similar to what we do with ignoreCase, 
however the ignoreCase was easily handled but I think allowing expand to be 
changed by the API is much more complicated.

Of course we could punt on this issue altogether and just make the expand flag 
immutable, i.e. you can set it initially to true or false, but cannot change it 
with the API. If we make it immutable, then we can apply the mapping on update 
and not have to maintain any additional data structures to remember the raw 
state of equiv lists.

2) Let's say we allow users to send in equivalent synonym lists to the API, 
such as:

{code}
curl -v -X PUT \
  -H 'Content-type:application/json' \
  --data-binary '[funny,entertaining,whimsical,jocular]' \
  'http://localhost:8983/solr/techproducts/schema/analysis/synonyms/english'
{code}

If expand is true, then you end up with the following mappings (pardon the Java 
code syntax as I didn't want to clean that up for this example):
{code}
assertJQ(endpoint + /funny,
/funny==['entertaining','jocular','whimiscal']);
assertJQ(endpoint + /entertaining,
/entertaining==['funny','jocular','whimiscal']);
assertJQ(endpoint + /jocular,
/jocular==['entertaining','funny','whimiscal']);
assertJQ(endpoint + /whimiscal,
/whimiscal==['entertaining','funny','jocular']);
{code}

What should the API do if the user then decides to update the specific mappings 
for funny by sending in a request such as:

{code}
curl -v -X PUT \
  -H 'Content-type:application/json' \
  --data-binary '{funny:[hilarious]}' \
  'http://localhost:8983/solr/techproducts/schema/analysis/synonyms/english'
{code}

Does the API treat explicit mappings as having precedence over equivalent 
lists? Or does it fail with some weird error most users won't understand? Seems 
to get complicated pretty fast ...

I didn't go too far down the path of implementing this so there are probably 
more questions that will come up. To reiterate my original design assumption 
for managed synonyms, the API was not intended for humans to interact with 
directly, rather there should be some sort of UI layer on top of this API that 
translates synonym mappings into low-level API calls. For me, it's much more 
clear to send in explicit mappings for each synonym than it is to send some 
flat list and then interpret that list differently based on some flag.

The only advantage I can see is if the synonym list is huge, then expanding 
that out in the request makes the request larger. Other than that are there 
other use cases that require this expand functionality that cannot be achieved 
with the current implementation? If so, we need to decide if expand should be 
immutable and what the API should do if an explicit mapping is received for a 
term that is already used in an equivalent synonym list. [~Soolek] your 
thoughts on this?

 solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
 

 Key: SOLR-6878
 URL: https://issues.apache.org/jira/browse/SOLR-6878
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.10.2
Reporter: Tomasz Sulkowski
Assignee: Timothy Potter
  Labels: ManagedSynonymFilterFactory, REST, SOLR
 Attachments: SOLR-6878.patch


 Hi,
 After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I 
 have found out that there is no way to set an all-to-all synonyms relation. 
 Basically (judgind from google search) there is a need for expand 
 functionality switch (known from SynonymFilterFactory) which will treat all 
 synonyms with its keyword as equal.
 For example: if we define a car:[wagen,ride] relation it would 
 

[jira] [Commented] (SOLR-6878) solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)

2015-04-17 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500859#comment-14500859
 ] 

Timothy Potter commented on SOLR-6878:
--

Thanks for the patch Vitaliy, I'll get this into 5.2


 solr.ManagedSynonymFilterFactory all-to-all synonym switch (aka. expand)
 

 Key: SOLR-6878
 URL: https://issues.apache.org/jira/browse/SOLR-6878
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.10.2
Reporter: Tomasz Sulkowski
Assignee: Timothy Potter
  Labels: ManagedSynonymFilterFactory, REST, SOLR
 Attachments: SOLR-6878.patch


 Hi,
 After switching from SynonymFilterFactory to ManagedSynonymFilterFactory I 
 have found out that there is no way to set an all-to-all synonyms relation. 
 Basically (judgind from google search) there is a need for expand 
 functionality switch (known from SynonymFilterFactory) which will treat all 
 synonyms with its keyword as equal.
 For example: if we define a car:[wagen,ride] relation it would 
 translate a query that includes one of the synonyms or keyword to car or 
 wagen or ride independently of which word was used from those three.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org