[ https://issues.apache.org/jira/browse/SOLR-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954701#comment-13954701 ]
Alaknantha commented on SOLR-3862: ---------------------------------- Yonik, Are you looking at the older patch? https://issues.apache.org/jira/secure/attachment/12637687/SOLR-3862.patch is my latest patch where I got rid of the regular expression usage. **1) Do we want a way to specify the removal of multiple values? Perhaps "remove" : [ "A","B","C" ] ***Yes, this patch supports removal of multiple values. **2) What are the downsides to using regex? Someone may not realize that the values being used are regular expressions until they are in production and values that accidentally have wildcards in them are used? Or they may simply forget to do wildcard escaping code since everything would "just work" until they did encounter them? ***Yes, I ran into this regular expression issue when I tried to use the field modifier "remove" provided by the older patch https://issues.apache.org/jira/secure/attachment/12589605/SOLR-3862-3.patch in my project. That's why I got rid of the usage of the regular expression and use the "value" comparisons. Here is the issue that I ran into: When invoked in the zoo keeper mode, the input field value comes in as a list whereas bypassing ZK and directly hitting Solr, it comes in as a String to the patch code segment to handle the atomic updates using field modifier "remove". The original patch SOLR-3862-3.patch creates a regular expression pattern on the incoming field value to be removed. The pattern is used to create a matcher and iterate through the original list of values. If the incoming field value is a list, the matcher does not match correctly because of the additional parenthesis like below: In the below example, "CA" is sent in as a input field value to be removed. Since the patch code was using the toString(), the list values are encapsulated within the parenthesis like [CA]. So, this pattern can match only to "C" or "A" and not to "CA". So, I had to get into the Solr code to troubleshoot this issue Hitting Solr using Zoo Keeper: Pattern p = Pattern.compile("[CA]"); Matcher m = p.matcher("CA"); boolean b = m.matches(); returns false and so the remove does not work if the incoming field value comes in a list. Hitting Solr directly: Pattern p = Pattern.compile("CA"); Matcher m = p.matcher("CA"); boolean b = m.matches(); returns true and so the remove works if the incoming field value comes in as a String. **3) Perhaps we want a separate way to specify "value" vs "regex". I assume "value" will be a much more common usecase than regex (although I do like the power that regex brings). ***I agree with you that "value" is a most common use case and that's the reason, I got rid of the "regex". Please review this patch https://issues.apache.org/jira/secure/attachment/12637687/SOLR-3862.patch that has my fix. > add "remove" as update option for atomically removing a value from a > multivalued field > -------------------------------------------------------------------------------------- > > Key: SOLR-3862 > URL: https://issues.apache.org/jira/browse/SOLR-3862 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Affects Versions: 4.0-BETA > Reporter: Jim Musil > Assignee: Erick Erickson > Attachments: SOLR-3862-2.patch, SOLR-3862-3.patch, SOLR-3862-4.patch, > SOLR-3862.patch, SOLR-3862.patch > > > Currently you can atomically "add" a value to a multivalued field. It would > be useful to be able to "remove" a value from a multivalued field. > When you "set" a multivalued field to null, it destroys all values. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org