[jira] [Updated] (SOLR-8113) Accept replacement strings in CloneFieldUpdateProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-8113: --- Attachment: SOLR-8113.patch Patch Updates... * Fixed javadocs to refer to replaceAll (code change was already previous patch) * added testCloneFieldRegexReplaceAll to prove replaceAll usecases would work. ** test initially failed due to the Matcher.replaceAll being conditional on a succesful "Mather.match()" call. ** Replaced "Mather.match()" with a "Matcher.find()" call to ensure replaceAll would work as intended * Updated the javadocs example config (and corrisponding test config) refering to "ending in {{_price}}" to actually use string ending bounds. * Updated the test configs to include the start/end boundary rules for ({{^feat(.*)s$}}) so it matches the javadocs ...i plan to commit & start backporting as soon as my full test + precommit finishes > Accept replacement strings in CloneFieldUpdateProcessorFactory > -- > > Key: SOLR-8113 > URL: https://issues.apache.org/jira/browse/SOLR-8113 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Hoss Man > Attachments: SOLR-8113.patch, SOLR-8113.patch, SOLR-8113.patch, > SOLR-8113.patch, SOLR-8113.patch > > > Presently CloneFieldUpdateProcessorFactory accepts regular expressions to > select source fields, which mirrors wildcards in the source for copyField in > the schema. This patch adds a counterpart to copyField's wildcards in the > dest attribute by interpreting the dest parameter as a regex replacement > string. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8113) Accept replacement strings in CloneFieldUpdateProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gus Heck updated SOLR-8113: --- Attachment: SOLR-8113.patch All your updates look fine. I notice that most of my tests disappeared, but probably there was a lot of overlap with your original tests (which also verified boost maintenance whereas mine did not). I'm trusting that you are happy with the coverage there. I also added 2 more validations of config parameters to give nicer messages if someone uses 3 or 4 of the possible parameters (src/dest/pattern/replacement). This would have been caught by the "extra params" check at the end but is less clear. I also moved the extra params check to the main method as a safety net for all cases. I had been on the fence WRT replace vs replaceAll, (simplicity vs flexibility) but in looking around I notice FieldNameMutatingUpdateProcessor uses replace all. Consistency seems like a good idea. > Accept replacement strings in CloneFieldUpdateProcessorFactory > -- > > Key: SOLR-8113 > URL: https://issues.apache.org/jira/browse/SOLR-8113 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Hoss Man > Attachments: SOLR-8113.patch, SOLR-8113.patch, SOLR-8113.patch, > SOLR-8113.patch > > > Presently CloneFieldUpdateProcessorFactory accepts regular expressions to > select source fields, which mirrors wildcards in the source for copyField in > the schema. This patch adds a counterpart to copyField's wildcards in the > dest attribute by interpreting the dest parameter as a regex replacement > string. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8113) Accept replacement strings in CloneFieldUpdateProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-8113: --- Attachment: SOLR-8113.patch Gus: This is definitely along the lines of what I had in mind -- and i like your test refactoring / additions. I've updated the patch as I reviewed -- mainly in the ariea of documentation and additional error handling/messages when parsing the config... {panel} * removed stray addition to FirstFieldValueUpdateProcessorFactory javadocs ... left over from old patch? * CloneFieldUpdateProcessorFactory javadocs: ** fixed to (and likewise for replacement ** clarify what the the literals & types were for the new config in the description ** reworded example description to be bulleted list instead of run on sentence(s) ** moved/reworded "common case" explanation to the end, after all the major functionality is explained, to clarify it's syntactic sugar and put next to it's example. ** replaced the one off comment about FirstValueUpdateProcessor with a more general comment about cloning into multivalue fields and various FieldValueSubsetUpdateProcessorFactory options * CloneFieldUpdateProcessorFactoryTest ** standardized indenting ** updated testCloneFieldExample to include the additions made to the javadoc example ** updated testCloneCombinations to include a first value clone test *** added corrisponding clone-first to solrconfig-update-processor-chains.xml ** updated testCloneField with more equivilence tests -- this helps ensure we've got good coverage for the case where multi-selector + pattern + replacement that results in single dest field getting values from multiple source fields. *** added corrisponding new clone-single-regex, clone-multi-regex, clone-array-regex, clone-selector-regex and clone-simple-regex-syntax to solrconfig-update-processor-chains.xml as needed * CloneFieldUpdateProcessorFactory code: ** fixed getInstance to use getSourceSelector for safe error handling (eliminates the unused warning you asked about) ** refactored init method into 2 helper methods specific to the two syntax styles (readability) ** tweaked config error messages & added additional error handling of bad config combos ** improved error reporting when pattern is invalid and can't be compile {panel} My one remaining concern with this patch is the use of Matcher.replaceFirst ... i feel like we should probably be using Matcher.replaceAll since it would provide a feature superset of replaceFirst (ie: using replaceAll can still support all the current patch behavior via start/end bound constraints + capture groups, but replaceFirst can't support everything possible with replaceAll). Gus: What do you think? > Accept replacement strings in CloneFieldUpdateProcessorFactory > -- > > Key: SOLR-8113 > URL: https://issues.apache.org/jira/browse/SOLR-8113 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Hoss Man > Attachments: SOLR-8113.patch, SOLR-8113.patch, SOLR-8113.patch > > > Presently CloneFieldUpdateProcessorFactory accepts regular expressions to > select source fields, which mirrors wildcards in the source for copyField in > the schema. This patch adds a counterpart to copyField's wildcards in the > dest attribute by interpreting the dest parameter as a regex replacement > string. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8113) Accept replacement strings in CloneFieldUpdateProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gus Heck updated SOLR-8113: --- Attachment: SOLR-8113.patch After chatting briefly with [~hossman] at Lucene/Solr Revolution it became clear that the key point of his suggestion was the complete separation of the selection phase and the replacement phase. The attached patch provides his suggested configuration options. This does introduce the possibility that a field could match the selector and not match the replacement pattern. I have handled this case by ignoring such fields (as if they were not selected) and logging a debug message. I also wound up creating my own entire unit test before I found the existing tests in FieldMutatingProcessorFactoryTest. I have moved the tests from there into my test class (CloneFieldUpdateFactoryTest) as well so that they are easier to find. Both tests read the same config file. > Accept replacement strings in CloneFieldUpdateProcessorFactory > -- > > Key: SOLR-8113 > URL: https://issues.apache.org/jira/browse/SOLR-8113 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8113.patch, SOLR-8113.patch > > > Presently CloneFieldUpdateProcessorFactory accepts regular expressions to > select source fields, which mirrors wildcards in the source for copyField in > the schema. This patch adds a counterpart to copyField's wildcards in the > dest attribute by interpreting the dest parameter as a regex replacement > string. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8113) Accept replacement strings in CloneFieldUpdateProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gus Heck updated SOLR-8113: --- Attachment: SOLR-8113.patch Patch vs 5x > Accept replacement strings in CloneFieldUpdateProcessorFactory > -- > > Key: SOLR-8113 > URL: https://issues.apache.org/jira/browse/SOLR-8113 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8113.patch > > > Presently CloneFieldUpdateProcessorFactory accepts regular expressions to > select source fields, which mirrors wildcards in the source for copyField in > the schema. This patch adds a counterpart to copyField's wildcards in the > dest attribute by interpreting the dest parameter as a regex replacement > string. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org