[ 
https://issues.apache.org/jira/browse/SOLR-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951357#comment-14951357
 ] 

Hoss Man commented on SOLR-8113:
--------------------------------

Gus, just read through your patch.

My chief concerns are:

# you've redefined the semantics of how the {{dest}} string is interpreted when 
a {{fieldRegex}} is used to identify the source (so there's a back compat break 
there depending on the value of {{dest}})
# You've designed the "config syntax" for this new feature around the 
requirement that it can _only_ be used if at least one {{fieldRegex}} is used 
to identify the source fields ...

The original purpose of the {{FieldSelector}} API was to provide more general 
appoaches for configuring which fields and {{UpdateProcessor}} should care 
about beyond simple string field name glob/pattern matching.  I think that 
pattern replacements for _destination_ field naming should (in general) be 
independent of the original selection criteria, so that a user could say 
something like...

bq. I want to make a copy of _any_ {{StrField}} in my documents such that the 
copy has the same name as the original but with {{_t}} appended.

...and that shold be possible with this feature, regardless of wether the user 
is using an specific naming convention (ie "*_s") for all StrFields in their 
index, using some syntax that might look like this...

{code}
<processor class="solr.CloneFieldUpdateProcessorFactory">
  <!-- existing source selector syntax -->
  <lst name="source">
    <str name="typeClass">solr.StrField</str>
  </lst>
  <!-- hypothetical new destination pattern syntax -->
  <lst name="dest">
    <str name="pattern">.*</str>
    <str name="replacement">$0_t</str>
  </lst>
</processor>
{code}

...while prefix\->prefix and suffix\->suffix style of cloning similar to what 
{{copyField}} supports could also be specified.  Example: a {{<copyField 
src="\*_s" dest="\*_t" />}} equivilent would be...

{code}
<processor class="solr.CloneFieldUpdateProcessorFactory">
  <!-- existing source selector syntax -->
  <lst name="source">
    <str name="fieldRegex">^(.*)_s$</str>
  </lst>
  <!-- hypothetical new destination pattern syntax -->
  <lst name="dest">
    <str name="pattern">^(.*)_s$</str>
    <str name="replacement">$1_t</str>
  </lst>
</processor>
{code}


That's fairly verbose, but if we get the nuts & blots of the general case 
implemented, then it should be trivial to add syntactic sugar to simplify the 
configuration...

{code}
<processor class="solr.CloneFieldUpdateProcessorFactory">
  <!-- hypothetical syntactic sugar equivilent to the above example -->
  <!-- since no other source selector args are specified, assume pattern based 
cloning -->
  <str name="pattern">^(.*)_s$</str>
  <str name="replacement">$1_t</str>
</processor>
{code}

What do you think?

> Accept replacement strings in CloneFieldUpdateProcessorFactory
> --------------------------------------------------------------
>
>                 Key: SOLR-8113
>                 URL: https://issues.apache.org/jira/browse/SOLR-8113
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 5.3
>            Reporter: Gus Heck
>         Attachments: SOLR-8113.patch
>
>
> Presently CloneFieldUpdateProcessorFactory accepts regular expressions to 
> select source fields, which mirrors wildcards in the source for copyField in 
> the schema. This patch adds a counterpart to copyField's wildcards in the 
> dest attribute by interpreting the dest parameter as a regex replacement 
> string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to