[ 
https://issues.apache.org/jira/browse/SOLR-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-2802:
---------------------------

    Attachment: SOLR-2802_update_processor_toolkit.patch

Patch containing a rough start for the type of thing i had in mind.

this patch implements 2 usable UpdateProcessors...

* TrimFieldUpdateProcessorFactory
* ConcatFieldUpdateProcessorFactory

...using these abstract subclasses...

* FieldMutatingUpdateProcessorFactory - handles configuration for what fields 
the processor should act on (by name, type, name regex, or type class)
* FieldMutatingUpdateProcessor handles the rote work of dealing with 
AddUpdateCommands and checking which fields the configuration indicates should 
be modified, so subclasses can focus solely on the relevant SolrInputFields
* FieldValueMutatingUpdateProcessor - handles the rote work of dealing with 
SolrInputFields when subclasses just want to modify all individual values of a 
field in place

Additional subclasses that seem like they would be useful, easy to implement, 
and fit easily into this framework would be...

  * RemoveBlankFieldUpdateProcessorFactory - ie: toss ""
  * HTMLStripFieldUpdateProcessorFactory
  * FirstFieldValueUpdateProcessorFactory
  * LastFieldValueUpdateProcessorFactory
  * ParseNumericFieldUpdateProcessorFactory - preconfigured formats
  * ParseDateFieldUpdateProcessorFactory - reconfigured formats, tz from field
  * ParseBooleanFieldUpdateProcessorFactory - configured lists of values to map 
to true/false

Would be helpful to get feedback in particular on the field config strategy.  
My thinking is that in general they should default to mutating all fields, but 
ignore things that don't match expectations (ie: Trim doesn't mess with things 
that aren't Strings); but some subclasses could default to things based on the 
implementing class.  Also seems like it would be helpful to support "excluding" 
fields (by name, regex, type, etc...)
                
> Toolkit of UpdateProcessors for modifying document values
> ---------------------------------------------------------
>
>                 Key: SOLR-2802
>                 URL: https://issues.apache.org/jira/browse/SOLR-2802
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Hoss Man
>         Attachments: SOLR-2802_update_processor_toolkit.patch
>
>
> Frequently users ask about questions about things where the answer is "you 
> could do it with an UpdateProcessor" but the number of our of hte box 
> UpdateProcessors is generally lacking and there aren't even very good base 
> classes for the common case of manipulating field values when adding documents

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to