[
https://issues.apache.org/jira/browse/SOLR-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-2802:
---------------------------
Attachment: SOLR-2802_update_processor_toolkit.patch
Patch containing a rough start for the type of thing i had in mind.
this patch implements 2 usable UpdateProcessors...
* TrimFieldUpdateProcessorFactory
* ConcatFieldUpdateProcessorFactory
...using these abstract subclasses...
* FieldMutatingUpdateProcessorFactory - handles configuration for what fields
the processor should act on (by name, type, name regex, or type class)
* FieldMutatingUpdateProcessor handles the rote work of dealing with
AddUpdateCommands and checking which fields the configuration indicates should
be modified, so subclasses can focus solely on the relevant SolrInputFields
* FieldValueMutatingUpdateProcessor - handles the rote work of dealing with
SolrInputFields when subclasses just want to modify all individual values of a
field in place
Additional subclasses that seem like they would be useful, easy to implement,
and fit easily into this framework would be...
* RemoveBlankFieldUpdateProcessorFactory - ie: toss ""
* HTMLStripFieldUpdateProcessorFactory
* FirstFieldValueUpdateProcessorFactory
* LastFieldValueUpdateProcessorFactory
* ParseNumericFieldUpdateProcessorFactory - preconfigured formats
* ParseDateFieldUpdateProcessorFactory - reconfigured formats, tz from field
* ParseBooleanFieldUpdateProcessorFactory - configured lists of values to map
to true/false
Would be helpful to get feedback in particular on the field config strategy.
My thinking is that in general they should default to mutating all fields, but
ignore things that don't match expectations (ie: Trim doesn't mess with things
that aren't Strings); but some subclasses could default to things based on the
implementing class. Also seems like it would be helpful to support "excluding"
fields (by name, regex, type, etc...)
> Toolkit of UpdateProcessors for modifying document values
> ---------------------------------------------------------
>
> Key: SOLR-2802
> URL: https://issues.apache.org/jira/browse/SOLR-2802
> Project: Solr
> Issue Type: New Feature
> Reporter: Hoss Man
> Attachments: SOLR-2802_update_processor_toolkit.patch
>
>
> Frequently users ask about questions about things where the answer is "you
> could do it with an UpdateProcessor" but the number of our of hte box
> UpdateProcessors is generally lacking and there aren't even very good base
> classes for the common case of manipulating field values when adding documents
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]