[
https://issues.apache.org/jira/browse/SOLR-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126235#comment-13126235
]
Jan Høydahl commented on SOLR-2823:
-----------------------------------
Hey guys, you're jumping fast here :)
Erik, you must have peeked in my ideas book because exactly what you propose is
something I planned to introduce later, but using Groovy as the DSL :) - much
like Gradle does. I think this could be achieved by making
UpdateProcessorChains pluggable and definable in solrconfig. The
DefaultUpdateProcessorChain could be the simple linear array[] of processors.
The ScriptedUpdateProcessorChain would be the powerhouse where you could do
both simple linear ones as well as complex logic. You can even do simple
document manipulation inline without calling a processor, such as
doc.deleteField("title")...
This approach also solves another wish of mine, namely being able to define
chains outside of solrconfig.xml. Logically, configuring schema and document
processing is done by a "content" guy, but configuring solrconfig.xml is done
by the "hardware/operations" guys. Imagine a solr/conf/pipeline.groovy defined
in solrconfig.xml:
{code:xml}
<updateProcessorChain class="solr.ScriptedUpdateProcessorChainFactory"
file="pipeline.groovy" />
{code}
pipeline.groovy:
{code}
chain simple {
process(langid)
process(copyfield)
chain(logAndRun)
}
chain moreComplex {
process(langid)
if(doc.getFieldValue("employees") > 10)
process(copyfield)
else
chain(myOtherProcesses)
doc.deleteField("title")
chain(logAndRun)
}
chain logAndRun {
process(log)
process(run)
}
processor langid {
class = "solr.LanguageIdentifierUpdateProcessorFactory"
config("langid.fl", "title,body")
config("langid.langField", "language")
config("map", true)
}
processor copyfield {
script = "copyfield.groovy"
config("from", "title")
config("to", "title_en")
}
{code}
I don't know what it takes to code such a thing, but if we had it, I'd never go
back to defining pipelines in XML :)
> Re-use of UpdateProcessor configurations in multiple UpdateChains
> -----------------------------------------------------------------
>
> Key: SOLR-2823
> URL: https://issues.apache.org/jira/browse/SOLR-2823
> Project: Solr
> Issue Type: Improvement
> Components: update
> Reporter: Jan Høydahl
> Priority: Minor
>
> When dealing with multiple UpdateChains and Processors, you frequently need
> to re-use configuration. Two chains may be equal except for one config
> setting in one <processor>.
> I propose to allow named processor configs, which can be referenced by name
> in the chains.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]