[
https://issues.apache.org/jira/browse/SOLR-9479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrey Kudryavtsev updated SOLR-9479:
-------------------------------------
Description:
I use DIH to index nested documents.
I have couple of use cases where field values on parent document depend on
children documents values. The simplest one - I need to "propagate" values from
all children documents to new field on parent document. It could be a little
bit tricky with current DIH architecture when you can apply transformation for
"plain" documents which considered as a plain map. See
_org.apache.solr.handler.dataimport.Transformer_ (You have to use some kind of
"transient fields" in your data-config, this fields are populated by child
sources, so sometimes you have to read child sources twice (more time) for
this. Maybe I do it wrong?)
I decided that maybe it makes sense to be able to apply transformation after
nested documents were converted from collection of map into good old
SolrInputDocument.
So this initial patch was created:
* It introduces concept of _DocumentTransformer_ with Java interface
{code:java}SolrInputDocument transform(SolrInputDocument solrDoc, Context
context) {code}
* This interface should be implemented by clients transformers. One simple
example of such transformer, _PropagationDocumentTransformer_, is implemented.
It parametrised by two field names - child and parent, and copied values from
children documents to parent document.
* This kind of transformer should be added into data-config.xml to
corresponding parents _entry_:
{code:xml}documentTransformer="org.apache.solr.handler.dataimport.PropagationDocumentTransformer"{code}
was:
I use DIH to index nested documents.
I have couple of use cases where field values on parent document depend on
children documents values. The simplest one - I need to "propagate" values from
all children documents to new field on parent document. It could be a little
bit tricky with current DIH architecture when you can apply transformation for
"plain" documents which considered as a plain map. See
_org.apache.solr.handler.dataimport.Transformer_ (You have to use some kind of
"transparent fields" in your config, this fields are populated by child
sources, so sometimes you have to read child sources twice (more time) for
this. Maybe I do it wrong?)
I decided that maybe it makes sense to be able to apply transformation after
nested documents were converted from collection of map into good old
SolrInputDocument.
So this initial patch was created:
* It introduces concept of _DocumentTransformer_ with Java interface
{code:java}SolrInputDocument transform(SolrInputDocument solrDoc, Context
context) {code}
* This interface should be implemented by clients transformers. One simple
example of such transformer, _PropagationDocumentTransformer_, is implemented.
It parametrised by two field names - child and parent, and copied values from
children documents to parent document.
* This kind of transformer should be added into data-config.xml to
corresponding parents _entry_:
{code:xml}documentTransformer="org.apache.solr.handler.dataimport.PropagationDocumentTransformer"{code}
> Extends DIH with transformation on Solr document level
> -------------------------------------------------------
>
> Key: SOLR-9479
> URL: https://issues.apache.org/jira/browse/SOLR-9479
> Project: Solr
> Issue Type: New Feature
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Andrey Kudryavtsev
> Attachments: SOLR-9479.patch
>
>
> I use DIH to index nested documents.
> I have couple of use cases where field values on parent document depend on
> children documents values. The simplest one - I need to "propagate" values
> from all children documents to new field on parent document. It could be a
> little bit tricky with current DIH architecture when you can apply
> transformation for "plain" documents which considered as a plain map. See
> _org.apache.solr.handler.dataimport.Transformer_ (You have to use some kind
> of "transient fields" in your data-config, this fields are populated by child
> sources, so sometimes you have to read child sources twice (more time) for
> this. Maybe I do it wrong?)
> I decided that maybe it makes sense to be able to apply transformation after
> nested documents were converted from collection of map into good old
> SolrInputDocument.
> So this initial patch was created:
> * It introduces concept of _DocumentTransformer_ with Java interface
> {code:java}SolrInputDocument transform(SolrInputDocument solrDoc, Context
> context) {code}
> * This interface should be implemented by clients transformers. One simple
> example of such transformer, _PropagationDocumentTransformer_, is
> implemented. It parametrised by two field names - child and parent, and
> copied values from children documents to parent document.
> * This kind of transformer should be added into data-config.xml to
> corresponding parents _entry_:
> {code:xml}documentTransformer="org.apache.solr.handler.dataimport.PropagationDocumentTransformer"{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]