Shawn, I was going to say the same thing, but... then I was thinking about
SolrCloud and the fact that update processors are invoked before the
document is set to its target node, so there wouldn't be a reliable way to
tell if the input document field value exists on the target rather than
current node.

Or does the update processing only occur on the leader node after being
forwarded from the originating node? Is the doc clear on this detail?

My understanding was that the distributed update processor is near the end
of the chain, so that running of user update processors occurs before the
distribution step, but is that distribution to the leader, or distribution
from leader to replicas for a shard?


-- Jack Krupansky

On Tue, May 19, 2015 at 9:01 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/19/2015 3:02 AM, Bram Van Dam wrote:
> > I'm looking for a way to have Solr reject documents if a certain field
> > value is duplicated (reject, not overwrite). There doesn't seem to be
> > any kind of unique option in schema fields.
> >
> > The de-duplication feature seems to make this (somewhat) possible, but I
> > would like it to provide the unique value myself, without having the
> > deduplicator create a hash of field values.
> >
> > Am I missing an obvious (or less obvious) way of accomplishing this?
>
> Write a custom update processor and include it in your update chain.
> You will then have the ability to do anything you want with the entire
> input document before it hits the code to actually do the indexing.
>
> A script update processor is included with Solr allows you to write your
> processor in a language other than Java, such as javascript.
>
>
> https://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html
>
> Here's how to discard a document in an update processor written in Java:
>
>
> http://stackoverflow.com/questions/27108200/how-to-cancel-indexing-of-a-solr-document-using-update-request-processor
>
> The javadoc that I linked above describes the ability to return "false"
> in other languages to discard the document.
>
> Thanks,
> Shawn
>
>

Reply via email to