Actually, you are right. It would be executed on every node if you put
LandDetect after a deliberately inserted
DistrubutedUpdateProcessorFactory entry.

Not optimal, but would work.

Upayavira

On Tue, Nov 3, 2015, at 12:26 PM, Alexandre Rafalovitch wrote:
> I wonder what would happen if the DistributedUpdateProcessorFactory is
> manually added into the chain and the LangDetect definition is moved
> AFTER it. As per
> https://wiki.apache.org/solr/UpdateRequestProcessor#Distributed_Updates
> 
> This would mean that the detection code would be executed on each
> node, but with the record expanded to include those other fields
> (assuming they were stored). This may do the trick, though a custom
> URP would probably be a better solution anyway.
> ----
> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> http://www.solr-start.com/
> 
> 
> On 3 November 2015 at 05:13, Upayavira <u...@odoko.co.uk> wrote:
> > Looking at the code, this is not going to work without modifications to
> > Solr (or at least a custom component).
> >
> > The atomic update code is closely embedded into the Solr
> > DistributedUpdateProcessor, which expands the atomic update into a full
> > document and then posts it to the shards.
> >
> > You need to do the update expansion before your lang detect processor,
> > but there is no gap between them.
> >
> > From my reading of the code, you could create an AtomicUpdateProcessor
> > that simply expands updates, and insert that before the
> > LangDetectUpdateProcessor.
> >
> > Upayavira
> >
> > On Tue, Nov 3, 2015, at 06:38 AM, Chaushu, Shani wrote:
> >> Hi
> >> When I make atomic update - set field - also on content field and also
> >> another field, the language field became generic. Meaning, it doesn’t
> >> work in the set field, only in the first inserting. Even if in the first
> >> time the language was detected, it just became generic after the update.
> >> Any idea?
> >>
> >> The chain is
> >>
> >> <updateRequestProcessorChain name="aa_chain">
> >> <processor
> >> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
> >> <str name="langid.fl">title,content,text</str>
> >>     <str name="langid.langField">language_t</str>
> >>     <str name="langid.langsField">language_all_t</str>
> >>     <str name="langid.fallback">generic</str>
> >>     <str name="langid.overwrite">false</str>
> >>     <str name="langid.threshold">0.8</str>
> >> </processor>
> >> <processor class="solr.LogUpdateProcessorFactory" />
> >>   <processor class="solr.RunUpdateProcessorFactory" />
> >> </updateRequestProcessorChain>
> >>
> >>
> >> Thanks,
> >> Shani
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Jack Krupansky [mailto:jack.krupan...@gmail.com]
> >> Sent: Thursday, October 29, 2015 17:04
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: language plugin
> >>
> >> Are you trying to do an atomic update without the content field? If so,
> >> it sounds like Solr needs an enhancement (bug fix?) so that language
> >> detection would be skipped if the input field is not present. Or maybe
> >> that could be an option.
> >>
> >>
> >> -- Jack Krupansky
> >>
> >> On Thu, Oct 29, 2015 at 3:25 AM, Chaushu, Shani <shani.chau...@intel.com>
> >> wrote:
> >>
> >> > Hi,
> >> >  I'm using solr language detection plugin on field name "content"
> >> > (solr 4.10, plugin LangDetectLanguageIdentifierUpdateProcessorFactory)
> >> > When I'm indexing  on the first time it works fine, but if I want to
> >> > set one field again (regardless if it's the content or not) if goes to
> >> > its default language. If I'm setting other field I would like the
> >> > language to stay the way it was before, and o don't want to insert all
> >> > the content again. There is an option to set the plugin that it won't
> >> > calculate again the language? (put langid.overwrite to false didn't
> >> > work)
> >> >
> >> > Thanks,
> >> > Shani
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > Intel Electronics Ltd.
> >> >
> >> > This e-mail and any attachments may contain confidential material for
> >> > the sole use of the intended recipient(s). Any review or distribution
> >> > by others is strictly prohibited. If you are not the intended
> >> > recipient, please contact the sender and delete all copies.
> >> >
> >> ---------------------------------------------------------------------
> >> Intel Electronics Ltd.
> >>
> >> This e-mail and any attachments may contain confidential material for
> >> the sole use of the intended recipient(s). Any review or distribution
> >> by others is strictly prohibited. If you are not the intended
> >> recipient, please contact the sender and delete all copies.

Reply via email to