I ended up in that part of the code while debugging after we had a crawling job 
stopped because of an exception concerning a document having a null value for a 
specific metadata and another one with a value that triggered a request parsing 
issue on Solr side. 

Julien

-----Message d'origine-----
De : Karl Wright <daddy...@gmail.com> 
Envoyé : mardi 13 juillet 2021 15:48
À : dev <dev@manifoldcf.apache.org>
Objet : Re: Solr output connector - behavior on some exceptions

If the "solr is down" exceptions are indeed caught upstream, I'm tentatively in 
agreement that this fallback logic can be changed.  But I would like to 
understand what specifically you are seeing this happen for.
What cases are you hoping to improve?

Karl


On Tue, Jul 13, 2021 at 9:39 AM <julien.massi...@francelabs.com> wrote:

> Hi,
>
>
>
> I would like to change the behavior of the Solr output connector 
> concerning two exception handling cases :
>
>
>
>    1. In the current « handleIOException » method of the HttpPoster
>    class, the « unknown » case looks like this :
>
>
>
>    As the comment says, we don’t know the type of IOException, so it is
>    not necessary to make the ServiceInterruption fail after a period,
>    especially since all « Solr down » exceptions have been handled 
> upstream
>
>    2. The current « handleSolrServerException » method of the HttPoster
>    class. Same as above, this method is called for an unknown exception that
>    cannot be related to a « Solr down » issue; it can only be related to some
>    missconfiguration or document specific issue. It is therefore not necessary
>    to throw a ManifoldCFException that will stop the job with a 
> failure state
>
>
>
>
>
> What do you think ? If you agree with me, I can create a ticket for 
> that and submit a patch. This would allow to graciously keep the job 
> running while properly skipping identified exceptions.
>
>
>
>
>
> Regards,
> Julien
>
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_
> campaign=sig-email&utm_content=emailclient> Garanti sans virus. 
> www.avast.com 
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_
> campaign=sig-email&utm_content=emailclient>
> <#m_-5206088803545595557_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>

Reply via email to