I ended up in that part of the code while debugging after we had a crawling job stopped because of an exception concerning a document having a null value for a specific metadata and another one with a value that triggered a request parsing issue on Solr side.
Julien -----Message d'origine----- De : Karl Wright <daddy...@gmail.com> Envoyé : mardi 13 juillet 2021 15:48 À : dev <dev@manifoldcf.apache.org> Objet : Re: Solr output connector - behavior on some exceptions If the "solr is down" exceptions are indeed caught upstream, I'm tentatively in agreement that this fallback logic can be changed. But I would like to understand what specifically you are seeing this happen for. What cases are you hoping to improve? Karl On Tue, Jul 13, 2021 at 9:39 AM <julien.massi...@francelabs.com> wrote: > Hi, > > > > I would like to change the behavior of the Solr output connector > concerning two exception handling cases : > > > > 1. In the current « handleIOException » method of the HttpPoster > class, the « unknown » case looks like this : > > > > As the comment says, we don’t know the type of IOException, so it is > not necessary to make the ServiceInterruption fail after a period, > especially since all « Solr down » exceptions have been handled > upstream > > 2. The current « handleSolrServerException » method of the HttPoster > class. Same as above, this method is called for an unknown exception that > cannot be related to a « Solr down » issue; it can only be related to some > missconfiguration or document specific issue. It is therefore not necessary > to throw a ManifoldCFException that will stop the job with a > failure state > > > > > > What do you think ? If you agree with me, I can create a ticket for > that and submit a patch. This would allow to graciously keep the job > running while properly skipping identified exceptions. > > > > > > Regards, > Julien > > > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_ > campaign=sig-email&utm_content=emailclient> Garanti sans virus. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_ > campaign=sig-email&utm_content=emailclient> > <#m_-5206088803545595557_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >