Maunel Bach created CONNECTORS-1720:
---------------------------------------

             Summary: Fatal Error due to NPE in RepriorizationTracker
                 Key: CONNECTORS-1720
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1720
             Project: ManifoldCF
          Issue Type: Bug
    Affects Versions: ManifoldCF 2.22
            Reporter: Maunel Bach


A couple of times a week our ManifoldCF instance logged a stacktrace like that:
{code:bash}
2022-07-06T14:04:39,297 FATAL [Set priority thread] 
org.apache.manifoldcf.crawlerthreads: Error tossed: null
java.lang.NullPointerException: null
        at 
org.apache.manifoldcf.crawler.reprioritizationtracker.ReprioritizationTracker$PreloadKey.hashCode(ReprioritizationTracker.java:460)
 ~[mcf-pull-agent.jar:?]
        at java.util.HashMap.hash(HashMap.java:339) ~[?:?]
        at java.util.HashMap.get(HashMap.java:552) ~[?:?]
        at 
org.apache.manifoldcf.crawler.reprioritizationtracker.ReprioritizationTracker.addPreloadRequest(ReprioritizationTracker.java:234)
 ~[mcf-pull-agent.jar:?]
        at 
org.apache.manifoldcf.crawler.system.PriorityCalculator.makePreloadRequest(PriorityCalculator.java:123)
 ~[mcf-pull-agent.jar:?]
        at 
org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1247)
 ~[mcf-pull-agent.jar:?]
        at 
org.apache.manifoldcf.crawler.system.SetPriorityThread.run(SetPriorityThread.java:141)
 ~[mcf-pull-agent.jar:?]{code}

We tracked it down to a not null safe {{hashCode()}} implementation of the 
private {{PreloadRequest}} class consuming the first entry of the array 
{{binNames}}. The specifics of the array is an implementation detail of a 
connector base class. Potentially array might contain {{null}} values at least 
for the {{JiraRepositoryConnector}} and more critical for the 
{{WebcrawlerConnector}}.

As a solution we fixed the {{hashCode()}} implementation in our own code base 
which did the trick. We should consider to fix this bug in the Apache 
repository as well. But of course the fix does not help improve the extraction 
and usage of the {{binName}} itself.

We stumbled over this bug because the jobs on our ManifoldCF instance got 
stuck. Though we cannot be sure whether there was a correlation between this 
greater incident and the fatal error here because that was not the only fatal 
bug we fixed to get ManifoldCF up an running again. But the topic 
"repriorization" indicates that a bit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to