[ https://issues.apache.org/jira/browse/NUTCH-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832568#comment-16832568 ]
Sebastian Nagel commented on NUTCH-2585: ---------------------------------------- PR including fix is open: [#452|https://github.com/apache/nutch/pull/452] I've decided to move the unsafe code block into a synchronized method. Because the TrieStringMatcher allows to mix matching and adding strings, the lazy conversion of nodes is mandatory. The impact on matching performance should be negligible because the synchronized method is only called on-demand if the node wasn't already prepared for matching. > NPE in TrieStringMatcher > ------------------------ > > Key: NUTCH-2585 > URL: https://issues.apache.org/jira/browse/NUTCH-2585 > Project: Nutch > Issue Type: Bug > Affects Versions: 1.14 > Reporter: Markus Jelsma > Priority: Major > Fix For: 1.16 > > > Stumbled on this one just now: > {code} > 2018-05-25 14:29:31,844 INFO [FetcherThread] > org.apache.nutch.fetcher.FetcherThread: FetcherThread 42 fetch of > http://www.ndcmediagroep.nl/wp-content/uploads/2017/03/Leaflet-Noflik-Wenje.pdf > failed with: java.lang.NullPointerException > at > org.apache.nutch.util.TrieStringMatcher$TrieNode.getChild(TrieStringMatcher.java:107) > at > org.apache.nutch.util.SuffixStringMatcher.shortestMatch(SuffixStringMatcher.java:74) > at > org.apache.nutch.urlfilter.suffix.SuffixURLFilter.filter(SuffixURLFilter.java:164) > at org.apache.nutch.net.URLFilters.filter(URLFilters.java:43) > at > org.apache.nutch.fetcher.FetcherThread.handleRedirect(FetcherThread.java:487) > at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:404) > {code} > Edit - added on 1 may 2019, i got a slightly different strack trace and using > PrefixURLFilter this time: > {code} > 2019-05-01 08:50:07,282 INFO [FetcherThread] > org.apache.nutch.fetcher.FetcherThread: FetcherThread 38 fetch of > https://kanaalstreek.nl/fzh/2018/06/04/vijf-maal-goud-voor-pegasus-op-nk > failed with: java.lang.NullPointerException > at > org.apache.nutch.util.TrieStringMatcher$TrieNode.getChild(TrieStringMatcher.java:107) > at > org.apache.nutch.util.PrefixStringMatcher.shortestMatch(PrefixStringMatcher.java:79) > at > org.apache.nutch.urlfilter.prefix.PrefixURLFilter.filter(PrefixURLFilter.java:73) > at org.apache.nutch.net.URLFilters.filter(URLFilters.java:43) > at > org.apache.nutch.fetcher.FetcherThread.handleRedirect(FetcherThread.java:487) > at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:404) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)