NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor -------------------------------------------------------------------------------------
Key: PIG-2110 URL: https://issues.apache.org/jira/browse/PIG-2110 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Michael Brauwerman When processing a large log file, I get an exception in SearchTermExtractor.exec I don't have a specific log line with a repro yet, but I assume the error occurs when the input URL is null, or maybe just has no query string: I think a fix would be to be add a guard after creating queryString: String queryString = urlObject.getQuery(); if (queryString == null) { return null; } Stack Trace: <code> Caused by: java.io.IOException: Caught exception processing input row at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:195) at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:64) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229) Caused by: java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1140) at java.util.regex.Matcher.reset(Matcher.java:291) at java.util.regex.Matcher.reset(Matcher.java:311) at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:170) </code> -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira