[
https://issues.apache.org/jira/browse/HIVE-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422787#comment-13422787
]
Zhenxiao Luo commented on HIVE-3301:
------------------------------------
The problem is:
In hadoop23, TaskLogServlet.java is using a new utility HtmlQuoting.java to
print Task Log.
In TaskLogServlet.java, printTaskLog() function:
result = taskLogReader.read(b);
if (result > 0) {
if (plainText) {
out.write(b, 0, result);
} else {
HtmlQuoting.quoteHtmlChars(out, b, 0, result);
}
} else {
break;
}
While, in hadoop20, TaskLogServlet.java is using its own utility(there is no
such HtmlQuoting.java at all) to print Task Log:
In TaskLogServlet.java, printTaskLog fucntion:
result = taskLogReader.read(b);
if (result > 0) {
if (plainText) {
out.write(b, 0, result);
} else {
quotedWrite(out, b, 0, result);
}
} else {
break;
}
And in Hive, TaskLogProcessor.java is generating stack trace by reading the raw
taskAttemptLog.
In ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java,
getStackTraces() fuction:
List<String> stackTrace = null;
// Patterns that match the middle/end of stack traces
Pattern stackTracePattern = Pattern.compile("^\tat .*",
Pattern.CASE_INSENSITIVE);
Pattern endStackTracePattern =
Pattern.compile("^\t... [0-9]+ more.*", Pattern.CASE_INSENSITIVE);
while ((inputLine = in.readLine()) != null) {
if (stackTracePattern.matcher(inputLine).matches() ||
endStackTracePattern.matcher(inputLine).matches()) {
To have Hive working for both hadoop20 and hadoop23, we should use different
mechanisms when hive TaskLogProcessor is parsing TaskAttemptLog.
My plan is creating a shim, which have different implementations for hadoop20
and hadoop23.
In hadoop23, HtmlQuoting.unquoteHtmlChars() is used to parse the TaskAttemptLog.
> Fix quote printing bug in mapreduce_stack_trace.q testcase failure when
> running hive on hadoop23
> ------------------------------------------------------------------------------------------------
>
> Key: HIVE-3301
> URL: https://issues.apache.org/jira/browse/HIVE-3301
> Project: Hive
> Issue Type: Bug
> Reporter: Zhenxiao Luo
> Assignee: Zhenxiao Luo
>
> When running hive on hadoop0.23, mapreduce_stack_trace.q is failing due to
> quote printing bug:
> quote is printed as: '"', instead of "
> Seems not able to state the bug clearly in html:
> quote is printed as 'address sign' + 'quot' + semicolon
> not the expected 'quote sign'
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira