[
https://issues.apache.org/jira/browse/HIVE-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038842#comment-13038842
]
[email protected] commented on HIVE-2156:
-----------------------------------------------------
bq. On 2011-05-24 20:49:24, Ning Zhang wrote:
bq. > ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java, line 110
bq. > <https://reviews.apache.org/r/777/diff/2/?file=19557#file19557line110>
bq. >
bq. > Do you have some numbers on how long it takes to get all the
TaskCompletionEvents? There are cases that a job may have more than 10k tasks
and all of them failed with the same error.
bq. >
bq. > If it takes too long you may want to consider adding a threshold to
the time spent in getting all the TaskCompleteEvents.
I have only tested it on some of the queries in the NegativeCliDriver tests,
where it usually only takes <10s running in miniMR cluster mode. There is a
coarse timeout (default 5 minutes, configurable in
HiveConf.ConfVars.JOB_DEBUG_TIMEOUT) to get all TaskCompletionEvents before we
stop that is enforced by HadoopJobExecHelper, but it would make sense to
timeout grabbing TaskCompletionEvents specifically, and then print out the
information obtained so far instead of what this patch does, which is just
throw away the taskCompletionEvents gathered so far and return the "could not
obtain debugging info". Does that sound reasonable, or do you think the coarse
timeout would be sufficient?
bq. On 2011-05-24 20:49:24, Ning Zhang wrote:
bq. > ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java,
line 571
bq. > <https://reviews.apache.org/r/777/diff/2/?file=19556#file19556line571>
bq. >
bq. > error code -101 is also used in TaskRunner.java to indicate OOM
exception. We should define all these error code in a centralized place.
This was just used as something to initialize the exitVal to, that specific
value should never be returned unless the call to runningJob.waitFor() returns
the same value. I can change it to something else just to avoid the collision,
but should we do both the consolidation of exit codes and the change to
showJobDebugInfo in the same patch? They seem like different changes, and
consolidating the exit codes would require touching several other parts of
MapredLocalTask, MapRedTask and ExecDriver. Would these changes fit better in a
separate patch?
- Syed
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/777/#review711
-----------------------------------------------------------
On 2011-05-24 04:29:32, Syed Albiz wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/777/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-05-24 04:29:32)
bq.
bq.
bq. Review request for hive and John Sichi.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. - Add local error messages to point to job logs and provide TaskIDs
bq. - Add a timeout to the fetching of task logs and errors
bq.
bq.
bq. This addresses bug HIVE-2156.
bq. https://issues.apache.org/jira/browse/HIVE-2156
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. build-common.xml 00c3680
bq. common/src/java/org/apache/hadoop/hive/conf/HiveConf.java dc96a1f
bq. conf/hive-default.xml 159d825
bq. ql/build.xml 449b47a
bq. ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java
4717c25
bq. ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java PRE-CREATION
bq. ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
bq. ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 691f038
bq. ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
9cb407c
bq. ql/src/test/queries/clientnegative/minimr_broken_pipe.q PRE-CREATION
bq. ql/src/test/results/clientnegative/dyn_part3.q.out 5f4df65
bq. ql/src/test/results/clientnegative/minimr_broken_pipe.q.out PRE-CREATION
bq. ql/src/test/results/clientnegative/script_broken_pipe1.q.out d33d2cc
bq. ql/src/test/results/clientnegative/script_broken_pipe2.q.out afbaa44
bq. ql/src/test/results/clientnegative/script_broken_pipe3.q.out fe8f757
bq. ql/src/test/results/clientnegative/script_error.q.out c72d780
bq. ql/src/test/results/clientnegative/udf_reflect_neg.q.out f2082a3
bq. ql/src/test/results/clientnegative/udf_test_error.q.out 5fd9a00
bq. ql/src/test/results/clientnegative/udf_test_error_reduce.q.out ddc5e5b
bq. ql/src/test/templates/TestNegativeCliDriver.vm ec13f79
bq.
bq. Diff: https://reviews.apache.org/r/777/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Tested TestNegativeCliDriver in both local and miniMR mode
bq.
bq.
bq. Thanks,
bq.
bq. Syed
bq.
bq.
> Improve error messages emitted during task execution
> ----------------------------------------------------
>
> Key: HIVE-2156
> URL: https://issues.apache.org/jira/browse/HIVE-2156
> Project: Hive
> Issue Type: Improvement
> Reporter: Syed S. Albiz
> Assignee: Syed S. Albiz
> Attachments: HIVE-2156.1.patch, HIVE-2156.2.patch
>
>
> Follow-up to HIVE-1731
> A number of issues were related to reporting errors from task execution and
> surfacing these in a more useful form.
> Currently a cryptic message with "Execution Error" and a return code and
> class name of the task is emitted.
> The most useful log messages here are emitted to the local logs, which can be
> found through jobtracker. Having either a pointer to these logs as part of
> the error message or the actual content would improve the usefulness
> substantially. It may also warrant looking into how the underlying error
> reporting through Hadoop is done and if more information can be propagated up
> from there.
> Specific issues raised in HIVE-1731:
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
> * issue was in regexp_extract syntax
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask
> * tried: desc table_does_not_exist;
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira