[jira] [Commented] (IMPALA-5765) Flaky tpc-ds data loading

Lars Volker (JIRA) Tue, 12 Jun 2018 09:57:22 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509874#comment-16509874
 ]


Lars Volker commented on IMPALA-5765:
-------------------------------------

I saw this again today.

> Flaky tpc-ds data loading
> -------------------------
>
>                 Key: IMPALA-5765
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5765
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 2.10.0
>            Reporter: Matthew Jacobs
>            Assignee: Philip Zeyliger
>            Priority: Critical
>              Labels: flaky
>
> Saw this on a number of gerrit-verify-dryrun jobs:
> {code}
> 23:49:37 Loading TPC-DS data (logging to 
> /home/ubuntu/Impala/logs/data_loading/load-tpcds.log)... 
> 23:55:39     FAILED (Took: 6 min 2 sec)
> 23:55:39     'load-data tpcds core' failed. Tail of log:
> 23:55:39 ss_net_profit,
> 23:55:39 ss_sold_date_sk
> 23:55:39 from store_sales_unpartitioned
> 23:55:39 WHERE ss_sold_date_sk < 2451272
> 23:55:39 distribute by ss_sold_date_sk
> 23:55:39 INFO  : Query ID = 
> ubuntu_20170731235555_26963c6a-a58b-4cad-b0c7-c3790f9b22dc
> 23:55:39 INFO  : Total jobs = 1
> 23:55:39 INFO  : Launching Job 1 out of 1
> 23:55:39 INFO  : Starting task [Stage-1:MAPRED] in serial mode
> 23:55:39 INFO  : Number of reduce tasks not specified. Estimated from input 
> data size: 2
> 23:55:39 INFO  : In order to change the average load for a reducer (in bytes):
> 23:55:39 INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
> 23:55:39 INFO  : In order to limit the maximum number of reducers:
> 23:55:39 INFO  :   set hive.exec.reducers.max=<number>
> 23:55:39 INFO  : In order to set a constant number of reducers:
> 23:55:39 INFO  :   set mapreduce.job.reduces=<number>
> 23:55:39 INFO  : number of splits:2
> 23:55:39 INFO  : Submitting tokens for job: job_local1252085428_0826
> 23:55:39 INFO  : The url to track the job: http://localhost:8080/
> 23:55:39 INFO  : Job running in-process (local Hadoop)
> 23:55:39 INFO  : 2017-07-31 23:55:06,606 Stage-1 map = 0%,  reduce = 0%
> 23:55:39 INFO  : 2017-07-31 23:55:13,609 Stage-1 map = 100%,  reduce = 0%
> 23:55:39 INFO  : 2017-07-31 23:55:28,621 Stage-1 map = 100%,  reduce = 33%
> 23:55:39 ERROR : Ended Job = job_local1252085428_0826 with errors
> 23:55:39 ERROR : FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 23:55:39 INFO  : MapReduce Jobs Launched: 
> 23:55:39 INFO  : Stage-Stage-1:  HDFS Read: 26483258512 HDFS Write: 
> 19378762131 FAIL
> 23:55:39 INFO  : Total MapReduce CPU Time Spent: 0 msec
> 23:55:39 INFO  : Completed executing 
> command(queryId=ubuntu_20170731235555_26963c6a-a58b-4cad-b0c7-c3790f9b22dc); 
> Time taken: 33.276 seconds
> 23:55:39 Error: Error while processing statement: FAILED: Execution Error, 
> return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask 
> (state=08S01,code=2)
> 23:55:39 java.sql.SQLException: Error while processing statement: FAILED: 
> Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 23:55:39      at 
> org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:292)
> 23:55:39      at 
> org.apache.hive.beeline.Commands.executeInternal(Commands.java:989)
> 23:55:39      at org.apache.hive.beeline.Commands.execute(Commands.java:1203)
> 23:55:39      at org.apache.hive.beeline.Commands.sql(Commands.java:1117)
> 23:55:39      at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1176)
> 23:55:39      at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1010)
> 23:55:39      at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:987)
> 23:55:39      at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:914)
> 23:55:39      at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:518)
> 23:55:39      at org.apache.hive.beeline.BeeLine.main(BeeLine.java:501)
> 23:55:39      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 23:55:39      at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 23:55:39      at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 23:55:39      at java.lang.reflect.Method.invoke(Method.java:606)
> 23:55:39      at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> 23:55:39      at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> 23:55:39 
> 23:55:39 Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> 23:55:39 Error executing file from Hive: load-tpcds-core-hive-generated.sql
> 23:55:39 Error in /home/ubuntu/Impala/testdata/bin/create-load-data.sh at 
> line 48: LOAD_DATA_ARGS=""
> {code}
> https://jenkins.impala.io/job/ubuntu-14.04-from-scratch/1827/
> It's been reported a few times in the last week. Here's another failed job 
> reported on dev@:
> https://jenkins.impala.io/job/ubuntu-14.04-from-scratch/1807/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-5765) Flaky tpc-ds data loading

Reply via email to