[jira] [Updated] (HIVE-2880) Fail to create temporary directory when execute bucket map join

makiet (Updated) (JIRA) Mon, 19 Mar 2012 18:40:03 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


makiet updated HIVE-2880:
-------------------------

    Description: 
I have 2 tables

CREATE TABLE data(calling STRING  COMMENT 'Calling number', 
volumn_download BIGINT COMMENT 'Volume download',
volumn_upload BIGINT COMMENT 'Volume upload')
PARTITIONED BY(ds STRING)
CLUSTERED BY (calling) INTO 100 BUCKETS;

CREATE TABLE sub(isdn STRING, sub_id STRING)
CLUSTERED BY (isdn) INTO 100 BUCKETS;

The DATA table has 15m records while SUB table only has 600k records.

The following SQL script were executed successfully:
select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on 
a.calling=b.isdn;

But when I used Bucket map join by setting: set hive.optimize.bucketmapjoin = 
true
the above SQL script failed
select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on 
a.calling=b.isdn;

hive> set hive.optimize.bucketmapjoin = true;
hive> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join 
sub_bucket b on a.calling=b.isdn;
Total MapReduce jobs = 1
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Execution log at: 
/tmp/hduser/hduser_20120320080909_8e6a3419-4d2c-4148-a0c9-166d051c8274.log
2012-03-20 08:09:34     Starting to launch local task to process map join;     
maximum memory = 932118528
2012-03-20 08:09:34     End of local task; Time Taken: 0.072 sec.
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Mapred Local Task Succeeded . Convert the Join into MapJoin
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: 
/u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1:
 No such file or directory
tar: Cowardly refusing to create an empty archive
Try `tar --help' or `tar --usage' for more information.

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
        at org.apache.hadoop.util.Shell.run(Shell.java:182)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:           
              375)
        at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:260)
        at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:407           
              )
        at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136           
              )
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja           
              va:57)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2           
              55)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.           
              java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Job Submission failed with exception 
'org.apache.hadoop.util.Shell$ExitCodeException(bash: line 0: cd: 
/u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1:
 No such file or directory
tar: Cowardly refusing to create an empty archive
Try `tar --help' or `tar --usage' for more information.
)'
java.lang.IllegalArgumentException: Can not create a Path from an empty string
        at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
        at org.apache.hadoop.fs.Path.<init>(Path.java:90)
        at 
org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:           
              379)
        at 
org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.ja           
              va:192)
        at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476           
              )
        at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136           
              )
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja           
              va:57)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2           
              55)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.           
              java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MapRedTask

in hadoop-env.sh, I set:
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true 
-Djava.io.tmpdir=/u01/app/hduser/hadoop-0.20.203.0/tempdir"

It looked like hive could not create temporary directory.


  was:
I have 2 tables

CREATE TABLE data(calling STRING  COMMENT 'Calling number', 
volumn_download BIGINT COMMENT 'Volume download',
volumn_upload BIGINT COMMENT 'Volume upload')
PARTITIONED BY(ds STRING)
CLUSTERED BY (calling) INTO 100 BUCKETS;

CREATE TABLE sub(isdn STRING, sub_id STRING)
CLUSTERED BY (isdn) INTO 100 BUCKETS;

The DATA table has 15m records while SUB table only has 600k records.

The following SQL script were executed successfully:
select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join sub_bucket 
b on a.calling=b.isdn;

But when I used Bucket map join by setting: set hive.optimize.bucketmapjoin = 
true
the above SQL script failed
select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join sub_bucket 
b on a.calling=b.isdn;

hive> set hive.optimize.bucketmapjoin = true;
hive> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join 
sub_bucket b on a.calling=b.isdn;
Total MapReduce jobs = 1
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Execution log at: 
/tmp/hduser/hduser_20120320080909_8e6a3419-4d2c-4148-a0c9-166d051c8274.log
2012-03-20 08:09:34     Starting to launch local task to process map join;     
maximum memory = 932118528
2012-03-20 08:09:34     End of local task; Time Taken: 0.072 sec.
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Mapred Local Task Succeeded . Convert the Join into MapJoin
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: 
/u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1:
 No such file or directory
tar: Cowardly refusing to create an empty archive
Try `tar --help' or `tar --usage' for more information.

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
        at org.apache.hadoop.util.Shell.run(Shell.java:182)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:           
              375)
        at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:260)
        at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:407           
              )
        at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136           
              )
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja           
              va:57)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2           
              55)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.           
              java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Job Submission failed with exception 
'org.apache.hadoop.util.Shell$ExitCodeExcep                         tion(bash: 
line 0: cd: /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_201           
              
2-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1: No 
such                          file or directory
tar: Cowardly refusing to create an empty archive
Try `tar --help' or `tar --usage' for more information.
)'
java.lang.IllegalArgumentException: Can not create a Path from an empty string
        at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
        at org.apache.hadoop.fs.Path.<init>(Path.java:90)
        at 
org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:           
              379)
        at 
org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.ja           
              va:192)
        at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476           
              )
        at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136           
              )
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja           
              va:57)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2           
              55)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.           
              java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MapRedTask

in hadoop-env.sh, I set:
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true 
-Djava.io.tmpdir=/u01/app/hduser/hadoop-0.20.203.0/tempdir"

It looked like hive could not create temporary directory.


    
> Fail to create temporary directory when execute bucket map join
> ---------------------------------------------------------------
>
>                 Key: HIVE-2880
>                 URL: https://issues.apache.org/jira/browse/HIVE-2880
>             Project: Hive
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 0.8.1
>         Environment: Cluster with 4 PC: CPU Intel dual core 2.2Ghz, 2Gb ram, 
> 80G hdd, cent OS 5.0, NIC 100Mbps
> 1 of them: Namenode + 2nd namenode + job tracker + hive server (also Datanode 
> and Task tracker)
> 3 of them: only datanode + task tracker
> All of them could ssh password-less to each other and used the same user 
> account: hduser
>            Reporter: makiet
>
> I have 2 tables
> CREATE TABLE data(calling STRING  COMMENT 'Calling number', 
> volumn_download BIGINT COMMENT 'Volume download',
> volumn_upload BIGINT COMMENT 'Volume upload')
> PARTITIONED BY(ds STRING)
> CLUSTERED BY (calling) INTO 100 BUCKETS;
> CREATE TABLE sub(isdn STRING, sub_id STRING)
> CLUSTERED BY (isdn) INTO 100 BUCKETS;
> The DATA table has 15m records while SUB table only has 600k records.
> The following SQL script were executed successfully:
> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on 
> a.calling=b.isdn;
> But when I used Bucket map join by setting: set hive.optimize.bucketmapjoin = 
> true
> the above SQL script failed
> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on 
> a.calling=b.isdn;
> hive> set hive.optimize.bucketmapjoin = true;
> hive> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join 
> sub_bucket b on a.calling=b.isdn;
> Total MapReduce jobs = 1
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
> org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
> Execution log at: 
> /tmp/hduser/hduser_20120320080909_8e6a3419-4d2c-4148-a0c9-166d051c8274.log
> 2012-03-20 08:09:34     Starting to launch local task to process map join;    
>  maximum memory = 932118528
> 2012-03-20 08:09:34     End of local task; Time Taken: 0.072 sec.
> Execution completed successfully
> Mapred Local Task Succeeded . Convert the Join into MapJoin
> Mapred Local Task Succeeded . Convert the Join into MapJoin
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: 
> /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1:
>  No such file or directory
> tar: Cowardly refusing to create an empty archive
> Try `tar --help' or `tar --usage' for more information.
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
>         at org.apache.hadoop.util.Shell.run(Shell.java:182)
>         at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:         
>                 375)
>         at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:260)
>         at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:407         
>                 )
>         at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136         
>                 )
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja         
>                 va:57)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2         
>                 55)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.         
>                 java:39)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Job Submission failed with exception 
> 'org.apache.hadoop.util.Shell$ExitCodeException(bash: line 0: cd: 
> /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1:
>  No such file or directory
> tar: Cowardly refusing to create an empty archive
> Try `tar --help' or `tar --usage' for more information.
> )'
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
>         at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:90)
>         at 
> org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:         
>                 379)
>         at 
> org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.ja         
>                 va:192)
>         at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476         
>                 )
>         at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136         
>                 )
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja         
>                 va:57)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2         
>                 55)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.         
>                 java:39)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> in hadoop-env.sh, I set:
> export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true 
> -Djava.io.tmpdir=/u01/app/hduser/hadoop-0.20.203.0/tempdir"
> It looked like hive could not create temporary directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2880) Fail to create temporary directory when execute bucket map join

Reply via email to