[ https://issues.apache.org/jira/browse/HIVE-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
makiet updated HIVE-2880: ------------------------- Description: I have 2 tables CREATE TABLE data(calling STRING COMMENT 'Calling number', volumn_download BIGINT COMMENT 'Volume download', volumn_upload BIGINT COMMENT 'Volume upload') PARTITIONED BY(ds STRING) CLUSTERED BY (calling) INTO 100 BUCKETS; CREATE TABLE sub(isdn STRING, sub_id STRING) CLUSTERED BY (isdn) INTO 100 BUCKETS; The DATA table has 15m records while SUB table only has 600k records. The following SQL script were executed successfully: select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on a.calling=b.isdn; But when I used Bucket map join by setting: set hive.optimize.bucketmapjoin = true the above SQL script failed select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on a.calling=b.isdn; hive> set hive.optimize.bucketmapjoin = true; hive> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join sub_bucket b on a.calling=b.isdn; Total MapReduce jobs = 1 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Execution log at: /tmp/hduser/hduser_20120320080909_8e6a3419-4d2c-4148-a0c9-166d051c8274.log 2012-03-20 08:09:34 Starting to launch local task to process map join; maximum memory = 932118528 2012-03-20 08:09:34 End of local task; Time Taken: 0.072 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1: No such file or directory tar: Cowardly refusing to create an empty archive Try `tar --help' or `tar --usage' for more information. at org.apache.hadoop.util.Shell.runCommand(Shell.java:255) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: 375) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:260) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:407 ) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 ) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja va:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 55) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Job Submission failed with exception 'org.apache.hadoop.util.Shell$ExitCodeException(bash: line 0: cd: /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1: No such file or directory tar: Cowardly refusing to create an empty archive Try `tar --help' or `tar --usage' for more information. )' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.<init>(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java: 379) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.ja va:192) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476 ) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 ) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja va:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 55) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask in hadoop-env.sh, I set: export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/u01/app/hduser/hadoop-0.20.203.0/tempdir" It looked like hive could not create temporary directory. was: I have 2 tables CREATE TABLE data(calling STRING COMMENT 'Calling number', volumn_download BIGINT COMMENT 'Volume download', volumn_upload BIGINT COMMENT 'Volume upload') PARTITIONED BY(ds STRING) CLUSTERED BY (calling) INTO 100 BUCKETS; CREATE TABLE sub(isdn STRING, sub_id STRING) CLUSTERED BY (isdn) INTO 100 BUCKETS; The DATA table has 15m records while SUB table only has 600k records. The following SQL script were executed successfully: select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join sub_bucket b on a.calling=b.isdn; But when I used Bucket map join by setting: set hive.optimize.bucketmapjoin = true the above SQL script failed select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join sub_bucket b on a.calling=b.isdn; hive> set hive.optimize.bucketmapjoin = true; hive> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join sub_bucket b on a.calling=b.isdn; Total MapReduce jobs = 1 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Execution log at: /tmp/hduser/hduser_20120320080909_8e6a3419-4d2c-4148-a0c9-166d051c8274.log 2012-03-20 08:09:34 Starting to launch local task to process map join; maximum memory = 932118528 2012-03-20 08:09:34 End of local task; Time Taken: 0.072 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1: No such file or directory tar: Cowardly refusing to create an empty archive Try `tar --help' or `tar --usage' for more information. at org.apache.hadoop.util.Shell.runCommand(Shell.java:255) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: 375) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:260) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:407 ) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 ) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja va:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 55) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Job Submission failed with exception 'org.apache.hadoop.util.Shell$ExitCodeExcep tion(bash: line 0: cd: /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_201 2-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1: No such file or directory tar: Cowardly refusing to create an empty archive Try `tar --help' or `tar --usage' for more information. )' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.<init>(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java: 379) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.ja va:192) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476 ) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 ) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja va:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 55) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask in hadoop-env.sh, I set: export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/u01/app/hduser/hadoop-0.20.203.0/tempdir" It looked like hive could not create temporary directory. > Fail to create temporary directory when execute bucket map join > --------------------------------------------------------------- > > Key: HIVE-2880 > URL: https://issues.apache.org/jira/browse/HIVE-2880 > Project: Hive > Issue Type: Bug > Components: SQL > Affects Versions: 0.8.1 > Environment: Cluster with 4 PC: CPU Intel dual core 2.2Ghz, 2Gb ram, > 80G hdd, cent OS 5.0, NIC 100Mbps > 1 of them: Namenode + 2nd namenode + job tracker + hive server (also Datanode > and Task tracker) > 3 of them: only datanode + task tracker > All of them could ssh password-less to each other and used the same user > account: hduser > Reporter: makiet > > I have 2 tables > CREATE TABLE data(calling STRING COMMENT 'Calling number', > volumn_download BIGINT COMMENT 'Volume download', > volumn_upload BIGINT COMMENT 'Volume upload') > PARTITIONED BY(ds STRING) > CLUSTERED BY (calling) INTO 100 BUCKETS; > CREATE TABLE sub(isdn STRING, sub_id STRING) > CLUSTERED BY (isdn) INTO 100 BUCKETS; > The DATA table has 15m records while SUB table only has 600k records. > The following SQL script were executed successfully: > select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on > a.calling=b.isdn; > But when I used Bucket map join by setting: set hive.optimize.bucketmapjoin = > true > the above SQL script failed > select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on > a.calling=b.isdn; > hive> set hive.optimize.bucketmapjoin = true; > hive> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join > sub_bucket b on a.calling=b.isdn; > Total MapReduce jobs = 1 > WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use > org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. > Execution log at: > /tmp/hduser/hduser_20120320080909_8e6a3419-4d2c-4148-a0c9-166d051c8274.log > 2012-03-20 08:09:34 Starting to launch local task to process map join; > maximum memory = 932118528 > 2012-03-20 08:09:34 End of local task; Time Taken: 0.072 sec. > Execution completed successfully > Mapred Local Task Succeeded . Convert the Join into MapJoin > Mapred Local Task Succeeded . Convert the Join into MapJoin > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: > /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1: > No such file or directory > tar: Cowardly refusing to create an empty archive > Try `tar --help' or `tar --usage' for more information. > at org.apache.hadoop.util.Shell.runCommand(Shell.java:255) > at org.apache.hadoop.util.Shell.run(Shell.java:182) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: > 375) > at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:260) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:407 > ) > at > org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 > ) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja > va:57) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 > 55) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. > java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Job Submission failed with exception > 'org.apache.hadoop.util.Shell$ExitCodeException(bash: line 0: cd: > /u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_810_1393729636696443501/-local-10002/HashTable-Stage-1: > No such file or directory > tar: Cowardly refusing to create an empty archive > Try `tar --help' or `tar --usage' for more information. > )' > java.lang.IllegalArgumentException: Can not create a Path from an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) > at org.apache.hadoop.fs.Path.<init>(Path.java:90) > at > org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java: > 379) > at > org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.ja > va:192) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476 > ) > at > org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 > ) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja > va:57) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 > 55) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. > java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MapRedTask > in hadoop-env.sh, I set: > export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true > -Djava.io.tmpdir=/u01/app/hduser/hadoop-0.20.203.0/tempdir" > It looked like hive could not create temporary directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira