[
https://issues.apache.org/jira/browse/HIVE-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739197#action_12739197
]
Zheng Shao commented on HIVE-718:
---------------------------------
bq. Zheng, aren't buckets are separate subdirs? they work so sub-dirs should be
fine.
I tried to add a directory into a table, and then run this. Apparently hadoop
file format does not like the sub directory:
Buckets are files not directories.
{code}
> select * from zshao_tt;
OK
Failed with exception java.io.IOException:Not a file:
hdfs://dfs1.data.facebook.com:9000/user/facebook/warehouse/zshao_tt/a
09/08/04 14:49:38 ERROR exec.FetchTask: Failed with exception
java.io.IOException:Not a file:
hdfs://dfs1.data.facebook.com:9000/user/facebook/warehouse/zshao_tt/a
java.io.IOException: Not a file:
hdfs://dfs1.data.facebook.com:9000/user/facebook/warehouse/zshao_tt/a
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:231)
at
org.apache.hadoop.hive.ql.exec.FetchTask.getRecordReader(FetchTask.java:236)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:291)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:368)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:166)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
{code}
I discussed with Ashish offline on this. I think we still want the atomic
property of insert - as a result, we may need to manually expand the input
directory into a bunch of files, and feed the files into the map/reduce jobs
(instead of the directories). That code is in ExecDriver.java and
MapRedTask.java when we set the JobConf.
What do you think?
> Load data inpath into a new partition without overwrite does not move the file
> ------------------------------------------------------------------------------
>
> Key: HIVE-718
> URL: https://issues.apache.org/jira/browse/HIVE-718
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: Zheng Shao
> Attachments: HIVE-718.1.patch, HIVE-718.2.patch, hive-718.txt
>
>
> The bug can be reproduced as following. Note that it only happens for
> partitioned tables. The select after the first load returns nothing, while
> the second returns the data correctly.
> insert.txt in the current local directory contains 3 lines: "a", "b" and "c".
> {code}
> > create table tmp_insert_test (value string) stored as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test;
> > select * from tmp_insert_test;
> a
> b
> c
> > create table tmp_insert_test_p ( value string) partitioned by (ds string)
> > stored as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition
> > (ds = '2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition
> > (ds = '2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> a 2009-08-01
> b 2009-08-01
> d 2009-08-01
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.