[jira] [Updated] (HIVE-18623) Hive throws an exception "Renames across Mount points not supported" when running in a federated cluster
[ https://issues.apache.org/jira/browse/HIVE-18623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-18623: Affects Version/s: (was: 2.3.1) (was: 2.3.0) 2.3.3 3.0.0 > Hive throws an exception "Renames across Mount points not supported" when > running in a federated cluster > > > Key: HIVE-18623 > URL: https://issues.apache.org/jira/browse/HIVE-18623 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.2.0, 3.0.0, 2.3.2, 2.3.3 > Environment: hadoop 2.7.5, HDFS Federation enabled > hive 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Major > Attachments: HIVE-18623.1.patch > > > > I run a sql query in in a federated cluster and I have two namespaces: > nameservice and nameservice1. I set > hive.exec.stagingdir=/nameservice1/hive_tmp in hive-site.xml and my data > tables are located in the directory of nameservice, then I got the exception > as below: > hive> create external table test_par6(id int,name string) partitioned by(p > int); > OK > Time taken: 1.527 seconds > hive> insert into table test_par6 partition(p = 1) values(1,'Jack'); > Moving data to directory > viewfs://nsX/nameservice1/hive_tmp_hive_2018-02-05_14-09-36_416_3075179128063595297-1/-ext-1 > Loading data to table default.test_par6 partition (p=1) > Failed with exception java.io.IOException: Renames across Mount points not > supported > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Renames across > Mount points not supported > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.08 sec HDFS Read: 3930 HDFS Write: 7 > SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 80 msec -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Affects Version/s: (was: 3.0.0) 2.1.1 > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang >Priority: Major > Attachments: 0001-Hive-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Affects Version/s: (was: 2.1.1) (was: 2.1.0) 3.0.0 > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 3.0.0 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang >Priority: Major > Attachments: 0001-Hive-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-18623) Hive throws an exception "Renames across Mount points not supported" when running in a federated cluster
[ https://issues.apache.org/jira/browse/HIVE-18623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352229#comment-16352229 ] yangfang edited comment on HIVE-18623 at 2/5/18 10:47 AM: -- Hive should move the execution result from the source path to the distination path at the end of the execution, In my test, the source path is viewfs://nsX/nameservice1/hive_tmp_hive_2018-02-05_14-09-36_416_3075179128063595297-1/-ext-1 and the distination path is viewfs://nsX/nameservice/hive/test_par6/p=1/ , Moving files from source path to distination path are acrossing mount points. HDFS use rename to move files , but it does not support renames across mount points, so it throws an exception. The exception comes from the following code: private static Path mvFile(HiveConf conf, FileSystem sourceFs, Path sourcePath, FileSystem destFs, Path destDirPath, boolean isSrcLocal, boolean isOverwrite, boolean isRenameAllowed) throws IOException { for (int counter = 1; destFs.exists(destFilePath); counter++) { if (isOverwrite) { destFs.delete(destFilePath, false); break; } destFilePath = new Path(destDirPath, name + (Utilities.COPY_KEYWORD + counter) + (!type.isEmpty() ? "." + type : "")); } if (isRenameAllowed) { destFs.rename(sourcePath, destFilePath); } else if (isSrcLocal) { destFs.copyFromLocalFile(sourcePath, destFilePath); } else { FileUtils.copy(sourceFs, sourcePath, destFs, destFilePath, true, // delete source false, // overwrite destination conf); } } I think we should make isRenameAllowed false because across mount points can not use rename, we should use copy In this case, so I have changed needToCopy() method , if I found source path and destination path are acrossing mount point , needToCopy() return false.Now isRenameAllowed is false , Hive will use FileUtils.copy() method for moving files. was (Author: yangfang): Hive should move the execution result from the source path to the distination path at the end of the execution, In my test, the source path is viewfs://nsX/nameservice1/hive_tmp_hive_2018-02-05_14-09-36_416_3075179128063595297-1/-ext-1 and the distination path is viewfs://nsX/nameservice/hive/test_par6/p=1/ , Moving files from source path to distination path are acrossing mount points. HDFS use rename to move files , but it does not suport renames across mount points, so it throws an exception. The exception comes from the following code: private static Path mvFile(HiveConf conf, FileSystem sourceFs, Path sourcePath, FileSystem destFs, Path destDirPath, boolean isSrcLocal, boolean isOverwrite, boolean isRenameAllowed) throws IOException { for (int counter = 1; destFs.exists(destFilePath); counter++) { if (isOverwrite) { destFs.delete(destFilePath, false); break; } destFilePath = new Path(destDirPath, name + (Utilities.COPY_KEYWORD + counter) + (!type.isEmpty() ? "." + type : "")); } if (isRenameAllowed) { destFs.rename(sourcePath, destFilePath); }else if (isSrcLocal) { destFs.copyFromLocalFile(sourcePath, destFilePath); } else { FileUtils.copy(sourceFs, sourcePath, destFs, destFilePath, true, // delete source false, // overwrite destination conf); } } I think we should make isRenameAllowed false because across mount points can not use rename, we should use copy In this case, so I have changed needToCopy() method , if I found source path and destination path are acrossing mount point , needToCopy() return false.Now isRenameAllowed is false , Hive will use FileUtils.copy() method for moving files. > Hive throws an exception "Renames across Mount points not supported" when > running in a federated cluster > > > Key: HIVE-18623 > URL: https://issues.apache.org/jira/browse/HIVE-18623 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.2.0, 2.3.0, 2.3.1, 2.3.2 > Environment: hadoop 2.7.5, HDFS Federation enabled > hive 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Major > Attachments: HIVE-18623.1.patch > > > > I run a sql query in in a federated cluster and I have two namespaces: > nameservice and nameservice1. I set > hive.exec.stagingdir=/nameservice1/hive_tmp in hive-site.xml and my data > tables are located in the directory of nameservice, then I got the exception > as below: > hive> create external table test_par6(id int,name
[jira] [Updated] (HIVE-18623) Hive throws an exception "Renames across Mount points not supported" when running in a federated cluster
[ https://issues.apache.org/jira/browse/HIVE-18623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-18623: Status: Patch Available (was: Open) Hive should move the execution result from the source path to the distination path at the end of the execution, In my test, the source path is viewfs://nsX/nameservice1/hive_tmp_hive_2018-02-05_14-09-36_416_3075179128063595297-1/-ext-1 and the distination path is viewfs://nsX/nameservice/hive/test_par6/p=1/ , Moving files from source path to distination path are acrossing mount points. HDFS use rename to move files , but it does not suport renames across mount points, so it throws an exception. The exception comes from the following code: private static Path mvFile(HiveConf conf, FileSystem sourceFs, Path sourcePath, FileSystem destFs, Path destDirPath, boolean isSrcLocal, boolean isOverwrite, boolean isRenameAllowed) throws IOException { for (int counter = 1; destFs.exists(destFilePath); counter++) { if (isOverwrite) { destFs.delete(destFilePath, false); break; } destFilePath = new Path(destDirPath, name + (Utilities.COPY_KEYWORD + counter) + (!type.isEmpty() ? "." + type : "")); } if (isRenameAllowed) { destFs.rename(sourcePath, destFilePath); }else if (isSrcLocal) { destFs.copyFromLocalFile(sourcePath, destFilePath); } else { FileUtils.copy(sourceFs, sourcePath, destFs, destFilePath, true, // delete source false, // overwrite destination conf); } } I think we should make isRenameAllowed false because across mount points can not use rename, we should use copy In this case, so I have changed needToCopy() method , if I found source path and destination path are acrossing mount point , needToCopy() return false.Now isRenameAllowed is false , Hive will use FileUtils.copy() method for moving files. > Hive throws an exception "Renames across Mount points not supported" when > running in a federated cluster > > > Key: HIVE-18623 > URL: https://issues.apache.org/jira/browse/HIVE-18623 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2, 2.3.1, 2.3.0, 2.2.0 > Environment: hadoop 2.7.5, HDFS Federation enabled > hive 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Major > Attachments: HIVE-18623.1.patch > > > > I run a sql query in in a federated cluster and I have two namespaces: > nameservice and nameservice1. I set > hive.exec.stagingdir=/nameservice1/hive_tmp in hive-site.xml and my data > tables are located in the directory of nameservice, then I got the exception > as below: > hive> create external table test_par6(id int,name string) partitioned by(p > int); > OK > Time taken: 1.527 seconds > hive> insert into table test_par6 partition(p = 1) values(1,'Jack'); > Moving data to directory > viewfs://nsX/nameservice1/hive_tmp_hive_2018-02-05_14-09-36_416_3075179128063595297-1/-ext-1 > Loading data to table default.test_par6 partition (p=1) > Failed with exception java.io.IOException: Renames across Mount points not > supported > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Renames across > Mount points not supported > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.08 sec HDFS Read: 3930 HDFS Write: 7 > SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 80 msec -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18623) Hive throws an exception "Renames across Mount points not supported" when running in a federated cluster
[ https://issues.apache.org/jira/browse/HIVE-18623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-18623: Attachment: HIVE-18623.1.patch > Hive throws an exception "Renames across Mount points not supported" when > running in a federated cluster > > > Key: HIVE-18623 > URL: https://issues.apache.org/jira/browse/HIVE-18623 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.2.0, 2.3.0, 2.3.1, 2.3.2 > Environment: hadoop 2.7.5, HDFS Federation enabled > hive 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Major > Attachments: HIVE-18623.1.patch > > > > I run a sql query in in a federated cluster and I have two namespaces: > nameservice and nameservice1. I set > hive.exec.stagingdir=/nameservice1/hive_tmp in hive-site.xml and my data > tables are located in the directory of nameservice, then I got the exception > as below: > hive> create external table test_par6(id int,name string) partitioned by(p > int); > OK > Time taken: 1.527 seconds > hive> insert into table test_par6 partition(p = 1) values(1,'Jack'); > Moving data to directory > viewfs://nsX/nameservice1/hive_tmp_hive_2018-02-05_14-09-36_416_3075179128063595297-1/-ext-1 > Loading data to table default.test_par6 partition (p=1) > Failed with exception java.io.IOException: Renames across Mount points not > supported > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Renames across > Mount points not supported > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.08 sec HDFS Read: 3930 HDFS Write: 7 > SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 80 msec -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-18623) Hive throws an exception "Renames across Mount points not supported" when running in a federated cluster
[ https://issues.apache.org/jira/browse/HIVE-18623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang reassigned HIVE-18623: --- > Hive throws an exception "Renames across Mount points not supported" when > running in a federated cluster > > > Key: HIVE-18623 > URL: https://issues.apache.org/jira/browse/HIVE-18623 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2, 2.3.1, 2.3.0, 2.2.0 > Environment: hadoop 2.7.5, HDFS Federation enabled > hive 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Major > > > I run a sql query in in a federated cluster and I have two namespaces: > nameservice and nameservice1. I set > hive.exec.stagingdir=/nameservice1/hive_tmp in hive-site.xml and my data > tables are located in the directory of nameservice, then I got the exception > as below: > hive> create external table test_par6(id int,name string) partitioned by(p > int); > OK > Time taken: 1.527 seconds > hive> insert into table test_par6 partition(p = 1) values(1,'Jack'); > Moving data to directory > viewfs://nsX/nameservice1/hive_tmp_hive_2018-02-05_14-09-36_416_3075179128063595297-1/-ext-1 > Loading data to table default.test_par6 partition (p=1) > Failed with exception java.io.IOException: Renames across Mount points not > supported > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Renames across > Mount points not supported > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.08 sec HDFS Read: 3930 HDFS Write: 7 > SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 80 msec -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256370#comment-16256370 ] yangfang commented on HIVE-15911: - [~ashutoshc] I've tested it and this issue has been solved in hive3.0 > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: 0001-Hive-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019097#comment-16019097 ] yangfang edited comment on HIVE-1 at 5/22/17 3:03 AM: -- [~aihuaxu],[~pvary], thanks for your advice. In my opinion, staging directory is just a temporary directory, users may not be concerned with where the directory is, they only care about the success of the query and the final result. For users, any staging directory name may be allowed, throw an exception may be a little rough. Even if we add a validation against the configuration, for example suppose /tmp/hive/.hive-staging is a valide directory because it's a empty directory that no one has used, but now, someone may create table like this: create table test(a int, b string) location '/tmp' Now the staging directory is a sub directory of table data directory, this will still to delete the intermediate query results in execution. Looking forward to your comments. was (Author: yangfang): [~aihuaxu],[~pvary], thanks for your advice. In my opinion, the staging directory is just a temporary directory, users may not be concerned with where the directory is, they only care about the final result. For users, any staging directory name may be allowed, throw an exception may be a little rough. Even if we add a validation against the configuration, for example suppose /tmp/hive/.hive-staging is a valide directory because it's a empty directory that no one has used, but now, someone may create table like this: create table test(a int, b string) location '/tmp' Now the staging directory is a sub directory of table data directory, this will still to delete the intermediate query results in execution. Looking forward to your comments. > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory
[jira] [Commented] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019097#comment-16019097 ] yangfang commented on HIVE-1: - [~aihuaxu],[~pvary], thanks for your advice. In my opinion, the staging directory is just a temporary directory, users may not be concerned with where the directory is, they only care about the final result. For users, any staging directory name may be allowed, throw an exception may be a little rough. Even if we add a validation against the configuration, for example suppose /tmp/hive/.hive-staging is a valide directory because it's a empty directory that no one has used, but now, someone may create table like this: create table test(a int, b string) location '/tmp' Now the staging directory is a sub directory of table data directory, this will still to delete the intermediate query results in execution. Looking forward to your comments. > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended
[jira] [Commented] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017080#comment-16017080 ] yangfang commented on HIVE-1: - These failed tests are not relevant to the patch ,[~aihuaxu],[~pvary] ,can you plz take a review?Thanks! > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015098#comment-16015098 ] yangfang commented on HIVE-1: - hello,[~pvary],[~aihuaxu],can you plz take a review?Thanks! > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Affects Version/s: (was: 2.1.1) 3.0.0 > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Status: Open (was: Patch Available) > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Status: Patch Available (was: Open) > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Attachment: HIVE-1.1.patch > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Attachment: (was: HIVE-1.1.patch) > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 > who's name does not begin with "_" or "." in order to move data to this > directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with ".", Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Description: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading data to table default.test2 Moved: 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' to trash at: hdfs://nameservice/user/mr/.Trash/Current Failed with exception Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive> hive.exec.stagingdir=./opq8 is a relative path for destination write directory /hive/test2. Hive will create a temporary directory /hive/test2/opq8_hive* for intermediate query results. Later in the move staging, Hive will delete or trash the sub directory under the /hive/test2 who's name does not begin with "_" or "." in order to move data to this directory. You can see its processing logic in org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. My modification method is: if stagingdir is a sub directory of the destination write directory. I add a "." in front of stagingdir. now temporary directory will be /hive/test2/.opq8_hive* , because the sub directory .opq8_hive* starts with ".", Hive will not delete it. hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0012, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0012/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0012 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0012 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 Loading data to table default.test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 26.751 seconds hive> was: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading data to
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Description: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading data to table default.test2 Moved: 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' to trash at: hdfs://nameservice/user/mr/.Trash/Current Failed with exception Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive> I set hive.exec.stagingdir=./opq8 is a relative path for destination write directory /hive/test2. Hive will create a temporary directory /hive/test2/opq8_hive* for intermediate query results. Later in the move staging, Hive will delete or trash the sub directory under the /hive/test2 who's name does not begin with "_" or "." in order to move data to this directory. You can see its processing logic in org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. My modification method is: if stagingdir is a sub directory of the destination write directory. I add a "." in front of stagingdir. now temporary directory will be /hive/test2/.opq8_hive* , because the sub directory .opq8_hive* starts with ".", Hive will not delete it. hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0012, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0012/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0012 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0012 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 Loading data to table default.test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 26.751 seconds hive> was: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Description: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading data to table default.test2 Moved: 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' to trash at: hdfs://nameservice/user/mr/.Trash/Current Failed with exception Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive> I set hive.exec.stagingdir=./opq8 is a relative path for destination write directory /hive/test2. Hive will create a temporary directory /hive/test2/opq8_hive* for intermediate query results. Later in the move staging, Hive will delete or trash the sub directory under the /hive/test2 who's name does not begin with "_" or "." in order to move data to this directory. You can see its processing logic in org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. My modification method is: if stagingdir is a sub directory of the destination write directory. I add a "." in front of stagingdir. now temporary directory will be /hive/test2/.opq8_hive* , because the sub directory .opq8_hive* starts with ".", Hive will not delete it. hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0012, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0012/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0012 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0012 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 Loading data to table default.test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 26.751 seconds hive> was: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Description: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading data to table default.test2 Moved: 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' to trash at: hdfs://nameservice/user/mr/.Trash/Current Failed with exception Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 to destination hdfs://nameservice/hive/test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive> I set hive.exec.stagingdir=./opq8 is a relative path for destination write directory /hive/test2. Hive will create a temporary directory /hive/test2/opq8_hive* for intermediate query results. Later in the move staging, Hive will delete or trash the sub directory under the /hive/test2 who's name does not begin with "_" or "." in order to move data to this directory. You can see its processing logic in org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. My modification method is: if stagingdir is a sub directory of the destination write directory. I add a "." in front of stagingdir. now temporary directory will be /hive/test2/.opq8_hive* , because the sub directory .opq8_hive* starts with "." Hive will not delete it. hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0012, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0012/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0012 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0012 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 Loading data to table default.test2 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 26.751 seconds hive> was: Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. Then excute a query like this: insert overwrite table test2 select * from test3; You will get the error like this: hive> set hive.exec.stagingdir=./opq8; hive> insert overwrite table test2 select * from test3; Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1494818119523_0008, Tracking URL = http://zdh77:8088/proxy/application_1494818119523_0008/ Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill job_1494818119523_0008 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% Ended Job = job_1494818119523_0008 Stage-3 is selected by condition resolver. Stage-2 is filtered out by condition resolver. Stage-4 is filtered out by condition resolver. Moving data to directory hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 Loading data
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Summary: Set hive.exec.stagingdir a relative directory or a sub directory of distination data directory will cause Hive to delete the intermediate query results (was: Set hive.exec.stagingdir=./* will cause Hive to delete the intermediate query results) > Set hive.exec.stagingdir a relative directory or a sub directory of > distination data directory will cause Hive to delete the intermediate query > results > --- > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > I set hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 if > the directory name does not begin with "_" or "." in order to move data to > this directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with "." Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir=./* will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Status: Patch Available (was: Open) I have tested it locally, please review it. > Set hive.exec.stagingdir=./* will cause Hive to delete the intermediate query > results > - > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > I set hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 if > the directory name does not begin with "_" or "." in order to move data to > this directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with "." Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16666) Set hive.exec.stagingdir=./* will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-1: Attachment: HIVE-1.1.patch > Set hive.exec.stagingdir=./* will cause Hive to delete the intermediate query > results > - > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > Attachments: HIVE-1.1.patch > > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > I set hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 if > the directory name does not begin with "_" or "." in order to move data to > this directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with "." Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16666) Set hive.exec.stagingdir=./* will cause Hive to delete the intermediate query results
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang reassigned HIVE-1: --- > Set hive.exec.stagingdir=./* will cause Hive to delete the intermediate query > results > - > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.1.1 >Reporter: yangfang >Assignee: yangfang >Priority: Critical > > Set hive.exec.stagingdir=./*, for example set hive.exec.stagingdir=./opq8. > Then excute a query like this: > insert overwrite table test2 select * from test3; > You will get the error like this: > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515134831_28ee392d-0d5a-4e47-b80c-dfcd31691b02 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0008, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0008/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0008 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 13:48:51,487 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0008 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > Loading data to table default.test2 > Moved: > 'hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1' > to trash at: hdfs://nameservice/user/mr/.Trash/Current > Failed with exception Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source > hdfs://nameservice/hive/test2/opqt8_hive_2017-05-15_13-48-31_558_6151032330134038151-1/-ext-1 > to destination hdfs://nameservice/hive/test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > hive> > I set hive.exec.stagingdir=./opq8 is a relative path for destination write > directory /hive/test2. Hive will create a temporary directory > /hive/test2/opq8_hive* for intermediate query results. Later in the move > staging, Hive will delete or trash the sub directory under the /hive/test2 if > the directory name does not begin with "_" or "." in order to move data to > this directory. You can see its processing logic in > org.apache.hadoop.hive.ql.metadata.trashFilesUnderDir. > My modification method is: if stagingdir is a sub directory of the > destination write directory. I add a "." in front of stagingdir. now > temporary directory will be /hive/test2/.opq8_hive* , because the sub > directory .opq8_hive* starts with "." Hive will not delete it. > hive> set hive.exec.stagingdir=./opq8; > hive> insert overwrite table test2 select * from test3; > Query ID = mr_20170515143940_ae48a65e-42be-4f50-b974-b713ca902867 > Total jobs = 3 > Launching Job 1 out of 3 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1494818119523_0012, Tracking URL = > http://zdh77:8088/proxy/application_1494818119523_0012/ > Kill Command = /opt/ZDH/parcels/lib/hadoop/bin/hadoop job -kill > job_1494818119523_0012 > Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0 > 2017-05-15 14:40:04,547 Stage-1 map = 0%, reduce = 0% > Ended Job = job_1494818119523_0012 > Stage-3 is selected by condition resolver. > Stage-2 is filtered out by condition resolver. > Stage-4 is filtered out by condition resolver. > Moving data to directory > hdfs://nameservice/hive/test2/.opqt8_hive_2017-05-15_14-39-40_751_1221840798987515724-1/-ext-1 > Loading data to table default.test2 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 26.751 seconds > hive> -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Attachment: (was: HIVE-15911.patch) > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: 0001-Hive-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869636#comment-15869636 ] yangfang commented on HIVE-15911: - add Query Unit Test alter_view_as_select_view.q > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: 0001-Hive-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Status: Patch Available (was: Open) > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.1, 2.1.0 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: 0001-Hive-15911.patch, HIVE-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Attachment: 0001-Hive-15911.patch > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: 0001-Hive-15911.patch, HIVE-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Status: Open (was: Patch Available) > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.1, 2.1.0 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Create a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Summary: Create a view based on another view throws an exception “FAILED: NullPointerException null” (was: Creating a view based on another view throws an exception “FAILED: NullPointerException null”) > Create a view based on another view throws an exception “FAILED: > NullPointerException null” > --- > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867504#comment-15867504 ] yangfang commented on HIVE-15911: - update the patch > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Description: when I create a new view based on another view I an exception “FAILED: NullPointerException null”: hive> create view view2(a,b) as select a, b from view1; //view1 is another view FAILED: NullPointerException null hive> The hive log show error stack: 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1459) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1239) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) was: when I create a new view based on another view I an exception “FAILED: NullPointerException null”: hive> create view view2(a,b) as select a, b from view1; //view1 is another view FAILED: NullPointerException null hive> The hive log show error stack: 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-15T15:40:25,816 ERROR ql.Driver (SessionState.java:printError(1116)) > - FAILED: NullPointerException null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:863) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:552) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1319) > at
[jira] [Updated] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Attachment: HIVE-15911.patch > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Attachment: (was: HIVE-15911.1.patch) > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865643#comment-15865643 ] yangfang edited comment on HIVE-15911 at 2/14/17 12:10 PM: --- In the doAuthorization method of the Driver: if (tbl.isView() && sem instanceof SemanticAnalyzer) { tab2Cols.put(tbl, sem.getColumnAccessInfo().getTableToColumnAccessMap().get(tbl.getCompleteName())); } sem.getColumnAccessInfo returns null, so the code throw exception "NullPointerException null". It needs to call setColumnAccessInfo method in the above method. I call setColumnAccessInfo in the analyzeInternal method when processing the view creation. was (Author: yangfang): In the doAuthorization method of the Driver: if (tbl.isView() && sem instanceof SemanticAnalyzer) { tab2Cols.put(tbl, sem.getColumnAccessInfo().getTableToColumnAccessMap().get(tbl.getCompleteName())); } sem.getColumnAccessInfo returns null, so the code throw exception "NullPointerException null". It needs to call setColumnAccessInfo in the above method. > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.1.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Description: when I create a new view based on another view I an exception “FAILED: NullPointerException null”: hive> create view view2(a,b) as select a, b from view1; //view1 is another view FAILED: NullPointerException null hive> The hive log show error stack: 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) was: when I create a new view based on another view I an exception “FAILED: NullPointerException null”: hive> create view view2(a,b) as select a, b from view1; //view1 is another veiw FAILED: NullPointerException null hive> The hive log show error stack: 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.1.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > view > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at
[jira] [Updated] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Description: when I create a new view based on another view I an exception “FAILED: NullPointerException null”: hive> create view view2(a,b) as select a, b from view1; //view1 is another veiw FAILED: NullPointerException null hive> The hive log show error stack: 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) was: when I create a new view based on another view I an exception “FAILED: NullPointerException null”: hive> create view view2(a,b) as select a, b from view1; FAILED: NullPointerException null hive> The hive log show error stack: 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.1.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; //view1 is another > veiw > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at
[jira] [Comment Edited] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865643#comment-15865643 ] yangfang edited comment on HIVE-15911 at 2/14/17 11:51 AM: --- In the doAuthorization method of the Driver: if (tbl.isView() && sem instanceof SemanticAnalyzer) { tab2Cols.put(tbl, sem.getColumnAccessInfo().getTableToColumnAccessMap().get(tbl.getCompleteName())); } sem.getColumnAccessInfo returns null, so the code throw exception "NullPointerException null". It needs to call setColumnAccessInfo in the above method. was (Author: yangfang): In the doAuthorization example of the Driver: if (tbl.isView() && sem instanceof SemanticAnalyzer) { tab2Cols.put(tbl, sem.getColumnAccessInfo().getTableToColumnAccessMap().get(tbl.getCompleteName())); } sem.getColumnAccessInfo returns null, so the code throw exception "NullPointerException null". It needs to call setColumnAccessInfo in the above method. > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.1.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Attachment: HIVE-15911.1.patch > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.0, 2.1.1 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15911.1.patch > > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15911) Creating a view based on another view throws an exception “FAILED: NullPointerException null”
[ https://issues.apache.org/jira/browse/HIVE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15911: Assignee: yangfang Release Note: In the analyzeInternal method of the SemanticAnalyzer class, call setColumnAccessInfo method to get column information. Status: Patch Available (was: Open) In the doAuthorization example of the Driver: if (tbl.isView() && sem instanceof SemanticAnalyzer) { tab2Cols.put(tbl, sem.getColumnAccessInfo().getTableToColumnAccessMap().get(tbl.getCompleteName())); } sem.getColumnAccessInfo returns null, so the code throw exception "NullPointerException null". It needs to call setColumnAccessInfo in the above method. > Creating a view based on another view throws an exception “FAILED: > NullPointerException null” > - > > Key: HIVE-15911 > URL: https://issues.apache.org/jira/browse/HIVE-15911 > Project: Hive > Issue Type: Bug > Components: Views >Affects Versions: 2.1.1, 2.1.0 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > > when I create a new view based on another view I an exception “FAILED: > NullPointerException null”: > hive> create view view2(a,b) as select a, b from view1; > FAILED: NullPointerException null > hive> > The hive log show error stack: > 2017-02-13T20:54:29,288 ERROR ql.Driver (:()) - FAILED: NullPointerException > null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:710) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:331) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15862) beeline always have a warning "Hive does not support autoCommit=false"
[ https://issues.apache.org/jira/browse/HIVE-15862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15862: Description: When we use beeline, there is always a warning::"Request to set autoCommit to false; Hive does not support autoCommit=false", For example, we use beeline excute a sql: beeline -u "jdbc:hive2://localhost:1/default" -n mr -e "show tables" --hivevar taskNO=10111, we got this warning: . Connecting to jdbc:hive2://localhost:1/default Connected to: Apache Hive (version 2.1.0-zdh2.7.3) Driver: Hive JDBC (version 2.1.0-zdh2.7.3) 17/02/10 15:10:10 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false. When I looked at the hive source code, I found the BeeLineOpts set the autoCommit default value to false, It always triggers an alarm when beeline connect to the hiveserver2 in DatabaseConnection.connect: getConnection().setAutoCommit(beeLine.getOpts().getAutoCommit()); was: When we use beeline, there is always a warning::"Request to set autoCommit to false; Hive does not support autoCommit=false", For example, we use beeline excute a sql: beeline -u "jdbc:hive2://localhost:1/default" -n mr -e "show tables" --hivevar taskNO=10111, we got this warning. When I looked at the hive source code, I found the BeeLineOpts set the autoCommit default value to false, It always triggers an alarm when beeline connect to the hiveserver2 in DatabaseConnection.connect: getConnection().setAutoCommit(beeLine.getOpts().getAutoCommit()); > beeline always have a warning "Hive does not support autoCommit=false" > -- > > Key: HIVE-15862 > URL: https://issues.apache.org/jira/browse/HIVE-15862 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.1.0 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15862.1.patch > > > When we use beeline, there is always a warning::"Request to set autoCommit to > false; Hive does not support autoCommit=false", > For example, we use beeline excute a sql: > beeline -u "jdbc:hive2://localhost:1/default" -n mr -e "show tables" > --hivevar taskNO=10111, we got this warning: > . > Connecting to jdbc:hive2://localhost:1/default > Connected to: Apache Hive (version 2.1.0-zdh2.7.3) > Driver: Hive JDBC (version 2.1.0-zdh2.7.3) > 17/02/10 15:10:10 [main]: WARN jdbc.HiveConnection: Request to set autoCommit > to false; Hive does not support autoCommit=false. > When I looked at the hive source code, I found the BeeLineOpts set the > autoCommit default value to false, It always triggers an alarm when beeline > connect to the hiveserver2 in DatabaseConnection.connect: > getConnection().setAutoCommit(beeLine.getOpts().getAutoCommit()); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15862) beeline always have a warning "Hive does not support autoCommit=false"
[ https://issues.apache.org/jira/browse/HIVE-15862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15862: Attachment: HIVE-15862.1.patch This change will eliminate this alarm. > beeline always have a warning "Hive does not support autoCommit=false" > -- > > Key: HIVE-15862 > URL: https://issues.apache.org/jira/browse/HIVE-15862 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.1.0 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-15862.1.patch > > > When we use beeline, there is always a warning::"Request to set autoCommit to > false; Hive does not support autoCommit=false", > For example, we use beeline excute a sql: > beeline -u "jdbc:hive2://localhost:1/default" -n mr -e "show tables" > --hivevar taskNO=10111, we got this warning. > When I looked at the hive source code, I found the BeeLineOpts set the > autoCommit default value to false, It always triggers an alarm when beeline > connect to the hiveserver2 in DatabaseConnection.connect: > getConnection().setAutoCommit(beeLine.getOpts().getAutoCommit()); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15862) beeline always have a warning "Hive does not support autoCommit=false"
[ https://issues.apache.org/jira/browse/HIVE-15862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-15862: Assignee: yangfang Release Note: set the autoCommit default value to true in BeeLineOpts Target Version/s: 2.1.1, 2.1.0 Status: Patch Available (was: Open) > beeline always have a warning "Hive does not support autoCommit=false" > -- > > Key: HIVE-15862 > URL: https://issues.apache.org/jira/browse/HIVE-15862 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.1.0 > Environment: hive2.1.0 >Reporter: yangfang >Assignee: yangfang > > When we use beeline, there is always a warning::"Request to set autoCommit to > false; Hive does not support autoCommit=false", > For example, we use beeline excute a sql: > beeline -u "jdbc:hive2://localhost:1/default" -n mr -e "show tables" > --hivevar taskNO=10111, we got this warning. > When I looked at the hive source code, I found the BeeLineOpts set the > autoCommit default value to false, It always triggers an alarm when beeline > connect to the hiveserver2 in DatabaseConnection.connect: > getConnection().setAutoCommit(beeLine.getOpts().getAutoCommit()); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12877) Hive use index for queries will lose some data if the Query file is compressed.
[ https://issues.apache.org/jira/browse/HIVE-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12877: Attachment: 19-index_compressed_file.gz index_query_compressed_file_failure.q HIVE-12877.1.patch > Hive use index for queries will lose some data if the Query file is > compressed. > --- > > Key: HIVE-12877 > URL: https://issues.apache.org/jira/browse/HIVE-12877 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 1.2.1 > Environment: This problem exists in all Hive versions.no matter what > platform >Reporter: yangfang > Attachments: 19-index_compressed_file.gz, HIVE-12877.1.patch, > HIVE-12877.patch, index_query_compressed_file_failure.q > > > Hive created the index using the extracted file length when the file is the > compressed, > but when to divide the data into pieces in MapReduce,Hive use the file length > to compare with the extracted file length,if > If it found that these two lengths are not matched, It filters out the > file.So the query will lose some data. > I modified the source code and make hive index can be used when the files is > compressed,please test it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12877) Hive use index for queries will lose some data if the Query file is compressed.
[ https://issues.apache.org/jira/browse/HIVE-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125875#comment-15125875 ] yangfang commented on HIVE-12877: - I have add .q test in index_query_compressed_file_failure.q, 19-index_compressed_file.gz is my compressed file which used for test. > Hive use index for queries will lose some data if the Query file is > compressed. > --- > > Key: HIVE-12877 > URL: https://issues.apache.org/jira/browse/HIVE-12877 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 1.2.1 > Environment: This problem exists in all Hive versions.no matter what > platform >Reporter: yangfang > Attachments: 19-index_compressed_file.gz, HIVE-12877.1.patch, > HIVE-12877.patch, index_query_compressed_file_failure.q > > > Hive created the index using the extracted file length when the file is the > compressed, > but when to divide the data into pieces in MapReduce,Hive use the file length > to compare with the extracted file length,if > If it found that these two lengths are not matched, It filters out the > file.So the query will lose some data. > I modified the source code and make hive index can be used when the files is > compressed,please test it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12877) Hive use index for queries will lose some data if the Query file is compressed.
[ https://issues.apache.org/jira/browse/HIVE-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106046#comment-15106046 ] yangfang commented on HIVE-12877: - None of the test failures here are related or regressions, could you please take a quick look at this patch > Hive use index for queries will lose some data if the Query file is > compressed. > --- > > Key: HIVE-12877 > URL: https://issues.apache.org/jira/browse/HIVE-12877 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 1.2.1 > Environment: This problem exists in all Hive versions.no matter what > platform >Reporter: yangfang > Attachments: HIVE-12877.patch > > > Hive created the index using the extracted file length when the file is the > compressed, > but when to divide the data into pieces in MapReduce,Hive use the file length > to compare with the extracted file length,if > If it found that these two lengths are not matched, It filters out the > file.So the query will lose some data. > I modified the source code and make hive index can be used when the files is > compressed,please test it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12877) Hive use index for queries will lose some data if the Query file is compressed.
[ https://issues.apache.org/jira/browse/HIVE-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12877: Description: Hive created the index using the extracted file length when the file is the compressed, but when to divide the data into pieces in MapReduce,Hive use the file length to compare with the extracted file length,if If it found that these two lengths are not matched, It filters out the file.So the query will lose some data. was: Hive created the index using the extracted file length when the file is the compressed, but when to divide the data into pieces in MapReduce,Hive use the file length to compare with the extracted file length,if If it found that these two lengths are not matched, It filters out the file.So the query will lose some data > Hive use index for queries will lose some data if the Query file is > compressed. > --- > > Key: HIVE-12877 > URL: https://issues.apache.org/jira/browse/HIVE-12877 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 1.2.1 > Environment: This problem exists in all Hive versions.no matter what > platform >Reporter: yangfang > Attachments: HIVE-12877.patch > > > Hive created the index using the extracted file length when the file is the > compressed, > but when to divide the data into pieces in MapReduce,Hive use the file length > to compare with the extracted file length,if > If it found that these two lengths are not matched, It filters out the > file.So the query will lose some data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12877) Hive use index for queries will lose some data if the Query file is compressed.
[ https://issues.apache.org/jira/browse/HIVE-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12877: Description: Hive created the index using the extracted file length when the file is the compressed, but when to divide the data into pieces in MapReduce,Hive use the file length to compare with the extracted file length,if If it found that these two lengths are not matched, It filters out the file.So the query will lose some data. I modified the source code and make hive index can be used when the files is compressed,please test it. was: Hive created the index using the extracted file length when the file is the compressed, but when to divide the data into pieces in MapReduce,Hive use the file length to compare with the extracted file length,if If it found that these two lengths are not matched, It filters out the file.So the query will lose some data. > Hive use index for queries will lose some data if the Query file is > compressed. > --- > > Key: HIVE-12877 > URL: https://issues.apache.org/jira/browse/HIVE-12877 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 1.2.1 > Environment: This problem exists in all Hive versions.no matter what > platform >Reporter: yangfang > Attachments: HIVE-12877.patch > > > Hive created the index using the extracted file length when the file is the > compressed, > but when to divide the data into pieces in MapReduce,Hive use the file length > to compare with the extracted file length,if > If it found that these two lengths are not matched, It filters out the > file.So the query will lose some data. > I modified the source code and make hive index can be used when the files is > compressed,please test it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Attachment: HIVE-12653.3.patch > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, > HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055851#comment-15055851 ] yangfang commented on HIVE-12653: - Thanks very much, I have Re packaged. > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, > HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055852#comment-15055852 ] yangfang commented on HIVE-12653: - Thanks very much, I have Re packaged. > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, > HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Attachment: HIVE-12653.2.patch > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055657#comment-15055657 ] yangfang commented on HIVE-12653: - OK,Thank you for guidance, I have already modified the code and tested it. > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055390#comment-15055390 ] yangfang commented on HIVE-12653: - The method deserialize(Writable) in AbstractEncodingAwareSerDe is final,so can not override it > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055391#comment-15055391 ] yangfang commented on HIVE-12653: - The method deserialize(Writable) in AbstractEncodingAwareSerDe is final,so can not override it > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Attachment: HIVE-12653.patch > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Description: when I create table with ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files with chinese encoded by GBK: create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr string, num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table PersonInfo; I found chinese disorder code in the table and 'serialization.encoding' does not work, which list as below: | 9999�ϴ��� 0624624002��ʱ�� > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, which list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang reassigned HIVE-12653: --- Assignee: yangfang > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, which list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Attachment: HIVE-12653.patch > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, which list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Description: when I create table with ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files with chinese encoded by GBK: create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr string, num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table PersonInfo; I found chinese disorder code in the table and 'serialization.encoding' does not work, the chinese disorder data list as below: | 9999�ϴ��� 0624624002��ʱ�� was: when I create table with ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files with chinese encoded by GBK: create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr string, num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table PersonInfo; I found chinese disorder code in the table and 'serialization.encoding' does not work, the error which list as below: | 9999�ϴ��� 0624624002��ʱ�� > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Description: when I create table with ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files with chinese encoded by GBK: create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr string, num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table PersonInfo; I found chinese disorder code in the table and 'serialization.encoding' does not work, the error which list as below: | 9999�ϴ��� 0624624002��ʱ�� was: when I create table with ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files with chinese encoded by GBK: create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr string, num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); load data local inpath '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table PersonInfo; I found chinese disorder code in the table and 'serialization.encoding' does not work, which list as below: | 9999�ϴ��� 0624624002��ʱ�� > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the error which list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)