[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-13860: --- Attachment: HIVE-13860-java8.patch > Fix more json related JDK8 test failures > > > Key: HIVE-13860 > URL: https://issues.apache.org/jira/browse/HIVE-13860 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all
[ https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302324#comment-15302324 ] frank luo commented on HIVE-13737: -- turning stats off gives correct result. So it is a bug in StatsOptimizer? On Thu, May 26, 2016 at 11:04 AM, Ashutosh Chauhan (JIRA)> incorrect count when multiple inserts with union all > > > Key: HIVE-13737 > URL: https://issues.apache.org/jira/browse/HIVE-13737 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 > Environment: hdp 2.3.4.7 on Red Hat 6 >Reporter: Frank Luo >Priority: Critical > > Here is a test case to illustrate the issue. It seems MR works fine but Tez > is having the problem. > CREATE TABLE test(col1 STRING); > CREATE TABLE src (col1 string); > insert into table src values ('a'); > INSERT into TABLE test > select * from ( >SELECT * from src >UNION ALL >SELECT * from src) x; > -- do it one more time > INSERT INTO TABLE test >SELECT * from src >UNION ALL >SELECT * from src; > --below gives correct result > SELECT * FROM TEST; > --count is incorrect. It might give either '1' or '2', but I am expecting '4' > SELECT count (*) FROM test; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6683) Beeline does not accept comments at end of line
[ https://issues.apache.org/jira/browse/HIVE-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302312#comment-15302312 ] Sergio Peña commented on HIVE-6683: --- Thanks [~Tauruzzz] Could you create a JIRA, and include necessary information to reproduce and versions? > Beeline does not accept comments at end of line > --- > > Key: HIVE-6683 > URL: https://issues.apache.org/jira/browse/HIVE-6683 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.10.0 >Reporter: Jeremy Beard >Assignee: Sergio Peña > Labels: TODOC15 > Fix For: 1.1.0 > > Attachments: HIVE-6683.1.patch, HIVE-6683.1.patch > > > Beeline fails to read queries where lines have comments at the end. This > works in the embedded Hive CLI. > Example: > SELECT > 1 -- this is a comment about this value > FROM > table; > Error: Error while processing statement: FAILED: ParseException line 1:36 > mismatched input '' expecting FROM near '1' in from clause > (state=42000,code=4) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all
[ https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302310#comment-15302310 ] Ashutosh Chauhan commented on HIVE-13737: - StatsOptimizer is throwing query off. Try turning it off: {code} set hive.compute.query.using.stats=false; {code} > incorrect count when multiple inserts with union all > > > Key: HIVE-13737 > URL: https://issues.apache.org/jira/browse/HIVE-13737 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 > Environment: hdp 2.3.4.7 on Red Hat 6 >Reporter: Frank Luo >Priority: Critical > > Here is a test case to illustrate the issue. It seems MR works fine but Tez > is having the problem. > CREATE TABLE test(col1 STRING); > CREATE TABLE src (col1 string); > insert into table src values ('a'); > INSERT into TABLE test > select * from ( >SELECT * from src >UNION ALL >SELECT * from src) x; > -- do it one more time > INSERT INTO TABLE test >SELECT * from src >UNION ALL >SELECT * from src; > --below gives correct result > SELECT * FROM TEST; > --count is incorrect. It might give either '1' or '2', but I am expecting '4' > SELECT count (*) FROM test; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302301#comment-15302301 ] Ashutosh Chauhan commented on HIVE-13826: - branch-2.1 has already been cut. master is on 2.2 now If you want this to be in 2.1 release then its need to be committed to 2.1-branch specifically. > Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER > > > Key: HIVE-13826 > URL: https://issues.apache.org/jira/browse/HIVE-13826 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch > > > GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER > (i.e. as single item for WHERE). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all
[ https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302294#comment-15302294 ] frank luo commented on HIVE-13737: -- hive> explain INSERT INTO TABLE test SELECT * from src UNION ALL SELECT * from src; OK Plan not optimized by CBO. Vertex dependency in root stage Map 1 <- Union 2 (CONTAINS) Map 3 <- Union 2 (CONTAINS) Stage-4 Stats-Aggr Operator Stage-0 Move Operator table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} Stage-2 Dependency Collection{} Stage-1 Union 2 |<-Map 1 [CONTAINS] | File Output Operator [FS_6] | compressed:false | Statistics:Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE | table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} | Select Operator [SEL_1] |outputColumnNames:["_col0"] |Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE |TableScan [TS_0] | alias:src | Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE |<-Map 3 [CONTAINS] File Output Operator [FS_6] compressed:false Statistics:Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} Select Operator [SEL_3] outputColumnNames:["_col0"] Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE TableScan [TS_2] alias:src Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE Stage-3 Stats-Aggr Operator Please refer to the previous Stage-0 Time taken: 0.088 seconds, Fetched: 41 row(s) hive> explain SELECT count(*) FROM test; OK Plan not optimized by CBO. Stage-0 Fetch Operator limit:1 Time taken: 0.037 seconds, Fetched: 6 row(s) > incorrect count when multiple inserts with union all > > > Key: HIVE-13737 > URL: https://issues.apache.org/jira/browse/HIVE-13737 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 > Environment: hdp 2.3.4.7 on Red Hat 6 >Reporter: Frank Luo >Priority: Critical > > Here is a test case to illustrate the issue. It seems MR works fine but Tez > is having the problem. > CREATE TABLE test(col1 STRING); > CREATE TABLE src (col1 string); > insert into table src values ('a'); > INSERT into TABLE test > select * from ( >SELECT * from src >UNION ALL >SELECT * from src) x; > -- do it one more time > INSERT INTO TABLE test >SELECT * from src >UNION ALL >SELECT * from src; > --below gives correct result > SELECT * FROM TEST; > --count is incorrect. It might give either '1' or '2', but I am expecting '4' > SELECT count (*) FROM test; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all
[ https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302296#comment-15302296 ] frank luo commented on HIVE-13737: -- forgot to mention I was using TEZ on Hive. > incorrect count when multiple inserts with union all > > > Key: HIVE-13737 > URL: https://issues.apache.org/jira/browse/HIVE-13737 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 > Environment: hdp 2.3.4.7 on Red Hat 6 >Reporter: Frank Luo >Priority: Critical > > Here is a test case to illustrate the issue. It seems MR works fine but Tez > is having the problem. > CREATE TABLE test(col1 STRING); > CREATE TABLE src (col1 string); > insert into table src values ('a'); > INSERT into TABLE test > select * from ( >SELECT * from src >UNION ALL >SELECT * from src) x; > -- do it one more time > INSERT INTO TABLE test >SELECT * from src >UNION ALL >SELECT * from src; > --below gives correct result > SELECT * FROM TEST; > --count is incorrect. It might give either '1' or '2', but I am expecting '4' > SELECT count (*) FROM test; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302286#comment-15302286 ] Matt McCline commented on HIVE-13826: - Also committed to branch-1 > Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER > > > Key: HIVE-13826 > URL: https://issues.apache.org/jira/browse/HIVE-13826 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch > > > GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER > (i.e. as single item for WHERE). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13826: Fix Version/s: 1.3.0 > Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER > > > Key: HIVE-13826 > URL: https://issues.apache.org/jira/browse/HIVE-13826 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch > > > GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER > (i.e. as single item for WHERE). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302264#comment-15302264 ] Ashutosh Chauhan commented on HIVE-13826: - [~mmccline] Would you also like to commit this to branch-2.1? > Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER > > > Key: HIVE-13826 > URL: https://issues.apache.org/jira/browse/HIVE-13826 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch > > > GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER > (i.e. as single item for WHERE). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all
[ https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302261#comment-15302261 ] Ashutosh Chauhan commented on HIVE-13737: - Can you add explain output? > incorrect count when multiple inserts with union all > > > Key: HIVE-13737 > URL: https://issues.apache.org/jira/browse/HIVE-13737 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 > Environment: hdp 2.3.4.7 on Red Hat 6 >Reporter: Frank Luo >Priority: Critical > > Here is a test case to illustrate the issue. It seems MR works fine but Tez > is having the problem. > CREATE TABLE test(col1 STRING); > CREATE TABLE src (col1 string); > insert into table src values ('a'); > INSERT into TABLE test > select * from ( >SELECT * from src >UNION ALL >SELECT * from src) x; > -- do it one more time > INSERT INTO TABLE test >SELECT * from src >UNION ALL >SELECT * from src; > --below gives correct result > SELECT * FROM TEST; > --count is incorrect. It might give either '1' or '2', but I am expecting '4' > SELECT count (*) FROM test; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13850) File name conflict when have multiple INSERT INTO queries running in parallel
[ https://issues.apache.org/jira/browse/HIVE-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302260#comment-15302260 ] Ashutosh Chauhan commented on HIVE-13850: - Whatever name you chose you will always be susceptible to [TOCTTOU issue | https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use] since name is chosen by different process (hive cli) then the one doing renames (Namenode) Until HDFS adds merge api (HDFS-9763) best way to handle this scenario is to turn on locking https://cwiki.apache.org/confluence/display/Hive/Locking > File name conflict when have multiple INSERT INTO queries running in parallel > - > > Key: HIVE-13850 > URL: https://issues.apache.org/jira/browse/HIVE-13850 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Bing Li >Assignee: Bing Li > Attachments: HIVE-13850-1.2.1.patch > > > We have an application which connect to HiveServer2 via JDBC. > In the application, it executes "INSERT INTO" query to the same table. > If there are a lot of users running the application at the same time. Some of > the INSERT could fail. > The root cause is that in Hive.checkPaths(), it uses the following method to > check the existing of the file. But if there are multiple inserts running in > parallel, it will led to the conflict. > for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); > counter++) { > itemDest = new Path(destf, name + ("_copy_" + counter) + > filetype); > } > The Error Message > === > In hive log, > org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error > while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met > > adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46- > 23_642_2056172497900766879-3321/-ext-1/00_0 to > hdfs://node:8020/apps/hive > /warehouse/metadata.db/scalding_stats/00_0_copy_9014 > at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: > 2719) > at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: > 1645) > > In hadoop log, > WARN hdfs.StateChange (FSDirRenameOp.java: > unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo: > failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- > staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext- > 1/00_0 to /apps/hive/warehouse/metadata. > db/scalding_stats/00_0_copy_9014 because destination exists -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13301) Hive on spark local mode broken
[ https://issues.apache.org/jira/browse/HIVE-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302255#comment-15302255 ] Teng Qiu edited comment on HIVE-13301 at 5/26/16 3:31 PM: -- Hi, i run into the same issue, can not simply remove calcite-avatica jar... i replaced calcite-avatica jar with this one https://repo1.maven.org/maven2/org/apache/calcite/avatica/avatica/1.7.1/avatica-1.7.1.jar then it works, but not sure what is the difference between avatica jar and calcite-avatica jar... really very ugly hack here... was (Author: chutium): Hi, i run into the same issue, can not simply remove calcite-avatica jar... i replaced calcite-avatica jar with this one https://repo1.maven.org/maven2/org/apache/calcite/avatica/avatica/1.7.1/avatica-1.7.1.jar then it works. > Hive on spark local mode broken > --- > > Key: HIVE-13301 > URL: https://issues.apache.org/jira/browse/HIVE-13301 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Szehon Ho > > Was trying to run hive-on-spark local mode (set spark.master=local), and > found it is not working due to jackson-databind conflict with spark's version. > {noformat} > 16/03/17 13:55:43 [f2e832af-82fc-426b-b0b1-ad201210cef4 main]: INFO > exec.SerializationUtilities: Serializing MapWork using kryo > java.lang.NoSuchMethodError: > com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class; > at > com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala:49) > at > com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala) > at > com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule$class.$init$(ScalaNumberDeserializersModule.scala:61) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule.(DefaultScalaModule.scala:19) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala:35) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala) > at > org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala:81) > at > org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala) > at org.apache.spark.SparkContext.withScope(SparkContext.scala:714) > at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:991) > at > org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:419) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117) > at > org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.execute(LocalHiveSparkClient.java:130) > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:71) > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:94) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:172) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1838) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1579) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1353) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1124) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1112) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:779) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:718) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > {noformat} > Seems conflicting version of this jackson-databind class is being brought in > via calcite-avatica.jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13301) Hive on spark local mode broken
[ https://issues.apache.org/jira/browse/HIVE-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302255#comment-15302255 ] Teng Qiu commented on HIVE-13301: - Hi, i run into the same issue, can not simply remove calcite-avatica jar... i replaced calcite-avatica jar with this one https://repo1.maven.org/maven2/org/apache/calcite/avatica/avatica/1.7.1/avatica-1.7.1.jar then it works. > Hive on spark local mode broken > --- > > Key: HIVE-13301 > URL: https://issues.apache.org/jira/browse/HIVE-13301 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Szehon Ho > > Was trying to run hive-on-spark local mode (set spark.master=local), and > found it is not working due to jackson-databind conflict with spark's version. > {noformat} > 16/03/17 13:55:43 [f2e832af-82fc-426b-b0b1-ad201210cef4 main]: INFO > exec.SerializationUtilities: Serializing MapWork using kryo > java.lang.NoSuchMethodError: > com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class; > at > com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala:49) > at > com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.(ScalaNumberDeserializersModule.scala) > at > com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule$class.$init$(ScalaNumberDeserializersModule.scala:61) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule.(DefaultScalaModule.scala:19) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala:35) > at > com.fasterxml.jackson.module.scala.DefaultScalaModule$.(DefaultScalaModule.scala) > at > org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala:81) > at > org.apache.spark.rdd.RDDOperationScope$.(RDDOperationScope.scala) > at org.apache.spark.SparkContext.withScope(SparkContext.scala:714) > at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:991) > at > org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:419) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117) > at > org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.execute(LocalHiveSparkClient.java:130) > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:71) > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:94) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:172) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1838) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1579) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1353) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1124) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1112) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:779) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:718) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > {noformat} > Seems conflicting version of this jackson-databind class is being brought in > via calcite-avatica.jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13737) incorrect count when multiple inserts with union all
[ https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302252#comment-15302252 ] frank luo commented on HIVE-13737: -- it always gives wrong result. > incorrect count when multiple inserts with union all > > > Key: HIVE-13737 > URL: https://issues.apache.org/jira/browse/HIVE-13737 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 > Environment: hdp 2.3.4.7 on Red Hat 6 >Reporter: Frank Luo >Priority: Critical > > Here is a test case to illustrate the issue. It seems MR works fine but Tez > is having the problem. > CREATE TABLE test(col1 STRING); > CREATE TABLE src (col1 string); > insert into table src values ('a'); > INSERT into TABLE test > select * from ( >SELECT * from src >UNION ALL >SELECT * from src) x; > -- do it one more time > INSERT INTO TABLE test >SELECT * from src >UNION ALL >SELECT * from src; > --below gives correct result > SELECT * FROM TEST; > --count is incorrect. It might give either '1' or '2', but I am expecting '4' > SELECT count (*) FROM test; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13826: Resolution: Fixed Status: Resolved (was: Patch Available) > Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER > > > Key: HIVE-13826 > URL: https://issues.apache.org/jira/browse/HIVE-13826 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch > > > GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER > (i.e. as single item for WHERE). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302248#comment-15302248 ] Matt McCline commented on HIVE-13826: - Committed to master. > Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER > > > Key: HIVE-13826 > URL: https://issues.apache.org/jira/browse/HIVE-13826 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch > > > GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER > (i.e. as single item for WHERE). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302236#comment-15302236 ] Matt McCline commented on HIVE-13826: - None of the test failures are related. > Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER > > > Key: HIVE-13826 > URL: https://issues.apache.org/jira/browse/HIVE-13826 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13826.01.patch, HIVE-13826.02.patch > > > GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER > (i.e. as single item for WHERE). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM
[ https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajat Khandelwal updated HIVE-13862: Attachment: HIVE-13862.patch Simple enough patch. > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter > falls back to ORM > --- > > Key: HIVE-13862 > URL: https://issues.apache.org/jira/browse/HIVE-13862 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Fix For: 2.1.0 > > Attachments: HIVE-13862.patch > > > We are seeing following exception and calls fall back to ORM which make it > costly : > {noformat} > WARN org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, > falling back to ORM > java.lang.ClassCastException: > org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to > java.lang.Number > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13850) File name conflict when have multiple INSERT INTO queries running in parallel
[ https://issues.apache.org/jira/browse/HIVE-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-13850: --- Description: We have an application which connect to HiveServer2 via JDBC. In the application, it executes "INSERT INTO" query to the same table. If there are a lot of users running the application at the same time. Some of the INSERT could fail. The root cause is that in Hive.checkPaths(), it uses the following method to check the existing of the file. But if there are multiple inserts running in parallel, it will led to the conflict. for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); counter++) { itemDest = new Path(destf, name + ("_copy_" + counter) + filetype); } The Error Message === In hive log, org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46- 23_642_2056172497900766879-3321/-ext-1/00_0 to hdfs://node:8020/apps/hive /warehouse/metadata.db/scalding_stats/00_0_copy_9014 at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 2719) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 1645) In hadoop log, WARN hdfs.StateChange (FSDirRenameOp.java: unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo: failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext- 1/00_0 to /apps/hive/warehouse/metadata. db/scalding_stats/00_0_copy_9014 because destination exists was: We have an application which connect to HiveServer2 via JDBC. In the application, it executes "INSERT INTO" query to the same table. If there are a lot of users running the application at the same time. Some of the INSERT could fail. In hive log, org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46- 23_642_2056172497900766879-3321/-ext-1/00_0 to hdfs://node:8020/apps/hive /warehouse/metadata.db/scalding_stats/00_0_copy_9014 at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java: 2719) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java: 1645) In hadoop log, WARN hdfs.StateChange (FSDirRenameOp.java: unprotectedRenameTo(174)) - DIR* FSDirectory.unprotectedRenameTo: failed to rename /apps/hive/warehouse/metadata.db/scalding_stats/.hive- staging_hive_2016-05-10_18-46-23_642_2056172497900766879-3321/-ext- 1/00_0 to /apps/hive/warehouse/metadata. db/scalding_stats/00_0_copy_9014 because destination exists > File name conflict when have multiple INSERT INTO queries running in parallel > - > > Key: HIVE-13850 > URL: https://issues.apache.org/jira/browse/HIVE-13850 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Bing Li >Assignee: Bing Li > > We have an application which connect to HiveServer2 via JDBC. > In the application, it executes "INSERT INTO" query to the same table. > If there are a lot of users running the application at the same time. Some of > the INSERT could fail. > The root cause is that in Hive.checkPaths(), it uses the following method to > check the existing of the file. But if there are multiple inserts running in > parallel, it will led to the conflict. > for (int counter = 1; fs.exists(itemDest) || destExists(result, itemDest); > counter++) { > itemDest = new Path(destf, name + ("_copy_" + counter) + > filetype); > } > The Error Message > === > In hive log, > org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error > while moving files!!! Cannot move hdfs://node:8020/apps/hive/warehouse/met > > adata.db/scalding_stats/.hive-staging_hive_2016-05-10_18-46- > 23_642_2056172497900766879-3321/-ext-1/00_0 to > hdfs://node:8020/apps/hive >
[jira] [Commented] (HIVE-13808) Use constant expressions to backtrack when we create ReduceSink
[ https://issues.apache.org/jira/browse/HIVE-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302152#comment-15302152 ] Hive QA commented on HIVE-13808: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12805948/HIVE-13808.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 130 failed/errored test(s), 10039 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_decimal_trailing.q-subquery_in.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_grouping_sets.q-update_all_partitioned.q-cte_5.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoin_union_remove_2.q-timestamp_null.q-union32.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-smb_mapjoin_4.q-groupby8_map.q-groupby4_map.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cp_sel org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_resolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_matchpath org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_basic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reducesink_dedup org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_date_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_arithmetic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_gby2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf_matchpath org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_date_1
[jira] [Updated] (HIVE-13816) Infer constants directly when we create semijoin
[ https://issues.apache.org/jira/browse/HIVE-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13816: --- Target Version/s: (was: 2.1.0) > Infer constants directly when we create semijoin > > > Key: HIVE-13816 > URL: https://issues.apache.org/jira/browse/HIVE-13816 > Project: Hive > Issue Type: Sub-task > Components: Parser >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Follow-up on HIVE-13068. > When we create a left semijoin, we could infer the constants from the SEL > below when we create the GB to remove duplicates on the right hand side. > Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out > {noformat} > explain select table1.id, table1.val, table1.val1 from table1 left semi join > table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid = > 100; > {noformat} > Plan: > {noformat} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: table1 > Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: (((dimid = 100) = true) and (dimid = 100)) (type: > boolean) > Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), val (type: string), val1 (type: > string) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > key expressions: 100 (type: int), true (type: boolean) > sort order: ++ > Map-reduce partition columns: 100 (type: int), true (type: > boolean) > Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE > Column stats: NONE > value expressions: _col0 (type: int), _col1 (type: string), > _col2 (type: string) > TableScan > alias: table3 > Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: (((id = 100) = true) and (id = 100)) (type: boolean) > Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: 100 (type: int), true (type: boolean) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE > Column stats: NONE > Group By Operator > keys: _col0 (type: int), _col1 (type: boolean) > mode: hash > outputColumnNames: _col0, _col1 > Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: int), _col1 (type: boolean) > sort order: ++ > Map-reduce partition columns: _col0 (type: int), _col1 > (type: boolean) > Statistics: Num rows: 1 Data size: 3 Basic stats: > COMPLETE Column stats: NONE > Reduce Operator Tree: > Join Operator > condition map: >Left Semi Join 0 to 1 > keys: > 0 100 (type: int), true (type: boolean) > 1 _col0 (type: int), _col1 (type: boolean) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column > stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE > Column stats: NONE > table: > input format: org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13811) Constant not removed in index_auto_unused.q.out
[ https://issues.apache.org/jira/browse/HIVE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13811: --- Target Version/s: (was: 2.1.0) > Constant not removed in index_auto_unused.q.out > --- > > Key: HIVE-13811 > URL: https://issues.apache.org/jira/browse/HIVE-13811 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Follow-up on HIVE-13068. > In test file ql/src/test/results/clientpositive/index_auto_unused.q.out. > After HIVE-13068 goes in, the following filter is not folded after > PartitionPruning is done: > {{filterExpr: ((ds = '2008-04-09') and (12.0 = 12.0) and (UDFToDouble(key) < > 10.0)) (type: boolean)}} > Further, SimpleFetchOptimizer got disabled. > All this needs further investigation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13815) Improve logic to infer false predicates
[ https://issues.apache.org/jira/browse/HIVE-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13815: --- Target Version/s: (was: 2.1.0) > Improve logic to infer false predicates > --- > > Key: HIVE-13815 > URL: https://issues.apache.org/jira/browse/HIVE-13815 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Follow-up/extension of the work done in HIVE-13068. > Ex. > ql/src/test/results/clientpositive/annotate_stats_filter.q.out > {{predicate: ((year = 2001) and (state = 'OH') and (state = 'FL')) (type: > boolean)}} -> {{false}} > ql/src/test/results/clientpositive/cbo_rp_join1.q.out > {{predicate: ((_col0 = _col1) and (_col1 = 40) and (_col0 = 40)) (type: > boolean)}} -> {{predicate: ((_col1 = 40) and (_col0 = 40)) (type: boolean)}} > ql/src/test/results/clientpositive/constprog_semijoin.q.out > {{predicate: (((id = 100) = true) and (id <> 100)) (type: boolean)}} -> > {{false}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13804) Propagate constant expressions through insert
[ https://issues.apache.org/jira/browse/HIVE-13804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13804: --- Target Version/s: (was: 2.1.0) > Propagate constant expressions through insert > - > > Key: HIVE-13804 > URL: https://issues.apache.org/jira/browse/HIVE-13804 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Follow-up of HIVE-13068. > The problem is that CBO optimizes the select query and then the insert part > of the query is attached; after HIVE-13068, ConstantPropagate in Hive does > not kick in anymore because CBO optimized the plan, thus we may miss > opportunity to propagate constant till the top of the plan. > Ex. ql/src/test/results/clientpositive/cp_sel.q.out > {noformat} > insert overwrite table testpartbucket partition(ds,hr) select > key,value,'hello' as ds, 'world' as hr from srcpart where hr=11; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13805) Extend HiveSortLimitPullUpConstantsRule to pull up constants even when SortLimit is the root of the plan
[ https://issues.apache.org/jira/browse/HIVE-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13805: --- Target Version/s: (was: 2.1.0) > Extend HiveSortLimitPullUpConstantsRule to pull up constants even when > SortLimit is the root of the plan > > > Key: HIVE-13805 > URL: https://issues.apache.org/jira/browse/HIVE-13805 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Follow-up of HIVE-13068. > Limitation in the original HiveSortLimitPullUpConstantsRule rule. > Currently Calcite rule does not pull-up constants when the Sort/Limit > operator is on top of the operator tree, as this was causing Hive limit > related optimizations to not kick in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13863) Improve AnnotateWithStatistics with support for cartesian product
[ https://issues.apache.org/jira/browse/HIVE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13863: --- Status: Patch Available (was: In Progress) > Improve AnnotateWithStatistics with support for cartesian product > - > > Key: HIVE-13863 > URL: https://issues.apache.org/jira/browse/HIVE-13863 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13863.patch > > > Currently cartesian product stats based on cardinality of inputs are not > inferred correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302103#comment-15302103 ] Aihua Xu commented on HIVE-13149: - Target version is removed. It's an improvement so not necessary for 2.1.0. > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.2.0 > > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, > HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13149: Target Version/s: (was: 2.1.0) > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.2.0 > > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, > HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13746) Data duplication when insert overwrite
[ https://issues.apache.org/jira/browse/HIVE-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302091#comment-15302091 ] Chinna Rao Lalam commented on HIVE-13746: - Hi Bill Wailliam, Can you give exact scenario. I have tried with below queries on master it is working fine. {code} drop table sample1; create table sample1(a STRING, b int) partitioned by (partitionid string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/home/chinna/install/data/file1.txt' INTO TABLE sample1 partition (partitionid = "one"); drop table sample2; create table sample2(a STRING, b int) partitioned by (partitionid string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/home/chinna/install/data/file2.txt' INTO TABLE sample2 partition (partitionid = "one"); INSERT OVERWRITE TABLE sample2 PARTITION (partitionid = 'one') select a,b from sample1; {code} > Data duplication when insert overwrite > --- > > Key: HIVE-13746 > URL: https://issues.apache.org/jira/browse/HIVE-13746 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Bill Wailliam >Priority: Critical > > Data duplication when insert overwrite .The old data cannot be deleted -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13149: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks Jimmy for reviewing. > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.2.0 > > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, > HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13792) Show create table should not show stats info in the table properties
[ https://issues.apache.org/jira/browse/HIVE-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302063#comment-15302063 ] Aihua Xu commented on HIVE-13792: - The tests are not related. [~ctang.ma] Can you help review the code? > Show create table should not show stats info in the table properties > > > Key: HIVE-13792 > URL: https://issues.apache.org/jira/browse/HIVE-13792 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13792.1.patch, HIVE-13792.2.patch, > HIVE-13792.3.patch > > > From the test > org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries > failure, we are printing table stats in show create table parameters. This > info should be skipped since it would be incorrect when you just copy them to > create a table. > {noformat} > PREHOOK: query: SHOW CREATE TABLE hbase_table_1_like > PREHOOK: type: SHOW_CREATETABLE > PREHOOK: Input: default@hbase_table_1_like > POSTHOOK: query: SHOW CREATE TABLE hbase_table_1_like > POSTHOOK: type: SHOW_CREATETABLE > POSTHOOK: Input: default@hbase_table_1_like > CREATE EXTERNAL TABLE `hbase_table_1_like`( > `key` int COMMENT 'It is a column key', > `value` string COMMENT 'It is the column string value') > ROW FORMAT SERDE > 'org.apache.hadoop.hive.hbase.HBaseSerDe' > STORED BY > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( > 'hbase.columns.mapping'='cf:string', > 'serialization.format'='1') > TBLPROPERTIES ( > 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', > 'hbase.table.name'='hbase_table_0', > 'numFiles'='0', > 'numRows'='0', > 'rawDataSize'='0', > 'totalSize'='0', > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM
[ https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302046#comment-15302046 ] Jesus Camacho Rodriguez commented on HIVE-13862: [~amareshwari], thanks for letting me know. I plan to create the first RC beginning next week; there should be time to get it in. > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter > falls back to ORM > --- > > Key: HIVE-13862 > URL: https://issues.apache.org/jira/browse/HIVE-13862 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Fix For: 2.1.0 > > > We are seeing following exception and calls fall back to ORM which make it > costly : > {noformat} > WARN org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, > falling back to ORM > java.lang.ClassCastException: > org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to > java.lang.Number > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM
[ https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302044#comment-15302044 ] Amareshwari Sriramadasu commented on HIVE-13862: [~jcamachorodriguez], We would like to get this in 2.1.0 release. > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter > falls back to ORM > --- > > Key: HIVE-13862 > URL: https://issues.apache.org/jira/browse/HIVE-13862 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Fix For: 2.1.0 > > > We are seeing following exception and calls fall back to ORM which make it > costly : > {noformat} > WARN org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, > falling back to ORM > java.lang.ClassCastException: > org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to > java.lang.Number > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606) > ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746) > [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301970#comment-15301970 ] Hive QA commented on HIVE-13860: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806318/HIVE-13860-java8.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build-JAVA8/10/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build-JAVA8/10/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-JAVA8-10/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]] + export JAVA_HOME=/usr/java/jdk1.8.0_25 + JAVA_HOME=/usr/java/jdk1.8.0_25 + export PATH=/usr/java/jdk1.8.0_25/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.8.0_25/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-JAVA8-10/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z java8 ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 76130a9 HIVE-13269: Simplify comparison expressions using column stats (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) + git clean -f -d + git checkout java8 Switched to branch 'java8' Your branch is behind 'origin/java8' by 1 commit, and can be fast-forwarded. + git reset --hard origin/java8 HEAD is now at 4cbc10e HIVE-13409: Fix JDK8 test failures related to COLUMN_STATS_ACCURATE (Mohit Sabharwal, reviewed by Sergio Pena) + git merge --ff-only origin/java8 Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12806318 - PreCommit-HIVE-MASTER-Build-JAVA8 > Fix more json related JDK8 test failures > > > Key: HIVE-13860 > URL: https://issues.apache.org/jira/browse/HIVE-13860 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-13860-java8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
[ https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301969#comment-15301969 ] Hive QA commented on HIVE-13826: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806242/HIVE-13826.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 76 failed/errored test(s), 10016 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-auto_join1.q-schema_evol_text_vec_mapwork_part_all_complex.q-vector_complex_join.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-auto_join30.q-vector_decimal_10_0.q-acid_globallimit.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-parallel_join1.q-escape_distributeby1.q-auto_sortmerge_join_7.q-and-12-more - did not produce a TEST-*.xml file TestSparkClient - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join27 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5_noskew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_complex_types org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_map_ppr_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_mixed org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_outer_join4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union19 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_math_funcs org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
[jira] [Commented] (HIVE-13831) Error pushing predicates to HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301941#comment-15301941 ] Jesus Camacho Rodriguez commented on HIVE-13831: Fails are not related to this patch. > Error pushing predicates to HBase storage handler > - > > Key: HIVE-13831 > URL: https://issues.apache.org/jira/browse/HIVE-13831 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13831.patch > > > Discovered while working on HIVE-13693. > There is an error on the predicates that we can push to HBaseStorageHandler. > In particular, range predicates of the shape {{(bounded, open)}} and {{(open, > bounded)}} over long or int columns get pushed and return wrong results. > The problem has to do with the storage order for keys in HBase. Keys are > sorted lexicographically. Since the byte representation of negative values > comes after the positive values, open range predicates need special handling > that we do not have right now. > Thus, for instance, when we push the predicate {{key > 2}}, we return all > records with column _key_ greater than 2, plus the records with negative > values for the column _key_. This problem does not get exposed if a filter is > kept in the Hive operator tree, but we should not assume the latest. > This fix avoids pushing this kind of predicates to the storage handler, > returning them in the _residual_ part of the predicate that cannot be pushed. > In the future, special handling might be added to support this kind of > predicates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13844) Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class
[ https://issues.apache.org/jira/browse/HIVE-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301930#comment-15301930 ] Jesus Camacho Rodriguez commented on HIVE-13844: +1 > Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class > > > Key: HIVE-13844 > URL: https://issues.apache.org/jira/browse/HIVE-13844 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 2.0.0 >Reporter: Svetozar Ivanov >Priority: Minor > Attachments: HIVE-13844.patch > > > Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name > 'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name > is 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' > {code} > public static enum IndexType { > AGGREGATE_TABLE("aggregate", > "org.apache.hadoop.hive.ql.AggregateIndexHandler"), > COMPACT_SUMMARY_TABLE("compact", > "org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"), > > BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler"); > private IndexType(String indexType, String className) { > indexTypeName = indexType; > this.handlerClsName = className; > } > private final String indexTypeName; > private final String handlerClsName; > public String getName() { > return indexTypeName; > } > public String getHandlerClsName() { > return handlerClsName; > } > } > > {code} > Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't > work as we got java.lang.NullPointerException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules
[ https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13861: --- Attachment: HIVE-13861.patch > Fix up nullability issue that might be created by pull up constants rules > - > > Key: HIVE-13861 > URL: https://issues.apache.org/jira/browse/HIVE-13861 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13861.patch > > > When we pull up constants through Union or Sort operators, we might end up > rewriting the original expression into an expression whose schema has > different nullability properties for some of its columns. > This results in AssertionError of the following kind: > {noformat} > ... > org.apache.hive.service.cli.HiveSQLException: Error running query: > java.lang.AssertionError: Internal error: Cannot add expression of different > type to set: > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13840) Orc split generation is reading file footers twice
[ https://issues.apache.org/jira/browse/HIVE-13840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13840: - Attachment: HIVE-13840.2.patch In the updated patch 1) Another file system call in split generation is avoided by specifying max length in reader. If max length is not specified ORC reader will issue fs.getFileStatus(path) to find the length of the file. 2) Added file system stats to MockFS which is used in the newly added test case fyi.. [~rajesh.balamohan],[~ashutoshc] [~owen.omalley] Can you please review the patch? > Orc split generation is reading file footers twice > -- > > Key: HIVE-13840 > URL: https://issues.apache.org/jira/browse/HIVE-13840 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13840.1.patch, HIVE-13840.2.patch > > > Recent refactorings to move orc out introduced a regression in split > generation. This leads to reading the orc file footers twice during split > generation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules
[ https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13861 started by Jesus Camacho Rodriguez. -- > Fix up nullability issue that might be created by pull up constants rules > - > > Key: HIVE-13861 > URL: https://issues.apache.org/jira/browse/HIVE-13861 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > When we pull up constants through Union or Sort operators, we might end up > rewriting the original expression into an expression whose schema has > different nullability properties for some of its columns. > This results in AssertionError of the following kind: > {noformat} > ... > org.apache.hive.service.cli.HiveSQLException: Error running query: > java.lang.AssertionError: Internal error: Cannot add expression of different > type to set: > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules
[ https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13861: --- Status: Patch Available (was: In Progress) > Fix up nullability issue that might be created by pull up constants rules > - > > Key: HIVE-13861 > URL: https://issues.apache.org/jira/browse/HIVE-13861 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > When we pull up constants through Union or Sort operators, we might end up > rewriting the original expression into an expression whose schema has > different nullability properties for some of its columns. > This results in AssertionError of the following kind: > {noformat} > ... > org.apache.hive.service.cli.HiveSQLException: Error running query: > java.lang.AssertionError: Internal error: Cannot add expression of different > type to set: > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13721) HPL/SQL COPY FROM FTP Statement: lack of DIR option leads to NPE
[ https://issues.apache.org/jira/browse/HIVE-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301886#comment-15301886 ] Dmitry Tolpeko commented on HIVE-13721: --- Small fix, I included it to patch for HIVE-13540 > HPL/SQL COPY FROM FTP Statement: lack of DIR option leads to NPE > > > Key: HIVE-13721 > URL: https://issues.apache.org/jira/browse/HIVE-13721 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin > > The docs (http://www.hplsql.org/copy-from-ftp) suggest DIR is optional. When > I left it out in: > {code} > copy from ftp hdp250.example.com user 'vagrant' pwd 'vagrant' files > 'sampledata.csv' to /tmp overwrite > {code} > I got: > {code} > Ln:2 Connected to ftp: hdp250.example.com (29 ms) > Ln:2 Retrieving directory listing > Listing the current working FTP directory > Ln:2 Files to copy: 45 bytes, 1 file, 0 subdirectories scanned (27 ms) > Exception in thread "main" java.lang.NullPointerException > at org.apache.hive.hplsql.Ftp.getTargetFileName(Ftp.java:342) > at org.apache.hive.hplsql.Ftp.run(Ftp.java:149) > at org.apache.hive.hplsql.Ftp.copyFiles(Ftp.java:121) > at org.apache.hive.hplsql.Ftp.run(Ftp.java:91) > at org.apache.hive.hplsql.Exec.visitCopy_from_ftp_stmt(Exec.java:1292) > at org.apache.hive.hplsql.Exec.visitCopy_from_ftp_stmt(Exec.java:52) > at > org.apache.hive.hplsql.HplsqlParser$Copy_from_ftp_stmtContext.accept(HplsqlParser.java:11956) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:994) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:52) > at > org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:1012) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at > org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28) > at > org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:446) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:901) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:52) > at > org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:389) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42) > at org.apache.hive.hplsql.Exec.run(Exec.java:760) > at org.apache.hive.hplsql.Exec.run(Exec.java:736) > at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > Traceback leads to: > {code} > /** >* Get the target file relative path and name >*/ > String getTargetFileName(String file) { > int len = dir.length(); > return targetDir + file.substring(len); > } > {code} > in Ftp.java > When I added DIR '/' this worked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13540) Casts to numeric types don't seem to work in hplsql
[ https://issues.apache.org/jira/browse/HIVE-13540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-13540: -- Affects Version/s: 2.2.0 Status: Patch Available (was: Open) Patch submitted. > Casts to numeric types don't seem to work in hplsql > --- > > Key: HIVE-13540 > URL: https://issues.apache.org/jira/browse/HIVE-13540 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.2.0 >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko > Attachments: HIVE-13540.1.patch > > > Maybe I'm doing this wrong? But it seems to be broken. > Casts to string types seem to work fine, but not numbers. > This code: > {code} > temp_int = CAST('1' AS int); > print temp_int > temp_float = CAST('1.2' AS float); > print temp_float > temp_double = CAST('1.2' AS double); > print temp_double > temp_decimal = CAST('1.2' AS decimal(10, 4)); > print temp_decimal > temp_string = CAST('1.2' AS string); > print temp_string > {code} > Produces this output: > {code} > [vagrant@hdp250 hplsql]$ hplsql -f temp2.hplsql > which: no hbase in > (/usr/lib64/qt-3.3/bin:/usr/lib/jvm/java/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/puppetlabs/bin:/usr/local/share/jmeter/bin:/home/vagrant/bin) > WARNING: Use "yarn jar" to launch YARN applications. > null > null > null > null > 1.2 > {code} > The software I'm using is not anything released but is pretty close to the > trunk, 2 weeks old at most. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13540) Casts to numeric types don't seem to work in hplsql
[ https://issues.apache.org/jira/browse/HIVE-13540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-13540: -- Attachment: HIVE-13540.1.patch > Casts to numeric types don't seem to work in hplsql > --- > > Key: HIVE-13540 > URL: https://issues.apache.org/jira/browse/HIVE-13540 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko > Attachments: HIVE-13540.1.patch > > > Maybe I'm doing this wrong? But it seems to be broken. > Casts to string types seem to work fine, but not numbers. > This code: > {code} > temp_int = CAST('1' AS int); > print temp_int > temp_float = CAST('1.2' AS float); > print temp_float > temp_double = CAST('1.2' AS double); > print temp_double > temp_decimal = CAST('1.2' AS decimal(10, 4)); > print temp_decimal > temp_string = CAST('1.2' AS string); > print temp_string > {code} > Produces this output: > {code} > [vagrant@hdp250 hplsql]$ hplsql -f temp2.hplsql > which: no hbase in > (/usr/lib64/qt-3.3/bin:/usr/lib/jvm/java/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/puppetlabs/bin:/usr/local/share/jmeter/bin:/home/vagrant/bin) > WARNING: Use "yarn jar" to launch YARN applications. > null > null > null > null > 1.2 > {code} > The software I'm using is not anything released but is pretty close to the > trunk, 2 weeks old at most. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13792) Show create table should not show stats info in the table properties
[ https://issues.apache.org/jira/browse/HIVE-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301789#comment-15301789 ] Hive QA commented on HIVE-13792: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12805910/HIVE-13792.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10047 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-auto_join1.q-schema_evol_text_vec_mapwork_part_all_complex.q-vector_complex_join.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-update_orig_table.q-union2.q-bucket4.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_grouping_sets.q-update_all_partitioned.q-cte_5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorized_parquet.q-insert_values_non_partitioned.q-schema_evol_orc_nonvec_mapwork_part.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_top_level org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorParallelism org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorWithinDagPriority org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf
[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats
[ https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13269: --- Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) Pushed to master, branch 2.1. Thanks for reviewing [~ashutoshc]! > Simplify comparison expressions using column stats > -- > > Key: HIVE-13269 > URL: https://issues.apache.org/jira/browse/HIVE-13269 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.1.0 > > Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, > HIVE-13269.03.patch, HIVE-13269.04.patch, HIVE-13269.patch, HIVE-13269.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13818: Attachment: vector_bug.q.out > Fast Vector MapJoin not enhanced to use sortOrder when handling > BinarySortable keys for Small Table? > > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, > vector_bug.q.out > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13818: Attachment: vector_bug.q > Fast Vector MapJoin not enhanced to use sortOrder when handling > BinarySortable keys for Small Table? > > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301770#comment-15301770 ] Matt McCline commented on HIVE-13818: - [~gopalv] Thank you very much for working on a repro. I've attached vector_bug.q and its Tez output. The bug isn't triggered. Can you see what I did wrong? Thanks > Fast Vector MapJoin not enhanced to use sortOrder when handling > BinarySortable keys for Small Table? > > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-13860: --- Status: Patch Available (was: Open) > Fix more json related JDK8 test failures > > > Key: HIVE-13860 > URL: https://issues.apache.org/jira/browse/HIVE-13860 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-13860-java8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures
[ https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-13860: --- Attachment: HIVE-13860-java8.patch > Fix more json related JDK8 test failures > > > Key: HIVE-13860 > URL: https://issues.apache.org/jira/browse/HIVE-13860 > Project: Hive > Issue Type: Sub-task > Components: Test >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal > Attachments: HIVE-13860-java8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13729) FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation
[ https://issues.apache.org/jira/browse/HIVE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13729: --- Fix Version/s: (was: 2.1.0) 2.2.0 > FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation > > > Key: HIVE-13729 > URL: https://issues.apache.org/jira/browse/HIVE-13729 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-13729.1.patch, HIVE-13729.2.patch > > > Didn't invoke FileSystem.closeAllForUGI after checkFileAccess. This results > leak in FileSystem$Cache and eventually OOM for HS2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12279) Testcase to verify session temporary files are removed after HIVE-11768
[ https://issues.apache.org/jira/browse/HIVE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12279: --- Fix Version/s: (was: 2.1.0) 2.2.0 > Testcase to verify session temporary files are removed after HIVE-11768 > --- > > Key: HIVE-12279 > URL: https://issues.apache.org/jira/browse/HIVE-12279 > Project: Hive > Issue Type: Test > Components: HiveServer2, Test >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-12279.1.patch > > > We need to make sure HS2 session temporary files are removed after session > ends. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12279) Testcase to verify session temporary files are removed after HIVE-11768
[ https://issues.apache.org/jira/browse/HIVE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301690#comment-15301690 ] Jesus Camacho Rodriguez commented on HIVE-12279: [~daijy], could you push to branch-2.1 too? Master is version 2.2.0 now (I updated the fix version accordingly). Thanks > Testcase to verify session temporary files are removed after HIVE-11768 > --- > > Key: HIVE-12279 > URL: https://issues.apache.org/jira/browse/HIVE-12279 > Project: Hive > Issue Type: Test > Components: HiveServer2, Test >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-12279.1.patch > > > We need to make sure HS2 session temporary files are removed after session > ends. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13513) cleardanglingscratchdir does not work in some version of HDFS
[ https://issues.apache.org/jira/browse/HIVE-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13513: --- Fix Version/s: (was: 2.1.0) 2.2.0 > cleardanglingscratchdir does not work in some version of HDFS > - > > Key: HIVE-13513 > URL: https://issues.apache.org/jira/browse/HIVE-13513 > Project: Hive > Issue Type: Bug >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-13513.1.patch, HIVE-13513.2.patch > > > On some Hadoop version, we keep getting "lease recovery" message at the time > we check for scratchdir by opening for appending: > {code} > Failed to APPEND_FILE xxx for DFSClient_NONMAPREDUCE_785768631_1 on 10.0.0.18 > because lease recovery is in progress. Try again later. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2917) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2677) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2984) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2953) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:655) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:421) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131) > {code} > and > {code} > 16/04/14 04:51:56 ERROR hdfs.DFSClient: Failed to close inode 18963 > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]], > > original=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1017) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1165) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) > {code} > The reason is not clear. However, if we remove hsync from SessionState, > everything works as expected. Attach patch to remove hsync call for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13513) cleardanglingscratchdir does not work in some version of HDFS
[ https://issues.apache.org/jira/browse/HIVE-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301686#comment-15301686 ] Jesus Camacho Rodriguez commented on HIVE-13513: [~daijy], could you push to branch-2.1 too? Master is version 2.2.0 now (I updated the fix version accordingly). Thanks > cleardanglingscratchdir does not work in some version of HDFS > - > > Key: HIVE-13513 > URL: https://issues.apache.org/jira/browse/HIVE-13513 > Project: Hive > Issue Type: Bug >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-13513.1.patch, HIVE-13513.2.patch > > > On some Hadoop version, we keep getting "lease recovery" message at the time > we check for scratchdir by opening for appending: > {code} > Failed to APPEND_FILE xxx for DFSClient_NONMAPREDUCE_785768631_1 on 10.0.0.18 > because lease recovery is in progress. Try again later. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2917) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2677) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2984) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2953) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:655) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:421) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131) > {code} > and > {code} > 16/04/14 04:51:56 ERROR hdfs.DFSClient: Failed to close inode 18963 > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]], > > original=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1017) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1165) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470) > {code} > The reason is not clear. However, if we remove hsync from SessionState, > everything works as expected. Attach patch to remove hsync call for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13551) Make cleardanglingscratchdir work on Windows
[ https://issues.apache.org/jira/browse/HIVE-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301684#comment-15301684 ] Jesus Camacho Rodriguez commented on HIVE-13551: [~daijy], could you push to branch-2.1 too? Master is version 2.2.0 now (I updated the fix version accordingly). Thanks > Make cleardanglingscratchdir work on Windows > > > Key: HIVE-13551 > URL: https://issues.apache.org/jira/browse/HIVE-13551 > Project: Hive > Issue Type: Bug >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-13551.1.patch, HIVE-13551.2.patch > > > See a couple of issues when running cleardanglingscratchdir on Windows, > includes: > 1. dfs.support.append is set to false in Azure cluster, need an alternative > way when append is disabled > 2. fix for cmd scripts > 3. fix UT on Windows -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13551) Make cleardanglingscratchdir work on Windows
[ https://issues.apache.org/jira/browse/HIVE-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13551: --- Fix Version/s: (was: 2.1.0) 2.2.0 > Make cleardanglingscratchdir work on Windows > > > Key: HIVE-13551 > URL: https://issues.apache.org/jira/browse/HIVE-13551 > Project: Hive > Issue Type: Bug >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-13551.1.patch, HIVE-13551.2.patch > > > See a couple of issues when running cleardanglingscratchdir on Windows, > includes: > 1. dfs.support.append is set to false in Azure cluster, need an alternative > way when append is disabled > 2. fix for cmd scripts > 3. fix UT on Windows -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13354) Add ability to specify Compaction options per table and per request
[ https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13354: - Attachment: HIVE-13354.3.patch Thanks Eugene. 1. I split the first round compaction worker job for ttp2 and ttp1 into the explicit way, so that we can compare the value we get from tblproperties (2048) vs the default value (1024). 2. Added symbolic constants in Initiator and CompactorMR. > Add ability to specify Compaction options per table and per request > --- > > Key: HIVE-13354 > URL: https://issues.apache.org/jira/browse/HIVE-13354 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.3.0, 2.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Labels: TODOC2.1 > Attachments: HIVE-13354.1.patch, > HIVE-13354.1.withoutSchemaChange.patch, HIVE-13354.2.patch, HIVE-13354.3.patch > > > Currently the are a few options that determine when automatic compaction is > triggered. They are specified once for the warehouse. > This doesn't make sense - some table may be more important and need to be > compacted more often. > We should allow specifying these on per table basis. > Also, compaction is an MR job launched from within the metastore. There is > currently no way to control job parameters (like memory, for example) except > to specify it in hive-site.xml for metastore which means they are site wide. > Should add a way to specify these per table (perhaps even per compaction if > launched via ALTER TABLE) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301635#comment-15301635 ] Gopal V commented on HIVE-13818: Here's the smallest scenario which triggers the issue right now. {code} create temporary table x (a int) stored as orc; create temporary table y (b int) stored as orc; insert into x values(1); insert into y values(1); select count(1) from x, y where a = b; Caused by: java.io.EOFException at org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54) at org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:81) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98) {code} To test the theory, I tried with {code} create temporary table x1 (a bigint) stored as orc; create temporary table y1 (b bigint) stored as orc; insert into x1 values(1); insert into y1 values(1); select count(1) from x1, y1 where a = b; OK 1 Time taken: 1.532 seconds, Fetched: 1 row(s) {code} > Fast Vector MapJoin not enhanced to use sortOrder when handling > BinarySortable keys for Small Table? > > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13376) HoS emits too many logs with application state
[ https://issues.apache.org/jira/browse/HIVE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301613#comment-15301613 ] Rui Li commented on HIVE-13376: --- Hi [~xuefuz], I just briefly looked at the code. Although there're switches to control whether to log the app state, the switches are not exposed to user via configurations. So I think in order to disable the logging, we either need a log level higher than INFO, or we can disable {{spark.yarn.submit.waitAppCompletion}} (only works for yarn-cluster) . Otherwise we need the interval to avoid the verbose state logs. Let me know if there's other method to achieve it. Related code in {{Client.scala}}: {code} def monitorApplication( appId: ApplicationId, returnOnRunning: Boolean = false, logApplicationReport: Boolean = true): (YarnApplicationState, FinalApplicationStatus) = { val interval = sparkConf.getLong("spark.yarn.report.interval", 1000) var lastState: YarnApplicationState = null while (true) { Thread.sleep(interval) val report: ApplicationReport = try { getApplicationReport(appId) } catch { case e: ApplicationNotFoundException => logError(s"Application $appId not found.") return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED) case NonFatal(e) => logError(s"Failed to contact YARN for application $appId.", e) return (YarnApplicationState.FAILED, FinalApplicationStatus.FAILED) } val state = report.getYarnApplicationState if (logApplicationReport) { logInfo(s"Application report for $appId (state: $state)") // If DEBUG is enabled, log report details every iteration // Otherwise, log them every time the application changes state if (log.isDebugEnabled) { logDebug(formatReportDetails(report)) } else if (lastState != state) { logInfo(formatReportDetails(report)) } } if (lastState != state) { state match { case YarnApplicationState.RUNNING => reportLauncherState(SparkAppHandle.State.RUNNING) case YarnApplicationState.FINISHED => reportLauncherState(SparkAppHandle.State.FINISHED) case YarnApplicationState.FAILED => reportLauncherState(SparkAppHandle.State.FAILED) case YarnApplicationState.KILLED => reportLauncherState(SparkAppHandle.State.KILLED) case _ => } } if (state == YarnApplicationState.FINISHED || state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) { cleanupStagingDir(appId) return (state, report.getFinalApplicationStatus) } if (returnOnRunning && state == YarnApplicationState.RUNNING) { return (state, report.getFinalApplicationStatus) } lastState = state } // Never reached, but keeps compiler happy throw new SparkException("While loop is depleted! This should never happen...") } {code} > HoS emits too many logs with application state > -- > > Key: HIVE-13376 > URL: https://issues.apache.org/jira/browse/HIVE-13376 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 2.1.0 > > Attachments: HIVE-13376.2.patch, HIVE-13376.patch > > > The logs get flooded with something like: > > Mar 28, 3:12:21.851 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > > Mar 28, 3:12:21.912 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report > > for application_1458679386200_0149 (state: RUNNING) > > Mar 28, 3:12:22.853 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > > Mar 28, 3:12:22.913 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report > > for application_1458679386200_0149 (state: RUNNING) > > Mar 28, 3:12:23.855 PMINFO > > org.apache.hive.spark.client.SparkClientImpl > > [stderr-redir-1]: 16/03/28 15:12:23 INFO yarn.Client: Application report > > for application_1458679386200_0161 (state: RUNNING) > While this is good information, it is a bit much. > Seems like SparkJobMonitor hard-codes its interval to 1 second. It should be > higher and perhaps made configurable. -- This message was sent by Atlassian
[jira] [Updated] (HIVE-13518) Hive on Tez: Shuffle joins do not choose the right 'big' table.
[ https://issues.apache.org/jira/browse/HIVE-13518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-13518: -- Attachment: HIVE-13518.3.patch > Hive on Tez: Shuffle joins do not choose the right 'big' table. > --- > > Key: HIVE-13518 > URL: https://issues.apache.org/jira/browse/HIVE-13518 > Project: Hive > Issue Type: Bug >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-13518.1.patch, HIVE-13518.2.patch, > HIVE-13518.3.patch > > > Currently the big table is always assumed to be at position 0 but this isn't > efficient for some queries as the big table at position 1 could have a lot > more keys/skew. We already have a mechanism of choosing the big table that > can be leveraged to make the right choice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13729) FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation
[ https://issues.apache.org/jira/browse/HIVE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301601#comment-15301601 ] Lefty Leverenz commented on HIVE-13729: --- [~daijy], branch-2.1 was cut this morning so now master is for 2.2.0. If you want this patch to go in release 2.1.0, you'll have to commit it to branch-2.1. The same goes for your other commits today: HIVE-12279, HIVE-13513, and HIVE-13551. > FileSystem$Cache leaks in FileUtils.checkFileAccessWithImpersonation > > > Key: HIVE-13729 > URL: https://issues.apache.org/jira/browse/HIVE-13729 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13729.1.patch, HIVE-13729.2.patch > > > Didn't invoke FileSystem.closeAllForUGI after checkFileAccess. This results > leak in FileSystem$Cache and eventually OOM for HS2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?
[ https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301587#comment-15301587 ] Gopal V commented on HIVE-13818: TPC-DS 55 is the one that failed - I will try to get a smaller repro for this, but it looks like it should fail in a two row test-case (or produce incorrect results at least). The larger range join keys in hive-testbench were upgraded to BigInt sometime during the 100Tb testing, you might need to undo some schema changes there to repro this - you might not be testing the Integer:Integer join scenario. I'm repeating this from memory from debugging this a couple of nights ago (might build a repro tomorrow). {code} if (keyBinarySortableDeserializeRead.readCheckNull()) { return; } long key = VectorMapJoinFastLongHashUtil.deserializeLongKey( keyBinarySortableDeserializeRead, hashTableKeyType); {code} As explained above, looks like the BinarySortableSerDe handles Long and Integer differently, so just because the Join ops says LongLongInner, the deserializer for Long cannot be used for joins involving integers. This is *not* an issue with LazyBinary, only BinarySortable encodes Long and Integers differently. In all the runs I could manage, the join worked whenever I cast up to bigint. The problem seems to be that readCheckNull() does not know of what the actual hashTableKeyType here & reads a Long out of an encoded Int & runs out of bytes to read (i.e not 8 bytes). >From the readCheckNull(), this is where it goes into the deep end. {code} /* * We have a field and are positioned to it. Read it. */ ... case INT: { final boolean invert = columnSortOrderIsDesc[fieldIndex]; int v = inputByteBuffer.read(invert) ^ 0x80; for (int i = 0; i < 3; i++) { v = (v << 8) + (inputByteBuffer.read(invert) & 0xff); } currentInt = v; } break; case LONG: { final boolean invert = columnSortOrderIsDesc[fieldIndex]; long v = inputByteBuffer.read(invert) ^ 0x80; for (int i = 0; i < 7; i++) { v = (v << 8) + (inputByteBuffer.read(invert) & 0xff); } currentLong = v; } break; {code} The integer:integer join case hits the 2nd case expression there and throws an EOF. Changing all joins to Long:Long allows me to run queries successfully. > Fast Vector MapJoin not enhanced to use sortOrder when handling > BinarySortable keys for Small Table? > > > Key: HIVE-13818 > URL: https://issues.apache.org/jira/browse/HIVE-13818 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch > > > Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not > this issue according to Gopal/Rajesh/Nita. -- This message was sent by Atlassian JIRA (v6.3.4#6332)