[jira] [Updated] (HIVE-17192) Add InterfaceAudience and InterfaceStability annotations for Stats Collection APIs

2017-08-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17192:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master.

> Add InterfaceAudience and InterfaceStability annotations for Stats Collection 
> APIs
> --
>
> Key: HIVE-17192
> URL: https://issues.apache.org/jira/browse/HIVE-17192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17192.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects

2017-08-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17225:

Attachment: HIVE-17225.2.patch

> HoS DPP pruning sink ops can target parallel work objects
> -
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE17225.1.patch, HIVE-17225.2.patch
>
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> SET hive.auto.convert.join=true;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col int);
> CREATE TABLE regular_table2 (col int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (1), (2), (3), (4), (5), (6);
> INSERT INTO table regular_table2 VALUES (1), (2), (3), (4), (5), (6);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (3);
> SELECT *
> FROM   partitioned_table1,
>regular_table1 rt1,
>regular_table2 rt2
> WHERE  rt1.col = partitioned_table1.part_col
>AND rt2.col = partitioned_table1.part_col;
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   

[jira] [Assigned] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects

2017-08-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17225:
---

Assignee: Sahil Takiar  (was: Janaki Lahorani)

> HoS DPP pruning sink ops can target parallel work objects
> -
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE17225.1.patch
>
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> SET hive.auto.convert.join=true;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col int);
> CREATE TABLE regular_table2 (col int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (1), (2), (3), (4), (5), (6);
> INSERT INTO table regular_table2 VALUES (1), (2), (3), (4), (5), (6);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (3);
> SELECT *
> FROM   partitioned_table1,
>regular_table1 rt1,
>regular_table2 rt2
> WHERE  rt1.col = partitioned_table1.part_col
>AND rt2.col = partitioned_table1.part_col;
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>

[jira] [Commented] (HIVE-17368) DBTokenStore fails to connect in Kerberos enabled remote HMS environment

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142622#comment-16142622
 ] 

Hive QA commented on HIVE-17368:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883845/HIVE-17368.02-branch-2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10603 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explaindenpendencydiffengs]
 (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=139)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[explaindenpendencydiffengs]
 (batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] 
(batchId=125)
org.apache.hadoop.hive.ql.security.TestExtendedAcls.testPartition (batchId=228)
org.apache.hadoop.hive.ql.security.TestFolderPermissions.testPartition 
(batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=176)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6553/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6553/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6553/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883845 - PreCommit-HIVE-Build

> DBTokenStore fails to connect in Kerberos enabled remote HMS environment
> 
>
> Key: HIVE-17368
> URL: https://issues.apache.org/jira/browse/HIVE-17368
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.0, 2.1.0, 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17368.01-branch-2.patch, HIVE-17368.01.patch, 
> HIVE-17368.02-branch-2.patch
>
>
> In setups where HMS is running as a remote process secured using Kerberos, 
> and when {{DBTokenStore}} is configured as the token store, the HS2 Thrift 
> API calls like {{GetDelegationToken}}, {{CancelDelegationToken}} and 
> {{RenewDelegationToken}} fail with exception trace seen below. HS2 is not 
> able to invoke HMS APIs needed to add/remove/renew tokens from the DB since 
> it is possible that the user which is issue the {{GetDelegationToken}} is not 
> kerberos enabled.
> Eg. Oozie submits a job on behalf of user "Joe". When Oozie opens a session 
> with HS2 it uses Oozie's principal and creates a proxy UGI with Hive. This 
> principal can establish a transport authenticated using Kerberos. It stores 
> the HMS delegation token string in the sessionConf and sessionToken. Now, 
> lets say Oozie issues a {{GetDelegationToken}} which has {{Joe}} as the owner 
> and {{oozie}} as the renewer in {{GetDelegationTokenReq}}. This API call 
> cannot instantiate a HMSClient and open transport to HMS using the HMSToken 
> string available in the sessionConf, since DBTokenStore uses server HiveConf 
> instead of sessionConf. It tries to establish transport using Kerberos and it 
> fails since user Joe is not Kerberos enabled.
> I see the following exception trace in HS2 logs.
> {noformat}
> 2017-08-21T18:07:19,644 ERROR [HiveServer2-Handler-Pool: Thread-61] 
> transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  ~[?:1.8.0_121]
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
>  ~[libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) 
> [libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>  [libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_121]
> 

[jira] [Commented] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142609#comment-16142609
 ] 

Hive QA commented on HIVE-17100:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883838/HIVE-17100.08.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11005 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6552/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6552/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6552/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883838 - PreCommit-HIVE-Build

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch, 
> HIVE-17100.03.patch, HIVE-17100.04.patch, HIVE-17100.05.patch, 
> HIVE-17100.06.patch, HIVE-17100.07.patch, HIVE-17100.08.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, 

[jira] [Updated] (HIVE-17205) add functional support for unbucketed tables

2017-08-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17205:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> add functional support for unbucketed tables
> 
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch, HIVE-17205.16.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17205) add functional support for unbucketed tables

2017-08-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142587#comment-16142587
 ] 

Eugene Koifman commented on HIVE-17205:
---

patch 16 committed to master
thanks Wei for the review

> add functional support for unbucketed tables
> 
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch, HIVE-17205.16.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17205) add functional support for unbucketed tables

2017-08-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17205:
--
Summary: add functional support for unbucketed tables  (was: add functional 
support)

> add functional support for unbucketed tables
> 
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch, HIVE-17205.16.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142585#comment-16142585
 ] 

Hive QA commented on HIVE-16886:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883836/HIVE-16886.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11005 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.ql.parse.TestExport.shouldExportImportATemporaryTable 
(batchId=218)
org.apache.hive.beeline.TestSchemaTool.testHiveMetastoreDbPropertiesTable 
(batchId=222)
org.apache.hive.beeline.TestSchemaTool.testMetastoreDbPropertiesAfterUpgrade 
(batchId=222)
org.apache.hive.beeline.TestSchemaTool.testSchemaInit (batchId=222)
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade (batchId=222)
org.apache.hive.beeline.TestSchemaTool.testValidateLocations (batchId=222)
org.apache.hive.beeline.TestSchemaTool.testValidateNullValues (batchId=222)
org.apache.hive.beeline.TestSchemaTool.testValidateSchemaTables (batchId=222)
org.apache.hive.beeline.TestSchemaTool.testValidateSchemaVersions (batchId=222)
org.apache.hive.beeline.TestSchemaTool.testValidateSequences (batchId=222)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6550/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6550/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6550/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883836 - PreCommit-HIVE-Build

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>Assignee: anishek
> Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch, 
> HIVE-16886.2.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   

[jira] [Commented] (HIVE-16811) Estimate statistics in absence of stats

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142586#comment-16142586
 ] 

Hive QA commented on HIVE-16811:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883837/HIVE-16811.8.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6551/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6551/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6551/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-26 03:14:48.551
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6551/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-26 03:14:48.554
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 262d8f9 HIVE-17392: SharedWorkOptimizer might merge TS operators 
filtered by not equivalent semijoin operators (Jesus Camacho Rodriguez, 
reviewed by Ashutosh Chauhan)
+ git clean -f -d
Removing metastore/scripts/upgrade/derby/045-HIVE-16886.derby.sql
Removing metastore/scripts/upgrade/mssql/030-HIVE-16886.mssql.sql
Removing metastore/scripts/upgrade/mysql/045-HIVE-16886.mysql.sql
Removing metastore/scripts/upgrade/oracle/045-HIVE-16886.oracle.sql
Removing metastore/scripts/upgrade/postgres/044-HIVE-16886.postgres.sql
Removing 
metastore/src/java/org/apache/hadoop/hive/metastore/tools/SQLGenerator.java
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 262d8f9 HIVE-17392: SharedWorkOptimizer might merge TS operators 
filtered by not equivalent semijoin operators (Jesus Camacho Rodriguez, 
reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-26 03:14:52.425
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: No such 
file or directory
error: a/itests/src/test/resources/testconfiguration.properties: No such file 
or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java: 
No such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java:
 No such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java: No such 
file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java: No such 
file or directory
error: a/ql/src/test/results/clientpositive/annotate_stats_filter.q.out: No 
such file or directory
error: a/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out: No 
such file or directory
error: a/ql/src/test/results/clientpositive/annotate_stats_part.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/annotate_stats_select.q.out: No 
such file or directory
error: a/ql/src/test/results/clientpositive/annotate_stats_table.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/auto_join_reordering_values.q.out: 
No such file or directory
error: a/ql/src/test/results/clientpositive/auto_join_stats.q.out: No such file 
or directory
error: a/ql/src/test/results/clientpositive/auto_join_stats2.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out: No 
such file or directory
error: 
a/ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out: No 
such file or directory
error: 

[jira] [Commented] (HIVE-16949) Leak of threads from Get-Input-Paths thread pool when more than 1 used in query

2017-08-25 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142581#comment-16142581
 ] 

Sahil Takiar commented on HIVE-16949:
-

[~vihangk1] addressed your comments.

> Leak of threads from Get-Input-Paths thread pool when more than 1 used in 
> query
> ---
>
> Key: HIVE-16949
> URL: https://issues.apache.org/jira/browse/HIVE-16949
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Birger Brunswiek
>Assignee: Sahil Takiar
> Attachments: HIVE-16949.1.patch
>
>
> The commit 
> [20210de|https://github.com/apache/hive/commit/20210dec94148c9b529132b1545df3dd7be083c3]
>  which was part of HIVE-15546 [introduced a thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3109]
>  which is not shutdown upon completion of its threads. This leads to a leak 
> of threads for each query which uses more than 1 partition. They are not 
> removed automatically. When queries spanning multiple partitions are made the 
> number of threads increases and is never reduced. On my machine hiveserver2 
> starts to get slower and slower once 10k threads are reached.
> Thread pools only shutdown automatically in special circumstances (see 
> [documentation section 
> _Finalization_|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html]).
>  This is not currently the case for the Get-Input-Paths thread pool. I would 
> add a _pool.shutdown()_ in a finally block just before returning the result 
> to make sure the threads are really shutdown.
> My current workaround is to set {{hive.exec.input.listing.max.threads = 1}}. 
> This prevents the the thread pool from being spawned 
> [\[1\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2118]
>  
> [\[2\]|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3107].
> The same issue probably also applies to the [Get-Input-Summary thread 
> pool|https://github.com/apache/hive/blob/824b9c80b443dc4e2b9ad35214a23ac756e75234/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2193].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-08-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17386:

Attachment: HIVE-17386.patch

Adding tests, rebasing on top of the committed refactoring changes, and fixing 
various scenarios. 
While writing the tests, I realized the sync to avoid parallel requests for the 
same session with different numbers is not so good. Would need to update that.
The non-"only" patch includes HIVE-17297

> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17386.only.patch, HIVE-17386.patch
>
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-08-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17386:

Attachment: (was: HIVE-17386.patch)

> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17386.only.patch
>
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-08-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17386:

Attachment: HIVE-17386.only.patch

> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17386.only.patch
>
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17139) Conditional expressions optimization: skip the expression evaluation if the condition is not satisfied for vectorization engine.

2017-08-25 Thread Ke Jia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated HIVE-17139:
--
Attachment: HIVE-17139.5.patch

> Conditional expressions optimization: skip the expression evaluation if the 
> condition is not satisfied for vectorization engine.
> 
>
> Key: HIVE-17139
> URL: https://issues.apache.org/jira/browse/HIVE-17139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ke Jia
>Assignee: Ke Jia
> Attachments: HIVE-17139.1.patch, HIVE-17139.2.patch, 
> HIVE-17139.3.patch, HIVE-17139.4.patch, HIVE-17139.5.patch
>
>
> The case when and if statement execution for Hive vectorization is not 
> optimal, which all the conditional and else expressions are evaluated for 
> current implementation. The optimized approach is to update the selected 
> array of batch parameter after the conditional expression is executed. Then 
> the else expression will only do the selected rows instead of all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17368) DBTokenStore fails to connect in Kerberos enabled remote HMS environment

2017-08-25 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17368:
---
Attachment: HIVE-17368.02-branch-2.patch

Attaching the second version of the patch. In the current patch when the 
session is closed {{HiveSessionImplWithUGI.close()}} calls {{super.close()}} 
which calls SessionState.close(). One of the steps of SessionState.close() is 
to {{unCacheDataNucleusClassLoaders}}. This code tries to create a HMS Client 
to check if it is localMetastore. Since the HMS delegation token is already 
cancelled by this time and the UGI might not open transport to HMS, the 
connection will fail and it will log a {{INFO}} level error. I think this check 
can be simplified by just using {{HiveConfUtil.isEmbeddedMetaStore}} method 
which doesn't need to instantiate a HMS client.

If HMS is remote, this method is will return false and previous behaviour is 
maintained. If HMS is embedded, this code will return true and there would be 
no need to open the transport. {{ObjectStore.unCacheDataNucleusClassLoaders}} 
will execute in the same process.

> DBTokenStore fails to connect in Kerberos enabled remote HMS environment
> 
>
> Key: HIVE-17368
> URL: https://issues.apache.org/jira/browse/HIVE-17368
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.0, 2.1.0, 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17368.01-branch-2.patch, HIVE-17368.01.patch, 
> HIVE-17368.02-branch-2.patch
>
>
> In setups where HMS is running as a remote process secured using Kerberos, 
> and when {{DBTokenStore}} is configured as the token store, the HS2 Thrift 
> API calls like {{GetDelegationToken}}, {{CancelDelegationToken}} and 
> {{RenewDelegationToken}} fail with exception trace seen below. HS2 is not 
> able to invoke HMS APIs needed to add/remove/renew tokens from the DB since 
> it is possible that the user which is issue the {{GetDelegationToken}} is not 
> kerberos enabled.
> Eg. Oozie submits a job on behalf of user "Joe". When Oozie opens a session 
> with HS2 it uses Oozie's principal and creates a proxy UGI with Hive. This 
> principal can establish a transport authenticated using Kerberos. It stores 
> the HMS delegation token string in the sessionConf and sessionToken. Now, 
> lets say Oozie issues a {{GetDelegationToken}} which has {{Joe}} as the owner 
> and {{oozie}} as the renewer in {{GetDelegationTokenReq}}. This API call 
> cannot instantiate a HMSClient and open transport to HMS using the HMSToken 
> string available in the sessionConf, since DBTokenStore uses server HiveConf 
> instead of sessionConf. It tries to establish transport using Kerberos and it 
> fails since user Joe is not Kerberos enabled.
> I see the following exception trace in HS2 logs.
> {noformat}
> 2017-08-21T18:07:19,644 ERROR [HiveServer2-Handler-Pool: Thread-61] 
> transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  ~[?:1.8.0_121]
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
>  ~[libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) 
> [libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>  [libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_121]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_121]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>  [hadoop-common-2.7.2.jar:?]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:488)
>  [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:255)
>  [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70)
>  

[jira] [Updated] (HIVE-17368) DBTokenStore fails to connect in Kerberos enabled remote HMS environment

2017-08-25 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17368:
---
Attachment: (was: HIVE-17368-branch-2.01.patch)

> DBTokenStore fails to connect in Kerberos enabled remote HMS environment
> 
>
> Key: HIVE-17368
> URL: https://issues.apache.org/jira/browse/HIVE-17368
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.0, 2.1.0, 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17368.01-branch-2.patch, HIVE-17368.01.patch, 
> HIVE-17368.02-branch-2.patch
>
>
> In setups where HMS is running as a remote process secured using Kerberos, 
> and when {{DBTokenStore}} is configured as the token store, the HS2 Thrift 
> API calls like {{GetDelegationToken}}, {{CancelDelegationToken}} and 
> {{RenewDelegationToken}} fail with exception trace seen below. HS2 is not 
> able to invoke HMS APIs needed to add/remove/renew tokens from the DB since 
> it is possible that the user which is issue the {{GetDelegationToken}} is not 
> kerberos enabled.
> Eg. Oozie submits a job on behalf of user "Joe". When Oozie opens a session 
> with HS2 it uses Oozie's principal and creates a proxy UGI with Hive. This 
> principal can establish a transport authenticated using Kerberos. It stores 
> the HMS delegation token string in the sessionConf and sessionToken. Now, 
> lets say Oozie issues a {{GetDelegationToken}} which has {{Joe}} as the owner 
> and {{oozie}} as the renewer in {{GetDelegationTokenReq}}. This API call 
> cannot instantiate a HMSClient and open transport to HMS using the HMSToken 
> string available in the sessionConf, since DBTokenStore uses server HiveConf 
> instead of sessionConf. It tries to establish transport using Kerberos and it 
> fails since user Joe is not Kerberos enabled.
> I see the following exception trace in HS2 logs.
> {noformat}
> 2017-08-21T18:07:19,644 ERROR [HiveServer2-Handler-Pool: Thread-61] 
> transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  ~[?:1.8.0_121]
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
>  ~[libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) 
> [libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>  [libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_121]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_121]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>  [hadoop-common-2.7.2.jar:?]
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
>  [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:488)
>  [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:255)
>  [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70)
>  [hive-exec-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_121]
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_121]
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_121]
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
> [?:1.8.0_121]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1699)
>  [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83)
>  [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-17393) AMReporter need hearbeat every external 'AM'

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142546#comment-16142546
 ] 

Hive QA commented on HIVE-17393:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883831/HIVE-17393.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6549/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6549/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6549/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-26 01:59:44.358
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6549/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-26 01:59:44.361
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 262d8f9 HIVE-17392: SharedWorkOptimizer might merge TS operators 
filtered by not equivalent semijoin operators (Jesus Camacho Rodriguez, 
reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 262d8f9 HIVE-17392: SharedWorkOptimizer might merge TS operators 
filtered by not equivalent semijoin operators (Jesus Camacho Rodriguez, 
reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-26 01:59:44.849
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/AMReporter.java
patching file 
llap-server/src/test/org/apache/hadoop/hive/llap/daemon/impl/comparator/TestAMReporter.java
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
DataNucleus Enhancer (version 4.1.17) for API "JDO"
DataNucleus Enhancer : Classpath
>>  /usr/share/maven/boot/plexus-classworlds-2.x.jar
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDatabase
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFieldSchema
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MType
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTable
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MConstraint
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MSerDeInfo
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MOrder
ENHANCED (Persistable) : 
org.apache.hadoop.hive.metastore.model.MColumnDescriptor
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MStringList
ENHANCED (Persistable) : 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartition
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MIndex
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MRole
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MRoleMap
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MGlobalPrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDBPrivilege
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTablePrivilege
ENHANCED (Persistable) : 
org.apache.hadoop.hive.metastore.model.MPartitionPrivilege
ENHANCED (Persistable) : 
org.apache.hadoop.hive.metastore.model.MTableColumnPrivilege
ENHANCED (Persistable) : 

[jira] [Commented] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142540#comment-16142540
 ] 

Hive QA commented on HIVE-17392:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883811/HIVE-17392.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11005 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6547/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6547/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6547/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883811 - PreCommit-HIVE-Build

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-17392.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17307) Change the metastore to not use the metrics code in hive/common

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142542#comment-16142542
 ] 

Hive QA commented on HIVE-17307:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883818/HIVE-17307.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6548/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6548/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6548/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-26 01:58:48.078
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6548/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-26 01:58:48.081
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   7567119..262d8f9  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 7567119 HIVE-17340 TxnHandler.checkLock() - reduce number of SQL 
statements (Eugene Koifman, reviewed by Wei Zheng)
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/dynamic_semijoin_reduction_sw.q
Removing 
ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_sw.q.out
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 262d8f9 HIVE-17392: SharedWorkOptimizer might merge TS operators 
filtered by not equivalent semijoin operators (Jesus Camacho Rodriguez, 
reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-26 01:58:51.716
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:17
error: metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: 
patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883818 - PreCommit-HIVE-Build

> Change the metastore to not use the metrics code in hive/common
> ---
>
> Key: HIVE-17307
> URL: https://issues.apache.org/jira/browse/HIVE-17307
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17307.2.patch, HIVE-17307.3.patch, HIVE-17307.patch
>
>
> As we move code into the standalone metastore module, it cannot use the 
> metrics in hive-common.  We could copy the current Metrics interface or we 
> could change the metastore code to directly use codahale metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17392:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]!

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-17392.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-25 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142035#comment-16142035
 ] 

Sankar Hariappan edited comment on HIVE-17367 at 8/26/17 12:39 AM:
---

Added 02.patch with additional handling to support retry after failure of 
import command.

Request [~thejas], [~anishek], [~daijy] to please review.


was (Author: sankarh):
Added 02.patch with additional handling to support retry after failure of 
import command.

> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch, HIVE-17367.02.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Attachment: HIVE-17100.08.patch

Added 08.patch with below changes.
- Changed the tag name as REPL:::
- Moved Repl log task generation inside Load classes from ReplLoadTask.
- Fixed the bug of displaying wrong last repl id by incremental load log.

Request [~anishek] to do the review.

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch, 
> HIVE-17100.03.patch, HIVE-17100.04.patch, HIVE-17100.05.patch, 
> HIVE-17100.06.patch, HIVE-17100.07.patch, HIVE-17100.08.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> 

[jira] [Updated] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17100:

Status: Patch Available  (was: Open)

> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch, 
> HIVE-17100.03.patch, HIVE-17100.04.patch, HIVE-17100.05.patch, 
> HIVE-17100.06.patch, HIVE-17100.07.patch, HIVE-17100.08.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID 

[jira] [Updated] (HIVE-16811) Estimate statistics in absence of stats

2017-08-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16811:
---
Status: Patch Available  (was: Open)

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch, HIVE-16811.8.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142495#comment-16142495
 ] 

Hive QA commented on HIVE-17392:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883811/HIVE-17392.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11005 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation (batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6546/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6546/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6546/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883811 - PreCommit-HIVE-Build

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-16811) Estimate statistics in absence of stats

2017-08-25 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142496#comment-16142496
 ] 

Vineet Garg edited comment on HIVE-16811 at 8/26/17 12:36 AM:
--

Latest patch addresses review comments


was (Author: vgarg):
Addresses review comments

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch, HIVE-16811.8.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16811) Estimate statistics in absence of stats

2017-08-25 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142496#comment-16142496
 ] 

Vineet Garg commented on HIVE-16811:


Addresses review comments

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch, HIVE-16811.8.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16811) Estimate statistics in absence of stats

2017-08-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16811:
---
Status: Open  (was: Patch Available)

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch, HIVE-16811.8.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16811) Estimate statistics in absence of stats

2017-08-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16811:
---
Attachment: HIVE-16811.8.patch

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch, HIVE-16811.8.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-08-25 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-16886:
---
Attachment: HIVE-16886.2.patch

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>Assignee: anishek
> Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch, 
> HIVE-16886.2.patch
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work stopped] (HIVE-17100) Improve HS2 operation logs for REPL commands.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17100 stopped by Sankar Hariappan.
---
> Improve HS2 operation logs for REPL commands.
> -
>
> Key: HIVE-17100
> URL: https://issues.apache.org/jira/browse/HIVE-17100
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17100.01.patch, HIVE-17100.02.patch, 
> HIVE-17100.03.patch, HIVE-17100.04.patch, HIVE-17100.05.patch, 
> HIVE-17100.06.patch, HIVE-17100.07.patch
>
>
> It is necessary to log the progress the replication tasks in a structured 
> manner as follows.
> *+Bootstrap Dump:+*
> * At the start of bootstrap dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (BOOTSTRAP)
> * (Estimated) Total number of tables/views to dump
> * (Estimated) Total number of functions to dump.
> * Dump Start Time{color}
> * After each table dump, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table dump end time
> * Table dump progress. Format is Table sequence no/(Estimated) Total number 
> of tables and views.{color}
> * After each function dump, will add a log as follows
> {color:#59afe1}* Function Name
> * Function dump end time
> * Function dump progress. Format is Function sequence no/(Estimated) Total 
> number of functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> dump.
> {color:#59afe1}* Database Name.
> * Dump Type (BOOTSTRAP).
> * Dump End Time.
> * (Actual) Total number of tables/views dumped.
> * (Actual) Total number of functions dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The actual and estimated number of tables/functions may not match if 
> any table/function is dropped when dump in progress.
> *+Bootstrap Load:+*
> * At the start of bootstrap load, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump directory
> * Load Type (BOOTSTRAP)
> * Total number of tables/views to load
> * Total number of functions to load.
> * Load Start Time{color}
> * After each table load, will add a log as follows
> {color:#59afe1}* Table/View Name
> * Type (TABLE/VIEW/MATERIALIZED_VIEW)
> * Table load completion time
> * Table load progress. Format is Table sequence no/Total number of tables and 
> views.{color}
> * After each function load, will add a log as follows
> {color:#59afe1}* Function Name
> * Function load completion time
> * Function load progress. Format is Function sequence no/Total number of 
> functions.{color}
> * After completion of all dumps, will add a log as follows to consolidate the 
> load.
> {color:#59afe1}* Database Name.
> * Load Type (BOOTSTRAP).
> * Load End Time.
> * Total number of tables/views loaded.
> * Total number of functions loaded.
> * Last Repl ID of the loaded database.{color}
> *+Incremental Dump:+*
> * At the start of database dump, will add one log with below details.
> {color:#59afe1}* Database Name
> * Dump Type (INCREMENTAL)
> * (Estimated) Total number of events to dump.
> * Dump Start Time{color}
> * After each event dump, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event dump end time
> * Event dump progress. Format is Event sequence no/ (Estimated) Total number 
> of events.{color}
> * After completion of all event dumps, will add a log as follows.
> {color:#59afe1}* Database Name.
> * Dump Type (INCREMENTAL).
> * Dump End Time.
> * (Actual) Total number of events dumped.
> * Dump Directory.
> * Last Repl ID of the dump.{color}
> *Note:* The estimated number of events can be terribly inaccurate with actual 
> number as we don’t have the number of events upfront until we read from 
> metastore NotificationEvents table.
> *+Incremental Load:+*
> * At the start of incremental load, will add one log with below details.
> {color:#59afe1}* Target Database Name 
> * Dump directory
> * Load Type (INCREMENTAL)
> * Total number of events to load
> * Load Start Time{color}
> * After each event load, will add a log as follows
> {color:#59afe1}* Event ID
> * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc)
> * Event load end time
> * Event load progress. Format is Event sequence no/ Total number of 
> events.{color}
> * After completion of all event loads, will add a log as follows to 
> consolidate the load.
> {color:#59afe1}* Target Database Name.
> * Load Type (INCREMENTAL).
> * Load End Time.
> * Total number of events loaded.
> * Last Repl ID of the loaded database.{color}



--

[jira] [Commented] (HIVE-17205) add functional support

2017-08-25 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142453#comment-16142453
 ] 

Wei Zheng commented on HIVE-17205:
--

The patch looks good. +1
Thanks Eugene for relaxing the bucketing restriction!

> add functional support
> --
>
> Key: HIVE-17205
> URL: https://issues.apache.org/jira/browse/HIVE-17205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17205.01.patch, HIVE-17205.02.patch, 
> HIVE-17205.03.patch, HIVE-17205.09.patch, HIVE-17205.10.patch, 
> HIVE-17205.11.patch, HIVE-17205.12.patch, HIVE-17205.13.patch, 
> HIVE-17205.14.patch, HIVE-17205.15.patch, HIVE-17205.16.patch
>
>
> make sure unbucketed tables can be marked transactional=true
> make insert/update/delete/compaction work



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17393) AMReporter need hearbeat every external 'AM'

2017-08-25 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-17393:

Status: Patch Available  (was: Open)

> AMReporter need hearbeat every external 'AM'
> 
>
> Key: HIVE-17393
> URL: https://issues.apache.org/jira/browse/HIVE-17393
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 3.0.0
>
> Attachments: HIVE-17393.1.patch
>
>
> AMReporter only remember first AM that submit the query and heartbeat to it. 
> In case of external client, there might be multiple 'AM's and every of them 
> need node heartbeat.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17393) AMReporter need hearbeat every external 'AM'

2017-08-25 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-17393:

Attachment: HIVE-17393.1.patch

> AMReporter need hearbeat every external 'AM'
> 
>
> Key: HIVE-17393
> URL: https://issues.apache.org/jira/browse/HIVE-17393
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 3.0.0
>
> Attachments: HIVE-17393.1.patch
>
>
> AMReporter only remember first AM that submit the query and heartbeat to it. 
> In case of external client, there might be multiple 'AM's and every of them 
> need node heartbeat.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142396#comment-16142396
 ] 

Hive QA commented on HIVE-17297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883788/HIVE-17297.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11012 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input13] (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin1] (batchId=5)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6545/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6545/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6545/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883788 - PreCommit-HIVE-Build

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.01.nogen.patch, HIVE-17297.01.patch, 
> HIVE-17297.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17393) AMReporter need hearbeat every external 'AM'

2017-08-25 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang reassigned HIVE-17393:
---


> AMReporter need hearbeat every external 'AM'
> 
>
> Key: HIVE-17393
> URL: https://issues.apache.org/jira/browse/HIVE-17393
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
>
> AMReporter only remember first AM that submit the query and heartbeat to it. 
> In case of external client, there might be multiple 'AM's and every of them 
> need node heartbeat.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17393) AMReporter need hearbeat every external 'AM'

2017-08-25 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-17393:

Fix Version/s: 3.0.0

> AMReporter need hearbeat every external 'AM'
> 
>
> Key: HIVE-17393
> URL: https://issues.apache.org/jira/browse/HIVE-17393
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 3.0.0
>
>
> AMReporter only remember first AM that submit the query and heartbeat to it. 
> In case of external client, there might be multiple 'AM's and every of them 
> need node heartbeat.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17307) Change the metastore to not use the metrics code in hive/common

2017-08-25 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142362#comment-16142362
 ] 

Alan Gates commented on HIVE-17307:
---

This doesn't seem to have been posted to the JIRA, but a build was done:

https://builds.apache.org/job/PreCommit-HIVE-Build/6529/

with results:

Test Result (3 failures / ±0)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[0]
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14]
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]


> Change the metastore to not use the metrics code in hive/common
> ---
>
> Key: HIVE-17307
> URL: https://issues.apache.org/jira/browse/HIVE-17307
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17307.2.patch, HIVE-17307.3.patch, HIVE-17307.patch
>
>
> As we move code into the standalone metastore module, it cannot use the 
> metrics in hive-common.  We could copy the current Metrics interface or we 
> could change the metastore code to directly use codahale metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17307) Change the metastore to not use the metrics code in hive/common

2017-08-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17307:
--
Attachment: (was: HIVE-17307.3.patch)

> Change the metastore to not use the metrics code in hive/common
> ---
>
> Key: HIVE-17307
> URL: https://issues.apache.org/jira/browse/HIVE-17307
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17307.2.patch, HIVE-17307.3.patch, HIVE-17307.patch
>
>
> As we move code into the standalone metastore module, it cannot use the 
> metrics in hive-common.  We could copy the current Metrics interface or we 
> could change the metastore code to directly use codahale metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17307) Change the metastore to not use the metrics code in hive/common

2017-08-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17307:
--
Attachment: HIVE-17307.3.patch

> Change the metastore to not use the metrics code in hive/common
> ---
>
> Key: HIVE-17307
> URL: https://issues.apache.org/jira/browse/HIVE-17307
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17307.2.patch, HIVE-17307.3.patch, HIVE-17307.patch
>
>
> As we move code into the standalone metastore module, it cannot use the 
> metrics in hive-common.  We could copy the current Metrics interface or we 
> could change the metastore code to directly use codahale metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142281#comment-16142281
 ] 

Hive QA commented on HIVE-17392:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883797/HIVE-17392.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6544/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6544/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6544/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883797 - PreCommit-HIVE-Build

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142245#comment-16142245
 ] 

Ashutosh Chauhan commented on HIVE-17392:
-

+1

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17392:
---
Attachment: HIVE-17392.02.patch

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17392:
---
Attachment: (was: HIVE-17392.01.patch)

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17375) stddev_samp,var_samp standard compliance

2017-08-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142211#comment-16142211
 ] 

Ashutosh Chauhan commented on HIVE-17375:
-

+1

> stddev_samp,var_samp standard compliance
> 
>
> Key: HIVE-17375
> URL: https://issues.apache.org/jira/browse/HIVE-17375
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-17375.1.patch, HIVE-17375.2.patch
>
>
> these two udaf-s are returning 0 in case of only one element - however the 
> stadard requires NULL to be returned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142152#comment-16142152
 ] 

Hive QA commented on HIVE-17367:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883786/HIVE-17367.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11005 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=231)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.metastore.TestHiveMetaStoreWithEnvironmentContext.testEnvironmentContext
 (batchId=209)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6543/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6543/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6543/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883786 - PreCommit-HIVE-Build

> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch, HIVE-17367.02.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17392:
---
Attachment: HIVE-17392.01.patch

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17392:
---
Attachment: (was: HIVE-17392.patch)

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17340:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142101#comment-16142101
 ] 

Eugene Koifman commented on HIVE-17340:
---

I cleaned up the imports.  I think keeping commit() in the "main" methods makes 
the flow clearer.

committed to master
thanks Wei for the review

cc [~gopalv]

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17341) DbTxnManger.startHeartbeat() - randomize initial delay

2017-08-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17341:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

committed to master
thanks Wei for the review

cc [~gopalv]

> DbTxnManger.startHeartbeat() - randomize initial delay
> --
>
> Key: HIVE-17341
> URL: https://issues.apache.org/jira/browse/HIVE-17341
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-17341.01.patch
>
>
> This sets up a fixed delay for all heartebeats.  If many queries land on the 
> server at the same time,
> they will wake up and start hearbeating at the same time causing a bottleneck.
> Add some random element to heatbeat delay.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17297:

Attachment: (was: HIVE-17297.01.patch)

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.01.nogen.patch, HIVE-17297.01.patch, 
> HIVE-17297.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17297:

Attachment: (was: HIVE-17297.01.patch)

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.01.nogen.patch, HIVE-17297.01.patch, 
> HIVE-17297.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17297) allow AM to use LLAP guaranteed tasks

2017-08-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17297:

Attachment: HIVE-17297.02.patch

Rebased on top of recent commits. No extra changes.

> allow AM to use LLAP guaranteed tasks
> -
>
> Key: HIVE-17297
> URL: https://issues.apache.org/jira/browse/HIVE-17297
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17297.01.nogen.patch, HIVE-17297.01.patch, 
> HIVE-17297.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142042#comment-16142042
 ] 

Jesus Camacho Rodriguez commented on HIVE-17392:


[~sershe], I just want to get a full QA run. Certainly they should be removed 
in final patch.

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142038#comment-16142038
 ] 

Sergey Shelukhin commented on HIVE-17392:
-

What are the commented-out lines for - should they rather be removed?

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17367:

Status: Patch Available  (was: Open)

> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch, HIVE-17367.02.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17367:

Attachment: HIVE-17367.02.patch

Added 02.patch with additional handling to support retry after failure of 
import command.

> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch, HIVE-17367.02.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables

2017-08-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142033#comment-16142033
 ] 

Sergey Shelukhin commented on HIVE-12631:
-

Any update here?

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, 
> HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.19.patch, 
> HIVE-12631.1.patch, HIVE-12631.20.patch, HIVE-12631.21.patch, 
> HIVE-12631.22.patch, HIVE-12631.23.patch, HIVE-12631.24.patch, 
> HIVE-12631.25.patch, HIVE-12631.26.patch, HIVE-12631.2.patch, 
> HIVE-12631.3.patch, HIVE-12631.4.patch, HIVE-12631.5.patch, 
> HIVE-12631.6.patch, HIVE-12631.7.patch, HIVE-12631.8.patch, 
> HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work stopped] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17367 stopped by Sankar Hariappan.
---
> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17330) refactor TezSessionPoolManager to separate its multiple functions

2017-08-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17330:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master after fixing a trivial NPE in tests. Thanks for the review!

> refactor TezSessionPoolManager to separate its multiple functions
> -
>
> Key: HIVE-17330
> URL: https://issues.apache.org/jira/browse/HIVE-17330
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-17330.01.patch, HIVE-17330.02.patch, 
> HIVE-17330.patch
>
>
> TezSessionPoolManager would retain things specific to current Hive session 
> management. 
> The session pool itself, as well as expiration tracking, the pool session 
> implementation, and some config validation can be separated out and made 
> independent from the pool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17392:
---
Attachment: HIVE-17392.patch

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17392:
---
Status: Patch Available  (was: In Progress)

> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17392.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17392 started by Jesus Camacho Rodriguez.
--
> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17392) SharedWorkOptimizer might merge TS operators filtered by not equivalent semijoin operators

2017-08-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-17392:
--


> SharedWorkOptimizer might merge TS operators filtered by not equivalent 
> semijoin operators
> --
>
> Key: HIVE-17392
> URL: https://issues.apache.org/jira/browse/HIVE-17392
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-25 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142023#comment-16142023
 ] 

Wei Zheng commented on HIVE-17340:
--

Maybe it's better to move {code}dbConn.commit(){code} into the acquire() method 
from checkLock().

nit: there's unused imports.

+1 otherwise.

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17024) HPL/SQL: CLI fails to exit after NPE using embedded connection

2017-08-25 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142003#comment-16142003
 ] 

Dmitry Tolpeko commented on HIVE-17024:
---

For HiveServer2 embedded mode (jdbc:hive2://) I am going to remove the default 
initialization to use mr engine as it causes the program termination for some 
reason:

{code}
hplsql.conn.init.hiveconn
  
 set hive.execution.engine=mr; 
 use default;
  
{code}

> HPL/SQL: CLI fails to exit after NPE using embedded connection
> --
>
> Key: HIVE-17024
> URL: https://issues.apache.org/jira/browse/HIVE-17024
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
>Priority: Critical
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> This happened during the error reported in XXX (bug TBD)
> Script is this:
> create table if not exists test1(col1 integer);
> create table if not exists test2(col1 double);
> create table if not exists test3(col1 decimal(10, 4));
> create table if not exists test4(col1 string);
> create table if not exists test5(col1 varchar(20));
> Output is this:
> [vagrant@trunk hplsql]$ hplsql -f temp3.sql
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/vagrant/hivedist/apache-hive-3.0.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.6.1.0-128/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Open connection: jdbc:hive2:// (5.83 sec)
> Starting query
> OK
> Query executed successfully (2.31 sec)
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.hive.hplsql.Exec.evalPop(Exec.java:2398)
> at org.apache.hive.hplsql.Stmt.createTableDefinition(Stmt.java:169)
> at org.apache.hive.hplsql.Stmt.createTable(Stmt.java:142)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:1366)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$Create_table_stmtContext.accept(HplsqlParser.java:4198)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1013)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:1018)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at 
> org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
> at 
> org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:452)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:920)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:393)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
> at org.apache.hive.hplsql.Exec.run(Exec.java:775)
> at org.apache.hive.hplsql.Exec.run(Exec.java:751)
> at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> I attached a jstack.
> When I use a Hiveserver2 connection instead, I get an NPE but it doesn't hang 
> (at least not on the client side)
> Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17367:

Status: Open  (was: Patch Available)

> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.

2017-08-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17367 started by Sankar Hariappan.
---
> IMPORT table doesn't load from data dump if a metadata-only dump was already 
> imported.
> --
>
> Key: HIVE-17367
> URL: https://issues.apache.org/jira/browse/HIVE-17367
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Import/Export, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17367.01.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data 
> (as per events) across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the 
> latest notification event ID as current state of it. So, in this example, 
> import of metadata by ALTER_TABLE event sets the current state of the table 
> as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the 
> table's current state(11) is equal to the dump state (11) which in-turn leads 
> to the data never gets replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current 
> state equals the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17391) Compaction fails if there is an empty value in tblproperties

2017-08-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141973#comment-16141973
 ] 

Ashutosh Chauhan commented on HIVE-17391:
-

Stacktrace
{code}
Exception running child : java.lang.NullPointerException
  at java.util.Hashtable.put(Hashtable.java:459)
  at java.util.Hashtable.putAll(Hashtable.java:523)
  at 
org.apache.hadoop.hive.common.StringableMap.toProperties(StringableMap.java:77)
  at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.getWriter(CompactorMR.java:660)
  at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:636)
  at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:610)
{code}


> Compaction fails if there is an empty value in tblproperties
> 
>
> Key: HIVE-17391
> URL: https://issues.apache.org/jira/browse/HIVE-17391
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>
> create table t1 (a int) tblproperties ('serialization.null.format'='');
> alter table t1 compact 'major';
> fails



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17341) DbTxnManger.startHeartbeat() - randomize initial delay

2017-08-25 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141949#comment-16141949
 ] 

Wei Zheng commented on HIVE-17341:
--

+1

> DbTxnManger.startHeartbeat() - randomize initial delay
> --
>
> Key: HIVE-17341
> URL: https://issues.apache.org/jira/browse/HIVE-17341
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17341.01.patch
>
>
> This sets up a fixed delay for all heartebeats.  If many queries land on the 
> server at the same time,
> they will wake up and start hearbeating at the same time causing a bottleneck.
> Add some random element to heatbeat delay.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17329) ensure acid side file is not overwritten

2017-08-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-17329.
---
Resolution: Done

fixed in HIVE-17205

> ensure acid side file is not overwritten
> 
>
> Key: HIVE-17329
> URL: https://issues.apache.org/jira/browse/HIVE-17329
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Fix For: 3.0.0
>
>
> OrcRecordUpdater() has 
> {noformat}
>   flushLengths = fs.create(OrcAcidUtils.getSideFile(this.path), true, 8,
>   options.getReporter());
> {noformat}
> this should be the only place where the side file is created but to be safe 
> we should set "overwrite" parameter to false.  If this file already exists 
> that means there are 2 OrcRecordUpdates trying to write the same (primary) 
> file - never ok.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17375) stddev_samp,var_samp standard compliance

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141903#comment-16141903
 ] 

Hive QA commented on HIVE-17375:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883746/HIVE-17375.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6542/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6542/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6542/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883746 - PreCommit-HIVE-Build

> stddev_samp,var_samp standard compliance
> 
>
> Key: HIVE-17375
> URL: https://issues.apache.org/jira/browse/HIVE-17375
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-17375.1.patch, HIVE-17375.2.patch
>
>
> these two udaf-s are returning 0 in case of only one element - however the 
> stadard requires NULL to be returned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17340) TxnHandler.checkLock() - reduce number of SQL statements

2017-08-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141852#comment-16141852
 ] 

Eugene Koifman commented on HIVE-17340:
---

no related failures

> TxnHandler.checkLock() - reduce number of SQL statements
> 
>
> Key: HIVE-17340
> URL: https://issues.apache.org/jira/browse/HIVE-17340
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17340.03.patch
>
>
> This calls acquire(Connection dbConn, Statement stmt, long extLockId, 
> LockInfo lockInfo)
> for each lock in the same DB transaction - 1 Update stmt per acquire().
> There is no reason all of them cannot be sent in 1 statement if all the locks 
> are granted
> With a lot of partitions this can be a perf issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17025) HPL/SQL: hplsql.conn.convert.hiveconn seems to default to false, contrary to docs

2017-08-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko reassigned HIVE-17025:
-

Assignee: Dmitry Tolpeko

> HPL/SQL: hplsql.conn.convert.hiveconn seems to default to false, contrary to 
> docs
> -
>
> Key: HIVE-17025
> URL: https://issues.apache.org/jira/browse/HIVE-17025
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> Example from the docs is as follows:
> CREATE TABLE dept (
>   deptno NUMBER(2,0),
>   dname  NUMBER(14),
>   locVARCHAR2(13),
>   CONSTRAINT pk_dept PRIMARY KEY (deptno)
> );
> With this config:
> 
>   
> hplsql.conn.default
> hiveconn
>   
>   
> hplsql.conn.hiveconn
> org.apache.hive.jdbc.HiveDriver;jdbc:hive2://
>   
> 
> I get this error:
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.parse.ParseException:line 2:9 cannot recognize 
> input near 'NUMBER' '(' '2' in column type
> With this config:
> 
>   
> hplsql.conn.default
> hiveconn
>   
>   
> hplsql.conn.hiveconn
> org.apache.hive.jdbc.HiveDriver;jdbc:hive2://
>   
>   
> hplsql.conn.convert.hiveconn
> true
>   
> 
> the example works.
> Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17389) Yetus is always failing on rat checks

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141810#comment-16141810
 ] 

Hive QA commented on HIVE-17389:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883742/HIVE-17389.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11004 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=46)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6541/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6541/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6541/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883742 - PreCommit-HIVE-Build

> Yetus is always failing on rat checks
> -
>
> Key: HIVE-17389
> URL: https://issues.apache.org/jira/browse/HIVE-17389
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Trivial
> Attachments: HIVE-17389.01.patch
>
>
> Rat checks are failing on metastore_db/dblock and files under patchprocess 
> created by Yetus itself.
> Both directories should be excluded from rat checks.
> CC: [~pvary] [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17024) HPL/SQL: CLI fails to exit after NPE using embedded connection

2017-08-25 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141797#comment-16141797
 ] 

Dmitry Tolpeko commented on HIVE-17024:
---

Asked Carter to add -trace and send me the results back as I cannot reproduce 
this. Here is my output:

{code}
Ln:1 CREATE TABLE
Ln:1 create table if not exists crtab_test1(col1 int)
17/08/25 15:31:05 INFO jdbc.Utils: Supplied authorities: localhost:1
17/08/25 15:31:05 INFO jdbc.Utils: Resolved authority: localhost:1
Open connection: jdbc:hive2://localhost:1 (568 ms)
Starting SQL statement
SQL statement executed successfully (395 ms)
Ln:2 CREATE TABLE
Ln:2 create table if not exists crtab_test2(col1 double)
Starting SQL statement
SQL statement executed successfully (505 ms)
Ln:3 CREATE TABLE
Ln:3 create table if not exists crtab_test3(col1 decimal(10, 4))
Starting SQL statement
SQL statement executed successfully (357 ms)
Ln:4 CREATE TABLE
Ln:4 create table if not exists crtab_test4(col1 string)
Starting SQL statement
SQL statement executed successfully (314 ms)
Ln:5 CREATE TABLE
Ln:5 create table if not exists crtab_test5(col1 varchar(20))
Starting SQL statement
SQL statement executed successfully (348 ms)
{code}

> HPL/SQL: CLI fails to exit after NPE using embedded connection
> --
>
> Key: HIVE-17024
> URL: https://issues.apache.org/jira/browse/HIVE-17024
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
>Priority: Critical
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> This happened during the error reported in XXX (bug TBD)
> Script is this:
> create table if not exists test1(col1 integer);
> create table if not exists test2(col1 double);
> create table if not exists test3(col1 decimal(10, 4));
> create table if not exists test4(col1 string);
> create table if not exists test5(col1 varchar(20));
> Output is this:
> [vagrant@trunk hplsql]$ hplsql -f temp3.sql
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/vagrant/hivedist/apache-hive-3.0.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.6.1.0-128/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Open connection: jdbc:hive2:// (5.83 sec)
> Starting query
> OK
> Query executed successfully (2.31 sec)
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.hive.hplsql.Exec.evalPop(Exec.java:2398)
> at org.apache.hive.hplsql.Stmt.createTableDefinition(Stmt.java:169)
> at org.apache.hive.hplsql.Stmt.createTable(Stmt.java:142)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:1366)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$Create_table_stmtContext.accept(HplsqlParser.java:4198)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1013)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:1018)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at 
> org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
> at 
> org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:452)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:920)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:393)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
> at org.apache.hive.hplsql.Exec.run(Exec.java:775)
> at org.apache.hive.hplsql.Exec.run(Exec.java:751)
> at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> 

[jira] [Commented] (HIVE-17341) DbTxnManger.startHeartbeat() - randomize initial delay

2017-08-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141741#comment-16141741
 ] 

Eugene Koifman commented on HIVE-17341:
---

no related failures

> DbTxnManger.startHeartbeat() - randomize initial delay
> --
>
> Key: HIVE-17341
> URL: https://issues.apache.org/jira/browse/HIVE-17341
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17341.01.patch
>
>
> This sets up a fixed delay for all heartebeats.  If many queries land on the 
> server at the same time,
> they will wake up and start hearbeating at the same time causing a bottleneck.
> Add some random element to heatbeat delay.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17107) Upgrade Yetus to 0.5.0

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141725#comment-16141725
 ] 

Hive QA commented on HIVE-17107:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883740/HIVE-17107.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 11000 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6540/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6540/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6540/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883740 - PreCommit-HIVE-Build

> Upgrade Yetus to 0.5.0
> --
>
> Key: HIVE-17107
> URL: https://issues.apache.org/jira/browse/HIVE-17107
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17107.01.patch
>
>
> [Yetus 0.5.0|https://yetus.apache.org/documentation/0.5.0/RELEASENOTES/] is 
> released, and it contains our fixes.
> We should upgrade and remove our extra patched files.
> CC: [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17390) Select count(distinct) returns incorrect results using tez

2017-08-25 Thread Khaja Hussain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141708#comment-16141708
 ] 

Khaja Hussain commented on HIVE-17390:
--

Thanks Brian for filing the bug.

> Select count(distinct) returns incorrect results using tez
> --
>
> Key: HIVE-17390
> URL: https://issues.apache.org/jira/browse/HIVE-17390
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.2.1
>Reporter: Brian Goerlitz
>
> With the following combination of settings, select count(distinct) will 
> return the results of select sum(distinct).
> hive.execution.engine=tez
> hive.optimize.reducededuplication=true
> hive.optimize.reducededuplication.min.reducer=1
> hive.optimize.distinct.rewrite=true
> hive.groupby.skewindata=false
> hive.vectorized.execution.reduce.enabled=true
> STEPS TO REPRODUCE:
> {quote}CREATE TABLE `simple_data`(ppmonth int, sale double);
> INSERT INTO simple_data VALUES 
> (501,25000.0),(502,6.0),(501,4.0),(502,7.0),(501,35000.0),(502,6.0);
> set hive.execution.engine=tez;
> set hive.optimize.reducededuplication=true;
> set hive.optimize.reducededuplication.min.reducer=1;
> set hive.optimize.distinct.rewrite=true;
> set hive.groupby.skewindata=false;
> set hive.vectorized.execution.reduce.enabled=true;
> select count(distinct ppmonth) from simple_data;{quote}
> Returns 1003 rather than 2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17375) stddev_samp,var_samp standard compliance

2017-08-25 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17375:

Attachment: HIVE-17375.2.patch

#2) update vectorization related q.outs

> stddev_samp,var_samp standard compliance
> 
>
> Key: HIVE-17375
> URL: https://issues.apache.org/jira/browse/HIVE-17375
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-17375.1.patch, HIVE-17375.2.patch
>
>
> these two udaf-s are returning 0 in case of only one element - however the 
> stadard requires NULL to be returned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (HIVE-17024) HPL/SQL: CLI fails to exit after NPE using embedded connection

2017-08-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-17024:
--
Comment: was deleted

(was: After HIVE-16595 there is another error:
{code}
java.sql.SQLException: Error while compiling statement: FAILED: ParseException 
line 1:38 cannot recognize input near 'integer' ')' '' in column type
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
{code}

Looks like EOF went to SQL string. )

> HPL/SQL: CLI fails to exit after NPE using embedded connection
> --
>
> Key: HIVE-17024
> URL: https://issues.apache.org/jira/browse/HIVE-17024
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
>Priority: Critical
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> This happened during the error reported in XXX (bug TBD)
> Script is this:
> create table if not exists test1(col1 integer);
> create table if not exists test2(col1 double);
> create table if not exists test3(col1 decimal(10, 4));
> create table if not exists test4(col1 string);
> create table if not exists test5(col1 varchar(20));
> Output is this:
> [vagrant@trunk hplsql]$ hplsql -f temp3.sql
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/vagrant/hivedist/apache-hive-3.0.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.6.1.0-128/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Open connection: jdbc:hive2:// (5.83 sec)
> Starting query
> OK
> Query executed successfully (2.31 sec)
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.hive.hplsql.Exec.evalPop(Exec.java:2398)
> at org.apache.hive.hplsql.Stmt.createTableDefinition(Stmt.java:169)
> at org.apache.hive.hplsql.Stmt.createTable(Stmt.java:142)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:1366)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$Create_table_stmtContext.accept(HplsqlParser.java:4198)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1013)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:1018)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at 
> org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
> at 
> org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:452)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:920)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:393)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
> at org.apache.hive.hplsql.Exec.run(Exec.java:775)
> at org.apache.hive.hplsql.Exec.run(Exec.java:751)
> at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> I attached a jstack.
> When I use a Hiveserver2 connection instead, I get an NPE but it doesn't hang 
> (at least not on the client side)
> Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17024) HPL/SQL: CLI fails to exit after NPE using embedded connection

2017-08-25 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141682#comment-16141682
 ] 

Dmitry Tolpeko commented on HIVE-17024:
---

After HIVE-16595 there is another error:
{code}
java.sql.SQLException: Error while compiling statement: FAILED: ParseException 
line 1:38 cannot recognize input near 'integer' ')' '' in column type
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
{code}

Looks like EOF went to SQL string. 

> HPL/SQL: CLI fails to exit after NPE using embedded connection
> --
>
> Key: HIVE-17024
> URL: https://issues.apache.org/jira/browse/HIVE-17024
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
>Priority: Critical
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> This happened during the error reported in XXX (bug TBD)
> Script is this:
> create table if not exists test1(col1 integer);
> create table if not exists test2(col1 double);
> create table if not exists test3(col1 decimal(10, 4));
> create table if not exists test4(col1 string);
> create table if not exists test5(col1 varchar(20));
> Output is this:
> [vagrant@trunk hplsql]$ hplsql -f temp3.sql
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/vagrant/hivedist/apache-hive-3.0.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.6.1.0-128/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Open connection: jdbc:hive2:// (5.83 sec)
> Starting query
> OK
> Query executed successfully (2.31 sec)
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.hive.hplsql.Exec.evalPop(Exec.java:2398)
> at org.apache.hive.hplsql.Stmt.createTableDefinition(Stmt.java:169)
> at org.apache.hive.hplsql.Stmt.createTable(Stmt.java:142)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:1366)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$Create_table_stmtContext.accept(HplsqlParser.java:4198)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1013)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:1018)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at 
> org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
> at 
> org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:452)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:920)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:393)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
> at org.apache.hive.hplsql.Exec.run(Exec.java:775)
> at org.apache.hive.hplsql.Exec.run(Exec.java:751)
> at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> I attached a jstack.
> When I use a Hiveserver2 connection instead, I get an NPE but it doesn't hang 
> (at least not on the client side)
> Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17024) HPL/SQL: CLI fails to exit after NPE using embedded connection

2017-08-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko reassigned HIVE-17024:
-

Assignee: Dmitry Tolpeko

> HPL/SQL: CLI fails to exit after NPE using embedded connection
> --
>
> Key: HIVE-17024
> URL: https://issues.apache.org/jira/browse/HIVE-17024
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Dmitry Tolpeko
>Priority: Critical
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> This happened during the error reported in XXX (bug TBD)
> Script is this:
> create table if not exists test1(col1 integer);
> create table if not exists test2(col1 double);
> create table if not exists test3(col1 decimal(10, 4));
> create table if not exists test4(col1 string);
> create table if not exists test5(col1 varchar(20));
> Output is this:
> [vagrant@trunk hplsql]$ hplsql -f temp3.sql
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/vagrant/hivedist/apache-hive-3.0.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.6.1.0-128/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Open connection: jdbc:hive2:// (5.83 sec)
> Starting query
> OK
> Query executed successfully (2.31 sec)
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.hive.hplsql.Exec.evalPop(Exec.java:2398)
> at org.apache.hive.hplsql.Stmt.createTableDefinition(Stmt.java:169)
> at org.apache.hive.hplsql.Stmt.createTable(Stmt.java:142)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:1366)
> at org.apache.hive.hplsql.Exec.visitCreate_table_stmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$Create_table_stmtContext.accept(HplsqlParser.java:4198)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1013)
> at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:1018)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at 
> org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
> at 
> org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:452)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:920)
> at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:52)
> at 
> org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:393)
> at 
> org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
> at org.apache.hive.hplsql.Exec.run(Exec.java:775)
> at org.apache.hive.hplsql.Exec.run(Exec.java:751)
> at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> I attached a jstack.
> When I use a Hiveserver2 connection instead, I get an NPE but it doesn't hang 
> (at least not on the client side)
> Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17389) Yetus is always failing on rat checks

2017-08-25 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141664#comment-16141664
 ] 

Peter Vary commented on HIVE-17389:
---

+1

> Yetus is always failing on rat checks
> -
>
> Key: HIVE-17389
> URL: https://issues.apache.org/jira/browse/HIVE-17389
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Trivial
> Attachments: HIVE-17389.01.patch
>
>
> Rat checks are failing on metastore_db/dblock and files under patchprocess 
> created by Yetus itself.
> Both directories should be excluded from rat checks.
> CC: [~pvary] [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17318) Make Hikari CP configurable using hive properties in hive-site.xml

2017-08-25 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17318:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for the patch [~zsombor.klara]!

May I ask you to update the documentation too?

Thanks,
Peter

> Make Hikari CP configurable using hive properties in hive-site.xml
> --
>
> Key: HIVE-17318
> URL: https://issues.apache.org/jira/browse/HIVE-17318
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 3.0.0
>
> Attachments: HIVE-17318.01.patch, HIVE-17318.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17389) Yetus is always failing on rat checks

2017-08-25 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17389:
---
Status: Patch Available  (was: Open)

> Yetus is always failing on rat checks
> -
>
> Key: HIVE-17389
> URL: https://issues.apache.org/jira/browse/HIVE-17389
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Trivial
> Attachments: HIVE-17389.01.patch
>
>
> Rat checks are failing on metastore_db/dblock and files under patchprocess 
> created by Yetus itself.
> Both directories should be excluded from rat checks.
> CC: [~pvary] [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17389) Yetus is always failing on rat checks

2017-08-25 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17389:
---
Attachment: HIVE-17389.01.patch

> Yetus is always failing on rat checks
> -
>
> Key: HIVE-17389
> URL: https://issues.apache.org/jira/browse/HIVE-17389
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Trivial
> Attachments: HIVE-17389.01.patch
>
>
> Rat checks are failing on metastore_db/dblock and files under patchprocess 
> created by Yetus itself.
> Both directories should be excluded from rat checks.
> CC: [~pvary] [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17389) Yetus is always failing on rat checks

2017-08-25 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara reassigned HIVE-17389:
--


> Yetus is always failing on rat checks
> -
>
> Key: HIVE-17389
> URL: https://issues.apache.org/jira/browse/HIVE-17389
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Trivial
>
> Rat checks are failing on metastore_db/dblock and files under patchprocess 
> created by Yetus itself.
> Both directories should be excluded from rat checks.
> CC: [~pvary] [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17388) Spark Stats for the WebUI Query Plan

2017-08-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141651#comment-16141651
 ] 

Hive QA commented on HIVE-17388:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12883732/running_1.png

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6539/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6539/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6539/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-25 13:53:41.423
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6539/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-25 13:53:41.426
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 835c066 HIVE-17380 : refactor LlapProtocolClientProxy to be 
usable with other protocols (Sergey Shelukhin, reviewed by Siddharth Seth)
+ git clean -f -d
Removing 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RestrictedConfigChecker.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SessionExpirationTracker.java
Removing ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPool.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolSession.java
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 835c066 HIVE-17380 : refactor LlapProtocolClientProxy to be 
usable with other protocols (Sergey Shelukhin, reviewed by Siddharth Seth)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-25 13:53:45.011
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
fatal: unrecognized input
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12883732 - PreCommit-HIVE-Build

> Spark Stats for the WebUI Query Plan
> 
>
> Key: HIVE-17388
> URL: https://issues.apache.org/jira/browse/HIVE-17388
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: features, patch
> Attachments: HIVE-17388.patch, running_1.png, running_2.png, 
> success_1.png, success_2.png, success_3.png
>
>
> Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark 
> task progress as well as the log file path will be displayed if 
> hive.server2.webui.show.stats=true. If the task is successful, 
> SparkStatistics will also be shown.
> Screenshots attached are from a run on a CDH cluster.
> Issues:
> * SparkStatistics aren't shown if task fails or is running.
> * Will need rebasing after HIVE-17300 is committed (current patch includes 
> HIVE-17300 changes)
> * Will need testing upstream. 
> Suggestion
> * It would be really easy to incorporate a progress bar to follow Spark 
> progress, with only a few tweaks to the JavaScript in:
> service/src/resources/hive-webapps/static/js/query-plan-graph.js



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17107) Upgrade Yetus to 0.5.0

2017-08-25 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141637#comment-16141637
 ] 

Peter Vary commented on HIVE-17107:
---

+1 LGTM

> Upgrade Yetus to 0.5.0
> --
>
> Key: HIVE-17107
> URL: https://issues.apache.org/jira/browse/HIVE-17107
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17107.01.patch
>
>
> [Yetus 0.5.0|https://yetus.apache.org/documentation/0.5.0/RELEASENOTES/] is 
> released, and it contains our fixes.
> We should upgrade and remove our extra patched files.
> CC: [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17107) Upgrade Yetus to 0.5.0

2017-08-25 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17107:
---
Status: Patch Available  (was: Open)

> Upgrade Yetus to 0.5.0
> --
>
> Key: HIVE-17107
> URL: https://issues.apache.org/jira/browse/HIVE-17107
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17107.01.patch
>
>
> [Yetus 0.5.0|https://yetus.apache.org/documentation/0.5.0/RELEASENOTES/] is 
> released, and it contains our fixes.
> We should upgrade and remove our extra patched files.
> CC: [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17107) Upgrade Yetus to 0.5.0

2017-08-25 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara reassigned HIVE-17107:
--

Assignee: Barna Zsombor Klara  (was: Peter Vary)

> Upgrade Yetus to 0.5.0
> --
>
> Key: HIVE-17107
> URL: https://issues.apache.org/jira/browse/HIVE-17107
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17107.01.patch
>
>
> [Yetus 0.5.0|https://yetus.apache.org/documentation/0.5.0/RELEASENOTES/] is 
> released, and it contains our fixes.
> We should upgrade and remove our extra patched files.
> CC: [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17107) Upgrade Yetus to 0.5.0

2017-08-25 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17107:
---
Attachment: HIVE-17107.01.patch

Uploading patch.

> Upgrade Yetus to 0.5.0
> --
>
> Key: HIVE-17107
> URL: https://issues.apache.org/jira/browse/HIVE-17107
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17107.01.patch
>
>
> [Yetus 0.5.0|https://yetus.apache.org/documentation/0.5.0/RELEASENOTES/] is 
> released, and it contains our fixes.
> We should upgrade and remove our extra patched files.
> CC: [~kgyrtkirk]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17388) Spark Stats for the WebUI Query Plan

2017-08-25 Thread Karen Coppage (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-17388:
-
Assignee: Karen Coppage
  Status: Patch Available  (was: Open)

> Spark Stats for the WebUI Query Plan
> 
>
> Key: HIVE-17388
> URL: https://issues.apache.org/jira/browse/HIVE-17388
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: features, patch
> Attachments: HIVE-17388.patch, running_1.png, running_2.png, 
> success_1.png, success_2.png, success_3.png
>
>
> Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark 
> task progress as well as the log file path will be displayed if 
> hive.server2.webui.show.stats=true. If the task is successful, 
> SparkStatistics will also be shown.
> Screenshots attached are from a run on a CDH cluster.
> Issues:
> * SparkStatistics aren't shown if task fails or is running.
> * Will need rebasing after HIVE-17300 is committed (current patch includes 
> HIVE-17300 changes)
> * Will need testing upstream. 
> Suggestion
> * It would be really easy to incorporate a progress bar to follow Spark 
> progress, with only a few tweaks to the JavaScript in:
> service/src/resources/hive-webapps/static/js/query-plan-graph.js



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17388) Spark Stats for the WebUI Query Plan

2017-08-25 Thread Karen Coppage (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141617#comment-16141617
 ] 

Karen Coppage commented on HIVE-17388:
--

[~xuefuz] Would you mind taking a look at this addition to the Query Plan 
visualizations?

> Spark Stats for the WebUI Query Plan
> 
>
> Key: HIVE-17388
> URL: https://issues.apache.org/jira/browse/HIVE-17388
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: features, patch
> Attachments: HIVE-17388.patch, running_1.png, running_2.png, 
> success_1.png, success_2.png, success_3.png
>
>
> Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark 
> task progress as well as the log file path will be displayed if 
> hive.server2.webui.show.stats=true. If the task is successful, 
> SparkStatistics will also be shown.
> Screenshots attached are from a run on a CDH cluster.
> Issues:
> * SparkStatistics aren't shown if task fails or is running.
> * Will need rebasing after HIVE-17300 is committed (current patch includes 
> HIVE-17300 changes)
> * Will need testing upstream. 
> Suggestion
> * It would be really easy to incorporate a progress bar to follow Spark 
> progress, with only a few tweaks to the JavaScript in:
> service/src/resources/hive-webapps/static/js/query-plan-graph.js



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17388) Spark Stats for the WebUI Query Plan

2017-08-25 Thread Karen Coppage (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-17388:
-
Description: 
Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark task 
progress as well as the log file path will be displayed if 
hive.server2.webui.show.stats=true. If the task is successful, SparkStatistics 
will also be shown.
Screenshots attached are from a run on a CDH cluster.

Issues:
* SparkStatistics aren't shown if task fails or is running.
* Will need rebasing after HIVE-17300 is committed (current patch includes 
HIVE-17300 changes)
* Will need testing upstream. 

Suggestion
* It would be really easy to incorporate a progress bar to follow Spark 
progress, with only a few tweaks to the JavaScript in:
service/src/resources/hive-webapps/static/js/query-plan-graph.js

  was:
Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark task 
progress as well as the log file path will be displayed if 
hive.server2.webui.show.stats=true.
If the task is successful, SparkStatistics will also be shown.

Issues:
* SparkStatistics aren't shown if task fails or is running.
* Will need rebasing after HIVE-17300 is committed (current patch includes 
HIVE-17300 changes)
* Will need testing upstream. 

Suggestion
* It would be really easy to incorporate a progress bar to follow Spark 
progress, with only a few tweaks to the JavaScript in:
service/src/resources/hive-webapps/static/js/query-plan-graph.js


> Spark Stats for the WebUI Query Plan
> 
>
> Key: HIVE-17388
> URL: https://issues.apache.org/jira/browse/HIVE-17388
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Priority: Minor
>  Labels: features, patch
> Attachments: HIVE-17388.patch, running_1.png, running_2.png, 
> success_1.png, success_2.png, success_3.png
>
>
> Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark 
> task progress as well as the log file path will be displayed if 
> hive.server2.webui.show.stats=true. If the task is successful, 
> SparkStatistics will also be shown.
> Screenshots attached are from a run on a CDH cluster.
> Issues:
> * SparkStatistics aren't shown if task fails or is running.
> * Will need rebasing after HIVE-17300 is committed (current patch includes 
> HIVE-17300 changes)
> * Will need testing upstream. 
> Suggestion
> * It would be really easy to incorporate a progress bar to follow Spark 
> progress, with only a few tweaks to the JavaScript in:
> service/src/resources/hive-webapps/static/js/query-plan-graph.js



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17388) Spark Stats for the WebUI Query Plan

2017-08-25 Thread Karen Coppage (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-17388:
-
Attachment: running_1.png
running_2.png
success_1.png
success_2.png
success_3.png

> Spark Stats for the WebUI Query Plan
> 
>
> Key: HIVE-17388
> URL: https://issues.apache.org/jira/browse/HIVE-17388
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Priority: Minor
>  Labels: features, patch
> Attachments: HIVE-17388.patch, running_1.png, running_2.png, 
> success_1.png, success_2.png, success_3.png
>
>
> Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark 
> task progress as well as the log file path will be displayed if 
> hive.server2.webui.show.stats=true.
> If the task is successful, SparkStatistics will also be shown.
> Issues:
> * SparkStatistics aren't shown if task fails or is running.
> * Will need rebasing after HIVE-17300 is committed (current patch includes 
> HIVE-17300 changes)
> * Will need testing upstream. 
> Suggestion
> * It would be really easy to incorporate a progress bar to follow Spark 
> progress, with only a few tweaks to the JavaScript in:
> service/src/resources/hive-webapps/static/js/query-plan-graph.js



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17388) Spark Stats for the WebUI Query Plan

2017-08-25 Thread Karen Coppage (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-17388:
-
Attachment: HIVE-17388.patch

> Spark Stats for the WebUI Query Plan
> 
>
> Key: HIVE-17388
> URL: https://issues.apache.org/jira/browse/HIVE-17388
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Priority: Minor
>  Labels: features, patch
> Attachments: HIVE-17388.patch
>
>
> Click on a Spark stage in the WebUI/Drilldown/Query Plan graph, and Spark 
> task progress as well as the log file path will be displayed if 
> hive.server2.webui.show.stats=true.
> If the task is successful, SparkStatistics will also be shown.
> Issues:
> * SparkStatistics aren't shown if task fails or is running.
> * Will need rebasing after HIVE-17300 is committed (current patch includes 
> HIVE-17300 changes)
> * Will need testing upstream. 
> Suggestion
> * It would be really easy to incorporate a progress bar to follow Spark 
> progress, with only a few tweaks to the JavaScript in:
> service/src/resources/hive-webapps/static/js/query-plan-graph.js



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >