[jira] [Updated] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-17 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16071:
---
Attachment: HIVE-16071.patch

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch, HIVE-16071.patch, HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-17 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929995#comment-15929995
 ] 

Chaoyu Tang commented on HIVE-16189:


Precommit build was run but its result was not published and linked to this 
JIRA (https://builds.apache.org/job/PreCommit-HIVE-Build/4200/). Two tests 
failed but none of them are related to this patch.

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, 
> HIVE-16189.2.patch, HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-17 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to 2.2.0. Thanks [~pxiong] for review.

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0
>
> Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, 
> HIVE-16189.2.patch, HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-14 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925409#comment-15925409
 ] 

Chaoyu Tang edited comment on HIVE-16189 at 3/15/17 1:57 AM:
-

1. Fixed the failed tests.
2. Add a test based on [~pxiong]'s suggestion, the test scenario is as 
following (see encryption_move_tbl.q):
When renaming a table fails to move its table data from one encryption zone to 
another due to EZ incompatibility, table rename fails but its column stats are 
invalidated. When we describe formatted table column, we found that all column 
stats have gone.

[~pxiong] could you review it to see if it makes sense? Thanks.



was (Author: ctang.ma):
1. Fixed the failed tests.
2. Add a test based on [~pxiong]'s suggestion, the test scenario is as 
following (see encryption_move_tbl.q):
When renaming a table fails to move its table data from one encryption zone to 
another due to EZ incompatibility, table rename fails but its column stats are 
invalidated. When we describe formatted table column, we found that all column 
stats have gone.


> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.1.patch, HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-14 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Attachment: HIVE-16189.1.patch

1. Fixed the failed tests.
2. Add a test based on [~pxiong]'s suggestion, the test scenario is as 
following (see encryption_move_tbl.q):
When renaming a table fails to move its table data from one encryption zone to 
another due to EZ incompatibility, table rename fails but its column stats are 
invalidated. When we describe formatted table column, we found that all column 
stats have gone.


> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.1.patch, HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16394) HoS does not support queue name change in middle of session

2017-04-11 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964399#comment-15964399
 ] 

Chaoyu Tang commented on HIVE-16394:


Thanks [~leftylev]. This property is not HoS specific and already works in 
HoMR, so I think it is not needed to be documented separately.

> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0
>
> Attachments: HIVE-16394.patch
>
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16061) When hive.async.log.enabled is set to true, some output is not printed to the beeline console

2017-04-02 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15952898#comment-15952898
 ] 

Chaoyu Tang commented on HIVE-16061:


LGTM, +1

> When hive.async.log.enabled is set to true, some output is not printed to the 
> beeline console
> -
>
> Key: HIVE-16061
> URL: https://issues.apache.org/jira/browse/HIVE-16061
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16061.1.patch, HIVE-16061.2.patch, 
> HIVE-16061.3.patch, HIVE-16061.4.patch
>
>
> Run a hiveserver2 instance "hive --service hiveserver2".
> Then from another console, connect to hiveserver2 "beeline -u 
> "jdbc:hive2://localhost:1"
> When you run a MR job like "select t1.key from src t1 join src t2 on 
> t1.key=t2.key", some of the console logs like MR job info are not printed to 
> the console while it just print to the hiveserver2 console.
> When hive.async.log.enabled is set to false and restarts the HiveServer2, 
> then the output will be printed to the beeline console.
> OperationLog implementation uses the ThreadLocal variable to store associated 
> the log file. When the hive.async.log.enabled is set to true, the logs will 
> be processed by a ThreadPool and  the actual threads from the pool which 
> prints the message won't be able to access the log file stored in the 
> original thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16394) HoS does not support queue name change in middle of session

2017-04-06 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959049#comment-15959049
 ] 

Chaoyu Tang commented on HIVE-16394:


The test failures are not related to this patch.

> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16394.patch
>
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16334) Query lock contains the query string, which can cause OOM on ZooKeeper

2017-04-06 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959261#comment-15959261
 ] 

Chaoyu Tang commented on HIVE-16334:


LGTM, +1

> Query lock contains the query string, which can cause OOM on ZooKeeper
> --
>
> Key: HIVE-16334
> URL: https://issues.apache.org/jira/browse/HIVE-16334
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16334.2.patch, HIVE-16334.3.patch, 
> HIVE-16334.4.patch, HIVE-16334.patch
>
>
> When there are big number of partitions in a query this will result in a huge 
> number of locks on ZooKeeper. Since the query object contains the whole query 
> string this might cause serious memory pressure on the ZooKeeper services.
> It would be good to have the possibility to truncate the query strings that 
> are written into the locks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16334) Query lock contains the query string, which can cause OOM on ZooKeeper

2017-04-07 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16334:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0. Thanks [~pvary] for the patch. I think you may need to 
update the document for the new property.

> Query lock contains the query string, which can cause OOM on ZooKeeper
> --
>
> Key: HIVE-16334
> URL: https://issues.apache.org/jira/browse/HIVE-16334
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 3.0.0
>
> Attachments: HIVE-16334.2.patch, HIVE-16334.3.patch, 
> HIVE-16334.4.patch, HIVE-16334.patch
>
>
> When there are big number of partitions in a query this will result in a huge 
> number of locks on ZooKeeper. Since the query object contains the whole query 
> string this might cause serious memory pressure on the ZooKeeper services.
> It would be good to have the possibility to truncate the query strings that 
> are written into the locks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15538) Test HIVE-13884 with more complex query predicates

2017-04-07 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15538:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0. Thanks [~kuczoram] for the patch.

> Test HIVE-13884 with more complex query predicates
> --
>
> Key: HIVE-15538
> URL: https://issues.apache.org/jira/browse/HIVE-15538
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Fix For: 3.0.0
>
> Attachments: HIVE-15538.2.patch, HIVE-15538.3.patch, HIVE-15538.patch
>
>
> HIVE-13884 introduced a new property hive.metastore.limit.partition.request. 
> It would be good to have more tests to cover the cases where the query 
> predicates (such as like, in) could not be pushed down to see if the fail 
> back from directsql to ORM works properly if hive.metastore.try.direct.sql is 
> enabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session

2017-04-07 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16394:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0. Thanks [~xuefuz], [~lirui] for review.

> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0
>
> Attachments: HIVE-16394.patch
>
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session

2017-04-05 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16394:
---
Attachment: HIVE-16394.patch

> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16394.patch
>
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16394) HoS does not support queue name change in middle of session

2017-04-05 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16394:
--


> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session

2017-04-05 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16394:
---
Status: Patch Available  (was: Open)

[~xuefuz], [~lirui], could you review the patch to see if it makes sense? Thanks

> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16394.patch
>
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15538) Test HIVE-13884 with more complex query predicates

2017-04-05 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957060#comment-15957060
 ] 

Chaoyu Tang commented on HIVE-15538:


LGTM, +1

> Test HIVE-13884 with more complex query predicates
> --
>
> Key: HIVE-15538
> URL: https://issues.apache.org/jira/browse/HIVE-15538
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Attachments: HIVE-15538.2.patch, HIVE-15538.3.patch, HIVE-15538.patch
>
>
> HIVE-13884 introduced a new property hive.metastore.limit.partition.request. 
> It would be good to have more tests to cover the cases where the query 
> predicates (such as like, in) could not be pushed down to see if the fail 
> back from directsql to ORM works properly if hive.metastore.try.direct.sql is 
> enabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens

2017-04-20 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976624#comment-15976624
 ] 

Chaoyu Tang commented on HIVE-16487:


[~pvary], your analysis makes sense. Thanks.

> Serious Zookeeper exception is logged when a race condition happens
> ---
>
> Key: HIVE-16487
> URL: https://issues.apache.org/jira/browse/HIVE-16487
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> A customer started to see this in the logs, but happily everything was 
> working as intended:
> {code}
> 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: 
> [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /hive_zookeeper_namespace//LOCK-SHARED-
> {code}
> This was happening, because a race condition between the lock releasing, and 
> lock acquiring. The thread releasing the lock removes the parent ZK node just 
> after the thread acquiring the lock made sure, that the parent node exists.
> Since this can happen without any real problem, I plan to add NODEEXISTS, and 
> NONODE as a transient ZooKeeper exception, so the users are not confused.
> Also, the original author of ZooKeeperHiveLockManager maybe planned to handle 
> different ZooKeeperExceptions differently, and the code is hard to 
> understand. See the {{continue}} and the {{break}}. The {{break}} only breaks 
> the switch, and not the loop which IMHO is not intuitive:
> {code}
> do {
>   try {
> [..]
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
>   } catch (Exception e1) {
> if (e1 instanceof KeeperException) {
>   KeeperException e = (KeeperException) e1;
>   switch (e.code()) {
>   case CONNECTIONLOSS:
>   case OPERATIONTIMEOUT:
> LOG.debug("Possibly transient ZooKeeper exception: ", e);
> continue;
>   default:
> LOG.error("Serious Zookeeper exception: ", e);
> break;
>   }
> }
> [..]
>   }
> } while (tryNum < numRetriesForLock);
> {code}
> If we do not want to try again in case of a "Serious Zookeeper exception:", 
> then we should add a label to the do loop, and break it in the switch.
> If we do want to try regardless of the type of the ZK exception, then we 
> should just change the {{continue;}} to {{break;}} and move the lines part of 
> the code which did not run in case of {{continue}} to the {{default}} switch, 
> so it is easier to understand the code.
> Any suggestions or ideas [~ctang.ma] or [~szehon]?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-09 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903422#comment-15903422
 ] 

Chaoyu Tang edited comment on HIVE-16071 at 3/9/17 5:24 PM:


So we reached the consensus that hive.spark.client.server.connect.timeout 
should not be used for cancelTask at RPCServer side. The value proposed could 
be hive.spark.client.connect.timeout.
[~xuefuz] The reason that I previously suggested we could consider another 
timeout for cancelTask (a little longer than 
hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time 
to timeout the handshaking than RPCServer. If the timeout at both sides are set 
to exactly same value, we might see the situations quite often where the 
terminations of SASL handshaking are initiated by cancelTask at RpcServer side 
because the timeout at RemoteDriver side might be slightly later for whatever 
reasons. During this short window, the handshake could still have a chance to 
succeed if it is not terminated by cancelTask.
To my understanding, to shorten cancelTask timeout is mainly for RpcServer to 
detect the handshake timeout (fired by RemoteDriver) sooner, we still want 
RemoteDriver to mainly control the SASL handshake timeout, and most handshake 
timeout should be fired from remoteDriver, right?


was (Author: ctang.ma):
So we reached the consensus that hive.spark.client.server.connect.timeout 
should not be used for cancelTask at RPCServer side. The value proposed could 
be hive.spark.client.connect.timeout.
[~xuefuz] The reason that I previously suggested we could consider another 
timeout for cancelTask (a little longer than 
hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time 
to timeout the handshaking than RPCServer. If the timeout at both sides are set 
to exactly same value, we might see the situations quite often where the 
terminations of SASL handshaking are initiated by cancelTask at RpcServer side 
because the timeout at RemoteDriver side might be slightly later for whatever 
reasons. During this short window, the handshake could still have a chance to 
succeed if it is not terminated by cancelTask.
To my understanding, to shorten cancelTask timeout is mainly for RpcServer to 
detect the handshake timeout (fired by RemoteDriver) sooner, we still want 
RemoteDriver to mainly control the SASL handshake timeout, and most handshake 
timeout should be fired from remoteDriver, right?
In addition, I think we should

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> 

[jira] [Comment Edited] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-09 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903422#comment-15903422
 ] 

Chaoyu Tang edited comment on HIVE-16071 at 3/9/17 5:24 PM:


So we reached the consensus that hive.spark.client.server.connect.timeout 
should not be used for cancelTask at RPCServer side. The value proposed could 
be hive.spark.client.connect.timeout.
[~xuefuz] The reason that I previously suggested we could consider another 
timeout for cancelTask (a little longer than 
hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time 
to timeout the handshaking than RPCServer. If the timeout at both sides are set 
to exactly same value, we might see the situations quite often where the 
terminations of SASL handshaking are initiated by cancelTask at RpcServer side 
because the timeout at RemoteDriver side might be slightly later for whatever 
reasons. During this short window, the handshake could still have a chance to 
succeed if it is not terminated by cancelTask.
To my understanding, to shorten cancelTask timeout is mainly for RpcServer to 
detect the handshake timeout (fired by RemoteDriver) sooner, we still want 
RemoteDriver to mainly control the SASL handshake timeout, and most handshake 
timeout should be fired from remoteDriver, right?
In addition, I think we should


was (Author: ctang.ma):
So we reached the consensus that hive.spark.client.server.connect.timeout 
should not be used for cancelTask at RPCServer side. The value proposed could 
be hive.spark.client.connect.timeout.
[~xuefuz] The reason that I previously suggested we could consider another 
timeout for cancelTask (a little longer than 
hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time 
to timeout the handshaking than RPCServer. If the timeout at both sides are set 
to exactly same value, we might see the situations quite often where the 
terminations of SASL handshaking are initiated by cancelTask at RpcServer side 
for the timeout at RemoteDriver side might be slightly later for whatever 
reasons. During this short window, the handshake could still succeed if it is 
not terminated by cancelTask.
To my understanding, we still want RemoteDriver to mainly control the SASL 
handshake timeout, to shorten the cancelTask timeout is mainly for RpcServer to 
detect the timeout (fired by RemoteDriver) sooner, right? 

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> 

[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-09 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903422#comment-15903422
 ] 

Chaoyu Tang commented on HIVE-16071:


So we reached the consensus that hive.spark.client.server.connect.timeout 
should not be used for cancelTask at RPCServer side. The value proposed could 
be hive.spark.client.connect.timeout.
[~xuefuz] The reason that I previously suggested we could consider another 
timeout for cancelTask (a little longer than 
hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time 
to timeout the handshaking than RPCServer. If the timeout at both sides are set 
to exactly same value, we might see the situations quite often where the 
terminations of SASL handshaking are initiated by cancelTask at RpcServer side 
for the timeout at RemoteDriver side might be slightly later for whatever 
reasons. During this short window, the handshake could still succeed if it is 
not terminated by cancelTask.
To my understanding, we still want RemoteDriver to mainly control the SASL 
handshake timeout, to shorten the cancelTask timeout is mainly for RpcServer to 
detect the timeout (fired by RemoteDriver) sooner, right? 

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-03 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894672#comment-15894672
 ] 

Chaoyu Tang commented on HIVE-16071:


Yes, [~lirui]. Increasing hive.spark.client.server.connect.timeout (instead of 
the hive.spark.client.connect.timeout) could help in my case.
The cancelTask could effect and close the channel only when its timeout is set 
to a value shorter than current hive.spark.client.server.connect.timeout. So 
for this cancelTask, we can do:
1. remove it to make code more understandable; or
2. leave it as is since it is not be executed anyway; or
3. Use a different HoS timeout configuration (either 
hive.spark.client.connect.timeout or a new one) so that we have more and finer 
control to the waiting time at HS2 side. Adding a new timeout config may not be 
desirable since we already have many such configurations.
[~xuefuz], [~lirui], [~vanzin], what do you think?

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16189:
--


> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Attachment: HIVE-16189.patch

This patch changes the order of metadata update and data move in alter table 
rename operation, which makes it easier to roll back metadata changes when 
moving data fails in rename a table.

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Status: Patch Available  (was: Open)

[~pxiong], [~aihuaxu], [~ychena], could you review the code?

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-03-08 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16147:
--


> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15922770#comment-15922770
 ] 

Chaoyu Tang commented on HIVE-16189:


I thought about that, but it seems a little difficult to create such a test 
case where the HFDS rename (data move) fails in
{code}
   if (srcFs.exists(srcPath) && !srcFs.rename(srcPath, destPath)) {
throw new IOException("Renaming " + srcPath + " to " + destPath + " 
failed");
   }
{code}
after the metadata change has been successfully committed. 
Do you have any suggestion on that? I was able to manually reproduce this issue 
and verify the patch by using debug breakpoint and simulating the file move 
failure.

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15922785#comment-15922785
 ] 

Chaoyu Tang commented on HIVE-16189:


Looks like the precommit build infrastructure has some issues, will re-trigger 
the tests when it is fixed.

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Attachment: HIVE-16189.patch

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Attachment: (was: HIVE-16189.patch)

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Attachment: (was: HIVE-16189.patch)

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-03-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Attachment: HIVE-16189.patch

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15997) Resource leaks when query is cancelled

2017-03-06 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897742#comment-15897742
 ] 

Chaoyu Tang edited comment on HIVE-15997 at 3/6/17 6:30 PM:


Will TezTask be affected as well? Also I am not quite sure about this, for the 
code like this:
{code}  
try {
curatorFramework.delete().forPath(zLock.getPath());
  } catch (InterruptedException ie) {
curatorFramework.delete().forPath(zLock.getPath());
  }
{code}
catching InterruptedException will guarantee to clear the interrupted flag in 
the thread and calling the method second time will guarantee to succeed?



was (Author: ctang.ma):
Will TezTask be affected as well?

> Resource leaks when query is cancelled 
> ---
>
> Key: HIVE-15997
> URL: https://issues.apache.org/jira/browse/HIVE-15997
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak:
> {noformat}
> 2017-02-02 06:23:25,410 WARN  hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>   at com.sun.proxy.$Proxy25.delete(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>   at com.sun.proxy.$Proxy26.delete(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>   at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405)
>   at org.apache.hadoop.hive.ql.Context.clear(Context.java:541)
>   at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109)
>   at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681)
>   at 
> 

[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled

2017-03-06 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897742#comment-15897742
 ] 

Chaoyu Tang commented on HIVE-15997:


Will TezTask be affected as well?

> Resource leaks when query is cancelled 
> ---
>
> Key: HIVE-15997
> URL: https://issues.apache.org/jira/browse/HIVE-15997
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak:
> {noformat}
> 2017-02-02 06:23:25,410 WARN  hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>   at com.sun.proxy.$Proxy25.delete(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>   at com.sun.proxy.$Proxy26.delete(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>   at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405)
>   at org.apache.hadoop.hive.ql.Context.clear(Context.java:541)
>   at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109)
>   at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714)
>   at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376)
>   at 

[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-01 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890961#comment-15890961
 ] 

Chaoyu Tang commented on HIVE-16071:


[~xuefuz], [~lirui], and [~vanzin], thanks for the clarification. I did not 
notice HIVE-15671 before.
If so, will it be more reasonable that the timeout used in Constructing 
RpcServer should match that used by RemoteDriver for its RPC handshaking 
(hive.spark.client.connect.timeout)? Or we might not need it since the 
handshaking timeout has been controlled by that used in RemoteDriver?

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-02 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893684#comment-15893684
 ] 

Chaoyu Tang commented on HIVE-16071:


For the error I mentioned in this JIRA description, it was caused by the 
[timeout|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L176]
 set in HS2 registerClient. It happened when SaslServerHandler and 
SaslClientHandler were undergoing handshaking. The registerClient in 
SparkClientImpl threw out an error which interrupted the process calling 
spark-submit, therefore ending the channel between HS2 and RemoteClient. The 
channel termination was detected by SaslClientHandler.channelInactive at 
RemoteDriver, which in term invoked SaslClientHandler.dispose(), therefore we 
saw the SASLException at RemoteDriver side with msg "SaslException: Client 
closed before SASL negotiation finished." I have managed to reproduce this 
error by adjusting the hive.spark.client.server.connect.timeout value to make 
the HS2 timeout happen during SASL negotiation but RemoteDriver has not reached 
its own timeout. 
Looking more into the code, I think that the 
[cancelTask|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L100]
 never has a chance to effect. It has the same timeout value as that used for 
[registerClient|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L170],
 and the latter one always kicks in before it.
Timeout at RemoteDriver side could happen at two places. If it happens when 
driver 
[connects|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java#L110]
 back to HS2, HS2 could not detect this timeout error and has to wait until its 
own timeout set in  
[registerClient|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L170]
 effects. If the 
[timeout|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java#L122]
 happens during SASL handshaking, the RemoteDriver main will exit abnormally. 
The SaslServerHandler.channelInactive at HS2 side could detect this channel 
termination and invokes the SaslServerHandler.dispose, which in term cancels 
this cancelTask. Depending on the stage where HS2 is at (see following code 
snippet) 
{code}
protected void onError(Throwable error) {
  cancelTask.cancel(true);
  if (client != null) {
client.timeoutFuture.cancel(true);
if (!client.promise.isDone()) {
  client.promise.setFailure(error);
}
  }
}
{code}
HS2 has to wait until its hive.spark.client.server.connect.timeout if the 
clientInfo is null, or the process could terminate immediately.
So the 
[cancelTask|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L100]
 is not used in the code unless its timeout is set with a different value 
shorter than that set for registerClient 
(hive.spark.client.server.connect.timeout).


> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> 

[jira] [Comment Edited] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-02 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893684#comment-15893684
 ] 

Chaoyu Tang edited comment on HIVE-16071 at 3/3/17 4:04 AM:


For the error I mentioned in this JIRA description, it was caused by the 
[timeout|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L176]
 set in HS2 registerClient. It happened when SaslServerHandler and 
SaslClientHandler were undergoing handshaking. The timeout caused the 
registerClient in SparkClientImpl to throw out an error which interrupted the 
process calling spark-submit, therefore ending the channel between HS2 and 
RemoteClient. The channel termination was detected by 
SaslClientHandler.channelInactive at RemoteDriver side, which in term invoked 
SaslClientHandler.dispose(). Therefore we saw the SASLException with msg 
"SaslException: Client closed before SASL negotiation finished."I have managed 
to reproduce this error by adjusting the 
hive.spark.client.server.connect.timeout value to make the HS2 timeout happen 
during SASL negotiation but RemoteDriver has not reached its own timeout.

Looking more into the code, I think that the 
[cancelTask|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L100]
 is not used at all in the code. It never has a chance to effect because it has 
the same timeout value as that used for 
[registerClient|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L170],
 and the latter one always kicks in before it.
Timeout at RemoteDriver side could happen at two places. If it happens when 
driver 
[connects|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java#L110]
 back to HS2, HS2 could not detect this timeout error at driver site and has to 
wait until its own timeout set in  
[registerClient|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L170]
 effects. If the 
[timeout|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java#L122]
 happens during SASL handshaking, the RemoteDriver main will exit abnormally. 
The SaslServerHandler.channelInactive at HS2 side could detect this channel 
termination and invokes the SaslServerHandler.dispose, which in term cancels 
this cancelTask (not be used again). Depending on the stage where HS2 is at 
(see following code snippet) 
{code}
protected void onError(Throwable error) {
  cancelTask.cancel(true);
  if (client != null) {
client.timeoutFuture.cancel(true);
if (!client.promise.isDone()) {
  client.promise.setFailure(error);
}
  }
}
{code}
HS2 should either wait until its hive.spark.client.server.connect.timeout when 
clientInfo is null, or terminates the process immediately.
So the 
[cancelTask|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L100]
 is currently useless in the code unless its timeout could be set with a 
different (shorter) value than that set for registerClient 
(hive.spark.client.server.connect.timeout). Or we can consider removing it 
though its existence in current code does not do any harm either.



was (Author: ctang.ma):
For the error I mentioned in this JIRA description, it was caused by the 
[timeout|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L176]
 set in HS2 registerClient. It happened when SaslServerHandler and 
SaslClientHandler were undergoing handshaking. The registerClient in 
SparkClientImpl threw out an error which interrupted the process calling 
spark-submit, therefore ending the channel between HS2 and RemoteClient. The 
channel termination was detected by SaslClientHandler.channelInactive at 
RemoteDriver, which in term invoked SaslClientHandler.dispose(), therefore we 
saw the SASLException at RemoteDriver side with msg "SaslException: Client 
closed before SASL negotiation finished." I have managed to reproduce this 
error by adjusting the hive.spark.client.server.connect.timeout value to make 
the HS2 timeout happen during SASL negotiation but RemoteDriver has not reached 
its own timeout. 
Looking more into the code, I think that the 
[cancelTask|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L100]
 never has a chance to effect. It has the same timeout value as that used for 
[registerClient|https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java#L170],
 and the latter one always kicks in before it.

[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake

2017-03-07 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900607#comment-15900607
 ] 

Chaoyu Tang commented on HIVE-16071:


I agree with  [~xuefuz] that we need a timeout for SASL handshaking at RPC 
server site for the case he raised. This timeout should be shorter than 
client.server.connect.timeout used by RegisterClient, but ideally I think it 
should be a little longer than the client.connect.timeout used by RemoteDriver 
handshaking so that we can try to avoid the handshaking timeout initiated by 
the server given that starting a remoteDriver is quite expensive. If so, I 
would suggest we can introduce a new configuration like 
hive.spark.rpc.handshake.server.timeout, and rename   
hive.spark.client.connect.timeout to hive.spark.rpc.handshake.client.timeout 
(though it is also used as the socket connect timeout at RemoteDriver side like 
now). Also the hive.spark.client.server.connect.timeout could be renamed to 
something like hive.spark.register.remote.driver.timeout if necessary. What do 
you guys think about it?

> Spark remote driver misuses the timeout in RPC handshake
> 
>
> Key: HIVE-16071
> URL: https://issues.apache.org/jira/browse/HIVE-16071
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16071.patch
>
>
> Based on its property description in HiveConf and the comments in HIVE-12650 
> (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979),
>  hive.spark.client.connect.timeout is the timeout when the spark remote 
> driver makes a socket connection (channel) to RPC server. But currently it is 
> also used by the remote driver for RPC client/server handshaking, which is 
> not right. Instead, hive.spark.client.server.connect.timeout should be used 
> and it has already been used by the RPCServer in the handshaking.
> The error like following is usually caused by this issue, since the default 
> hive.spark.client.connect.timeout value (1000ms) used by remote driver for 
> handshaking is a little too short.
> {code}
> 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: 
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
> Client closed before SASL negotiation finished.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> at 
> org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156)
> at 
> org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> Caused by: javax.security.sasl.SaslException: Client closed before SASL 
> negotiation finished.
> at 
> org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
> at 
> org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled

2017-03-07 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900049#comment-15900049
 ] 

Chaoyu Tang commented on HIVE-15997:


LGTM, +1

> Resource leaks when query is cancelled 
> ---
>
> Key: HIVE-15997
> URL: https://issues.apache.org/jira/browse/HIVE-15997
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak:
> {noformat}
> 2017-02-02 06:23:25,410 WARN  hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>   at com.sun.proxy.$Proxy25.delete(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>   at com.sun.proxy.$Proxy26.delete(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>   at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405)
>   at org.apache.hadoop.hive.ql.Context.clear(Context.java:541)
>   at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109)
>   at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714)
>   at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
>   at 

[jira] [Updated] (HIVE-16308) PreExecutePrinter and PostExecutePrinter should log to INFO level instead of ERROR

2017-04-01 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16308:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   2.3.0
   Status: Resolved  (was: Patch Available)

Committed to 2.3.0 & 3.0.0. Thanks [~stakiar].

> PreExecutePrinter and PostExecutePrinter should log to INFO level instead of 
> ERROR
> --
>
> Key: HIVE-16308
> URL: https://issues.apache.org/jira/browse/HIVE-16308
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16308.1.patch
>
>
> Many of the pre and post hook printers log info at the ERROR level, which is 
> confusing since they aren't errors. They should log to the INFO level.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15880) Allow insert overwrite and truncate table query to use auto.purge table property

2017-04-01 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15880:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   2.3.0
   Status: Resolved  (was: Patch Available)

Committed to 2.3.0 & 3.0.0. Thanks [~vihangk1] for the patch.

> Allow insert overwrite and truncate table query to use auto.purge table 
> property
> 
>
> Key: HIVE-15880
> URL: https://issues.apache.org/jira/browse/HIVE-15880
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, 
> HIVE-15880.03.patch, HIVE-15880.04.patch, HIVE-15880.05.patch, 
> HIVE-15880.06.patch
>
>
> It seems inconsistent that auto.purge property is not considered when we do a 
> INSERT OVERWRITE while it is when we do a DROP TABLE
> Drop table doesn't move table data to Trash when auto.purge is set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> No rows affected (0.064 seconds)
> > alter table temp set tblproperties('auto.purge'='true');
> No rows affected (0.083 seconds)
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> No rows affected (25.473 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:03 
> /user/hive/warehouse/temp/00_0
> #
> > drop table temp;
> No rows affected (0.242 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> ls: `/user/hive/warehouse/temp': No such file or directory
> #
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> #
> {noformat}
> INSERT OVERWRITE query moves the table data to Trash even when auto.purge is 
> set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> > alter table temp set tblproperties('auto.purge'='true');
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:07 
> /user/hive/warehouse/temp/00_0
> #
> > insert overwrite table temp select * from dummy;
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 26 2017-02-09 13:08 
> /user/hive/warehouse/temp/00_0
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> Found 1 items
> drwx--   - hive hive  0 2017-02-09 13:08 
> /user/hive/.Trash/Current/user/hive/warehouse/temp
> #
> {noformat}
> While move operations are not very costly on HDFS it could be significant 
> overhead on slow FileSystems like S3. This could improve the performance of 
> {{INSERT OVERWRITE TABLE}} queries especially when there are large number of 
> partitions on tables located on S3 should the user wish to set auto.purge 
> property to true
> Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property 
> set true should not move the data to Trash



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-10307) Support to use number literals in partition column

2017-04-05 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956800#comment-15956800
 ] 

Chaoyu Tang commented on HIVE-10307:


[~pxiong] The property hive.typecheck.on.insert was initially introduced in 
HIVE-5297 and this JIRA (HIVE-10307) just added its comment. I did not remove 
this property when working on this JIRA for the sake of the backward 
compatibility, though these years I have also not seen a case which needs this 
property set to false.

> Support to use number literals in partition column
> --
>
> Key: HIVE-10307
> URL: https://issues.apache.org/jira/browse/HIVE-10307
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.0.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.2.0
>
> Attachments: HIVE-10307.1.patch, HIVE-10307.2.patch, 
> HIVE-10307.3.patch, HIVE-10307.4.patch, HIVE-10307.5.patch, 
> HIVE-10307.6.patch, HIVE-10307.patch
>
>
> Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as 
> literals with postfix like Y, S, L, or BD appended to the number. These 
> literals work in most Hive queries, but do not when they are used as 
> partition column value. For a partitioned table like:
> create table partcoltypenum (key int, value string) partitioned by (tint 
> tinyint, sint smallint, bint bigint);
> insert into partcoltypenum partition (tint=100Y, sint=1S, 
> bint=1000L) select key, value from src limit 30;
> Queries like select, describe and drop partition do not work. For an example
> select * from partcoltypenum where tint=100Y and sint=1S and 
> bint=1000L;
> does not return any rows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15880) Allow insert overwrite and truncate table query to use auto.purge table property

2017-03-30 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950182#comment-15950182
 ] 

Chaoyu Tang commented on HIVE-15880:


The patch looks good to me, +1.

> Allow insert overwrite and truncate table query to use auto.purge table 
> property
> 
>
> Key: HIVE-15880
> URL: https://issues.apache.org/jira/browse/HIVE-15880
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, 
> HIVE-15880.03.patch, HIVE-15880.04.patch, HIVE-15880.05.patch, 
> HIVE-15880.06.patch
>
>
> It seems inconsistent that auto.purge property is not considered when we do a 
> INSERT OVERWRITE while it is when we do a DROP TABLE
> Drop table doesn't move table data to Trash when auto.purge is set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> No rows affected (0.064 seconds)
> > alter table temp set tblproperties('auto.purge'='true');
> No rows affected (0.083 seconds)
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> No rows affected (25.473 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:03 
> /user/hive/warehouse/temp/00_0
> #
> > drop table temp;
> No rows affected (0.242 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> ls: `/user/hive/warehouse/temp': No such file or directory
> #
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> #
> {noformat}
> INSERT OVERWRITE query moves the table data to Trash even when auto.purge is 
> set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> > alter table temp set tblproperties('auto.purge'='true');
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:07 
> /user/hive/warehouse/temp/00_0
> #
> > insert overwrite table temp select * from dummy;
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 26 2017-02-09 13:08 
> /user/hive/warehouse/temp/00_0
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> Found 1 items
> drwx--   - hive hive  0 2017-02-09 13:08 
> /user/hive/.Trash/Current/user/hive/warehouse/temp
> #
> {noformat}
> While move operations are not very costly on HDFS it could be significant 
> overhead on slow FileSystems like S3. This could improve the performance of 
> {{INSERT OVERWRITE TABLE}} queries especially when there are large number of 
> partitions on tables located on S3 should the user wish to set auto.purge 
> property to true
> Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property 
> set true should not move the data to Trash



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-24 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982021#comment-15982021
 ] 

Chaoyu Tang commented on HIVE-16147:


Patch has been uploaded to RB. [~pxiong], could you help to review it. Thanks.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-26 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Attachment: HIVE-16147.patch

Reattach the patch to kick off precommit test.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-26 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Attachment: (was: HIVE-16147.patch)

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-26 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Attachment: HIVE-16147.patch

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-27 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Attachment: HIVE-16147.1.patch

Fixed the failed tests.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-24 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Status: Patch Available  (was: Open)

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-24 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Attachment: HIVE-16147.patch

The patch is to:
1. preserve the column stats in a partitioned table rename
2. since the column stats are no more invalidated during a table rename, I 
renamed the alter_table_invalidate_column_stats.q to alter_table_column_stats.q

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens

2017-04-25 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983128#comment-15983128
 ] 

Chaoyu Tang commented on HIVE-16487:


[~pvary] I think the Exception e1 will be eaten, if it is not an instance of 
KeeperException, after numRetriesForLock with this patch.

> Serious Zookeeper exception is logged when a race condition happens
> ---
>
> Key: HIVE-16487
> URL: https://issues.apache.org/jira/browse/HIVE-16487
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16487.patch
>
>
> A customer started to see this in the logs, but happily everything was 
> working as intended:
> {code}
> 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: 
> [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /hive_zookeeper_namespace//LOCK-SHARED-
> {code}
> This was happening, because a race condition between the lock releasing, and 
> lock acquiring. The thread releasing the lock removes the parent ZK node just 
> after the thread acquiring the lock made sure, that the parent node exists.
> Since this can happen without any real problem, I plan to add NODEEXISTS, and 
> NONODE as a transient ZooKeeper exception, so the users are not confused.
> Also, the original author of ZooKeeperHiveLockManager maybe planned to handle 
> different ZooKeeperExceptions differently, and the code is hard to 
> understand. See the {{continue}} and the {{break}}. The {{break}} only breaks 
> the switch, and not the loop which IMHO is not intuitive:
> {code}
> do {
>   try {
> [..]
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
>   } catch (Exception e1) {
> if (e1 instanceof KeeperException) {
>   KeeperException e = (KeeperException) e1;
>   switch (e.code()) {
>   case CONNECTIONLOSS:
>   case OPERATIONTIMEOUT:
> LOG.debug("Possibly transient ZooKeeper exception: ", e);
> continue;
>   default:
> LOG.error("Serious Zookeeper exception: ", e);
> break;
>   }
> }
> [..]
>   }
> } while (tryNum < numRetriesForLock);
> {code}
> If we do not want to try again in case of a "Serious Zookeeper exception:", 
> then we should add a label to the do loop, and break it in the switch.
> If we do want to try regardless of the type of the ZK exception, then we 
> should just change the {{continue;}} to {{break;}} and move the lines part of 
> the code which did not run in case of {{continue}} to the {{default}} switch, 
> so it is easier to understand the code.
> Any suggestions or ideas [~ctang.ma] or [~szehon]?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-28 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989269#comment-15989269
 ] 

Chaoyu Tang commented on HIVE-16147:


[~pxiong] Thanks for looking into this. Yeah, I made some changes to fix the 
test failures and also optimized the code a little. I have uploaded the 2nd 
patch to RB requesting for the review.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-28 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989168#comment-15989168
 ] 

Chaoyu Tang commented on HIVE-16147:


The only one test failure is not related to this patch. [~pxiong] could you 
review the patch? Thanks

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-27 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987783#comment-15987783
 ] 

Chaoyu Tang commented on HIVE-16147:


The test failures are not related to the patch. [~pxiong], could you help to 
review it again? Thanks

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens

2017-04-28 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988954#comment-15988954
 ] 

Chaoyu Tang commented on HIVE-16487:


LGTM, +1 pending tests.

> Serious Zookeeper exception is logged when a race condition happens
> ---
>
> Key: HIVE-16487
> URL: https://issues.apache.org/jira/browse/HIVE-16487
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16487.02.patch, HIVE-16487.patch
>
>
> A customer started to see this in the logs, but happily everything was 
> working as intended:
> {code}
> 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: 
> [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /hive_zookeeper_namespace//LOCK-SHARED-
> {code}
> This was happening, because a race condition between the lock releasing, and 
> lock acquiring. The thread releasing the lock removes the parent ZK node just 
> after the thread acquiring the lock made sure, that the parent node exists.
> Since this can happen without any real problem, I plan to add NODEEXISTS, and 
> NONODE as a transient ZooKeeper exception, so the users are not confused.
> Also, the original author of ZooKeeperHiveLockManager maybe planned to handle 
> different ZooKeeperExceptions differently, and the code is hard to 
> understand. See the {{continue}} and the {{break}}. The {{break}} only breaks 
> the switch, and not the loop which IMHO is not intuitive:
> {code}
> do {
>   try {
> [..]
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
>   } catch (Exception e1) {
> if (e1 instanceof KeeperException) {
>   KeeperException e = (KeeperException) e1;
>   switch (e.code()) {
>   case CONNECTIONLOSS:
>   case OPERATIONTIMEOUT:
> LOG.debug("Possibly transient ZooKeeper exception: ", e);
> continue;
>   default:
> LOG.error("Serious Zookeeper exception: ", e);
> break;
>   }
> }
> [..]
>   }
> } while (tryNum < numRetriesForLock);
> {code}
> If we do not want to try again in case of a "Serious Zookeeper exception:", 
> then we should add a label to the do loop, and break it in the switch.
> If we do want to try regardless of the type of the ZK exception, then we 
> should just change the {{continue;}} to {{break;}} and move the lines part of 
> the code which did not run in case of {{continue}} to the {{default}} switch, 
> so it is easier to understand the code.
> Any suggestions or ideas [~ctang.ma] or [~szehon]?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16394:
---
Component/s: Spark

> HoS does not support queue name change in middle of session
> ---
>
> Key: HIVE-16394
> URL: https://issues.apache.org/jira/browse/HIVE-16394
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0
>
> Attachments: HIVE-16394.patch
>
>
> The mapreduce.job.queuename only effects when HoS executes its query first 
> time. After that, changing mapreduce.job.queuename won't change the query 
> yarn scheduler queue name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
Component/s: Statistics

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16189:
---
Component/s: Statistics

> Table column stats might be invalidated in a failed table rename
> 
>
> Key: HIVE-16189
> URL: https://issues.apache.org/jira/browse/HIVE-16189
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0
>
> Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, 
> HIVE-16189.2.patch, HIVE-16189.patch
>
>
> If the table rename does not succeed due to its failure in moving the data to 
> the new renamed table folder, the changes in TAB_COL_STATS are not rolled 
> back which leads to invalid column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15485) Investigate the DoAs failure in HoS

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15485:
---
Component/s: Spark

> Investigate the DoAs failure in HoS
> ---
>
> Key: HIVE-15485
> URL: https://issues.apache.org/jira/browse/HIVE-15485
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0
>
> Attachments: HIVE-15485.1.patch, HIVE-15485.2.patch, HIVE-15485.patch
>
>
> With DoAs enabled, HoS failed with following errors:
> {code}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> systest tries to renew a token with renewer hive
>   at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:484)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7543)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:555)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:674)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:999)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)
> {code}
> It is related to the change from HIVE-14383. It looks like that SparkSubmit 
> logs in Kerberos with passed in hive principal/keytab and then tries to 
> create a hdfs delegation token for user systest with renewer hive.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15653:
---
Component/s: Statistics

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Statistics
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, 
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, 
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14359) Hive on Spark might fail in HS2 with LDAP authentication in a kerberized cluster

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14359:
---
Component/s: Spark

> Hive on Spark might fail in HS2 with LDAP authentication in a kerberized 
> cluster
> 
>
> Key: HIVE-14359
> URL: https://issues.apache.org/jira/browse/HIVE-14359
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.1, 2.2.0
>
> Attachments: HIVE-14359.patch
>
>
> When HS2 is used as a gateway for the LDAP users to access and run the 
> queries in kerberized cluster, it's authentication mode is configured as LDAP 
> and at this time, HoS might fail by the same reason as HIVE-10594. 
> hive.server2.authentication is not a proper property to determine if a 
> cluster is kerberized, instead hadoop.security.authentication should be used.
> The failure is in spark client communicating with rest of hadoop as it 
> assumes kerberos does not need to be used.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14697) Can not access kerberized HS2 Web UI

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14697:
---
Component/s: Security

> Can not access kerberized HS2 Web UI
> 
>
> Key: HIVE-14697
> URL: https://issues.apache.org/jira/browse/HIVE-14697
> Project: Hive
>  Issue Type: Bug
>  Components: Security, Web UI
>Affects Versions: 2.1.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.1, 2.2.0
>
> Attachments: HIVE-14697.patch
>
>
> Failed to access kerberized HS2 WebUI with following error msg:
> {code}
> curl -v -u : --negotiate http://util185.phx2.cbsig.net:10002/ 
> > GET / HTTP/1.1 
> > Host: util185.phx2.cbsig.net:10002 
> > Authorization: Negotiate YIIU7...[redacted]... 
> > User-Agent: curl/7.42.1 
> > Accept: */* 
> > 
> < HTTP/1.1 413 FULL head 
> < Content-Length: 0 
> < Connection: close 
> < Server: Jetty(7.6.0.v20120127) 
> {code}
> It is because the Jetty default request header (4K) is too small in some 
> kerberos case.
> So this patch is to increase the request header to 64K.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12270) Add DBTokenStore support to HS2 delegation token

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12270:
---
Component/s: Security
 Authentication

> Add DBTokenStore support to HS2 delegation token
> 
>
> Key: HIVE-12270
> URL: https://issues.apache.org/jira/browse/HIVE-12270
> Project: Hive
>  Issue Type: New Feature
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.0
>
> Attachments: HIVE-12270.1.nothrift.patch, HIVE-12270.1.patch, 
> HIVE-12270.2.patch, HIVE-12270.3.nothrift.patch, HIVE-12270.3.patch, 
> HIVE-12270.nothrift.patch
>
>
> DBTokenStore was initially introduced by HIVE-3255 in Hive-0.12 and it is 
> mainly for HMS delegation token. Later in Hive-0.13, the HS2 delegation token 
> support was introduced by HIVE-5155 but it used MemoryTokenStore as token 
> store. That the HIVE-9622 uses the shared RawStore (or HMSHandler) to access 
> the token/keys information in HMS DB directly from HS2 seems not the right 
> approach to support DBTokenStore in HS2. I think we should use 
> HiveMetaStoreClient in HS2 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-13401) Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token authentication

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-13401:
---
Component/s: Security

> Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token 
> authentication
> 
>
> Key: HIVE-13401
> URL: https://issues.apache.org/jira/browse/HIVE-13401
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.0
>
> Attachments: HIVE-13401-branch2.0.1.patch, HIVE-13401.patch
>
>
> When HS2 is running in kerberos cluster but with other Sasl authentication 
> (e.g. LDAP) enabled, it fails in kerberos/delegation token authentication. It 
> is because the HS2 server uses the TSetIpAddressProcess when other 
> authentication is enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12965) Insert overwrite local directory should perserve the overwritten directory permission

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12965:
---
Component/s: Security

> Insert overwrite local directory should perserve the overwritten directory 
> permission
> -
>
> Key: HIVE-12965
> URL: https://issues.apache.org/jira/browse/HIVE-12965
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12965.1.patch, HIVE-12965.2.patch, 
> HIVE-12965.3.patch, HIVE-12965.patch
>
>
> In Hive, "insert overwrite local directory" first deletes the overwritten 
> directory if exists, recreate a new one, then copy the files from src 
> directory to the new local directory. This process sometimes changes the 
> permissions of the to-be-overwritten local directory, therefore causing some 
> applications no more to be able to access its content.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12188) DoAs does not work properly in non-kerberos secured HS2

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12188:
---
Component/s: Security

> DoAs does not work properly in non-kerberos secured HS2
> ---
>
> Key: HIVE-12188
> URL: https://issues.apache.org/jira/browse/HIVE-12188
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12188.patch
>
>
> The case with following settings is valid but it seems still not work 
> correctly in current HS2
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=true (with HMS Kerberos enabled)
> ==
> Currently HS2 is able to fetch the delegation token to a kerberos secured HMS 
> only when itself is also kerberos secured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-02 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993588#comment-15993588
 ] 

Chaoyu Tang commented on HIVE-11064:


I was not able to reproduce this issue in HIVE-3.0.0, it was probably fixed as 
the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but 
only when hive.metastore.try.direct.sql is disabled.



> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
>   ... 19 more
> {quote}
> I debug the code, may this function "private void 
> updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I 
> don't known the exact error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-02 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993588#comment-15993588
 ] 

Chaoyu Tang edited comment on HIVE-11064 at 5/2/17 7:42 PM:


I was not able to reproduce this issue in Hive 3.0.0, it was probably fixed as 
the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but 
only when hive.metastore.try.direct.sql is disabled.




was (Author: ctang.ma):
I was not able to reproduce this issue in HIVE-3.0.0, it was probably fixed as 
the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but 
only when hive.metastore.try.direct.sql is disabled.



> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
>   at 
> 

[jira] [Comment Edited] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-02 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993588#comment-15993588
 ] 

Chaoyu Tang edited comment on HIVE-11064 at 5/2/17 7:44 PM:


I was not able to reproduce this issue in Hive 3.0.0, though I could reproduce 
this issue in CDH5.4.0 but only when hive.metastore.try.direct.sql is disabled.




was (Author: ctang.ma):
I was not able to reproduce this issue in Hive 3.0.0, it was probably fixed as 
the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but 
only when hive.metastore.try.direct.sql is disabled.



> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
>   ... 19 more

[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-03 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16572:
---
Attachment: HIVE-16572.patch

The patch is to do following:
1. keep the partition column stats when a partition is renamed
2. refactor the partition renaming logic. We move the partition directory 
before committing the HMS transaction, since it will be easier to revert the 
data moving in a rename failure.

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-03 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16572:
---
Status: Patch Available  (was: Open)

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-09 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16572:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0and 2.4.0. Thanks [~ychena] for review.

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16572.1.patch, HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-09 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reopened HIVE-11064:


> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
>   ... 19 more
> {quote}
> I debug the code, may this function "private void 
> updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I 
> don't known the exact error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-09 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002678#comment-16002678
 ] 

Chaoyu Tang commented on HIVE-11064:


The issue was caused by the discrepancy in column definition between a table 
and its partition. The first command "ALTER TABLE test1 CHANGE name name1 
string;" changed the table's column "name" to "name1", the 2nd command "ALTER 
TABLE test1 CHANGE name1 name string cascade;" with "cascade" clause attempted 
to change the partition column "name1" to "name" which did actually not exist. 
When executing the 2nd command, Hive failed in validateTableCols (validating 
partition columns against its table) in getMPartitionColumnStatistics. It is 
the root cause to the seen issue in this JIRA though the thrown exception and 
its message is not so informative.
The issue has been fixed as a side effect of HIVE-16147. 

> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at 

[jira] [Resolved] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-09 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang resolved HIVE-11064.

   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0

It has been fixed in HIVE-16147.

> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
> Fix For: 3.0.0, 2.4.0
>
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
>   ... 19 more
> {quote}
> I debug the code, may this function "private void 
> updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I 
> don't known the exact error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters

2017-06-21 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057735#comment-16057735
 ] 

Chaoyu Tang commented on HIVE-16930:


+1

> HoS should verify the value of Kerberos principal and keytab file before 
> adding them to spark-submit command parameters
> ---
>
> Key: HIVE-16930
> URL: https://issues.apache.org/jira/browse/HIVE-16930
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16930.1.patch
>
>
> When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries:
> {noformat}
> >hive -e "set hive.execution.engine=spark; create table if not exists test(a 
> >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > 
> >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt 
> 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting 
> for client to connect. 
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel 
> client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited 
> before connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) 
> at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) 
> at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) 
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) 
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) 
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) 
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) 
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
> Caused by: java.lang.RuntimeException: Cancel client 
> 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before 
> connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) 
> at 
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) 
> at java.lang.Thread.run(Thread.java:745) 
> 17/06/16 16:13:13 [Driver]: WARN client.SparkClientImpl: Child process exited 
> with code 1 
> {noformat} 
> In the log, below message shows up:
> {noformat}
> 17/06/16 16:13:12 

[jira] [Updated] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters

2017-06-22 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16930:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0 and 2.4.0. Thanks [~Yibing] for the patch.

> HoS should verify the value of Kerberos principal and keytab file before 
> adding them to spark-submit command parameters
> ---
>
> Key: HIVE-16930
> URL: https://issues.apache.org/jira/browse/HIVE-16930
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16930.1.patch
>
>
> When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries:
> {noformat}
> >hive -e "set hive.execution.engine=spark; create table if not exists test(a 
> >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > 
> >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt 
> 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting 
> for client to connect. 
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel 
> client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited 
> before connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) 
> at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) 
> at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) 
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) 
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) 
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) 
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) 
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
> Caused by: java.lang.RuntimeException: Cancel client 
> 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before 
> connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) 
> at 
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) 
> at 

[jira] [Commented] (HIVE-14615) Temp table leaves behind insert command

2017-06-19 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054938#comment-16054938
 ] 

Chaoyu Tang commented on HIVE-14615:


[~stakiar] I have not started to look into this and please feel free to assign 
the JIRA to Andrew. 

> Temp table leaves behind insert command
> ---
>
> Key: HIVE-14615
> URL: https://issues.apache.org/jira/browse/HIVE-14615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> {code}
> create table test (key int, value string);
> insert into test values (1, 'val1');
> show tables;
> test
> values__tmp__table__1
> {code}
> the temp table values__tmp__table__1 was resulted from insert into ...values
> and exists until logout the session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16803) Alter table change column comment should not try to get column stats for update

2017-06-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16803:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0 and 2.4.0. Thank [~pxiong] for reviewing the patch.

> Alter table change column comment should not try to get column stats for 
> update
> ---
>
> Key: HIVE-16803
> URL: https://issues.apache.org/jira/browse/HIVE-16803
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16803.patch
>
>
> When running command like "alter table .. change .." (e.g. ALTER TABLE 
> testtbl CHANGE col col string COMMENT 'change column comment';) to change a 
> column's comment, Hive should not go to fetch the column stats for update 
> since the comment change does not affect table/partition column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16803) Alter table change column comment should not try to get column stats for update

2017-06-01 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16803:
--


> Alter table change column comment should not try to get column stats for 
> update
> ---
>
> Key: HIVE-16803
> URL: https://issues.apache.org/jira/browse/HIVE-16803
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
>
> When running command like "alter table .. change .." (e.g. ALTER TABLE 
> testtbl CHANGE col col string COMMENT 'change column comment';) to change a 
> column's comment, Hive should not go to fetch the column stats for update 
> since the comment change does not affect table/partition column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16803) Alter table change column comment should not try to get column stats for update

2017-06-01 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16803:
---
Attachment: HIVE-16803.patch

Whether column stats need to be fetched and/or updated in alter table is based 
only on the column name/type between new and old columns, and not including 
column comment.

> Alter table change column comment should not try to get column stats for 
> update
> ---
>
> Key: HIVE-16803
> URL: https://issues.apache.org/jira/browse/HIVE-16803
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-16803.patch
>
>
> When running command like "alter table .. change .." (e.g. ALTER TABLE 
> testtbl CHANGE col col string COMMENT 'change column comment';) to change a 
> column's comment, Hive should not go to fetch the column stats for update 
> since the comment change does not affect table/partition column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16803) Alter table change column comment should not try to get column stats for update

2017-06-01 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033590#comment-16033590
 ] 

Chaoyu Tang commented on HIVE-16803:


I was not able to reproduce the failures of 
TestCliDriver[stats_aggregator_error_1], 
TestMiniLlapLocalCliDriver[columnstats_part_coltype.q], 
TestPerfCliDriver[query14.q] in my local machine. They seems to be flaky tests 
and not related to this patch.
[~pxiong], could you help to review this patch? Thanks.

> Alter table change column comment should not try to get column stats for 
> update
> ---
>
> Key: HIVE-16803
> URL: https://issues.apache.org/jira/browse/HIVE-16803
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-16803.patch
>
>
> When running command like "alter table .. change .." (e.g. ALTER TABLE 
> testtbl CHANGE col col string COMMENT 'change column comment';) to change a 
> column's comment, Hive should not go to fetch the column stats for update 
> since the comment change does not affect table/partition column stats.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens

2017-05-01 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16487:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0 and 2.4.0. Thanks [~pvary] for the patch.

> Serious Zookeeper exception is logged when a race condition happens
> ---
>
> Key: HIVE-16487
> URL: https://issues.apache.org/jira/browse/HIVE-16487
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16487.02.patch, HIVE-16487.patch
>
>
> A customer started to see this in the logs, but happily everything was 
> working as intended:
> {code}
> 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: 
> [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /hive_zookeeper_namespace//LOCK-SHARED-
> {code}
> This was happening, because a race condition between the lock releasing, and 
> lock acquiring. The thread releasing the lock removes the parent ZK node just 
> after the thread acquiring the lock made sure, that the parent node exists.
> Since this can happen without any real problem, I plan to add NODEEXISTS, and 
> NONODE as a transient ZooKeeper exception, so the users are not confused.
> Also, the original author of ZooKeeperHiveLockManager maybe planned to handle 
> different ZooKeeperExceptions differently, and the code is hard to 
> understand. See the {{continue}} and the {{break}}. The {{break}} only breaks 
> the switch, and not the loop which IMHO is not intuitive:
> {code}
> do {
>   try {
> [..]
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
>   } catch (Exception e1) {
> if (e1 instanceof KeeperException) {
>   KeeperException e = (KeeperException) e1;
>   switch (e.code()) {
>   case CONNECTIONLOSS:
>   case OPERATIONTIMEOUT:
> LOG.debug("Possibly transient ZooKeeper exception: ", e);
> continue;
>   default:
> LOG.error("Serious Zookeeper exception: ", e);
> break;
>   }
> }
> [..]
>   }
> } while (tryNum < numRetriesForLock);
> {code}
> If we do not want to try again in case of a "Serious Zookeeper exception:", 
> then we should add a label to the do loop, and break it in the switch.
> If we do want to try regardless of the type of the ZK exception, then we 
> should just change the {{continue;}} to {{break;}} and move the lines part of 
> the code which did not run in case of {{continue}} to the {{default}} switch, 
> so it is easier to understand the code.
> Any suggestions or ideas [~ctang.ma] or [~szehon]?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-05-01 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16147:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to 2.4.0 and 3.0.0. Thanks [~pxiong] for review.

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-11064:
--

Assignee: Chaoyu Tang

> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
>   ... 19 more
> {quote}
> I debug the code, may this function "private void 
> updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I 
> don't known the exact error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-02 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16572:
--


> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction

2017-05-05 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang resolved HIVE-11064.

Resolution: Cannot Reproduce

> ALTER TABLE CASCADE ERROR unbalanced calls to 
> openTransaction/commitTransaction
> ---
>
> Key: HIVE-11064
> URL: https://issues.apache.org/jira/browse/HIVE-11064
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
> Environment: CDH5.4.0
>Reporter: fatkun
>Assignee: Chaoyu Tang
>
> my hive version  hive-1.1.0-cdh5.4.0
> follower this step, the exception throw
>  
> use hive client
> {code}
> CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
> ALTER TABLE test1 ADD PARTITION (pt='1');
> ALTER TABLE test1 CHANGE name name1 string;
> ALTER TABLE test1 CHANGE name1 name string cascade;
> {code}
> then throw exception,
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
> java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>  
> metasotre log
> {quote}
> MetaException(message:java.lang.RuntimeException: commitTransaction was 
> called but openTransactionCalls = 0. This probably indicates that there are 
> unbalanced calls to openTransaction/commitTransaction)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
>   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
>   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
>   ... 19 more
> {quote}
> I debug the code, may this function "private void 
> updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I 
> don't known the exact error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-06 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15999650#comment-15999650
 ] 

Chaoyu Tang commented on HIVE-16572:


The test failure is not related to the patch.

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16572.1.patch, HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats

2017-05-04 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16572:
---
Attachment: HIVE-16572.1.patch

Fixed the failure for test rename_external_partition_location.q, and added more 
tests for renaming a partition in an external table.
The other two test failures are not related to this patch, I was not able to 
reproduce in my local machine.
[~pxiong] could you help to review the patch? Thanks

> Rename a partition should not drop its column stats
> ---
>
> Key: HIVE-16572
> URL: https://issues.apache.org/jira/browse/HIVE-16572
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16572.1.patch, HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> code  string  
> 0   303 6.985 
>   7   
> from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_namedata_type   comment 
>
> code  string  
> description   string  
> salaryint 
> total_emp int 
>
> # Partition Information
> # col_namedata_type   comment 
>
> dummy int 
>
> # Detailed Partition Information   
> Partition Value:  [11] 
> Database: default  
> Table:sample_pt
> CreateTime:   Thu Mar 30 23:03:59 EDT 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 
>  
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   numFiles1   
>   numRows 200 
>   rawDataSize 10228   
>   totalSize   10428   
>   transient_lastDdlTime   1490929439  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_namedata_type   comment 
>  
>   
>  
> code  string  from deserializer   
>  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15485) Investigate the DoAs failure in HoS

2017-11-25 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265873#comment-16265873
 ] 

Chaoyu Tang commented on HIVE-15485:


[~linzhangbing] I assume that you used beeline via HoS. Please try this Spark 
property spark.yarn.security.tokens.hive.enabled=true to see if it helps.

> Investigate the DoAs failure in HoS
> ---
>
> Key: HIVE-15485
> URL: https://issues.apache.org/jira/browse/HIVE-15485
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.3.0
>
> Attachments: HIVE-15485.1.patch, HIVE-15485.2.patch, HIVE-15485.patch
>
>
> With DoAs enabled, HoS failed with following errors:
> {code}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> systest tries to renew a token with renewer hive
>   at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:484)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7543)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:555)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:674)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:999)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)
> {code}
> It is related to the change from HIVE-14383. It looks like that SparkSubmit 
> logs in Kerberos with passed in hive principal/keytab and then tries to 
> create a hdfs delegation token for user systest with renewer hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-12812) Enable mapred.input.dir.recursive by default to support union with aggregate function

2018-09-21 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624059#comment-16624059
 ] 

Chaoyu Tang commented on HIVE-12812:


I could not remember the exact reason why this patch was not be committed years 
ago. It might probably be that we were going to decommission the MR soon and 
there was a regression in one test case. But as a workaround, you can always 
set the property to true on the command (not necessary in hive-site.xml):
set mapred.input.dir.recursive=true

> Enable mapred.input.dir.recursive by default to support union with aggregate 
> function
> -
>
> Key: HIVE-12812
> URL: https://issues.apache.org/jira/browse/HIVE-12812
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Chaoyu Tang
>Priority: Major
> Attachments: HIVE-12812.patch, HIVE-12812.patch, HIVE-12812.patch
>
>
> When union remove optimization is enabled, union query with aggregate 
> function writes its subquery intermediate results to subdirs which needs 
> mapred.input.dir.recursive to be enabled in order to be fetched. This 
> property is not defined by default in Hive and often ignored by user, which 
> causes the query failure and is hard to be debugged.
> So we need set mapred.input.dir.recursive to true whenever union remove 
> optimization is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19117) hiveserver2 org.apache.thrift.transport.TTransportException error when running 2nd query after minute of inactivity

2019-02-21 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774340#comment-16774340
 ] 

Chaoyu Tang edited comment on HIVE-19117 at 2/21/19 5:32 PM:
-

Could it be the HS2 session was timed out? Double check property 
hive.server2.idle.session.timeout, it is default set to 7 days and should not 
be the problem. In the mean time, check hs2 log file to see what happened 
during that period. 


was (Author: ctang.ma):
Could it be the HS2 session was timed out, double check property 
hive.server2.idle.session.timeout, it is default set to 7 days and should not 
be the problem. In the mean time, check hs2 log file to see what happened 
during that period. 

> hiveserver2 org.apache.thrift.transport.TTransportException error when 
> running 2nd query after minute of inactivity
> ---
>
> Key: HIVE-19117
> URL: https://issues.apache.org/jira/browse/HIVE-19117
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, Metastore, Thrift API
>Affects Versions: 2.1.1
> Environment: * Hive 2.1.1 with hive.server2.transport.mode set to 
> binary (sample JDBC string is jdbc:hive2://remotehost:1/default)
>  * Hadoop 2.8.3
>  * Metastore using MySQL
>  * Java 8
>Reporter: t oo
>Priority: Blocker
>
> I make a JDBC connection from my SQL tool (ie Squirrel SQL, Oracle SQL 
> Developer) to HiveServer2 (running on remote server) with port 1.
> I am able to run some queries successfully. I then do something else (not in 
> the SQL tool) for 1-2minutes and then return to my SQL tool and attempt to 
> run a query but I get this error: 
> {code:java}
> org.apache.thrift.transport.TTransportException: java.net.SocketException: 
> Software caused connection abort: socket write error{code}
> If I now disconnect and reconnect in my SQL tool I can run queries again. But 
> does anyone know what HiveServer2 settings I should change to prevent the 
> error? I assume something in hive-site.xml
> From the hiveserver2 logs below, can see an exact 1 minute gap from 30th min 
> to 31stmin where the disconnect happens.
> {code:java}
> 2018-04-05T03:30:41,706 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:30:41,718 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:30:41,719 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:31:41,232 INFO [HiveServer2-Handler-Pool: Thread-36] 
> thrift.ThriftCLIService: Session disconnected without closing properly.
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> thrift.ThriftCLIService: Closing the session: SessionHandle 
> [c81ec0f9-7a9d-46b6-9708-e7d78520a48a]
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> service.CompositeService: Session closed, SessionHandle 
> [c81ec0f9-7a9d-46b6-9708-e7d78520a48a], current sessions:0
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.HiveSessionImpl: Operation log session directory is deleted: 
> /var/hive/hs2log/tmp/c81ec0f9-7a9d-46b6-9708-e7d78520a48a
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Deleted directory: 
> /var/hive/scratch/tmp/anonymous/c81ec0f9-7a9d-46b6-9708-e7d78520a48a on fs 
> with scheme file
>  2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Deleted 

[jira] [Commented] (HIVE-19117) hiveserver2 org.apache.thrift.transport.TTransportException error when running 2nd query after minute of inactivity

2019-02-21 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774340#comment-16774340
 ] 

Chaoyu Tang commented on HIVE-19117:


Could it be the HS2 session was timed out, double check property 
hive.server2.idle.session.timeout, it is default set to 7 days and should not 
be the problem. In the mean time, check hs2 log file to see what happened 
during that period. 

> hiveserver2 org.apache.thrift.transport.TTransportException error when 
> running 2nd query after minute of inactivity
> ---
>
> Key: HIVE-19117
> URL: https://issues.apache.org/jira/browse/HIVE-19117
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, Metastore, Thrift API
>Affects Versions: 2.1.1
> Environment: * Hive 2.1.1 with hive.server2.transport.mode set to 
> binary (sample JDBC string is jdbc:hive2://remotehost:1/default)
>  * Hadoop 2.8.3
>  * Metastore using MySQL
>  * Java 8
>Reporter: t oo
>Priority: Blocker
>
> I make a JDBC connection from my SQL tool (ie Squirrel SQL, Oracle SQL 
> Developer) to HiveServer2 (running on remote server) with port 1.
> I am able to run some queries successfully. I then do something else (not in 
> the SQL tool) for 1-2minutes and then return to my SQL tool and attempt to 
> run a query but I get this error: 
> {code:java}
> org.apache.thrift.transport.TTransportException: java.net.SocketException: 
> Software caused connection abort: socket write error{code}
> If I now disconnect and reconnect in my SQL tool I can run queries again. But 
> does anyone know what HiveServer2 settings I should change to prevent the 
> error? I assume something in hive-site.xml
> From the hiveserver2 logs below, can see an exact 1 minute gap from 30th min 
> to 31stmin where the disconnect happens.
> {code:java}
> 2018-04-05T03:30:41,706 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:30:41,718 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:30:41,719 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:31:41,232 INFO [HiveServer2-Handler-Pool: Thread-36] 
> thrift.ThriftCLIService: Session disconnected without closing properly.
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> thrift.ThriftCLIService: Closing the session: SessionHandle 
> [c81ec0f9-7a9d-46b6-9708-e7d78520a48a]
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> service.CompositeService: Session closed, SessionHandle 
> [c81ec0f9-7a9d-46b6-9708-e7d78520a48a], current sessions:0
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Updating thread name to 
> c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.HiveSessionImpl: Operation log session directory is deleted: 
> /var/hive/hs2log/tmp/c81ec0f9-7a9d-46b6-9708-e7d78520a48a
>  2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: 
> Thread-36
>  2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Deleted directory: 
> /var/hive/scratch/tmp/anonymous/c81ec0f9-7a9d-46b6-9708-e7d78520a48a on fs 
> with scheme file
>  2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] 
> session.SessionState: Deleted directory: 
> /var/hive/ec2-user/c81ec0f9-7a9d-46b6-9708-e7d78520a48a on fs with scheme file
>  2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] 
> hive.metastore: Closed a connection to metastore, current connections: 1{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    4   5   6   7   8   9