[jira] [Commented] (HIVE-19117) hiveserver2 org.apache.thrift.transport.TTransportException error when running 2nd query after minute of inactivity
[ https://issues.apache.org/jira/browse/HIVE-19117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774340#comment-16774340 ] Chaoyu Tang commented on HIVE-19117: Could it be the HS2 session was timed out, double check property hive.server2.idle.session.timeout, it is default set to 7 days and should not be the problem. In the mean time, check hs2 log file to see what happened during that period. > hiveserver2 org.apache.thrift.transport.TTransportException error when > running 2nd query after minute of inactivity > --- > > Key: HIVE-19117 > URL: https://issues.apache.org/jira/browse/HIVE-19117 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2, Metastore, Thrift API >Affects Versions: 2.1.1 > Environment: * Hive 2.1.1 with hive.server2.transport.mode set to > binary (sample JDBC string is jdbc:hive2://remotehost:1/default) > * Hadoop 2.8.3 > * Metastore using MySQL > * Java 8 >Reporter: t oo >Priority: Blocker > > I make a JDBC connection from my SQL tool (ie Squirrel SQL, Oracle SQL > Developer) to HiveServer2 (running on remote server) with port 1. > I am able to run some queries successfully. I then do something else (not in > the SQL tool) for 1-2minutes and then return to my SQL tool and attempt to > run a query but I get this error: > {code:java} > org.apache.thrift.transport.TTransportException: java.net.SocketException: > Software caused connection abort: socket write error{code} > If I now disconnect and reconnect in my SQL tool I can run queries again. But > does anyone know what HiveServer2 settings I should change to prevent the > error? I assume something in hive-site.xml > From the hiveserver2 logs below, can see an exact 1 minute gap from 30th min > to 31stmin where the disconnect happens. > {code:java} > 2018-04-05T03:30:41,706 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:30:41,718 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:30:41,719 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:31:41,232 INFO [HiveServer2-Handler-Pool: Thread-36] > thrift.ThriftCLIService: Session disconnected without closing properly. > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > thrift.ThriftCLIService: Closing the session: SessionHandle > [c81ec0f9-7a9d-46b6-9708-e7d78520a48a] > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > service.CompositeService: Session closed, SessionHandle > [c81ec0f9-7a9d-46b6-9708-e7d78520a48a], current sessions:0 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.HiveSessionImpl: Operation log session directory is deleted: > /var/hive/hs2log/tmp/c81ec0f9-7a9d-46b6-9708-e7d78520a48a > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Deleted directory: > /var/hive/scratch/tmp/anonymous/c81ec0f9-7a9d-46b6-9708-e7d78520a48a on fs > with scheme file > 2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Deleted directory: > /var/hive/ec2-user/c81ec0f9-7a9d-46b6-9708-e7d78520a48a on fs with scheme file > 2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] > hive.metastore: Closed a connection to metastore, current connections: 1{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19117) hiveserver2 org.apache.thrift.transport.TTransportException error when running 2nd query after minute of inactivity
[ https://issues.apache.org/jira/browse/HIVE-19117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774340#comment-16774340 ] Chaoyu Tang edited comment on HIVE-19117 at 2/21/19 5:32 PM: - Could it be the HS2 session was timed out? Double check property hive.server2.idle.session.timeout, it is default set to 7 days and should not be the problem. In the mean time, check hs2 log file to see what happened during that period. was (Author: ctang.ma): Could it be the HS2 session was timed out, double check property hive.server2.idle.session.timeout, it is default set to 7 days and should not be the problem. In the mean time, check hs2 log file to see what happened during that period. > hiveserver2 org.apache.thrift.transport.TTransportException error when > running 2nd query after minute of inactivity > --- > > Key: HIVE-19117 > URL: https://issues.apache.org/jira/browse/HIVE-19117 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2, Metastore, Thrift API >Affects Versions: 2.1.1 > Environment: * Hive 2.1.1 with hive.server2.transport.mode set to > binary (sample JDBC string is jdbc:hive2://remotehost:1/default) > * Hadoop 2.8.3 > * Metastore using MySQL > * Java 8 >Reporter: t oo >Priority: Blocker > > I make a JDBC connection from my SQL tool (ie Squirrel SQL, Oracle SQL > Developer) to HiveServer2 (running on remote server) with port 1. > I am able to run some queries successfully. I then do something else (not in > the SQL tool) for 1-2minutes and then return to my SQL tool and attempt to > run a query but I get this error: > {code:java} > org.apache.thrift.transport.TTransportException: java.net.SocketException: > Software caused connection abort: socket write error{code} > If I now disconnect and reconnect in my SQL tool I can run queries again. But > does anyone know what HiveServer2 settings I should change to prevent the > error? I assume something in hive-site.xml > From the hiveserver2 logs below, can see an exact 1 minute gap from 30th min > to 31stmin where the disconnect happens. > {code:java} > 2018-04-05T03:30:41,706 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:30:41,712 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:30:41,718 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:30:41,719 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:31:41,232 INFO [HiveServer2-Handler-Pool: Thread-36] > thrift.ThriftCLIService: Session disconnected without closing properly. > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > thrift.ThriftCLIService: Closing the session: SessionHandle > [c81ec0f9-7a9d-46b6-9708-e7d78520a48a] > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > service.CompositeService: Session closed, SessionHandle > [c81ec0f9-7a9d-46b6-9708-e7d78520a48a], current sessions:0 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Updating thread name to > c81ec0f9-7a9d-46b6-9708-e7d78520a48a HiveServer2-Handler-Pool: Thread-36 > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.HiveSessionImpl: Operation log session directory is deleted: > /var/hive/hs2log/tmp/c81ec0f9-7a9d-46b6-9708-e7d78520a48a > 2018-04-05T03:31:41,233 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Resetting thread name to HiveServer2-Handler-Pool: > Thread-36 > 2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState: Deleted directory: > /var/hive/scratch/tmp/anonymous/c81ec0f9-7a9d-46b6-9708-e7d78520a48a on fs > with scheme file > 2018-04-05T03:31:41,236 INFO [HiveServer2-Handler-Pool: Thread-36] > session.SessionState:
[jira] [Commented] (HIVE-12812) Enable mapred.input.dir.recursive by default to support union with aggregate function
[ https://issues.apache.org/jira/browse/HIVE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624059#comment-16624059 ] Chaoyu Tang commented on HIVE-12812: I could not remember the exact reason why this patch was not be committed years ago. It might probably be that we were going to decommission the MR soon and there was a regression in one test case. But as a workaround, you can always set the property to true on the command (not necessary in hive-site.xml): set mapred.input.dir.recursive=true > Enable mapred.input.dir.recursive by default to support union with aggregate > function > - > > Key: HIVE-12812 > URL: https://issues.apache.org/jira/browse/HIVE-12812 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1, 2.1.0 >Reporter: Chaoyu Tang >Priority: Major > Attachments: HIVE-12812.patch, HIVE-12812.patch, HIVE-12812.patch > > > When union remove optimization is enabled, union query with aggregate > function writes its subquery intermediate results to subdirs which needs > mapred.input.dir.recursive to be enabled in order to be fetched. This > property is not defined by default in Hive and often ignored by user, which > causes the query failure and is hard to be debugged. > So we need set mapred.input.dir.recursive to true whenever union remove > optimization is enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-15485) Investigate the DoAs failure in HoS
[ https://issues.apache.org/jira/browse/HIVE-15485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265873#comment-16265873 ] Chaoyu Tang commented on HIVE-15485: [~linzhangbing] I assume that you used beeline via HoS. Please try this Spark property spark.yarn.security.tokens.hive.enabled=true to see if it helps. > Investigate the DoAs failure in HoS > --- > > Key: HIVE-15485 > URL: https://issues.apache.org/jira/browse/HIVE-15485 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.3.0 > > Attachments: HIVE-15485.1.patch, HIVE-15485.2.patch, HIVE-15485.patch > > > With DoAs enabled, HoS failed with following errors: > {code} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > systest tries to renew a token with renewer hive > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:484) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7543) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:555) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:674) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:999) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135) > {code} > It is related to the change from HIVE-14383. It looks like that SparkSubmit > logs in Kerberos with passed in hive principal/keytab and then tries to > create a hdfs delegation token for user systest with renewer hive. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters
[ https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16930: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0 and 2.4.0. Thanks [~Yibing] for the patch. > HoS should verify the value of Kerberos principal and keytab file before > adding them to spark-submit command parameters > --- > > Key: HIVE-16930 > URL: https://issues.apache.org/jira/browse/HIVE-16930 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Yibing Shi >Assignee: Yibing Shi > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16930.1.patch > > > When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries: > {noformat} > >hive -e "set hive.execution.engine=spark; create table if not exists test(a > >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > > >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt > 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting > for client to connect. > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel > client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited > before connecting back with error log Error: Cannot load main class from JAR > file:/tmp/spark-submit.7196051517706529285.properties > Run with --help for usage help or --verbose for debug output > at > io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) > > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100) > > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96) > > at > org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66) > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62) > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.lang.RuntimeException: Cancel client > 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before > connecting back with error log Error: Cannot load main class from JAR > file:/tmp/spark-submit.7196051517706529285.properties > Run with --help for usage help or --verbose for debug output > at > org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) > at > org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) > at java.lang.Thread.run(Thread.java:745)
[jira] [Commented] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters
[ https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057735#comment-16057735 ] Chaoyu Tang commented on HIVE-16930: +1 > HoS should verify the value of Kerberos principal and keytab file before > adding them to spark-submit command parameters > --- > > Key: HIVE-16930 > URL: https://issues.apache.org/jira/browse/HIVE-16930 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Yibing Shi >Assignee: Yibing Shi > Attachments: HIVE-16930.1.patch > > > When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries: > {noformat} > >hive -e "set hive.execution.engine=spark; create table if not exists test(a > >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > > >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt > 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting > for client to connect. > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel > client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited > before connecting back with error log Error: Cannot load main class from JAR > file:/tmp/spark-submit.7196051517706529285.properties > Run with --help for usage help or --verbose for debug output > at > io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) > > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100) > > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96) > > at > org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66) > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62) > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.lang.RuntimeException: Cancel client > 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before > connecting back with error log Error: Cannot load main class from JAR > file:/tmp/spark-submit.7196051517706529285.properties > Run with --help for usage help or --verbose for debug output > at > org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) > at > org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) > at java.lang.Thread.run(Thread.java:745) > 17/06/16 16:13:13 [Driver]: WARN client.SparkClientImpl: Child process exited > with code 1 > {noformat} > In the log, below message shows up: > {noformat} > 17
[jira] [Commented] (HIVE-14615) Temp table leaves behind insert command
[ https://issues.apache.org/jira/browse/HIVE-14615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054938#comment-16054938 ] Chaoyu Tang commented on HIVE-14615: [~stakiar] I have not started to look into this and please feel free to assign the JIRA to Andrew. > Temp table leaves behind insert command > --- > > Key: HIVE-14615 > URL: https://issues.apache.org/jira/browse/HIVE-14615 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > {code} > create table test (key int, value string); > insert into test values (1, 'val1'); > show tables; > test > values__tmp__table__1 > {code} > the temp table values__tmp__table__1 was resulted from insert into ...values > and exists until logout the session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16803) Alter table change column comment should not try to get column stats for update
[ https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16803: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0 and 2.4.0. Thank [~pxiong] for reviewing the patch. > Alter table change column comment should not try to get column stats for > update > --- > > Key: HIVE-16803 > URL: https://issues.apache.org/jira/browse/HIVE-16803 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16803.patch > > > When running command like "alter table .. change .." (e.g. ALTER TABLE > testtbl CHANGE col col string COMMENT 'change column comment';) to change a > column's comment, Hive should not go to fetch the column stats for update > since the comment change does not affect table/partition column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16803) Alter table change column comment should not try to get column stats for update
[ https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033590#comment-16033590 ] Chaoyu Tang commented on HIVE-16803: I was not able to reproduce the failures of TestCliDriver[stats_aggregator_error_1], TestMiniLlapLocalCliDriver[columnstats_part_coltype.q], TestPerfCliDriver[query14.q] in my local machine. They seems to be flaky tests and not related to this patch. [~pxiong], could you help to review this patch? Thanks. > Alter table change column comment should not try to get column stats for > update > --- > > Key: HIVE-16803 > URL: https://issues.apache.org/jira/browse/HIVE-16803 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-16803.patch > > > When running command like "alter table .. change .." (e.g. ALTER TABLE > testtbl CHANGE col col string COMMENT 'change column comment';) to change a > column's comment, Hive should not go to fetch the column stats for update > since the comment change does not affect table/partition column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16803) Alter table change column comment should not try to get column stats for update
[ https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16803: --- Status: Patch Available (was: Open) > Alter table change column comment should not try to get column stats for > update > --- > > Key: HIVE-16803 > URL: https://issues.apache.org/jira/browse/HIVE-16803 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-16803.patch > > > When running command like "alter table .. change .." (e.g. ALTER TABLE > testtbl CHANGE col col string COMMENT 'change column comment';) to change a > column's comment, Hive should not go to fetch the column stats for update > since the comment change does not affect table/partition column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16803) Alter table change column comment should not try to get column stats for update
[ https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16803: --- Attachment: HIVE-16803.patch Whether column stats need to be fetched and/or updated in alter table is based only on the column name/type between new and old columns, and not including column comment. > Alter table change column comment should not try to get column stats for > update > --- > > Key: HIVE-16803 > URL: https://issues.apache.org/jira/browse/HIVE-16803 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-16803.patch > > > When running command like "alter table .. change .." (e.g. ALTER TABLE > testtbl CHANGE col col string COMMENT 'change column comment';) to change a > column's comment, Hive should not go to fetch the column stats for update > since the comment change does not affect table/partition column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16803) Alter table change column comment should not try to get column stats for update
[ https://issues.apache.org/jira/browse/HIVE-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-16803: -- > Alter table change column comment should not try to get column stats for > update > --- > > Key: HIVE-16803 > URL: https://issues.apache.org/jira/browse/HIVE-16803 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > > When running command like "alter table .. change .." (e.g. ALTER TABLE > testtbl CHANGE col col string COMMENT 'change column comment';) to change a > column's comment, Hive should not go to fetch the column stats for update > since the comment change does not affect table/partition column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats
[ https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16572: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0and 2.4.0. Thanks [~ychena] for review. > Rename a partition should not drop its column stats > --- > > Key: HIVE-16572 > URL: https://issues.apache.org/jira/browse/HIVE-16572 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16572.1.patch, HIVE-16572.patch > > > The column stats for the table sample_pt partition (dummy=1) is as following: > {code} > hive> describe formatted sample_pt partition (dummy=1) code; > OK > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > code string > 0 303 6.985 > 7 > from deserializer > Time taken: 0.259 seconds, Fetched: 3 row(s) > {code} > But when this partition is renamed, say > alter table sample_pt partition (dummy=1) rename to partition (dummy=11); > The COLUMN_STATS in partition description are true, but column stats are > actually all deleted. > {code} > hive> describe formatted sample_pt partition (dummy=11); > OK > # col_namedata_type comment > > code string > description string > salaryint > total_emp int > > # Partition Information > # col_namedata_type comment > > dummy int > > # Detailed Partition Information > Partition Value: [11] > Database: default > Table:sample_pt > CreateTime: Thu Mar 30 23:03:59 EDT 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 > > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > numFiles1 > numRows 200 > rawDataSize 10228 > totalSize 10428 > transient_lastDdlTime 1490929439 > > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 6.783 seconds, Fetched: 37 row(s) > === > hive> describe formatted sample_pt partition (dummy=11) code; > OK > # col_namedata_type comment > > > > code string from deserializer > > Time taken: 9.429 seconds, Fetched: 3 row(s) > {code} > The column stats should not be drop when a partition is renamed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang resolved HIVE-11064. Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 It has been fixed in HIVE-16147. > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > Fix For: 3.0.0, 2.4.0 > > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318) > ... 19 more > {quote} > I debug the code, may this function "private void > updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I > don't known the exact error. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reopened HIVE-11064: > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318) > ... 19 more > {quote} > I debug the code, may this function "private void > updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I > don't known the exact error. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002678#comment-16002678 ] Chaoyu Tang commented on HIVE-11064: The issue was caused by the discrepancy in column definition between a table and its partition. The first command "ALTER TABLE test1 CHANGE name name1 string;" changed the table's column "name" to "name1", the 2nd command "ALTER TABLE test1 CHANGE name1 name string cascade;" with "cascade" clause attempted to change the partition column "name1" to "name" which did actually not exist. When executing the 2nd command, Hive failed in validateTableCols (validating partition columns against its table) in getMPartitionColumnStatistics. It is the root cause to the seen issue in this JIRA though the thrown exception and its message is not so informative. The issue has been fixed as a side effect of HIVE-16147. > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun
[jira] [Commented] (HIVE-16572) Rename a partition should not drop its column stats
[ https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999650#comment-15999650 ] Chaoyu Tang commented on HIVE-16572: The test failure is not related to the patch. > Rename a partition should not drop its column stats > --- > > Key: HIVE-16572 > URL: https://issues.apache.org/jira/browse/HIVE-16572 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16572.1.patch, HIVE-16572.patch > > > The column stats for the table sample_pt partition (dummy=1) is as following: > {code} > hive> describe formatted sample_pt partition (dummy=1) code; > OK > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > code string > 0 303 6.985 > 7 > from deserializer > Time taken: 0.259 seconds, Fetched: 3 row(s) > {code} > But when this partition is renamed, say > alter table sample_pt partition (dummy=1) rename to partition (dummy=11); > The COLUMN_STATS in partition description are true, but column stats are > actually all deleted. > {code} > hive> describe formatted sample_pt partition (dummy=11); > OK > # col_namedata_type comment > > code string > description string > salaryint > total_emp int > > # Partition Information > # col_namedata_type comment > > dummy int > > # Detailed Partition Information > Partition Value: [11] > Database: default > Table:sample_pt > CreateTime: Thu Mar 30 23:03:59 EDT 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 > > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > numFiles1 > numRows 200 > rawDataSize 10228 > totalSize 10428 > transient_lastDdlTime 1490929439 > > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 6.783 seconds, Fetched: 37 row(s) > === > hive> describe formatted sample_pt partition (dummy=11) code; > OK > # col_namedata_type comment > > > > code string from deserializer > > Time taken: 9.429 seconds, Fetched: 3 row(s) > {code} > The column stats should not be drop when a partition is renamed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang resolved HIVE-11064. Resolution: Cannot Reproduce > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318) > ... 19 more > {quote} > I debug the code, may this function "private void > updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I > don't known the exact error. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats
[ https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16572: --- Attachment: HIVE-16572.1.patch Fixed the failure for test rename_external_partition_location.q, and added more tests for renaming a partition in an external table. The other two test failures are not related to this patch, I was not able to reproduce in my local machine. [~pxiong] could you help to review the patch? Thanks > Rename a partition should not drop its column stats > --- > > Key: HIVE-16572 > URL: https://issues.apache.org/jira/browse/HIVE-16572 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16572.1.patch, HIVE-16572.patch > > > The column stats for the table sample_pt partition (dummy=1) is as following: > {code} > hive> describe formatted sample_pt partition (dummy=1) code; > OK > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > code string > 0 303 6.985 > 7 > from deserializer > Time taken: 0.259 seconds, Fetched: 3 row(s) > {code} > But when this partition is renamed, say > alter table sample_pt partition (dummy=1) rename to partition (dummy=11); > The COLUMN_STATS in partition description are true, but column stats are > actually all deleted. > {code} > hive> describe formatted sample_pt partition (dummy=11); > OK > # col_namedata_type comment > > code string > description string > salaryint > total_emp int > > # Partition Information > # col_namedata_type comment > > dummy int > > # Detailed Partition Information > Partition Value: [11] > Database: default > Table:sample_pt > CreateTime: Thu Mar 30 23:03:59 EDT 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 > > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > numFiles1 > numRows 200 > rawDataSize 10228 > totalSize 10428 > transient_lastDdlTime 1490929439 > > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 6.783 seconds, Fetched: 37 row(s) > === > hive> describe formatted sample_pt partition (dummy=11) code; > OK > # col_namedata_type comment > > > > code string from deserializer > > Time taken: 9.429 seconds, Fetched: 3 row(s) > {code} > The column stats should not be drop when a partition is renamed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats
[ https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16572: --- Status: Patch Available (was: Open) > Rename a partition should not drop its column stats > --- > > Key: HIVE-16572 > URL: https://issues.apache.org/jira/browse/HIVE-16572 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16572.patch > > > The column stats for the table sample_pt partition (dummy=1) is as following: > {code} > hive> describe formatted sample_pt partition (dummy=1) code; > OK > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > code string > 0 303 6.985 > 7 > from deserializer > Time taken: 0.259 seconds, Fetched: 3 row(s) > {code} > But when this partition is renamed, say > alter table sample_pt partition (dummy=1) rename to partition (dummy=11); > The COLUMN_STATS in partition description are true, but column stats are > actually all deleted. > {code} > hive> describe formatted sample_pt partition (dummy=11); > OK > # col_namedata_type comment > > code string > description string > salaryint > total_emp int > > # Partition Information > # col_namedata_type comment > > dummy int > > # Detailed Partition Information > Partition Value: [11] > Database: default > Table:sample_pt > CreateTime: Thu Mar 30 23:03:59 EDT 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 > > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > numFiles1 > numRows 200 > rawDataSize 10228 > totalSize 10428 > transient_lastDdlTime 1490929439 > > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 6.783 seconds, Fetched: 37 row(s) > === > hive> describe formatted sample_pt partition (dummy=11) code; > OK > # col_namedata_type comment > > > > code string from deserializer > > Time taken: 9.429 seconds, Fetched: 3 row(s) > {code} > The column stats should not be drop when a partition is renamed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16572) Rename a partition should not drop its column stats
[ https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16572: --- Attachment: HIVE-16572.patch The patch is to do following: 1. keep the partition column stats when a partition is renamed 2. refactor the partition renaming logic. We move the partition directory before committing the HMS transaction, since it will be easier to revert the data moving in a rename failure. > Rename a partition should not drop its column stats > --- > > Key: HIVE-16572 > URL: https://issues.apache.org/jira/browse/HIVE-16572 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16572.patch > > > The column stats for the table sample_pt partition (dummy=1) is as following: > {code} > hive> describe formatted sample_pt partition (dummy=1) code; > OK > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > code string > 0 303 6.985 > 7 > from deserializer > Time taken: 0.259 seconds, Fetched: 3 row(s) > {code} > But when this partition is renamed, say > alter table sample_pt partition (dummy=1) rename to partition (dummy=11); > The COLUMN_STATS in partition description are true, but column stats are > actually all deleted. > {code} > hive> describe formatted sample_pt partition (dummy=11); > OK > # col_namedata_type comment > > code string > description string > salaryint > total_emp int > > # Partition Information > # col_namedata_type comment > > dummy int > > # Detailed Partition Information > Partition Value: [11] > Database: default > Table:sample_pt > CreateTime: Thu Mar 30 23:03:59 EDT 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 > > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > numFiles1 > numRows 200 > rawDataSize 10228 > totalSize 10428 > transient_lastDdlTime 1490929439 > > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 6.783 seconds, Fetched: 37 row(s) > === > hive> describe formatted sample_pt partition (dummy=11) code; > OK > # col_namedata_type comment > > > > code string from deserializer > > Time taken: 9.429 seconds, Fetched: 3 row(s) > {code} > The column stats should not be drop when a partition is renamed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16572) Rename a partition should not drop its column stats
[ https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-16572: -- > Rename a partition should not drop its column stats > --- > > Key: HIVE-16572 > URL: https://issues.apache.org/jira/browse/HIVE-16572 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > The column stats for the table sample_pt partition (dummy=1) is as following: > {code} > hive> describe formatted sample_pt partition (dummy=1) code; > OK > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > code string > 0 303 6.985 > 7 > from deserializer > Time taken: 0.259 seconds, Fetched: 3 row(s) > {code} > But when this partition is renamed, say > alter table sample_pt partition (dummy=1) rename to partition (dummy=11); > The COLUMN_STATS in partition description are true, but column stats are > actually all deleted. > {code} > hive> describe formatted sample_pt partition (dummy=11); > OK > # col_namedata_type comment > > code string > description string > salaryint > total_emp int > > # Partition Information > # col_namedata_type comment > > dummy int > > # Detailed Partition Information > Partition Value: [11] > Database: default > Table:sample_pt > CreateTime: Thu Mar 30 23:03:59 EDT 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 > > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > numFiles1 > numRows 200 > rawDataSize 10228 > totalSize 10428 > transient_lastDdlTime 1490929439 > > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 6.783 seconds, Fetched: 37 row(s) > === > hive> describe formatted sample_pt partition (dummy=11) code; > OK > # col_namedata_type comment > > > > code string from deserializer > > Time taken: 9.429 seconds, Fetched: 3 row(s) > {code} > The column stats should not be drop when a partition is renamed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12188) DoAs does not work properly in non-kerberos secured HS2
[ https://issues.apache.org/jira/browse/HIVE-12188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-12188: --- Component/s: Security > DoAs does not work properly in non-kerberos secured HS2 > --- > > Key: HIVE-12188 > URL: https://issues.apache.org/jira/browse/HIVE-12188 > Project: Hive > Issue Type: Bug > Components: Security >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12188.patch > > > The case with following settings is valid but it seems still not work > correctly in current HS2 > == > hive.server2.authentication=NONE (or LDAP) > hive.server2.enable.doAs= true > hive.metastore.sasl.enabled=true (with HMS Kerberos enabled) > == > Currently HS2 is able to fetch the delegation token to a kerberos secured HMS > only when itself is also kerberos secured. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12965) Insert overwrite local directory should perserve the overwritten directory permission
[ https://issues.apache.org/jira/browse/HIVE-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-12965: --- Component/s: Security > Insert overwrite local directory should perserve the overwritten directory > permission > - > > Key: HIVE-12965 > URL: https://issues.apache.org/jira/browse/HIVE-12965 > Project: Hive > Issue Type: Bug > Components: Security >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-12965.1.patch, HIVE-12965.2.patch, > HIVE-12965.3.patch, HIVE-12965.patch > > > In Hive, "insert overwrite local directory" first deletes the overwritten > directory if exists, recreate a new one, then copy the files from src > directory to the new local directory. This process sometimes changes the > permissions of the to-be-overwritten local directory, therefore causing some > applications no more to be able to access its content. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-13401) Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token authentication
[ https://issues.apache.org/jira/browse/HIVE-13401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-13401: --- Component/s: Security > Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token > authentication > > > Key: HIVE-13401 > URL: https://issues.apache.org/jira/browse/HIVE-13401 > Project: Hive > Issue Type: Bug > Components: Authentication, Security >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.1.0 > > Attachments: HIVE-13401-branch2.0.1.patch, HIVE-13401.patch > > > When HS2 is running in kerberos cluster but with other Sasl authentication > (e.g. LDAP) enabled, it fails in kerberos/delegation token authentication. It > is because the HS2 server uses the TSetIpAddressProcess when other > authentication is enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12270) Add DBTokenStore support to HS2 delegation token
[ https://issues.apache.org/jira/browse/HIVE-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-12270: --- Component/s: Security Authentication > Add DBTokenStore support to HS2 delegation token > > > Key: HIVE-12270 > URL: https://issues.apache.org/jira/browse/HIVE-12270 > Project: Hive > Issue Type: New Feature > Components: Authentication, Security >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.1.0 > > Attachments: HIVE-12270.1.nothrift.patch, HIVE-12270.1.patch, > HIVE-12270.2.patch, HIVE-12270.3.nothrift.patch, HIVE-12270.3.patch, > HIVE-12270.nothrift.patch > > > DBTokenStore was initially introduced by HIVE-3255 in Hive-0.12 and it is > mainly for HMS delegation token. Later in Hive-0.13, the HS2 delegation token > support was introduced by HIVE-5155 but it used MemoryTokenStore as token > store. That the HIVE-9622 uses the shared RawStore (or HMSHandler) to access > the token/keys information in HMS DB directly from HS2 seems not the right > approach to support DBTokenStore in HS2. I think we should use > HiveMetaStoreClient in HS2 instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14697) Can not access kerberized HS2 Web UI
[ https://issues.apache.org/jira/browse/HIVE-14697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14697: --- Component/s: Security > Can not access kerberized HS2 Web UI > > > Key: HIVE-14697 > URL: https://issues.apache.org/jira/browse/HIVE-14697 > Project: Hive > Issue Type: Bug > Components: Security, Web UI >Affects Versions: 2.1.0 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.1.1, 2.2.0 > > Attachments: HIVE-14697.patch > > > Failed to access kerberized HS2 WebUI with following error msg: > {code} > curl -v -u : --negotiate http://util185.phx2.cbsig.net:10002/ > > GET / HTTP/1.1 > > Host: util185.phx2.cbsig.net:10002 > > Authorization: Negotiate YIIU7...[redacted]... > > User-Agent: curl/7.42.1 > > Accept: */* > > > < HTTP/1.1 413 FULL head > < Content-Length: 0 > < Connection: close > < Server: Jetty(7.6.0.v20120127) > {code} > It is because the Jetty default request header (4K) is too small in some > kerberos case. > So this patch is to increase the request header to 64K. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14359) Hive on Spark might fail in HS2 with LDAP authentication in a kerberized cluster
[ https://issues.apache.org/jira/browse/HIVE-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14359: --- Component/s: Spark > Hive on Spark might fail in HS2 with LDAP authentication in a kerberized > cluster > > > Key: HIVE-14359 > URL: https://issues.apache.org/jira/browse/HIVE-14359 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.1.1, 2.2.0 > > Attachments: HIVE-14359.patch > > > When HS2 is used as a gateway for the LDAP users to access and run the > queries in kerberized cluster, it's authentication mode is configured as LDAP > and at this time, HoS might fail by the same reason as HIVE-10594. > hive.server2.authentication is not a proper property to determine if a > cluster is kerberized, instead hadoop.security.authentication should be used. > The failure is in spark client communicating with rest of hadoop as it > assumes kerberos does not need to be used. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15653) Some ALTER TABLE commands drop table stats
[ https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15653: --- Component/s: Statistics > Some ALTER TABLE commands drop table stats > -- > > Key: HIVE-15653 > URL: https://issues.apache.org/jira/browse/HIVE-15653 > Project: Hive > Issue Type: Bug > Components: Metastore, Statistics >Affects Versions: 1.1.0 >Reporter: Alexander Behm >Assignee: Chaoyu Tang >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, > HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, > HIVE-15653.6.patch, HIVE-15653.patch > > > Some ALTER TABLE commands drop the table stats. That may make sense for some > ALTER TABLE operations, but certainly not for others. Personally, I I think > ALTER TABLE should only change what was requested by the user without any > side effects that may be unclear to users. In particular, collecting stats > can be an expensive operation so it's rather inconvenient for users if they > get wiped accidentally. > Repro: > {code} > create table t (i int); > insert into t values(1); > analyze table t compute statistics; > alter table t set tblproperties('test'='test'); > hive> describe formatted t; > OK > # col_namedata_type comment > > i int > > # Detailed Table Information > Database: default > Owner:abehm > CreateTime: Tue Jan 17 18:13:34 PST 2017 > LastAccessTime: UNKNOWN > Protect Mode: None > Retention:0 > Location: hdfs://localhost:20500/test-warehouse/t > Table Type: MANAGED_TABLE > Table Parameters: > COLUMN_STATS_ACCURATE false > last_modified_byabehm > last_modified_time 1484705748 > numFiles1 > numRows -1 > rawDataSize -1 > testtest > totalSize 2 > transient_lastDdlTime 1484705748 > > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 0.169 seconds, Fetched: 34 row(s) > {code} > The same behavior can be observed with several other ALTER TABLE commands. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15485) Investigate the DoAs failure in HoS
[ https://issues.apache.org/jira/browse/HIVE-15485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15485: --- Component/s: Spark > Investigate the DoAs failure in HoS > --- > > Key: HIVE-15485 > URL: https://issues.apache.org/jira/browse/HIVE-15485 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.2.0 > > Attachments: HIVE-15485.1.patch, HIVE-15485.2.patch, HIVE-15485.patch > > > With DoAs enabled, HoS failed with following errors: > {code} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > systest tries to renew a token with renewer hive > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:484) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:7543) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:555) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.renewDelegationToken(AuthorizationProviderProxyClientProtocol.java:674) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:999) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135) > {code} > It is related to the change from HIVE-14383. It looks like that SparkSubmit > logs in Kerberos with passed in hive principal/keytab and then tries to > create a hdfs delegation token for user systest with renewer hive. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Component/s: Statistics > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.2.0 > > Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, > HIVE-16189.2.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Component/s: Statistics > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session
[ https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16394: --- Component/s: Spark > HoS does not support queue name change in middle of session > --- > > Key: HIVE-16394 > URL: https://issues.apache.org/jira/browse/HIVE-16394 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 3.0.0 > > Attachments: HIVE-16394.patch > > > The mapreduce.job.queuename only effects when HoS executes its query first > time. After that, changing mapreduce.job.queuename won't change the query > yarn scheduler queue name. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993588#comment-15993588 ] Chaoyu Tang edited comment on HIVE-11064 at 5/2/17 7:44 PM: I was not able to reproduce this issue in Hive 3.0.0, though I could reproduce this issue in CDH5.4.0 but only when hive.metastore.try.direct.sql is disabled. was (Author: ctang.ma): I was not able to reproduce this issue in Hive 3.0.0, it was probably fixed as the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but only when hive.metastore.try.direct.sql is disabled. > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318) >
[jira] [Comment Edited] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993588#comment-15993588 ] Chaoyu Tang edited comment on HIVE-11064 at 5/2/17 7:42 PM: I was not able to reproduce this issue in Hive 3.0.0, it was probably fixed as the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but only when hive.metastore.try.direct.sql is disabled. was (Author: ctang.ma): I was not able to reproduce this issue in HIVE-3.0.0, it was probably fixed as the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but only when hive.metastore.try.direct.sql is disabled. > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMS
[jira] [Commented] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993588#comment-15993588 ] Chaoyu Tang commented on HIVE-11064: I was not able to reproduce this issue in HIVE-3.0.0, it was probably fixed as the side effect of HIVE-16147. But I could reproduce this issue in CDH5.4.0 but only when hive.metastore.try.direct.sql is disabled. > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318) > ... 19 more > {quote} > I debug the code, may this function "private void > updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I > don't known the exact error. -- This message was sent by Atlassian JIRA
[jira] [Assigned] (HIVE-11064) ALTER TABLE CASCADE ERROR unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-11064: -- Assignee: Chaoyu Tang > ALTER TABLE CASCADE ERROR unbalanced calls to > openTransaction/commitTransaction > --- > > Key: HIVE-11064 > URL: https://issues.apache.org/jira/browse/HIVE-11064 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 > Environment: CDH5.4.0 >Reporter: fatkun >Assignee: Chaoyu Tang > > my hive version hive-1.1.0-cdh5.4.0 > follower this step, the exception throw > > use hive client > {code} > CREATE TABLE test1 (name string) PARTITIONED BY (pt string); > ALTER TABLE test1 ADD PARTITION (pt='1'); > ALTER TABLE test1 CHANGE name name1 string; > ALTER TABLE test1 CHANGE name1 name string cascade; > {code} > then throw exception, > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. > java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > > metasotre log > {quote} > MetaException(message:java.lang.RuntimeException: commitTransaction was > called but openTransactionCalls = 0. This probably indicates that there are > unbalanced calls to openTransaction/commitTransaction) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: commitTransaction was called but > openTransactionCalls = 0. This probably indicates that there are unbalanced > calls to openTransaction/commitTransaction > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) > at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318) > ... 19 more > {quote} > I debug the code, may this function "private void > updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I > don't known the exact error. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Committed to 2.4.0 and 3.0.0. Thanks [~pxiong] for review. > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens
[ https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16487: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0 and 2.4.0. Thanks [~pvary] for the patch. > Serious Zookeeper exception is logged when a race condition happens > --- > > Key: HIVE-16487 > URL: https://issues.apache.org/jira/browse/HIVE-16487 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 3.0.0 >Reporter: Peter Vary >Assignee: Peter Vary > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16487.02.patch, HIVE-16487.patch > > > A customer started to see this in the logs, but happily everything was > working as intended: > {code} > 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: > [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /hive_zookeeper_namespace//LOCK-SHARED- > {code} > This was happening, because a race condition between the lock releasing, and > lock acquiring. The thread releasing the lock removes the parent ZK node just > after the thread acquiring the lock made sure, that the parent node exists. > Since this can happen without any real problem, I plan to add NODEEXISTS, and > NONODE as a transient ZooKeeper exception, so the users are not confused. > Also, the original author of ZooKeeperHiveLockManager maybe planned to handle > different ZooKeeperExceptions differently, and the code is hard to > understand. See the {{continue}} and the {{break}}. The {{break}} only breaks > the switch, and not the loop which IMHO is not intuitive: > {code} > do { > try { > [..] > ret = lockPrimitive(key, mode, keepAlive, parentCreated, > } catch (Exception e1) { > if (e1 instanceof KeeperException) { > KeeperException e = (KeeperException) e1; > switch (e.code()) { > case CONNECTIONLOSS: > case OPERATIONTIMEOUT: > LOG.debug("Possibly transient ZooKeeper exception: ", e); > continue; > default: > LOG.error("Serious Zookeeper exception: ", e); > break; > } > } > [..] > } > } while (tryNum < numRetriesForLock); > {code} > If we do not want to try again in case of a "Serious Zookeeper exception:", > then we should add a label to the do loop, and break it in the switch. > If we do want to try regardless of the type of the ZK exception, then we > should just change the {{continue;}} to {{break;}} and move the lines part of > the code which did not run in case of {{continue}} to the {{default}} switch, > so it is easier to understand the code. > Any suggestions or ideas [~ctang.ma] or [~szehon]? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989269#comment-15989269 ] Chaoyu Tang commented on HIVE-16147: [~pxiong] Thanks for looking into this. Yeah, I made some changes to fix the test failures and also optimized the code a little. I have uploaded the 2nd patch to RB requesting for the review. > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989168#comment-15989168 ] Chaoyu Tang commented on HIVE-16147: The only one test failure is not related to this patch. [~pxiong] could you review the patch? Thanks > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens
[ https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988954#comment-15988954 ] Chaoyu Tang commented on HIVE-16487: LGTM, +1 pending tests. > Serious Zookeeper exception is logged when a race condition happens > --- > > Key: HIVE-16487 > URL: https://issues.apache.org/jira/browse/HIVE-16487 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 3.0.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-16487.02.patch, HIVE-16487.patch > > > A customer started to see this in the logs, but happily everything was > working as intended: > {code} > 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: > [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /hive_zookeeper_namespace//LOCK-SHARED- > {code} > This was happening, because a race condition between the lock releasing, and > lock acquiring. The thread releasing the lock removes the parent ZK node just > after the thread acquiring the lock made sure, that the parent node exists. > Since this can happen without any real problem, I plan to add NODEEXISTS, and > NONODE as a transient ZooKeeper exception, so the users are not confused. > Also, the original author of ZooKeeperHiveLockManager maybe planned to handle > different ZooKeeperExceptions differently, and the code is hard to > understand. See the {{continue}} and the {{break}}. The {{break}} only breaks > the switch, and not the loop which IMHO is not intuitive: > {code} > do { > try { > [..] > ret = lockPrimitive(key, mode, keepAlive, parentCreated, > } catch (Exception e1) { > if (e1 instanceof KeeperException) { > KeeperException e = (KeeperException) e1; > switch (e.code()) { > case CONNECTIONLOSS: > case OPERATIONTIMEOUT: > LOG.debug("Possibly transient ZooKeeper exception: ", e); > continue; > default: > LOG.error("Serious Zookeeper exception: ", e); > break; > } > } > [..] > } > } while (tryNum < numRetriesForLock); > {code} > If we do not want to try again in case of a "Serious Zookeeper exception:", > then we should add a label to the do loop, and break it in the switch. > If we do want to try regardless of the type of the ZK exception, then we > should just change the {{continue;}} to {{break;}} and move the lines part of > the code which did not run in case of {{continue}} to the {{default}} switch, > so it is easier to understand the code. > Any suggestions or ideas [~ctang.ma] or [~szehon]? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987783#comment-15987783 ] Chaoyu Tang commented on HIVE-16147: The test failures are not related to the patch. [~pxiong], could you help to review it again? Thanks > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Attachment: HIVE-16147.1.patch Fixed the failed tests. > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Attachment: HIVE-16147.patch > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Attachment: (was: HIVE-16147.patch) > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Attachment: HIVE-16147.patch Reattach the patch to kick off precommit test. > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.patch, HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens
[ https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983128#comment-15983128 ] Chaoyu Tang commented on HIVE-16487: [~pvary] I think the Exception e1 will be eaten, if it is not an instance of KeeperException, after numRetriesForLock with this patch. > Serious Zookeeper exception is logged when a race condition happens > --- > > Key: HIVE-16487 > URL: https://issues.apache.org/jira/browse/HIVE-16487 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 3.0.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-16487.patch > > > A customer started to see this in the logs, but happily everything was > working as intended: > {code} > 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: > [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /hive_zookeeper_namespace//LOCK-SHARED- > {code} > This was happening, because a race condition between the lock releasing, and > lock acquiring. The thread releasing the lock removes the parent ZK node just > after the thread acquiring the lock made sure, that the parent node exists. > Since this can happen without any real problem, I plan to add NODEEXISTS, and > NONODE as a transient ZooKeeper exception, so the users are not confused. > Also, the original author of ZooKeeperHiveLockManager maybe planned to handle > different ZooKeeperExceptions differently, and the code is hard to > understand. See the {{continue}} and the {{break}}. The {{break}} only breaks > the switch, and not the loop which IMHO is not intuitive: > {code} > do { > try { > [..] > ret = lockPrimitive(key, mode, keepAlive, parentCreated, > } catch (Exception e1) { > if (e1 instanceof KeeperException) { > KeeperException e = (KeeperException) e1; > switch (e.code()) { > case CONNECTIONLOSS: > case OPERATIONTIMEOUT: > LOG.debug("Possibly transient ZooKeeper exception: ", e); > continue; > default: > LOG.error("Serious Zookeeper exception: ", e); > break; > } > } > [..] > } > } while (tryNum < numRetriesForLock); > {code} > If we do not want to try again in case of a "Serious Zookeeper exception:", > then we should add a label to the do loop, and break it in the switch. > If we do want to try regardless of the type of the ZK exception, then we > should just change the {{continue;}} to {{break;}} and move the lines part of > the code which did not run in case of {{continue}} to the {{default}} switch, > so it is easier to understand the code. > Any suggestions or ideas [~ctang.ma] or [~szehon]? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982021#comment-15982021 ] Chaoyu Tang commented on HIVE-16147: Patch has been uploaded to RB. [~pxiong], could you help to review it. Thanks. > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Attachment: HIVE-16147.patch The patch is to: 1. preserve the column stats in a partitioned table rename 2. since the column stats are no more invalidated during a table rename, I renamed the alter_table_invalidate_column_stats.q to alter_table_column_stats.q > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16147: --- Status: Patch Available (was: Open) > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16147.patch > > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16487) Serious Zookeeper exception is logged when a race condition happens
[ https://issues.apache.org/jira/browse/HIVE-16487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976624#comment-15976624 ] Chaoyu Tang commented on HIVE-16487: [~pvary], your analysis makes sense. Thanks. > Serious Zookeeper exception is logged when a race condition happens > --- > > Key: HIVE-16487 > URL: https://issues.apache.org/jira/browse/HIVE-16487 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 3.0.0 >Reporter: Peter Vary >Assignee: Peter Vary > > A customer started to see this in the logs, but happily everything was > working as intended: > {code} > 2017-03-30 12:01:59,446 ERROR ZooKeeperHiveLockManager: > [HiveServer2-Background-Pool: Thread-620]: Serious Zookeeper exception: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /hive_zookeeper_namespace//LOCK-SHARED- > {code} > This was happening, because a race condition between the lock releasing, and > lock acquiring. The thread releasing the lock removes the parent ZK node just > after the thread acquiring the lock made sure, that the parent node exists. > Since this can happen without any real problem, I plan to add NODEEXISTS, and > NONODE as a transient ZooKeeper exception, so the users are not confused. > Also, the original author of ZooKeeperHiveLockManager maybe planned to handle > different ZooKeeperExceptions differently, and the code is hard to > understand. See the {{continue}} and the {{break}}. The {{break}} only breaks > the switch, and not the loop which IMHO is not intuitive: > {code} > do { > try { > [..] > ret = lockPrimitive(key, mode, keepAlive, parentCreated, > } catch (Exception e1) { > if (e1 instanceof KeeperException) { > KeeperException e = (KeeperException) e1; > switch (e.code()) { > case CONNECTIONLOSS: > case OPERATIONTIMEOUT: > LOG.debug("Possibly transient ZooKeeper exception: ", e); > continue; > default: > LOG.error("Serious Zookeeper exception: ", e); > break; > } > } > [..] > } > } while (tryNum < numRetriesForLock); > {code} > If we do not want to try again in case of a "Serious Zookeeper exception:", > then we should add a label to the do loop, and break it in the switch. > If we do want to try regardless of the type of the ZK exception, then we > should just change the {{continue;}} to {{break;}} and move the lines part of > the code which did not run in case of {{continue}} to the {{default}} switch, > so it is easier to understand the code. > Any suggestions or ideas [~ctang.ma] or [~szehon]? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16394) HoS does not support queue name change in middle of session
[ https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964399#comment-15964399 ] Chaoyu Tang commented on HIVE-16394: Thanks [~leftylev]. This property is not HoS specific and already works in HoMR, so I think it is not needed to be documented separately. > HoS does not support queue name change in middle of session > --- > > Key: HIVE-16394 > URL: https://issues.apache.org/jira/browse/HIVE-16394 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 3.0.0 > > Attachments: HIVE-16394.patch > > > The mapreduce.job.queuename only effects when HoS executes its query first > time. After that, changing mapreduce.job.queuename won't change the query > yarn scheduler queue name. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16334) Query lock contains the query string, which can cause OOM on ZooKeeper
[ https://issues.apache.org/jira/browse/HIVE-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16334: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0. Thanks [~pvary] for the patch. I think you may need to update the document for the new property. > Query lock contains the query string, which can cause OOM on ZooKeeper > -- > > Key: HIVE-16334 > URL: https://issues.apache.org/jira/browse/HIVE-16334 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Peter Vary >Assignee: Peter Vary > Fix For: 3.0.0 > > Attachments: HIVE-16334.2.patch, HIVE-16334.3.patch, > HIVE-16334.4.patch, HIVE-16334.patch > > > When there are big number of partitions in a query this will result in a huge > number of locks on ZooKeeper. Since the query object contains the whole query > string this might cause serious memory pressure on the ZooKeeper services. > It would be good to have the possibility to truncate the query strings that > are written into the locks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session
[ https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16394: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0. Thanks [~xuefuz], [~lirui] for review. > HoS does not support queue name change in middle of session > --- > > Key: HIVE-16394 > URL: https://issues.apache.org/jira/browse/HIVE-16394 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 3.0.0 > > Attachments: HIVE-16394.patch > > > The mapreduce.job.queuename only effects when HoS executes its query first > time. After that, changing mapreduce.job.queuename won't change the query > yarn scheduler queue name. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15538) Test HIVE-13884 with more complex query predicates
[ https://issues.apache.org/jira/browse/HIVE-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15538: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0. Thanks [~kuczoram] for the patch. > Test HIVE-13884 with more complex query predicates > -- > > Key: HIVE-15538 > URL: https://issues.apache.org/jira/browse/HIVE-15538 > Project: Hive > Issue Type: Test > Components: Test >Affects Versions: 2.2.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora > Fix For: 3.0.0 > > Attachments: HIVE-15538.2.patch, HIVE-15538.3.patch, HIVE-15538.patch > > > HIVE-13884 introduced a new property hive.metastore.limit.partition.request. > It would be good to have more tests to cover the cases where the query > predicates (such as like, in) could not be pushed down to see if the fail > back from directsql to ORM works properly if hive.metastore.try.direct.sql is > enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16334) Query lock contains the query string, which can cause OOM on ZooKeeper
[ https://issues.apache.org/jira/browse/HIVE-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959261#comment-15959261 ] Chaoyu Tang commented on HIVE-16334: LGTM, +1 > Query lock contains the query string, which can cause OOM on ZooKeeper > -- > > Key: HIVE-16334 > URL: https://issues.apache.org/jira/browse/HIVE-16334 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-16334.2.patch, HIVE-16334.3.patch, > HIVE-16334.4.patch, HIVE-16334.patch > > > When there are big number of partitions in a query this will result in a huge > number of locks on ZooKeeper. Since the query object contains the whole query > string this might cause serious memory pressure on the ZooKeeper services. > It would be good to have the possibility to truncate the query strings that > are written into the locks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16394) HoS does not support queue name change in middle of session
[ https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959049#comment-15959049 ] Chaoyu Tang commented on HIVE-16394: The test failures are not related to this patch. > HoS does not support queue name change in middle of session > --- > > Key: HIVE-16394 > URL: https://issues.apache.org/jira/browse/HIVE-16394 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16394.patch > > > The mapreduce.job.queuename only effects when HoS executes its query first > time. After that, changing mapreduce.job.queuename won't change the query > yarn scheduler queue name. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session
[ https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16394: --- Status: Patch Available (was: Open) [~xuefuz], [~lirui], could you review the patch to see if it makes sense? Thanks > HoS does not support queue name change in middle of session > --- > > Key: HIVE-16394 > URL: https://issues.apache.org/jira/browse/HIVE-16394 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16394.patch > > > The mapreduce.job.queuename only effects when HoS executes its query first > time. After that, changing mapreduce.job.queuename won't change the query > yarn scheduler queue name. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16394) HoS does not support queue name change in middle of session
[ https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16394: --- Attachment: HIVE-16394.patch > HoS does not support queue name change in middle of session > --- > > Key: HIVE-16394 > URL: https://issues.apache.org/jira/browse/HIVE-16394 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16394.patch > > > The mapreduce.job.queuename only effects when HoS executes its query first > time. After that, changing mapreduce.job.queuename won't change the query > yarn scheduler queue name. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16394) HoS does not support queue name change in middle of session
[ https://issues.apache.org/jira/browse/HIVE-16394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-16394: -- > HoS does not support queue name change in middle of session > --- > > Key: HIVE-16394 > URL: https://issues.apache.org/jira/browse/HIVE-16394 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > The mapreduce.job.queuename only effects when HoS executes its query first > time. After that, changing mapreduce.job.queuename won't change the query > yarn scheduler queue name. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15538) Test HIVE-13884 with more complex query predicates
[ https://issues.apache.org/jira/browse/HIVE-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957060#comment-15957060 ] Chaoyu Tang commented on HIVE-15538: LGTM, +1 > Test HIVE-13884 with more complex query predicates > -- > > Key: HIVE-15538 > URL: https://issues.apache.org/jira/browse/HIVE-15538 > Project: Hive > Issue Type: Test > Components: Test >Affects Versions: 2.2.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora > Attachments: HIVE-15538.2.patch, HIVE-15538.3.patch, HIVE-15538.patch > > > HIVE-13884 introduced a new property hive.metastore.limit.partition.request. > It would be good to have more tests to cover the cases where the query > predicates (such as like, in) could not be pushed down to see if the fail > back from directsql to ORM works properly if hive.metastore.try.direct.sql is > enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-10307) Support to use number literals in partition column
[ https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956800#comment-15956800 ] Chaoyu Tang commented on HIVE-10307: [~pxiong] The property hive.typecheck.on.insert was initially introduced in HIVE-5297 and this JIRA (HIVE-10307) just added its comment. I did not remove this property when working on this JIRA for the sake of the backward compatibility, though these years I have also not seen a case which needs this property set to false. > Support to use number literals in partition column > -- > > Key: HIVE-10307 > URL: https://issues.apache.org/jira/browse/HIVE-10307 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.0.0 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 1.2.0 > > Attachments: HIVE-10307.1.patch, HIVE-10307.2.patch, > HIVE-10307.3.patch, HIVE-10307.4.patch, HIVE-10307.5.patch, > HIVE-10307.6.patch, HIVE-10307.patch > > > Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as > literals with postfix like Y, S, L, or BD appended to the number. These > literals work in most Hive queries, but do not when they are used as > partition column value. For a partitioned table like: > create table partcoltypenum (key int, value string) partitioned by (tint > tinyint, sint smallint, bint bigint); > insert into partcoltypenum partition (tint=100Y, sint=1S, > bint=1000L) select key, value from src limit 30; > Queries like select, describe and drop partition do not work. For an example > select * from partcoltypenum where tint=100Y and sint=1S and > bint=1000L; > does not return any rows. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16061) When hive.async.log.enabled is set to true, some output is not printed to the beeline console
[ https://issues.apache.org/jira/browse/HIVE-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952898#comment-15952898 ] Chaoyu Tang commented on HIVE-16061: LGTM, +1 > When hive.async.log.enabled is set to true, some output is not printed to the > beeline console > - > > Key: HIVE-16061 > URL: https://issues.apache.org/jira/browse/HIVE-16061 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 2.1.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-16061.1.patch, HIVE-16061.2.patch, > HIVE-16061.3.patch, HIVE-16061.4.patch > > > Run a hiveserver2 instance "hive --service hiveserver2". > Then from another console, connect to hiveserver2 "beeline -u > "jdbc:hive2://localhost:1" > When you run a MR job like "select t1.key from src t1 join src t2 on > t1.key=t2.key", some of the console logs like MR job info are not printed to > the console while it just print to the hiveserver2 console. > When hive.async.log.enabled is set to false and restarts the HiveServer2, > then the output will be printed to the beeline console. > OperationLog implementation uses the ThreadLocal variable to store associated > the log file. When the hive.async.log.enabled is set to true, the logs will > be processed by a ThreadPool and the actual threads from the pool which > prints the message won't be able to access the log file stored in the > original thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15880) Allow insert overwrite and truncate table query to use auto.purge table property
[ https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15880: --- Resolution: Fixed Fix Version/s: 3.0.0 2.3.0 Status: Resolved (was: Patch Available) Committed to 2.3.0 & 3.0.0. Thanks [~vihangk1] for the patch. > Allow insert overwrite and truncate table query to use auto.purge table > property > > > Key: HIVE-15880 > URL: https://issues.apache.org/jira/browse/HIVE-15880 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Fix For: 2.3.0, 3.0.0 > > Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, > HIVE-15880.03.patch, HIVE-15880.04.patch, HIVE-15880.05.patch, > HIVE-15880.06.patch > > > It seems inconsistent that auto.purge property is not considered when we do a > INSERT OVERWRITE while it is when we do a DROP TABLE > Drop table doesn't move table data to Trash when auto.purge is set to true > {noformat} > > create table temp(col1 string, col2 string); > No rows affected (0.064 seconds) > > alter table temp set tblproperties('auto.purge'='true'); > No rows affected (0.083 seconds) > > insert into temp values ('test', 'test'), ('test2', 'test2'); > No rows affected (25.473 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:03 > /user/hive/warehouse/temp/00_0 > # > > drop table temp; > No rows affected (0.242 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > ls: `/user/hive/warehouse/temp': No such file or directory > # > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > # > {noformat} > INSERT OVERWRITE query moves the table data to Trash even when auto.purge is > set to true > {noformat} > > create table temp(col1 string, col2 string); > > alter table temp set tblproperties('auto.purge'='true'); > > insert into temp values ('test', 'test'), ('test2', 'test2'); > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:07 > /user/hive/warehouse/temp/00_0 > # > > insert overwrite table temp select * from dummy; > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 26 2017-02-09 13:08 > /user/hive/warehouse/temp/00_0 > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > Found 1 items > drwx-- - hive hive 0 2017-02-09 13:08 > /user/hive/.Trash/Current/user/hive/warehouse/temp > # > {noformat} > While move operations are not very costly on HDFS it could be significant > overhead on slow FileSystems like S3. This could improve the performance of > {{INSERT OVERWRITE TABLE}} queries especially when there are large number of > partitions on tables located on S3 should the user wish to set auto.purge > property to true > Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property > set true should not move the data to Trash -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16308) PreExecutePrinter and PostExecutePrinter should log to INFO level instead of ERROR
[ https://issues.apache.org/jira/browse/HIVE-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16308: --- Resolution: Fixed Fix Version/s: 3.0.0 2.3.0 Status: Resolved (was: Patch Available) Committed to 2.3.0 & 3.0.0. Thanks [~stakiar]. > PreExecutePrinter and PostExecutePrinter should log to INFO level instead of > ERROR > -- > > Key: HIVE-16308 > URL: https://issues.apache.org/jira/browse/HIVE-16308 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Fix For: 2.3.0, 3.0.0 > > Attachments: HIVE-16308.1.patch > > > Many of the pre and post hook printers log info at the ERROR level, which is > confusing since they aren't errors. They should log to the INFO level. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15880) Allow insert overwrite and truncate table query to use auto.purge table property
[ https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950182#comment-15950182 ] Chaoyu Tang commented on HIVE-15880: The patch looks good to me, +1. > Allow insert overwrite and truncate table query to use auto.purge table > property > > > Key: HIVE-15880 > URL: https://issues.apache.org/jira/browse/HIVE-15880 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, > HIVE-15880.03.patch, HIVE-15880.04.patch, HIVE-15880.05.patch, > HIVE-15880.06.patch > > > It seems inconsistent that auto.purge property is not considered when we do a > INSERT OVERWRITE while it is when we do a DROP TABLE > Drop table doesn't move table data to Trash when auto.purge is set to true > {noformat} > > create table temp(col1 string, col2 string); > No rows affected (0.064 seconds) > > alter table temp set tblproperties('auto.purge'='true'); > No rows affected (0.083 seconds) > > insert into temp values ('test', 'test'), ('test2', 'test2'); > No rows affected (25.473 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:03 > /user/hive/warehouse/temp/00_0 > # > > drop table temp; > No rows affected (0.242 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > ls: `/user/hive/warehouse/temp': No such file or directory > # > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > # > {noformat} > INSERT OVERWRITE query moves the table data to Trash even when auto.purge is > set to true > {noformat} > > create table temp(col1 string, col2 string); > > alter table temp set tblproperties('auto.purge'='true'); > > insert into temp values ('test', 'test'), ('test2', 'test2'); > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:07 > /user/hive/warehouse/temp/00_0 > # > > insert overwrite table temp select * from dummy; > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 26 2017-02-09 13:08 > /user/hive/warehouse/temp/00_0 > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > Found 1 items > drwx-- - hive hive 0 2017-02-09 13:08 > /user/hive/.Trash/Current/user/hive/warehouse/temp > # > {noformat} > While move operations are not very costly on HDFS it could be significant > overhead on slow FileSystems like S3. This could improve the performance of > {{INSERT OVERWRITE TABLE}} queries especially when there are large number of > partitions on tables located on S3 should the user wish to set auto.purge > property to true > Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property > set true should not move the data to Trash -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16308) PreExecutePrinter and PostExecutePrinter should log to INFO level instead of ERROR
[ https://issues.apache.org/jira/browse/HIVE-16308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947673#comment-15947673 ] Chaoyu Tang commented on HIVE-16308: I do not know if there is any particular reason that console.printError instead of printInfo was used in HIVE-14936. Otherwise, the patch looks good to me. +1. > PreExecutePrinter and PostExecutePrinter should log to INFO level instead of > ERROR > -- > > Key: HIVE-16308 > URL: https://issues.apache.org/jira/browse/HIVE-16308 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-16308.1.patch > > > Many of the pre and post hook printers log info at the ERROR level, which is > confusing since they aren't errors. They should log to the INFO level. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15880) Allow insert overwrite and truncate table query to use auto.purge table property
[ https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942414#comment-15942414 ] Chaoyu Tang commented on HIVE-15880: [~vihangk1] Thanks for the patch. I think the method in HiveMetaStoreFsImpl.deleteDir has the similar logic as you would like to change in FileUtils.moveToTrash, could we combine or reuse these methods? Also could you add a qtest for insert overwrite a table with encryption zone and enabled trash? I think the test should fail without this patch. > Allow insert overwrite and truncate table query to use auto.purge table > property > > > Key: HIVE-15880 > URL: https://issues.apache.org/jira/browse/HIVE-15880 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, > HIVE-15880.03.patch, HIVE-15880.04.patch > > > It seems inconsistent that auto.purge property is not considered when we do a > INSERT OVERWRITE while it is when we do a DROP TABLE > Drop table doesn't move table data to Trash when auto.purge is set to true > {noformat} > > create table temp(col1 string, col2 string); > No rows affected (0.064 seconds) > > alter table temp set tblproperties('auto.purge'='true'); > No rows affected (0.083 seconds) > > insert into temp values ('test', 'test'), ('test2', 'test2'); > No rows affected (25.473 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:03 > /user/hive/warehouse/temp/00_0 > # > > drop table temp; > No rows affected (0.242 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > ls: `/user/hive/warehouse/temp': No such file or directory > # > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > # > {noformat} > INSERT OVERWRITE query moves the table data to Trash even when auto.purge is > set to true > {noformat} > > create table temp(col1 string, col2 string); > > alter table temp set tblproperties('auto.purge'='true'); > > insert into temp values ('test', 'test'), ('test2', 'test2'); > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:07 > /user/hive/warehouse/temp/00_0 > # > > insert overwrite table temp select * from dummy; > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 26 2017-02-09 13:08 > /user/hive/warehouse/temp/00_0 > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > Found 1 items > drwx-- - hive hive 0 2017-02-09 13:08 > /user/hive/.Trash/Current/user/hive/warehouse/temp > # > {noformat} > While move operations are not very costly on HDFS it could be significant > overhead on slow FileSystems like S3. This could improve the performance of > {{INSERT OVERWRITE TABLE}} queries especially when there are large number of > partitions on tables located on S3 should the user wish to set auto.purge > property to true > Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property > set true should not move the data to Trash -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16071) HoS RPCServer misuses the timeout in its RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16071: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to 2.2.0. Thanks [~xuefuz], [~lirui] for review. > HoS RPCServer misuses the timeout in its RPC handshake > -- > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.2.0 > > Attachments: HIVE-16071.patch, HIVE-16071.patch, HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16071) HoS RPCServer misuses the timeout in its RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16071: --- Summary: HoS RPCServer misuses the timeout in its RPC handshake (was: Spark remote driver misuses the timeout in RPC handshake) > HoS RPCServer misuses the timeout in its RPC handshake > -- > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch, HIVE-16071.patch, HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930841#comment-15930841 ] Chaoyu Tang commented on HIVE-16071: Two tests failed but none of them are related to this patch (See https://builds.apache.org/job/PreCommit-HIVE-Build/4215/testReport/) > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch, HIVE-16071.patch, HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to 2.2.0. Thanks [~pxiong] for review. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 2.2.0 > > Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, > HIVE-16189.2.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929995#comment-15929995 ] Chaoyu Tang commented on HIVE-16189: Precommit build was run but its result was not published and linked to this JIRA (https://builds.apache.org/job/PreCommit-HIVE-Build/4200/). Two tests failed but none of them are related to this patch. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, > HIVE-16189.2.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16071: --- Attachment: HIVE-16071.patch > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch, HIVE-16071.patch, HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929988#comment-15929988 ] Chaoyu Tang commented on HIVE-16071: Precommit tests were run but for some reasons the result was not reported (https://builds.apache.org/job/PreCommit-HIVE-Build/4203/testReport/). There is a spark failure but I do not think it is related to this patch. Reattach the patch to kick off tests to see if it could be reproducible. > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch, HIVE-16071.patch, HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16071: --- Attachment: HIVE-16071.patch Change to use the hive.spark.client.connect.timeout value for sasl handshake timeout (cancelTask) at RpcServer side. [~xuefuz], [~lirui], could you review it? Thanks. In addition, I am thinking about renaming the property hive.spark.client.connect.timeout and hive.spark.client.server.connect.timeout to hive.spark.rpc.sasl.handshake.timeout and hive.spark.remote.driver.register.timeout respectively. The rename should be tracked in a separate JIRA since we need consider property name backward compatibility issue. Comments? > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch, HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: HIVE-16189.2.patch For unknown reasons, the precommit build tests were not run for the new patch. Reattach patch to trigger the tests. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, > HIVE-16189.2.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929059#comment-15929059 ] Chaoyu Tang commented on HIVE-16189: Currently the partition column stats are dropped no matter its table rename succeeds or fails (see HIVE-16147). I am going to address that issue, together with some other issues particularly related to the partition table stats, in HIVE-16147. Will that makes sense? Thanks. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: HIVE-16189.2.patch Fixed the failed new test -- rebased and recreated the test output file. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.1.patch, HIVE-16189.2.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: HIVE-16189.1.patch 1. Fixed the failed tests. 2. Add a test based on [~pxiong]'s suggestion, the test scenario is as following (see encryption_move_tbl.q): When renaming a table fails to move its table data from one encryption zone to another due to EZ incompatibility, table rename fails but its column stats are invalidated. When we describe formatted table column, we found that all column stats have gone. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.1.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15925409#comment-15925409 ] Chaoyu Tang edited comment on HIVE-16189 at 3/15/17 1:57 AM: - 1. Fixed the failed tests. 2. Add a test based on [~pxiong]'s suggestion, the test scenario is as following (see encryption_move_tbl.q): When renaming a table fails to move its table data from one encryption zone to another due to EZ incompatibility, table rename fails but its column stats are invalidated. When we describe formatted table column, we found that all column stats have gone. [~pxiong] could you review it to see if it makes sense? Thanks. was (Author: ctang.ma): 1. Fixed the failed tests. 2. Add a test based on [~pxiong]'s suggestion, the test scenario is as following (see encryption_move_tbl.q): When renaming a table fails to move its table data from one encryption zone to another due to EZ incompatibility, table rename fails but its column stats are invalidated. When we describe formatted table column, we found that all column stats have gone. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.1.patch, HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: HIVE-16189.patch > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: (was: HIVE-16189.patch) > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: HIVE-16189.patch > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: (was: HIVE-16189.patch) > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922785#comment-15922785 ] Chaoyu Tang commented on HIVE-16189: Looks like the precommit build infrastructure has some issues, will re-trigger the tests when it is fixed. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922770#comment-15922770 ] Chaoyu Tang commented on HIVE-16189: I thought about that, but it seems a little difficult to create such a test case where the HFDS rename (data move) fails in {code} if (srcFs.exists(srcPath) && !srcFs.rename(srcPath, destPath)) { throw new IOException("Renaming " + srcPath + " to " + destPath + " failed"); } {code} after the metadata change has been successfully committed. Do you have any suggestion on that? I was able to manually reproduce this issue and verify the patch by using debug breakpoint and simulating the file move failure. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Status: Patch Available (was: Open) [~pxiong], [~aihuaxu], [~ychena], could you review the code? > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16189: --- Attachment: HIVE-16189.patch This patch changes the order of metadata update and data move in alter table rename operation, which makes it easier to roll back metadata changes when moving data fails in rename a table. > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16189.patch > > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16189) Table column stats might be invalidated in a failed table rename
[ https://issues.apache.org/jira/browse/HIVE-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-16189: -- > Table column stats might be invalidated in a failed table rename > > > Key: HIVE-16189 > URL: https://issues.apache.org/jira/browse/HIVE-16189 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > If the table rename does not succeed due to its failure in moving the data to > the new renamed table folder, the changes in TAB_COL_STATS are not rolled > back which leads to invalid column stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903422#comment-15903422 ] Chaoyu Tang edited comment on HIVE-16071 at 3/9/17 5:24 PM: So we reached the consensus that hive.spark.client.server.connect.timeout should not be used for cancelTask at RPCServer side. The value proposed could be hive.spark.client.connect.timeout. [~xuefuz] The reason that I previously suggested we could consider another timeout for cancelTask (a little longer than hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time to timeout the handshaking than RPCServer. If the timeout at both sides are set to exactly same value, we might see the situations quite often where the terminations of SASL handshaking are initiated by cancelTask at RpcServer side because the timeout at RemoteDriver side might be slightly later for whatever reasons. During this short window, the handshake could still have a chance to succeed if it is not terminated by cancelTask. To my understanding, to shorten cancelTask timeout is mainly for RpcServer to detect the handshake timeout (fired by RemoteDriver) sooner, we still want RemoteDriver to mainly control the SASL handshake timeout, and most handshake timeout should be fired from remoteDriver, right? In addition, I think we should was (Author: ctang.ma): So we reached the consensus that hive.spark.client.server.connect.timeout should not be used for cancelTask at RPCServer side. The value proposed could be hive.spark.client.connect.timeout. [~xuefuz] The reason that I previously suggested we could consider another timeout for cancelTask (a little longer than hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time to timeout the handshaking than RPCServer. If the timeout at both sides are set to exactly same value, we might see the situations quite often where the terminations of SASL handshaking are initiated by cancelTask at RpcServer side for the timeout at RemoteDriver side might be slightly later for whatever reasons. During this short window, the handshake could still succeed if it is not terminated by cancelTask. To my understanding, we still want RemoteDriver to mainly control the SASL handshake timeout, to shorten the cancelTask timeout is mainly for RpcServer to detect the timeout (fired by RemoteDriver) sooner, right? > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hi
[jira] [Comment Edited] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903422#comment-15903422 ] Chaoyu Tang edited comment on HIVE-16071 at 3/9/17 5:24 PM: So we reached the consensus that hive.spark.client.server.connect.timeout should not be used for cancelTask at RPCServer side. The value proposed could be hive.spark.client.connect.timeout. [~xuefuz] The reason that I previously suggested we could consider another timeout for cancelTask (a little longer than hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time to timeout the handshaking than RPCServer. If the timeout at both sides are set to exactly same value, we might see the situations quite often where the terminations of SASL handshaking are initiated by cancelTask at RpcServer side because the timeout at RemoteDriver side might be slightly later for whatever reasons. During this short window, the handshake could still have a chance to succeed if it is not terminated by cancelTask. To my understanding, to shorten cancelTask timeout is mainly for RpcServer to detect the handshake timeout (fired by RemoteDriver) sooner, we still want RemoteDriver to mainly control the SASL handshake timeout, and most handshake timeout should be fired from remoteDriver, right? was (Author: ctang.ma): So we reached the consensus that hive.spark.client.server.connect.timeout should not be used for cancelTask at RPCServer side. The value proposed could be hive.spark.client.connect.timeout. [~xuefuz] The reason that I previously suggested we could consider another timeout for cancelTask (a little longer than hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time to timeout the handshaking than RPCServer. If the timeout at both sides are set to exactly same value, we might see the situations quite often where the terminations of SASL handshaking are initiated by cancelTask at RpcServer side because the timeout at RemoteDriver side might be slightly later for whatever reasons. During this short window, the handshake could still have a chance to succeed if it is not terminated by cancelTask. To my understanding, to shorten cancelTask timeout is mainly for RpcServer to detect the handshake timeout (fired by RemoteDriver) sooner, we still want RemoteDriver to mainly control the SASL handshake timeout, and most handshake timeout should be fired from remoteDriver, right? In addition, I think we should > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.s
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903422#comment-15903422 ] Chaoyu Tang commented on HIVE-16071: So we reached the consensus that hive.spark.client.server.connect.timeout should not be used for cancelTask at RPCServer side. The value proposed could be hive.spark.client.connect.timeout. [~xuefuz] The reason that I previously suggested we could consider another timeout for cancelTask (a little longer than hive.spark.client.connect.timeout.) is to give RemoteDriver a little more time to timeout the handshaking than RPCServer. If the timeout at both sides are set to exactly same value, we might see the situations quite often where the terminations of SASL handshaking are initiated by cancelTask at RpcServer side for the timeout at RemoteDriver side might be slightly later for whatever reasons. During this short window, the handshake could still succeed if it is not terminated by cancelTask. To my understanding, we still want RemoteDriver to mainly control the SASL handshake timeout, to shorten the cancelTask timeout is mainly for RpcServer to detect the timeout (fired by RemoteDriver) sooner, right? > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats
[ https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-16147: -- > Rename a partitioned table should not drop its partition columns stats > -- > > Key: HIVE-16147 > URL: https://issues.apache.org/jira/browse/HIVE-16147 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > When a partitioned table (e.g. sample_pt) is renamed (e.g to > sample_pt_rename), describing its partition shows that the partition column > stats are still accurate, but actually they all have been dropped. > It could be reproduce as following: > 1. analyze table sample_pt compute statistics for columns; > 2. describe formatted default.sample_pt partition (dummy = 3): COLUMN_STATS > for all columns are true > {code} > ... > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > ... > {code} > 3: describe formatted default.sample_pt partition (dummy = 3) salary: column > stats exists > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > salaryint 1 151370 > 0 94 > > from deserializer > {code} > 4. alter table sample_pt rename to sample_pt_rename; > 5. describe formatted default.sample_pt_rename partition (dummy = 3): > describe the rename table partition (dummy =3) shows that COLUMN_STATS for > columns are still true. > {code} > # Detailed Partition Information > Partition Value: [3] > Database: default > Table:sample_pt_rename > CreateTime: Fri Jan 20 15:42:30 EST 2017 > LastAccessTime: UNKNOWN > Location: > file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3 > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > last_modified_byctang > last_modified_time 1485217063 > numFiles1 > numRows 100 > rawDataSize 5143 > totalSize 5243 > transient_lastDdlTime 1488842358 > {code} > describe formatted default.sample_pt_rename partition (dummy = 3) salary: the > column stats have been dropped. > {code} > # col_namedata_type comment > > > > salaryint from deserializer > > Time taken: 0.131 seconds, Fetched: 3 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900607#comment-15900607 ] Chaoyu Tang commented on HIVE-16071: I agree with [~xuefuz] that we need a timeout for SASL handshaking at RPC server site for the case he raised. This timeout should be shorter than client.server.connect.timeout used by RegisterClient, but ideally I think it should be a little longer than the client.connect.timeout used by RemoteDriver handshaking so that we can try to avoid the handshaking timeout initiated by the server given that starting a remoteDriver is quite expensive. If so, I would suggest we can introduce a new configuration like hive.spark.rpc.handshake.server.timeout, and rename hive.spark.client.connect.timeout to hive.spark.rpc.handshake.client.timeout (though it is also used as the socket connect timeout at RemoteDriver side like now). Also the hive.spark.client.server.connect.timeout could be renamed to something like hive.spark.register.remote.driver.timeout if necessary. What do you guys think about it? > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled
[ https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900049#comment-15900049 ] Chaoyu Tang commented on HIVE-15997: LGTM, +1 > Resource leaks when query is cancelled > --- > > Key: HIVE-15997 > URL: https://issues.apache.org/jira/browse/HIVE-15997 > Project: Hive > Issue Type: Bug >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-15997.1.patch > > > There may some resource leaks when query is cancelled. > We see following stacks in the log: > Possible files and folder leak: > {noformat} > 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: > Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local > exception: java.nio.channels.ClosedByInterruptException; Host Details : local > host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: > "ychencdh511t-1.vpc.cloudera.com":8020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > at org.apache.hadoop.ipc.Client.call(Client.java:1476) > at org.apache.hadoop.ipc.Client.call(Client.java:1409) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy25.delete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy26.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671) > at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) > at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) > at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) > at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) > at > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714) > at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) >
[jira] [Comment Edited] (HIVE-15997) Resource leaks when query is cancelled
[ https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15897742#comment-15897742 ] Chaoyu Tang edited comment on HIVE-15997 at 3/6/17 6:30 PM: Will TezTask be affected as well? Also I am not quite sure about this, for the code like this: {code} try { curatorFramework.delete().forPath(zLock.getPath()); } catch (InterruptedException ie) { curatorFramework.delete().forPath(zLock.getPath()); } {code} catching InterruptedException will guarantee to clear the interrupted flag in the thread and calling the method second time will guarantee to succeed? was (Author: ctang.ma): Will TezTask be affected as well? > Resource leaks when query is cancelled > --- > > Key: HIVE-15997 > URL: https://issues.apache.org/jira/browse/HIVE-15997 > Project: Hive > Issue Type: Bug >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-15997.1.patch > > > There may some resource leaks when query is cancelled. > We see following stacks in the log: > Possible files and folder leak: > {noformat} > 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: > Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local > exception: java.nio.channels.ClosedByInterruptException; Host Details : local > host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: > "ychencdh511t-1.vpc.cloudera.com":8020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > at org.apache.hadoop.ipc.Client.call(Client.java:1476) > at org.apache.hadoop.ipc.Client.call(Client.java:1409) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy25.delete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy26.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671) > at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) > at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) > at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) > at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) > at > or
[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled
[ https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15897742#comment-15897742 ] Chaoyu Tang commented on HIVE-15997: Will TezTask be affected as well? > Resource leaks when query is cancelled > --- > > Key: HIVE-15997 > URL: https://issues.apache.org/jira/browse/HIVE-15997 > Project: Hive > Issue Type: Bug >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-15997.1.patch > > > There may some resource leaks when query is cancelled. > We see following stacks in the log: > Possible files and folder leak: > {noformat} > 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: > Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local > exception: java.nio.channels.ClosedByInterruptException; Host Details : local > host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: > "ychencdh511t-1.vpc.cloudera.com":8020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > at org.apache.hadoop.ipc.Client.call(Client.java:1476) > at org.apache.hadoop.ipc.Client.call(Client.java:1409) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy25.delete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy26.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671) > at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) > at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) > at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) > at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) > at > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714) > at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376) > at org.apache.hadoop.ipc.Client.getConnec
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894672#comment-15894672 ] Chaoyu Tang commented on HIVE-16071: Yes, [~lirui]. Increasing hive.spark.client.server.connect.timeout (instead of the hive.spark.client.connect.timeout) could help in my case. The cancelTask could effect and close the channel only when its timeout is set to a value shorter than current hive.spark.client.server.connect.timeout. So for this cancelTask, we can do: 1. remove it to make code more understandable; or 2. leave it as is since it is not be executed anyway; or 3. Use a different HoS timeout configuration (either hive.spark.client.connect.timeout or a new one) so that we have more and finer control to the waiting time at HS2 side. Adding a new timeout config may not be desirable since we already have many such configurations. [~xuefuz], [~lirui], [~vanzin], what do you think? > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)