[jira] [Updated] (HIVE-14631) HiveServer2 regularly fails to connect to metastore

2016-08-25 Thread Alexandre Linte (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Linte updated HIVE-14631:
---
Description: 
I have a cluster secured with Kerberos and Hive is configured to work with Tez 
by default. Everything works well through hive-cli and beeline; however, I'm 
facing a strange behavior through Hue.
I can have a lot of client connections (these can reach 600) and after a day, 
the client connections fail. But this is not the case for all clients 
connection attempts.

When it fails, I have the following logs on the HiveServer2:
{noformat}
Aug  3 09:28:04 hiveserver2.bigdata.fr Executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112):
 INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = 
hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in 
parallel
Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:05 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:05 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:06 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:06 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:06 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:08 hiveserver2.bigdata.fr FAILED: Execution Error, return code -1 
from org.apache.hadoop.hive.ql.exec.tez.TezTask
Aug  3 09:28:08 hiveserver2.bigdata.fr Completed executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112);
 Time taken: 4.002 seconds
{noformat}

At the same time I have the following logs on the Metastore are:
{noformat}
Aug  3 09:28:03 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:03 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:04 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:05 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:05 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:06 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:06 metastore01.bigdata.fr Error occurred during processing of 
message.
{noformat}

Note: I also created a JIRA for Hue: https://issues.cloudera.org/browse/HUE-4748

  was:
I have a cluster secured with Kerberos and Hive is configured to work with Tez 
by default. Everything works well through hive-cli and beeline; however, I'm 
facing a strange behavior through Hue.
I can have a lot of client connections (these can reach 600) and after a day, 
the client connections fail. But this is not the case for all clients 
connection attempts.

When it fails, I have the following logs on the HiveServer2:
{noformat}
Aug  3 09:28:04 hiveserver2.bigdata.fr Executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112):
 INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = 
hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in 
parallel
Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:05 

[jira] [Updated] (HIVE-14631) HiveServer2 regularly fails to connect to metastore

2016-08-25 Thread Alexandre Linte (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Linte updated HIVE-14631:
---
Description: 
I have a cluster secured with Kerberos and Hive is configured to work with Tez 
by default. Everything works well through hive-cli and beeline; however, I'm 
facing a strange behavior through Hue.
I can have a lot of client connections (these can reach 600) and after a day, 
the client connections fail. But this is not the case for all clients 
connection attempts.

When it fails, I have the following logs on the HiveServer2:
{noformat}
Aug  3 09:28:04 hiveserver2.bigdata.fr Executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112):
 INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = 
hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in 
parallel
Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:05 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:05 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:06 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:06 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:06 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:08 hiveserver2.bigdata.fr FAILED: Execution Error, return code -1 
from org.apache.hadoop.hive.ql.exec.tez.TezTask
Aug  3 09:28:08 hiveserver2.bigdata.fr Completed executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112);
 Time taken: 4.002 seconds
{noformat}

At the same time I have the following logs on the Metastore are:
{noformat}
Aug  3 09:28:03 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:03 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:04 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:05 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:05 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:06 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:06 metastore01.bigdata.fr Error occurred during processing of 
message.
{noformat}

To solve the connections issue, I have to restart the HiveServer2.

Note: I also created a JIRA for Hue: https://issues.cloudera.org/browse/HUE-4748

  was:
I have a cluster secured with Kerberos and Hive is configured to work with Tez 
by default. Everything works well through hive-cli and beeline; however, I'm 
facing a strange behavior through Hue.
I can have a lot of client connections (these can reach 600) and after a day, 
the client connections fail. But this is not the case for all clients 
connection attempts.

When it fails, I have the following logs on the HiveServer2:
{noformat}
Aug  3 09:28:04 hiveserver2.bigdata.fr Executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112):
 INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = 
hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in 
parallel
Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI