[ 
https://issues.apache.org/jira/browse/HIVE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Linte updated HIVE-14631:
-----------------------------------
    Description: 
I have a cluster secured with Kerberos and Hive is configured to work with Tez 
by default. Everything works well through hive-cli and beeline; however, I'm 
facing a strange behavior through Hue.
I can have a lot of client connections (these can reach 600) and after a day, 
the client connections fail. But this is not the case for all clients 
connection attempts.

When it fails, I have the following logs on the HiveServer2:
{noformat}
Aug  3 09:28:04 hiveserver2.bigdata.fr Executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112):
 INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = 
hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in 
parallel
Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:05 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:05 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:06 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:06 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:06 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:08 hiveserver2.bigdata.fr FAILED: Execution Error, return code -1 
from org.apache.hadoop.hive.ql.exec.tez.TezTask
Aug  3 09:28:08 hiveserver2.bigdata.fr Completed executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112);
 Time taken: 4.002 seconds
{noformat}

At the same time I have the following logs on the Metastore are:
{noformat}
Aug  3 09:28:03 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:03 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:04 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:05 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:05 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:06 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:06 metastore01.bigdata.fr Error occurred during processing of 
message.
{noformat}

To solve the connections issue, I have to restart the HiveServer2.

Note: I also created a JIRA for Hue: https://issues.cloudera.org/browse/HUE-4748

  was:
I have a cluster secured with Kerberos and Hive is configured to work with Tez 
by default. Everything works well through hive-cli and beeline; however, I'm 
facing a strange behavior through Hue.
I can have a lot of client connections (these can reach 600) and after a day, 
the client connections fail. But this is not the case for all clients 
connection attempts.

When it fails, I have the following logs on the HiveServer2:
{noformat}
Aug  3 09:28:04 hiveserver2.bigdata.fr Executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112):
 INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = 
hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in 
parallel
Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:05 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:05 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:06 hiveserver2.bigdata.fr Trying to connect to metastore with URI 
thrift://metastore01.bigdata.fr:9083
Aug  3 09:28:06 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
Server...
Aug  3 09:28:06 hiveserver2.bigdata.fr Waiting 1 seconds before next connection 
attempt.
Aug  3 09:28:08 hiveserver2.bigdata.fr FAILED: Execution Error, return code -1 
from org.apache.hadoop.hive.ql.exec.tez.TezTask
Aug  3 09:28:08 hiveserver2.bigdata.fr Completed executing 
command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112);
 Time taken: 4.002 seconds
{noformat}

At the same time I have the following logs on the Metastore are:
{noformat}
Aug  3 09:28:03 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:03 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
tbl=camille_test
Aug  3 09:28:04 metastore01.bigdata.fr 
ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
tbl=camille_test#011
Aug  3 09:28:04 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:04 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:05 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:05 metastore01.bigdata.fr Error occurred during processing of 
message.
Aug  3 09:28:06 metastore01.bigdata.fr SASL negotiation failure
Aug  3 09:28:06 metastore01.bigdata.fr Error occurred during processing of 
message.
{noformat}

Note: I also created a JIRA for Hue: https://issues.cloudera.org/browse/HUE-4748


> HiveServer2 regularly fails to connect to metastore
> ---------------------------------------------------
>
>                 Key: HIVE-14631
>                 URL: https://issues.apache.org/jira/browse/HIVE-14631
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 1.2.1, 2.0.0, 2.1.0
>         Environment: Hive 2.1.0, Hue 3.10.0, Hadoop 2.7.2, Tez 0.8.3
>            Reporter: Alexandre Linte
>
> I have a cluster secured with Kerberos and Hive is configured to work with 
> Tez by default. Everything works well through hive-cli and beeline; however, 
> I'm facing a strange behavior through Hue.
> I can have a lot of client connections (these can reach 600) and after a day, 
> the client connections fail. But this is not the case for all clients 
> connection attempts.
> When it fails, I have the following logs on the HiveServer2:
> {noformat}
> Aug  3 09:28:04 hiveserver2.bigdata.fr Executing 
> command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112):
>  INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
> Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = 
> hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
> Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
> Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
> Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in 
> parallel
> Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with 
> URI thrift://metastore01.bigdata.fr:9083
> Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
> Server...
> Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next 
> connection attempt.
> Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with 
> URI thrift://metastore01.bigdata.fr:9083
> Aug  3 09:28:05 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
> Server...
> Aug  3 09:28:05 hiveserver2.bigdata.fr Waiting 1 seconds before next 
> connection attempt.
> Aug  3 09:28:06 hiveserver2.bigdata.fr Trying to connect to metastore with 
> URI thrift://metastore01.bigdata.fr:9083
> Aug  3 09:28:06 hiveserver2.bigdata.fr Failed to connect to the MetaStore 
> Server...
> Aug  3 09:28:06 hiveserver2.bigdata.fr Waiting 1 seconds before next 
> connection attempt.
> Aug  3 09:28:08 hiveserver2.bigdata.fr FAILED: Execution Error, return code 
> -1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
> Aug  3 09:28:08 hiveserver2.bigdata.fr Completed executing 
> command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112);
>  Time taken: 4.002 seconds
> {noformat}
> At the same time I have the following logs on the Metastore are:
> {noformat}
> Aug  3 09:28:03 metastore01.bigdata.fr 180: get_table : db=shfs3453 
> tbl=camille_test
> Aug  3 09:28:03 metastore01.bigdata.fr 
> ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
> tbl=camille_test#011
> Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
> tbl=camille_test
> Aug  3 09:28:04 metastore01.bigdata.fr 
> ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
> tbl=camille_test#011
> Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 
> tbl=camille_test
> Aug  3 09:28:04 metastore01.bigdata.fr 
> ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 
> tbl=camille_test#011
> Aug  3 09:28:04 metastore01.bigdata.fr SASL negotiation failure
> Aug  3 09:28:04 metastore01.bigdata.fr Error occurred during processing of 
> message.
> Aug  3 09:28:05 metastore01.bigdata.fr SASL negotiation failure
> Aug  3 09:28:05 metastore01.bigdata.fr Error occurred during processing of 
> message.
> Aug  3 09:28:06 metastore01.bigdata.fr SASL negotiation failure
> Aug  3 09:28:06 metastore01.bigdata.fr Error occurred during processing of 
> message.
> {noformat}
> To solve the connections issue, I have to restart the HiveServer2.
> Note: I also created a JIRA for Hue: 
> https://issues.cloudera.org/browse/HUE-4748



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to