[ https://issues.apache.org/jira/browse/HIVE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237833#comment-15237833 ]
Poojan Khanpara commented on HIVE-11368: ---------------------------------------- I am getting the same alerts, especially when I try to drop a big schema with cascade. I have a six node cluster and it was working perfectly before I upgraded latest version. > Hive Metastore process always shows alert in Ambari UI on machines with 64 > CPU cores > ------------------------------------------------------------------------------------ > > Key: HIVE-11368 > URL: https://issues.apache.org/jira/browse/HIVE-11368 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 1.2.0 > Environment: 64 CPU Core. > Reporter: Ayappan > > I am running Ambari with hadoop full stack installed on a cluster setup with > machines having 64 CPU cores. All the services are up and running. But the > Hive Metastore process always shows alert.Checking into the alert definition > , it says Hive command was killed due timeout after 30 seconds. > This is below command. > /var/lib/ambari-agent/ambari-sudo.sh su ambari-qa -l -s /bin/bash -c export > PATH='/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/sbin/:/usr/iop/current/hive-metastore/bin' > ; ulimit -s 10240 ; export > HIVE_CONF_DIR='/usr/iop/current/hive-metastore/conf/conf.server' ; hive > --hiveconf hive.metastore.uris=thrift://birhel17.rtp.raleigh.ibm.com:9083 > --hiveconf hive.metastore.client.connect.retry.delay=1s > --hiveconf hive.metastore.failure.retries=1 --hiveconf > hive.metastore.connect.retries=1 --hiveconf > hive.metastore.client.socket.timeout=14s --hiveconf > hive.execution.engine=mr -e 'show databases;' > And the alert-metastore python script has a timeout of 30 seconds but the > above Hive command takes more than 30 seconds on a 64 core machine. So it > always shows the alert. > Even manually running the command from command line takes lot of time (around > 27 secs) in 64 core compared to 8 core machine (takes only 3 secs) > Do we need to change some hive parameters ( like worker.threads ) for 64 core > machines ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)