[jira] [Comment Edited] (HIVE-16674) Hive metastore JVM dumps core

Vlad Gudikov (JIRA) Thu, 18 May 2017 06:58:28 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015781#comment-16015781
 ]


Vlad Gudikov edited comment on HIVE-16674 at 5/18/17 1:57 PM:
--------------------------------------------------------------

Most of the rpc call in MetaStore are of fairly small payload. But in this case 
we get more than 256 mb of data  while calling get_partitions method. It is so 
due to getting all information about partitions including column level 
comments. Do we actually need them while getting partitions, because they are 
duplicated for each partition? Here is the code where we get column comments. 
Do we actuualy need them while getting information about partitions?

{code}
// Get FieldSchema stuff if any.
    if (!colss.isEmpty()) {
      // We are skipping the CDS table here, as it seems to be totally useless.
      queryText = "select \"CD_ID\", \"COMMENT\", \"COLUMN_NAME\", 
\"TYPE_NAME\""
          + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and 
\"INTEGER_IDX\" >= 0"
          + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc";
      loopJoinOrderedResult(colss, queryText, 0, new 
ApplyFunc<List<FieldSchema>>() {
        @Override
        public void apply(List<FieldSchema> t, Object[] fields) {
          t.add(new FieldSchema((String)fields[2], (String)fields[3], 
(String)fields[1]));
        }});
    }
{code}


was (Author: allgoodok):
Most of the rpc call in MetaStore are of fairly small payload. But in this case 
we get more than 256 mb of data  while calling get_partitions method. It is so 
due to getting all information about partitions including column level 
comments. Do we actually need them while getting partitions, because they are 
duplicated for each partition? Here is the code where we get column comments. 
Do we actuualy need them while getting information about partitions?

{code}
// Get FieldSchema stuff if any.
    if (!colss.isEmpty()) {
      // We are skipping the CDS table here, as it seems to be totally useless.
      queryText = "select \"CD_ID\", {color:red}\"COMMENT\"{color}, 
\"COLUMN_NAME\", \"TYPE_NAME\""
          + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and 
\"INTEGER_IDX\" >= 0"
          + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc";
      loopJoinOrderedResult(colss, queryText, 0, new 
ApplyFunc<List<FieldSchema>>() {
        @Override
        public void apply(List<FieldSchema> t, Object[] fields) {
          t.add(new FieldSchema((String)fields[2], (String)fields[3], 
(String)fields[1]));
        }});
    }
{code}

> Hive metastore JVM dumps core
> -----------------------------
>
>                 Key: HIVE-16674
>                 URL: https://issues.apache.org/jira/browse/HIVE-16674
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.1
>         Environment: Hive-1.2.1
> Kerberos enabled cluster
>            Reporter: Vlad Gudikov
>            Priority: Blocker
>             Fix For: 1.2.1, 2.3.0
>
>
> While trying to run a Hive query on 24 partitions executed on an external 
> table with large amount of partitions (4K+). I get an error
> {code}
>  - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], 
> int, int) @bci=27, line=568 (Compiled frame)
>  - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 
> (Compiled frame)
>  - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 
> (Compiled frame)
>  - org.apache.thrift.ProcessFunction.process(int, 
> org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, 
> java.lang.Object) @bci=236, line=55 (Compiled frame)
>  - 
> org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol,
>  org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame)
>  - 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run()
>  @bci=15, line=690 (Compiled frame)
>  - 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run()
>  @bci=1, line=685 (Compiled frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
>  java.security.AccessControlContext) @bci=0 (Compiled frame)
>  - javax.security.auth.Subject.doAs(javax.security.auth.Subject, 
> java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame)
>  - 
> org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
>  @bci=14, line=1595 (Compiled frame)
>  - 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol,
>  org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame)
>  - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, 
> line=285 (Interpreted frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1142 (Interpreted frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16674) Hive metastore JVM dumps core

Reply via email to