[ https://issues.apache.org/jira/browse/HIVE-16674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015781#comment-16015781 ]
Vlad Gudikov edited comment on HIVE-16674 at 5/18/17 1:57 PM: -------------------------------------------------------------- Most of the rpc call in MetaStore are of fairly small payload. But in this case we get more than 256 mb of data while calling get_partitions method. It is so due to getting all information about partitions including column level comments. Do we actually need them while getting partitions, because they are duplicated for each partition? Here is the code where we get column comments. Do we actuualy need them while getting information about partitions? {code} // Get FieldSchema stuff if any. if (!colss.isEmpty()) { // We are skipping the CDS table here, as it seems to be totally useless. queryText = "select \"CD_ID\", \"COMMENT\", \"COLUMN_NAME\", \"TYPE_NAME\"" + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and \"INTEGER_IDX\" >= 0" + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc"; loopJoinOrderedResult(colss, queryText, 0, new ApplyFunc<List<FieldSchema>>() { @Override public void apply(List<FieldSchema> t, Object[] fields) { t.add(new FieldSchema((String)fields[2], (String)fields[3], (String)fields[1])); }}); } {code} was (Author: allgoodok): Most of the rpc call in MetaStore are of fairly small payload. But in this case we get more than 256 mb of data while calling get_partitions method. It is so due to getting all information about partitions including column level comments. Do we actually need them while getting partitions, because they are duplicated for each partition? Here is the code where we get column comments. Do we actuualy need them while getting information about partitions? {code} // Get FieldSchema stuff if any. if (!colss.isEmpty()) { // We are skipping the CDS table here, as it seems to be totally useless. queryText = "select \"CD_ID\", {color:red}\"COMMENT\"{color}, \"COLUMN_NAME\", \"TYPE_NAME\"" + " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and \"INTEGER_IDX\" >= 0" + " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc"; loopJoinOrderedResult(colss, queryText, 0, new ApplyFunc<List<FieldSchema>>() { @Override public void apply(List<FieldSchema> t, Object[] fields) { t.add(new FieldSchema((String)fields[2], (String)fields[3], (String)fields[1])); }}); } {code} > Hive metastore JVM dumps core > ----------------------------- > > Key: HIVE-16674 > URL: https://issues.apache.org/jira/browse/HIVE-16674 > Project: Hive > Issue Type: Bug > Affects Versions: 1.2.1 > Environment: Hive-1.2.1 > Kerberos enabled cluster > Reporter: Vlad Gudikov > Priority: Blocker > Fix For: 1.2.1, 2.3.0 > > > While trying to run a Hive query on 24 partitions executed on an external > table with large amount of partitions (4K+). I get an error > {code} > - org.apache.thrift.transport.TSaslTransport$SaslParticipant.wrap(byte[], > int, int) @bci=27, line=568 (Compiled frame) > - org.apache.thrift.transport.TSaslTransport.flush() @bci=52, line=492 > (Compiled frame) > - org.apache.thrift.transport.TSaslServerTransport.flush() @bci=1, line=41 > (Compiled frame) > - org.apache.thrift.ProcessFunction.process(int, > org.apache.thrift.protocol.TProtocol, org.apache.thrift.protocol.TProtocol, > java.lang.Object) @bci=236, line=55 (Compiled frame) > - > org.apache.thrift.TBaseProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=126, line=39 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=15, line=690 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run() > @bci=1, line=685 (Compiled frame) > - > java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, > java.security.AccessControlContext) @bci=0 (Compiled frame) > - javax.security.auth.Subject.doAs(javax.security.auth.Subject, > java.security.PrivilegedExceptionAction) @bci=42, line=422 (Compiled frame) > - > org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) > @bci=14, line=1595 (Compiled frame) > - > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(org.apache.thrift.protocol.TProtocol, > org.apache.thrift.protocol.TProtocol) @bci=273, line=685 (Compiled frame) > - org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run() @bci=151, > line=285 (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Interpreted frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)