[jira] [Commented] (HIVE-2941) Hive should expand nested structs when setting the table schema from thrift structs
[ https://issues.apache.org/jira/browse/HIVE-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252649#comment-13252649 ] Travis Crawford commented on HIVE-2941: --- Here are some additional details about the issue. Consider the following create table statement. Columns will be discovered for the table by reflecting on the {{Person}} object (instead of explicitly specifying them). {code} hive create external table travis_test.person_test partitioned by (part_dt string) row format serde com.twitter.elephantbird.hive.serde.ThriftSerDe with serdeproperties (serialization.class=com.twitter.elephantbird.examples.thrift.Person) stored as inputformat com.twitter.elephantbird.mapred.input.HiveMultiInputFormat outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat; {code} Current behavior does not expand nested structures, listing the class name of nested structs as the field type. Users browsing the schema do not get a full definition of the table schema. {code} hive describe extended person_test; OK namecom.twitter.elephantbird.examples.thrift.Name from deserializer id int from deserializer email string from deserializer phones arraycom.twitter.elephantbird.examples.thrift.PhoneNumber from deserializer part_dt string {code} This patch expands nested structures, showing the full table schema. Here's an example of what the table looks like with the patch: {code} hive describe extended person_test; OK namestructfirst_name:string,last_name:string from deserializer id int from deserializer email string from deserializer phones arraystructnumber:string,type:structvalue:int from deserializer part_dt string {code} In both cases, the table storage descriptor is unchanged - both list the columns as {{cols:[]}}. I believe the reflected table schema should be copied into the partition storage descriptor when adding a new partition, but that could be a separate change. Hive should expand nested structs when setting the table schema from thrift structs --- Key: HIVE-2941 URL: https://issues.apache.org/jira/browse/HIVE-2941 Project: Hive Issue Type: Bug Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2941.D2721.1.patch When setting a table serde, the deserializer is queried for its schema, which is used to set the metastore table schema. The current implementation uses the class name stored in the field as the field type. By storing the class name as the field type, users cannot see the contents of a struct with describe tblname. Applications that query HiveMetaStore for the table schema (specifically HCatalog in this case) see an unknown field type, rather than a struct containing known field types. Hive should store the expanded schema in the metastore so users browsing the schema see expanded fields, and applications querying metastore see familiar types. DETAILS Set the table serde to something like this. This serde uses the built-in {{ThriftStructObjectInspector}}. {code} alter table foo_test set serde com.twitter.elephantbird.hive.serde.ThriftSerDe with serdeproperties (serialization.class=com.foo.Foo); {code} This causes a call to {{MetaStoreUtils.getFieldsFromDeserializer}} which returns a list of fields and their schemas. However, currently it does not handle nested structs, and if {{com.foo.Foo}} above contains a field {{com.foo.Bar}}, the class name {{com.foo.Bar}} would appear as the field type. Instead, nested structs should be expanded. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore
[ https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250707#comment-13250707 ] Travis Crawford commented on HIVE-2767: --- @ashutosh - Yeah this test does seem problematic. It works fine in IntelliJ but I can't get it to pass with the command you gave, even when doing a clean trunk build. Looking at Jenkins the test works fine, so perhaps its something to do with my machine (osx 10.7.3). I'm looking into why the test doesn't work on trunk, then will see if this change affects it. Optionally use framed transport with metastore -- Key: HIVE-2767 URL: https://issues.apache.org/jira/browse/HIVE-2767 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.D2661.2.patch, HIVE-2767.D2661.3.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt Users may want/need to use thrift's framed transport when communicating with the Hive MetaStore. This patch adds a new property {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed transport (defaults to off, aka no change from before the patch). This property must be set for both clients and the HMS server. It wasn't immediately clear how to use the framed transport with SASL, so as written an exception is thrown if you try starting the server with both options. If SASL and the framed transport will indeed work together I can update the patch (although I don't have a secured environment to test in). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore
[ https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250718#comment-13250718 ] Travis Crawford commented on HIVE-2767: --- Cool - thanks for the pointer! I'll watch that issue and afterwards rebase if necessary and update. Optionally use framed transport with metastore -- Key: HIVE-2767 URL: https://issues.apache.org/jira/browse/HIVE-2767 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.D2661.2.patch, HIVE-2767.D2661.3.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt Users may want/need to use thrift's framed transport when communicating with the Hive MetaStore. This patch adds a new property {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed transport (defaults to off, aka no change from before the patch). This property must be set for both clients and the HMS server. It wasn't immediately clear how to use the framed transport with SASL, so as written an exception is thrown if you try starting the server with both options. If SASL and the framed transport will indeed work together I can update the patch (although I don't have a secured environment to test in). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2883) Metastore client doesnt close connection properly
[ https://issues.apache.org/jira/browse/HIVE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250953#comment-13250953 ] Travis Crawford commented on HIVE-2883: --- I patched this in and ran a pig query through HCatalog, and the Unable to shutdown local metastore client error went away, and the query produced a correct result. LGTM. Metastore client doesnt close connection properly - Key: HIVE-2883 URL: https://issues.apache.org/jira/browse/HIVE-2883 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.9.0 Attachments: HIVE-2883.D2613.1.patch While closing connection, it always fail with following trace. Seemingly, it doesnt have any harmful effects. {code} 12/03/20 10:55:02 ERROR hive.metastore: Unable to shutdown local metastore client org.apache.thrift.transport.TTransportException: Cannot write to null outputStream at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142) at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163) at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:421) at com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:415) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:310) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2883) Metastore client doesnt close connection properly
[ https://issues.apache.org/jira/browse/HIVE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246636#comment-13246636 ] Travis Crawford commented on HIVE-2883: --- @ashutosh - fb303 is released as a thrift contrib project: http://svn.apache.org/viewvc/thrift/trunk/contrib/fb303/ Metastore client doesnt close connection properly - Key: HIVE-2883 URL: https://issues.apache.org/jira/browse/HIVE-2883 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Fix For: 0.9.0 While closing connection, it always fail with following trace. Seemingly, it doesnt have any harmful effects. {code} 12/03/20 10:55:02 ERROR hive.metastore: Unable to shutdown local metastore client org.apache.thrift.transport.TTransportException: Cannot write to null outputStream at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142) at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163) at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:421) at com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:415) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:310) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2609) NPE when pruning partitions by thrift method get_partitions_by_filter
[ https://issues.apache.org/jira/browse/HIVE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228911#comment-13228911 ] Travis Crawford commented on HIVE-2609: --- I ran into this today too and, in addition to updating the two jars Thomas mentioned, also had to update: https://github.com/apache/hive/blob/trunk/metastore/src/model/package.jdo#L49 In our hive tables the column is named COMMENT - not FCOMMENT. Without updating datanucleus things work fine, but this change is required when updating jars. I don't understand why the change in behavior yet though. NPE when pruning partitions by thrift method get_partitions_by_filter - Key: HIVE-2609 URL: https://issues.apache.org/jira/browse/HIVE-2609 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.1 Reporter: Min Zhou It's a datanucleus bug indeed. try this code: {code} boolean open = false; for (int i = 0; i 5 !open; ++i) { try { transport.open(); open = true; } catch (TTransportException e) { System.out.println(failed to connect to MetaStore, re-trying...); try { Thread.sleep(1000); } catch (InterruptedException ignore) {} } } try { ListPartition parts = client.get_partitions_by_filter(default, partitioned_nation, pt '2', (short) -1); for (Partition part : parts) { System.out.println(part.getSd().getLocation()); } } catch (Exception te) { te.printStackTrace(); } {code} A NPEexception would be thrown on the thrift server side {noformat} 11/11/25 13:11:55 ERROR api.ThriftHiveMetastore$Processor: Internal error processing get_partitions_by_filter java.lang.NullPointerException at org.datanucleus.store.mapped.mapping.MappingHelper.getMappingIndices(MappingHelper.java:35) at org.datanucleus.store.mapped.expression.StatementText.applyParametersToStatement(StatementText.java:194) at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:233) at org.datanucleus.store.rdbms.query.legacy.SQLEvaluator.evaluate(SQLEvaluator.java:115) at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.performExecute(JDOQLQuery.java:288) at org.datanucleus.store.query.Query.executeQuery(Query.java:1657) at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245) at org.datanucleus.store.query.Query.executeWithMap(Query.java:1526) at org.datanucleus.jdo.JDOQuery.executeWithMap(JDOQuery.java:334) at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1329) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1241) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$40.run(HiveMetaStore.java:2369) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$40.run(HiveMetaStore.java:2366) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:307) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2366) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.process(ThriftHiveMetastore.j ava:6099) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:4789) at org.apache.hadoop.hive.metastore.HiveMetaStore$TLoggingProcessor.process(HiveMetaStore.java:3167) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} A null JavaTypeMapping was passed into org.datanucleus.store.mapped.mapping.MappingHelper.(int initialPosition, JavaTypeMapping mapping), that caused NPE. After digged into the datanucleus source, I found that the null value was born in the constructor of org.datanucleus.store.mapped.expression.SubstringExpression. see {code} /** * Constructs the substring * @param str the String Expression * @param begin The start position * @param end The end position expression **/ public SubstringExpression(StringExpression str, NumericExpression begin, NumericExpression end) { super(str.getQueryExpression()); st.append(SUBSTRING().append(str).append( FROM ) .append(begin.add(new IntegerLiteral(qs, mapping,
[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore
[ https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217694#comment-13217694 ] Travis Crawford commented on HIVE-2767: --- Thanks for the feedback Ashutosh - if this is something you'd consider adding I can update based on current trunk if things have changed. While there may be perf gains, this is needed to integrate with our compute grid. Basically a wrapper registers the metastore host:port in zookeeper, and a thrift framed-transport-only proxy proxies metastore requests based on its ZK registration. This lets us launch the metastore on our mesos cluster and still give clients the hard-coded host:port they expect. Optionally use framed transport with metastore -- Key: HIVE-2767 URL: https://issues.apache.org/jira/browse/HIVE-2767 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2767.patch.txt Users may want/need to use thrift's framed transport when communicating with the Hive MetaStore. This patch adds a new property {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed transport (defaults to off, aka no change from before the patch). This property must be set for both clients and the HMS server. It wasn't immediately clear how to use the framed transport with SASL, so as written an exception is thrown if you try starting the server with both options. If SASL and the framed transport will indeed work together I can update the patch (although I don't have a secured environment to test in). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira