[jira] [Commented] (HIVE-2941) Hive should expand nested structs when setting the table schema from thrift structs

2012-04-12 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252649#comment-13252649
 ] 

Travis Crawford commented on HIVE-2941:
---

Here are some additional details about the issue. Consider the following create 
table statement. Columns will be discovered for the table by reflecting on the 
{{Person}} object (instead of explicitly specifying them).

{code}
hive create external table travis_test.person_test 
   partitioned by (part_dt string)
   row format serde com.twitter.elephantbird.hive.serde.ThriftSerDe
 with serdeproperties 
(serialization.class=com.twitter.elephantbird.examples.thrift.Person)
   stored as
 inputformat 
com.twitter.elephantbird.mapred.input.HiveMultiInputFormat
 outputformat 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
{code}

Current behavior does not expand nested structures, listing the class name of 
nested structs as the field type. Users browsing the schema do not get a full 
definition of the table schema.

{code}
hive describe extended person_test;

OK
namecom.twitter.elephantbird.examples.thrift.Name   from deserializer
id  int from deserializer
email   string  from deserializer
phones  arraycom.twitter.elephantbird.examples.thrift.PhoneNumber from 
deserializer
part_dt string  
{code}

This patch expands nested structures, showing the full table schema. Here's an 
example of what the table looks like with the patch:

{code}
hive describe extended person_test;
OK
namestructfirst_name:string,last_name:string  from deserializer
id  int from deserializer
email   string  from deserializer
phones  arraystructnumber:string,type:structvalue:int from 
deserializer
part_dt string  
{code}

In both cases, the table storage descriptor is unchanged - both list the 
columns as {{cols:[]}}.

I believe the reflected table schema should be copied into the partition 
storage descriptor when adding a new partition, but that could be a separate 
change.

 Hive should expand nested structs when setting the table schema from thrift 
 structs
 ---

 Key: HIVE-2941
 URL: https://issues.apache.org/jira/browse/HIVE-2941
 Project: Hive
  Issue Type: Bug
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-2941.D2721.1.patch


 When setting a table serde, the deserializer is queried for its schema, which 
 is used to set the metastore table schema. The current implementation uses 
 the class name stored in the field as the field type.
 By storing the class name as the field type, users cannot see the contents of 
 a struct with describe tblname. Applications that query HiveMetaStore for 
 the table schema (specifically HCatalog in this case) see an unknown field 
 type, rather than a struct containing known field types.
 Hive should store the expanded schema in the metastore so users browsing the 
 schema see expanded fields, and applications querying metastore see familiar 
 types.
 DETAILS
 Set the table serde to something like this. This serde uses the built-in 
 {{ThriftStructObjectInspector}}.
 {code}
 alter table foo_test
   set serde com.twitter.elephantbird.hive.serde.ThriftSerDe
   with serdeproperties (serialization.class=com.foo.Foo);
 {code}
 This causes a call to {{MetaStoreUtils.getFieldsFromDeserializer}} which 
 returns a list of fields and their schemas. However, currently it does not 
 handle nested structs, and if {{com.foo.Foo}} above contains a field 
 {{com.foo.Bar}}, the class name {{com.foo.Bar}} would appear as the field 
 type. Instead, nested structs should be expanded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore

2012-04-10 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250707#comment-13250707
 ] 

Travis Crawford commented on HIVE-2767:
---

@ashutosh - Yeah this test does seem problematic. It works fine in IntelliJ but 
I can't get it to pass with the command you gave, even when doing a clean trunk 
build. Looking at Jenkins the test works fine, so perhaps its something to do 
with my machine (osx 10.7.3).

I'm looking into why the test doesn't work on trunk, then will see if this 
change affects it.

 Optionally use framed transport with metastore
 --

 Key: HIVE-2767
 URL: https://issues.apache.org/jira/browse/HIVE-2767
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.D2661.2.patch, 
 HIVE-2767.D2661.3.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt


 Users may want/need to use thrift's framed transport when communicating with 
 the Hive MetaStore. This patch adds a new property 
 {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed 
 transport (defaults to off, aka no change from before the patch). This 
 property must be set for both clients and the HMS server.
 It wasn't immediately clear how to use the framed transport with SASL, so as 
 written an exception is thrown if you try starting the server with both 
 options. If SASL and the framed transport will indeed work together I can 
 update the patch (although I don't have a secured environment to test in).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore

2012-04-10 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250718#comment-13250718
 ] 

Travis Crawford commented on HIVE-2767:
---

Cool - thanks for the pointer! I'll watch that issue and afterwards rebase if 
necessary and update.

 Optionally use framed transport with metastore
 --

 Key: HIVE-2767
 URL: https://issues.apache.org/jira/browse/HIVE-2767
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.D2661.2.patch, 
 HIVE-2767.D2661.3.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt


 Users may want/need to use thrift's framed transport when communicating with 
 the Hive MetaStore. This patch adds a new property 
 {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed 
 transport (defaults to off, aka no change from before the patch). This 
 property must be set for both clients and the HMS server.
 It wasn't immediately clear how to use the framed transport with SASL, so as 
 written an exception is thrown if you try starting the server with both 
 options. If SASL and the framed transport will indeed work together I can 
 update the patch (although I don't have a secured environment to test in).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2883) Metastore client doesnt close connection properly

2012-04-10 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250953#comment-13250953
 ] 

Travis Crawford commented on HIVE-2883:
---

I patched this in and ran a pig query through HCatalog, and the Unable to 
shutdown local metastore client error went away, and the query produced a 
correct result. LGTM.

 Metastore client doesnt close connection properly
 -

 Key: HIVE-2883
 URL: https://issues.apache.org/jira/browse/HIVE-2883
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.9.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.9.0

 Attachments: HIVE-2883.D2613.1.patch


 While closing connection, it always fail with following trace. Seemingly, it 
 doesnt have any harmful effects.
 {code}
 12/03/20 10:55:02 ERROR hive.metastore: Unable to shutdown local metastore 
 client
 org.apache.thrift.transport.TTransportException: Cannot write to null 
 outputStream
   at 
 org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91)
   at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
   at 
 com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:421)
   at 
 com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:415)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:310)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2883) Metastore client doesnt close connection properly

2012-04-04 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246636#comment-13246636
 ] 

Travis Crawford commented on HIVE-2883:
---

@ashutosh - fb303 is released as a thrift contrib project:

http://svn.apache.org/viewvc/thrift/trunk/contrib/fb303/

 Metastore client doesnt close connection properly
 -

 Key: HIVE-2883
 URL: https://issues.apache.org/jira/browse/HIVE-2883
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.9.0
Reporter: Ashutosh Chauhan
 Fix For: 0.9.0


 While closing connection, it always fail with following trace. Seemingly, it 
 doesnt have any harmful effects.
 {code}
 12/03/20 10:55:02 ERROR hive.metastore: Unable to shutdown local metastore 
 client
 org.apache.thrift.transport.TTransportException: Cannot write to null 
 outputStream
   at 
 org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91)
   at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
   at 
 com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:421)
   at 
 com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:415)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:310)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2609) NPE when pruning partitions by thrift method get_partitions_by_filter

2012-03-13 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228911#comment-13228911
 ] 

Travis Crawford commented on HIVE-2609:
---

I ran into this today too and, in addition to updating the two jars Thomas 
mentioned, also had to update:

https://github.com/apache/hive/blob/trunk/metastore/src/model/package.jdo#L49

In our hive tables the column is named COMMENT - not FCOMMENT. Without 
updating datanucleus things work fine, but this change is required when 
updating jars. I don't understand why the change in behavior yet though.

 NPE when pruning partitions by thrift method get_partitions_by_filter
 -

 Key: HIVE-2609
 URL: https://issues.apache.org/jira/browse/HIVE-2609
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.1
Reporter: Min Zhou

 It's a datanucleus bug indeed. 
 try this code:
 {code}
 boolean open = false;
 for (int i = 0; i  5  !open; ++i) {
   try {
 transport.open();
 open = true;
   } catch (TTransportException e) {
 System.out.println(failed to connect to MetaStore, re-trying...);
 try {
   Thread.sleep(1000);
 } catch (InterruptedException ignore) {}
   }
 }
 try {
   ListPartition parts =
   client.get_partitions_by_filter(default, partitioned_nation,
   pt  '2', (short) -1);
   for (Partition part : parts) {
 System.out.println(part.getSd().getLocation());
   }
 } catch (Exception te) {
   te.printStackTrace();
 }
 {code}
 A NPEexception would be thrown on the thrift server side
 {noformat}
 11/11/25 13:11:55 ERROR api.ThriftHiveMetastore$Processor: Internal error 
 processing get_partitions_by_filter
 java.lang.NullPointerException
 at 
 org.datanucleus.store.mapped.mapping.MappingHelper.getMappingIndices(MappingHelper.java:35)
 at 
 org.datanucleus.store.mapped.expression.StatementText.applyParametersToStatement(StatementText.java:194)
 at 
 org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:233)
 at 
 org.datanucleus.store.rdbms.query.legacy.SQLEvaluator.evaluate(SQLEvaluator.java:115)
 at 
 org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.performExecute(JDOQLQuery.java:288)
 at org.datanucleus.store.query.Query.executeQuery(Query.java:1657)
 at 
 org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245)
 at org.datanucleus.store.query.Query.executeWithMap(Query.java:1526)
 at org.datanucleus.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1329)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1241)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$40.run(HiveMetaStore.java:2369)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$40.run(HiveMetaStore.java:2366)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:307)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2366)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.process(ThriftHiveMetastore.j
 ava:6099)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:4789)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$TLoggingProcessor.process(HiveMetaStore.java:3167)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 A null JavaTypeMapping was passed into 
 org.datanucleus.store.mapped.mapping.MappingHelper.(int initialPosition, 
 JavaTypeMapping mapping), that caused NPE.
 After digged into the datanucleus source, I found that the null value was 
 born in the constructor of 
 org.datanucleus.store.mapped.expression.SubstringExpression. see
 {code}
 /**
  * Constructs the substring
  * @param str the String Expression
  * @param begin The start position
  * @param end The end position expression
  **/   
 public SubstringExpression(StringExpression str, NumericExpression begin, 
 NumericExpression end)
 {
 super(str.getQueryExpression());
 st.append(SUBSTRING().append(str).append( FROM )
 .append(begin.add(new IntegerLiteral(qs, mapping, 
 

[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore

2012-02-27 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217694#comment-13217694
 ] 

Travis Crawford commented on HIVE-2767:
---

Thanks for the feedback Ashutosh - if this is something you'd consider adding I 
can update based on current trunk if things have changed.

While there may be perf gains, this is needed to integrate with our compute 
grid. Basically a wrapper registers the metastore host:port in zookeeper, and a 
thrift framed-transport-only proxy proxies metastore requests based on its ZK 
registration. This lets us launch the metastore on our mesos cluster and still 
give clients the hard-coded host:port they expect.

 Optionally use framed transport with metastore
 --

 Key: HIVE-2767
 URL: https://issues.apache.org/jira/browse/HIVE-2767
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-2767.patch.txt


 Users may want/need to use thrift's framed transport when communicating with 
 the Hive MetaStore. This patch adds a new property 
 {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed 
 transport (defaults to off, aka no change from before the patch). This 
 property must be set for both clients and the HMS server.
 It wasn't immediately clear how to use the framed transport with SASL, so as 
 written an exception is thrown if you try starting the server with both 
 options. If SASL and the framed transport will indeed work together I can 
 update the patch (although I don't have a secured environment to test in).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira