[jira] [Commented] (HIVE-3798) Can't escape reserved keywords used as table names

2015-01-21 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286335#comment-14286335
 ] 

Jakob Homan commented on HIVE-3798:
---

I don't currently have an Hive 0.14 install to test this, but happy it's fixed. 
 Thanks.

> Can't escape reserved keywords used as table names
> --
>
> Key: HIVE-3798
> URL: https://issues.apache.org/jira/browse/HIVE-3798
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.14.0
>
>
> {noformat}hive (some_table)> show tables;
> OK
> ...
> comment
> ...
> Time taken: 0.076 seconds
> hive (some_table)> describe comment;
> FAILED: Parse Error: line 1:0 cannot recognize input near 'describe' 
> 'comment' '' in describe statement
> hive (some_table)> describe `comment`; 
> OK
> Table `comment` does not exist 
> Time taken: 0.042 seconds
> {noformat}
> Describe should honor character escaping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9299) Reuse Configuration in AvroSerdeUtils

2015-01-07 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268442#comment-14268442
 ] 

Jakob Homan commented on HIVE-9299:
---

+1. This is a safe change since the modified APIs are for the AvroSerde's use 
only.

> Reuse Configuration in AvroSerdeUtils
> -
>
> Key: HIVE-9299
> URL: https://issues.apache.org/jira/browse/HIVE-9299
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.14.0, 0.13.1, 0.15.0
>Reporter: Nitay Joffe
>Assignee: Nitay Joffe
> Fix For: 0.15.0
>
> Attachments: HIVE-9299.patch
>
>
> I am getting an issue where the original Configuration has some parameters 
> needed to read the remote Avro schema (specifically S3 keys).
> Doing new Configuration doesn't pick it up because the keys are not on the 
> classpath.
> We should reuse the Configuration already present in callers.
> I'm using Hive/Avro from Spark so it'd be nice if we could put this into Hive 
> 0.13 since that's what Spark's built against.
> See also https://github.com/jghoman/haivvreo/pull/30



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5224) When creating table with AVRO serde, the "avro.schema.url" should be about to load serde schema from file system beside HDFS

2013-11-20 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828051#comment-13828051
 ] 

Jakob Homan commented on HIVE-5224:
---

So the goal is to use o.a.h.FileSystem to read other derivative file systems? 
Sounds reasonable.  But doesn't this lead to a similar situation when trying to 
open a URI that's not http or file?  Not sure that's important though.

> When creating table with AVRO serde, the "avro.schema.url" should be about to 
> load serde schema from file system beside HDFS
> 
>
> Key: HIVE-5224
> URL: https://issues.apache.org/jira/browse/HIVE-5224
> Project: Hive
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
> Attachments: HIVE-5224.1.patch, HIVE-5224.2.patch, Hive-5224.3.patch
>
>
> Now when loading schema for table with AVRO serde, the file system is hard 
> coded to hdfs in AvroSerdeUtils.java. This should enable loading schema from 
> file system beside hdfs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5850) Multiple table join error for avro

2013-11-20 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828055#comment-13828055
 ] 

Jakob Homan commented on HIVE-5850:
---

This has been a recurring problem.  The code to figure out what schema goes 
where has been problematic and the information passed to the mapper has changed 
from Hive version to Hive version.  Using the parent may not always get the 
latest schema, yes?

> Multiple table join error for avro 
> ---
>
> Key: HIVE-5850
> URL: https://issues.apache.org/jira/browse/HIVE-5850
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Shengjun Xin
> Attachments: part.tar.gz, partsupp.tar.gz, schema.tar.gz
>
>
> Reproduce step:
> {code}
> -- Create table Part.
> CREATE EXTERNAL TABLE part
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/part'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/part.avsc');
> -- Create table Part Supplier.
> CREATE EXTERNAL TABLE partsupp
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/partsupp'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/partsupp.avsc');
> --- Query
> select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where 
> p.p_partkey=1;
> {code}
> {code}
> Error message is:
> Error: java.io.IOException: java.io.IOException: 
> org.apache.avro.AvroTypeException: Found {
>   "type" : "record",
>   "name" : "partsupp",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "ps_partkey",
> "type" : "long"
>   }, {
> "name" : "ps_suppkey",
> "type" : "long"
>   }, {
> "name" : "ps_availqty",
> "type" : "long"
>   }, {
> "name" : "ps_supplycost",
> "type" : "double"
>   }, {
> "name" : "ps_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }, expecting {
>   "type" : "record",
>   "name" : "part",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "p_partkey",
> "type" : "long"
>   }, {
> "name" : "p_name",
> "type" : "string"
>   }, {
> "name" : "p_mfgr",
> "type" : "string"
>   }, {
> "name" : "p_brand",
> "type" : "string"
>   }, {
> "name" : "p_type",
> "type" : "string"
>   }, {
> "name" : "p_size",
> "type" : "int"
>   }, {
> "name" : "p_container",
> "type" : "string"
>   }, {
> "name" : "p_retailprice",
> "type" : "double"
>   }, {
> "name" : "p_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4734) Use custom ObjectInspectors for AvroSerde

2013-08-25 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749836#comment-13749836
 ] 

Jakob Homan commented on HIVE-4734:
---

Reviewed last patch on RB.  Everything looks good except for a change in the 
handling of [T1,Tn,NULL] types.

> Use custom ObjectInspectors for AvroSerde
> -
>
> Key: HIVE-4734
> URL: https://issues.apache.org/jira/browse/HIVE-4734
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Mark Wagner
>Assignee: Mark Wagner
> Fix For: 0.12.0
>
> Attachments: HIVE-4734.1.patch, HIVE-4734.2.patch, HIVE-4734.3.patch
>
>
> Currently, the AvroSerde recursively copies all fields of a record from the 
> GenericRecord to a List row object and provides the standard 
> ObjectInspectors. Performance can be improved by providing ObjectInspectors 
> to the Avro record itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4734) Use custom ObjectInspectors for AvroSerde

2013-07-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13722675#comment-13722675
 ] 

Jakob Homan commented on HIVE-4734:
---

Added comments into the rb.

> Use custom ObjectInspectors for AvroSerde
> -
>
> Key: HIVE-4734
> URL: https://issues.apache.org/jira/browse/HIVE-4734
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Mark Wagner
>Assignee: Mark Wagner
> Fix For: 0.12.0
>
> Attachments: HIVE-4734.1.patch, HIVE-4734.2.patch
>
>
> Currently, the AvroSerde recursively copies all fields of a record from the 
> GenericRecord to a List row object and provides the standard 
> ObjectInspectors. Performance can be improved by providing ObjectInspectors 
> to the Avro record itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3264) Add support for binary dataype to AvroSerde

2013-07-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13722674#comment-13722674
 ] 

Jakob Homan commented on HIVE-3264:
---

btw, Mark, once this goes through, please update the AvroSerde info (with the 
version number): https://cwiki.apache.org/confluence/display/Hive/AvroSerDe  
Ping the user list if you don't have write acesss.  Thanks.

> Add support for binary dataype to AvroSerde
> ---
>
> Key: HIVE-3264
> URL: https://issues.apache.org/jira/browse/HIVE-3264
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Jakob Homan
>  Labels: patch
> Fix For: 0.12.0
>
> Attachments: HIVE-3264-1.patch, HIVE-3264-2.patch, HIVE-3264-3.patch, 
> HIVE-3264-4.patch, HIVE-3264-5.patch, HIVE-3264.6.patch, HIVE-3264.7.patch
>
>
> When the AvroSerde was written, Hive didn't have a binary type, so Avro's 
> byte array type is converted an array of small ints.  Now that HIVE-2380 is 
> in, this step isn't necessary and we can convert both Avro's bytes type and 
> probably fixed type to Hive's binary type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3442) AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating external table

2013-07-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13722670#comment-13722670
 ] 

Jakob Homan commented on HIVE-3442:
---

The best place for this info would be in the Hive wiki, which is what passes 
for official project documentation: 
https://cwiki.apache.org/confluence/display/Hive/AvroSerDe  Please ping the 
user list if you don't have write access to the page.  Thanks for finding this 
out and sharing.

> AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating 
> external table
> ---
>
> Key: HIVE-3442
> URL: https://issues.apache.org/jira/browse/HIVE-3442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
>
> After creating a table and load data into it, I could check that the table is 
> created successfully, and data is inside:
> DROP TABLE IF EXISTS ml_items;
> CREATE TABLE ml_items(id INT,
>   title STRING,
>   release_date STRING,
>   video_release_date STRING,
>   imdb_url STRING,
>   unknown_genre TINYINT,
>   action TINYINT,
>   adventure TINYINT,
>   animation TINYINT,
>   children TINYINT,
>   comedy TINYINT,
>   crime TINYINT,
>   documentary TINYINT,
>   drama TINYINT,
>   fantasy TINYINT,
>   film_noir TINYINT,
>   horror TINYINT,
>   musical TINYINT,
>   mystery TINYINT,
>   romance TINYINT,
>   sci_fi TINYINT,
>   thriller TINYINT,
>   war TINYINT,
>   western TINYINT)
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
>   STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '../data/files/avro_items' INTO TABLE ml_items;
> select * from ml_items ORDER BY id ASC;
> While, the following create external table with AvroSerDe is not working:
> DROP TABLE IF EXISTS ml_items_as_avro;
> CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='${system:test.src.data.dir}/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:${system:test.tmp.dir}/hive-ml-items';
> describe ml_items_as_avro;
> INSERT OVERWRITE TABLE ml_items_as_avro
>   SELECT id, title,
> imdb_url, unknown_genre, action, adventure, animation, children, comedy, 
> crime,
> documentary, drama, fantasy, film_noir, horror, musical, mystery, romance,
> sci_fi, thriller, war, western
>   FROM ml_items;
> ml_items_as_avro is not created with expected schema, as shown in the 
> "describe ml_items_as_avro" output. The output is below:
> PREHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> PREHOOK: type: DROPTABLE
> POSTHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> POSTHOOK: type: DROPTABLE
> PREHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> PREHOOK: type: CREATETABLE
> POSTHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@ml_items_as_avro
> PREHOOK: query: describe ml_items_as_avro
> PREHOOK: type: DESCTABLE
> POSTHOOK: query: describe ml_items_as_avro
> POSTHOOK: type: DESCTABLE
> error_error_error_error_error_error_error   string  from deserializer
> cannot_determine_schema string  from deserializer
> check   string  from deserializer
> schema  string  from deserializer
> ur

[jira] [Commented] (HIVE-4734) Use custom ObjectInspectors for AvroSerde

2013-07-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13722637#comment-13722637
 ] 

Jakob Homan commented on HIVE-4734:
---

Sorry, above comment was for HIVE-3264.  Still reviewing this one.

> Use custom ObjectInspectors for AvroSerde
> -
>
> Key: HIVE-4734
> URL: https://issues.apache.org/jira/browse/HIVE-4734
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Mark Wagner
>Assignee: Mark Wagner
> Fix For: 0.12.0
>
> Attachments: HIVE-4734.1.patch, HIVE-4734.2.patch
>
>
> Currently, the AvroSerde recursively copies all fields of a record from the 
> GenericRecord to a List row object and provides the standard 
> ObjectInspectors. Performance can be improved by providing ObjectInspectors 
> to the Avro record itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3264) Add support for binary dataype to AvroSerde

2013-07-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13722636#comment-13722636
 ] 

Jakob Homan commented on HIVE-3264:
---

+1.  Looks good.

> Add support for binary dataype to AvroSerde
> ---
>
> Key: HIVE-3264
> URL: https://issues.apache.org/jira/browse/HIVE-3264
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Jakob Homan
>  Labels: patch
> Fix For: 0.12.0
>
> Attachments: HIVE-3264-1.patch, HIVE-3264-2.patch, HIVE-3264-3.patch, 
> HIVE-3264-4.patch, HIVE-3264-5.patch, HIVE-3264.6.patch, HIVE-3264.7.patch
>
>
> When the AvroSerde was written, Hive didn't have a binary type, so Avro's 
> byte array type is converted an array of small ints.  Now that HIVE-2380 is 
> in, this step isn't necessary and we can convert both Avro's bytes type and 
> probably fixed type to Hive's binary type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4734) Use custom ObjectInspectors for AvroSerde

2013-07-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13722635#comment-13722635
 ] 

Jakob Homan commented on HIVE-4734:
---

+1.  Looks good.

> Use custom ObjectInspectors for AvroSerde
> -
>
> Key: HIVE-4734
> URL: https://issues.apache.org/jira/browse/HIVE-4734
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Mark Wagner
>Assignee: Mark Wagner
> Fix For: 0.12.0
>
> Attachments: HIVE-4734.1.patch, HIVE-4734.2.patch
>
>
> Currently, the AvroSerde recursively copies all fields of a record from the 
> GenericRecord to a List row object and provides the standard 
> ObjectInspectors. Performance can be improved by providing ObjectInspectors 
> to the Avro record itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-2482) Convenience UDFs for binary data type

2013-07-15 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-2482:
-

Assignee: Mark Wagner

> Convenience UDFs for binary data type
> -
>
> Key: HIVE-2482
> URL: https://issues.apache.org/jira/browse/HIVE-2482
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Mark Wagner
>
> HIVE-2380 introduced binary data type in Hive. It will be good to have 
> following udfs to make it more useful:
> * UDF's to convert to/from hex string
> * UDF's to convert to/from string using a specific encoding
> * UDF's to convert to/from base64 string
> * UDF's to convert to/from non-string types using a particular serde

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-2302) Allow grant privileges on granting privileges.

2013-06-27 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-2302:
-

Assignee: Mohammad Kamrul Islam  (was: Jakob Homan)

> Allow grant privileges on granting privileges.
> --
>
> Key: HIVE-2302
> URL: https://issues.apache.org/jira/browse/HIVE-2302
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Security
>Affects Versions: 0.9.0, 0.10.0, 0.11.0
>Reporter: Guy Doulberg
>Assignee: Mohammad Kamrul Islam
>
> Today any user can grant him and any other users privileges on schemas and 
> tables.
> This way the administrator can not be sure that the rules he had apply are 
> fulfilled.
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3095) Self-referencing Avro schema creates infinite loop on table creation

2013-06-21 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-3095:
-

Assignee: Mohammad Kamrul Islam  (was: Jakob Homan)

> Self-referencing Avro schema creates infinite loop on table creation
> 
>
> Key: HIVE-3095
> URL: https://issues.apache.org/jira/browse/HIVE-3095
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.1
>Reporter: Keegan Mosley
>Assignee: Mohammad Kamrul Islam
>Priority: Minor
>  Labels: avro
>
> An Avro schema which has a field reference to itself will create an infinite 
> loop which eventually throws a StackOverflowError.
> To reproduce using the linked-list example from 
> http://avro.apache.org/docs/1.6.1/spec.html
> create table linkedListTest row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> with serdeproperties ('avro.schema.literal'='
> {
>"type": "record", 
>"name": "LongList",
>"aliases": ["LinkedLongs"],  // old name for this
>"fields" : [
>   {"name": "value", "type": "long"}, // each element has a 
> long
>   {"name": "next", "type": ["LongList", "null"]} // optional next element
>]
> }
> ')
> stored as inputformat 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-2390) Expand support for union types

2013-05-23 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-2390:
-

Assignee: (was: Jakob Homan)

> Expand support for union types
> --
>
> Key: HIVE-2390
> URL: https://issues.apache.org/jira/browse/HIVE-2390
> Project: Hive
>  Issue Type: Bug
>Reporter: Jakob Homan
>  Labels: uniontype
>
> When the union type was introduced, full support for it wasn't provided.  For 
> instance, when working with a union that gets passed to LazyBinarySerde: 
> {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3159) Update AvroSerde to determine schema of new tables

2013-05-22 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-3159:
-

Assignee: (was: Jakob Homan)

> Update AvroSerde to determine schema of new tables
> --
>
> Key: HIVE-3159
> URL: https://issues.apache.org/jira/browse/HIVE-3159
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Jakob Homan
>
> Currently when writing tables to Avro one must manually provide an Avro 
> schema that matches what is being delivered by Hive. It'd be better to have 
> the serde infer this schema by converting the table's TypeInfo into an 
> appropriate AvroSchema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4309) Hive server2 for Hive HA?

2013-05-14 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HIVE-4309.
---

   Resolution: Invalid
Fix Version/s: (was: 0.10.0)

> Hive server2 for Hive HA?
> -
>
> Key: HIVE-4309
> URL: https://issues.apache.org/jira/browse/HIVE-4309
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Affects Versions: 0.10.0
>Reporter: zengyongping
>  Labels: features
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> My team leader tell me Hive server2 is develop for Hive HA.I tell him Hive 
> server2 is for concurrent requests.We plan to deployment Hive server HA.So 
> can somebody tell me what is Hive server2?how to deployment a Hive server HA?
> thank you!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3953) Reading of partitioned Avro data fails because of missing properties

2013-04-30 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-3953:
--

 Priority: Blocker  (was: Major)
Affects Version/s: 0.11.0
Fix Version/s: 0.11.0

Changing status to blocker since this, if not fixed before release, will cause 
all kinds of havoc with Avro.

> Reading of partitioned Avro data fails because of missing properties
> 
>
> Key: HIVE-3953
> URL: https://issues.apache.org/jira/browse/HIVE-3953
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Mark Wagner
>Priority: Blocker
> Fix For: 0.11.0
>
> Attachments: avro_partition_test.q
>
>
> After HIVE-3833, reading partitioned Avro data fails due to missing 
> properties. The "avro.schema.(url|literal)" properties are not making it all 
> the way to the SerDe. Non-partitioned data can still be read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3308) Mixing avro and snappy gives null values

2013-02-03 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569879#comment-13569879
 ] 

Jakob Homan commented on HIVE-3308:
---

Will do.

> Mixing avro and snappy gives null values
> 
>
> Key: HIVE-3308
> URL: https://issues.apache.org/jira/browse/HIVE-3308
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Bennie Schut
>Assignee: Bennie Schut
> Attachments: HIVE-3308.patch1.txt, HIVE-3308.patch2.txt
>
>
> On default hive uses LazySimpleSerDe for output.
> When I now enable compression and "select count(*) from avrotable" the output 
> is a file with the .avro extension but this then will display null values 
> since the file is in reality not an avro file but a file created by 
> LazySimpleSerDe using compression so should be a .snappy file.
> This causes any job (exception select * from avrotable is that not truly a 
> job) to show null values.
> If you use any serde other then avro you can temporarily fix this by setting 
> "set hive.output.file.extension=.snappy" and it will correctly work again but 
> this won't work on avro since it overwrites the hive.output.file.extension 
> during initializing.
> When you dump the query result into a table with "create table bla as" you 
> can rename the .avro file into .snappy and the "select from bla" will also 
> magiacally work again.
> Input and Ouput serdes don't always match so when I use avro as an input 
> format it should not set the hive.output.file.extension.
> Onces it's set all queries will use it and fail making the connection useless 
> to reuse.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1694) Accelerate GROUP BY execution using indexes

2013-01-31 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-1694:
--

Description: 
The index building patch (Hive-417) is checked into trunk, this JIRA issue 
tracks supporting indexes in Hive compiler & execution engine for SELECT 
queries.

This is in ref. to John's comment at


https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869

on creating separate JIRA issue for tracking index usage in optimizer & query 
execution.

The aim of this effort is to use indexes to accelerate query execution (for 
certain class of queries). E.g.
- Filters and range scans (already being worked on by He Yongqiang as part of 
HIVE-417?)
- Joins (index based joins)
- Group By, Order By and other misc cases

The proposal is multi-step:
1. Building index based operators, compiler and execution engine changes
2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose 
between index scans, full table scans etc.)

This JIRA initially focuses on the first step. This JIRA is expected to hold 
the information about index based plans & operator implementations for above 
mentioned cases.  

  was:
The index building patch (Hive-417) is checked into trunk, this JIRA issue 
tracks supporting indexes in Hive compiler & execution engine for SELECT 
queries.

This is in ref. to John's comment at


https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869

on creating separate JIRA issue for tracking index usage in optimizer & query 
execution.

The aim of this effort is to use indexes to accelerate query execution (for 
certain class of queries). E.g.
- Filters and range scans (already being worked on by He Yongqiang as part of 
HIVE-417?)
- Joins (index based joins)
- Group By, Order By and other misc cases

The proposal is multi-step:
1. Building index based operators, compiler and execution engine changes
2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose 
between index scans, full table scans etc.)

This JIRA initially focuses on the first step. This JIRA is expected to hold 
the information about index based plans & operator implementations for above 
mentioned cases. 


> Accelerate GROUP BY execution using indexes
> ---
>
> Key: HIVE-1694
> URL: https://issues.apache.org/jira/browse/HIVE-1694
> Project: Hive
>  Issue Type: New Feature
>  Components: Indexing, Query Processor
>Affects Versions: 0.7.0
>Reporter: Nikhil Deshpande
>Assignee: Prajakta Kalmegh
> Fix For: 0.8.0
>
> Attachments: demo_q1.hql, demo_q2.hql, HIVE-1694.1.patch.txt, 
> HIVE-1694_2010-10-28.diff, HIVE-1694.2.patch.txt, HIVE-1694.3.patch.txt, 
> HIVE-1694.4.patch, HIVE-1694.5.patch, HIVE-1694.6.patch, HIVE-1694.7.patch, 
> HIVE-1694.7.patch
>
>
> The index building patch (Hive-417) is checked into trunk, this JIRA issue 
> tracks supporting indexes in Hive compiler & execution engine for SELECT 
> queries.
> This is in ref. to John's comment at
> https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869
> on creating separate JIRA issue for tracking index usage in optimizer & query 
> execution.
> The aim of this effort is to use indexes to accelerate query execution (for 
> certain class of queries). E.g.
> - Filters and range scans (already being worked on by He Yongqiang as part of 
> HIVE-417?)
> - Joins (index based joins)
> - Group By, Order By and other misc cases
> The proposal is multi-step:
> 1. Building index based operators, compiler and execution engine changes
> 2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose 
> between index scans, full table scans etc.)
> This JIRA initially focuses on the first step. This JIRA is expected to hold 
> the information about index based plans & operator implementations for above 
> mentioned cases.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata

2013-01-30 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566913#comment-13566913
 ] 

Jakob Homan commented on HIVE-3833:
---

This patch has broken Avro (and probably HBase and Cassandra) for partitioned 
tables since it no longer passes the table properties down to the serde:
{noformat}+Properties partProps =
+(pd.getPartSpec() == null || pd.getPartSpec().isEmpty()) ?
+pd.getTableDesc().getProperties() : pd.getProperties();{noformat}
Was this intentional?  If so, it's a breaking change and should be marked as 
such.  If not, since it's not been in a release yet, can we revert the patch?  
See HIVE-3953.

> object inspectors should be initialized based on partition metadata
> ---
>
> Key: HIVE-3833
> URL: https://issues.apache.org/jira/browse/HIVE-3833
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, 
> hive.3833.21.patch, hive.3833.22.patch, hive.3833.23.patch, 
> hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, 
> hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema

2013-01-25 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562871#comment-13562871
 ] 

Jakob Homan commented on HIVE-3528:
---

If there's a 10.1 release, it'd be good to get this in there.  Can it be 
committed to the 10 branch?

> Avro SerDe doesn't handle serializing Nullable types that require access to a 
> Schema
> 
>
> Key: HIVE-3528
> URL: https://issues.apache.org/jira/browse/HIVE-3528
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>  Labels: avro
> Fix For: 0.11.0
>
> Attachments: HIVE-3528.1.patch.txt, HIVE-3528.2.patch.txt
>
>
> Deserialization properly handles hiding Nullable Avro types, including 
> complex types like record, map, array, etc. However, when Serialization 
> attempts to write out these types it erroneously makes use of the UNION 
> schema that contains NULL and the other type.
> This results in Schema mis-match errors for Record, Array, Enum, Fixed, and 
> Bytes.
> Here's a [review board of unit tests that express the 
> problem|https://reviews.apache.org/r/7431/], as well as one that supports the 
> case that it's only when the schema is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema

2013-01-09 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13548953#comment-13548953
 ] 

Jakob Homan commented on HIVE-3528:
---

Sean, can you attach the RB patch to the JIRA?

> Avro SerDe doesn't handle serializing Nullable types that require access to a 
> Schema
> 
>
> Key: HIVE-3528
> URL: https://issues.apache.org/jira/browse/HIVE-3528
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
>  Labels: avro
> Attachments: HIVE-3528.1.patch.txt
>
>
> Deserialization properly handles hiding Nullable Avro types, including 
> complex types like record, map, array, etc. However, when Serialization 
> attempts to write out these types it erroneously makes use of the UNION 
> schema that contains NULL and the other type.
> This results in Schema mis-match errors for Record, Array, Enum, Fixed, and 
> Bytes.
> Here's a [review board of unit tests that express the 
> problem|https://reviews.apache.org/r/7431/], as well as one that supports the 
> case that it's only when the schema is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema

2013-01-08 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547431#comment-13547431
 ] 

Jakob Homan commented on HIVE-3528:
---

Did a review on rb, but it doesn't seem to be showing up.  +1.  Great tests.

> Avro SerDe doesn't handle serializing Nullable types that require access to a 
> Schema
> 
>
> Key: HIVE-3528
> URL: https://issues.apache.org/jira/browse/HIVE-3528
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
>  Labels: avro
> Attachments: HIVE-3528.1.patch.txt
>
>
> Deserialization properly handles hiding Nullable Avro types, including 
> complex types like record, map, array, etc. However, when Serialization 
> attempts to write out these types it erroneously makes use of the UNION 
> schema that contains NULL and the other type.
> This results in Schema mis-match errors for Record, Array, Enum, Fixed, and 
> Bytes.
> Here's a [review board of unit tests that express the 
> problem|https://reviews.apache.org/r/7431/], as well as one that supports the 
> case that it's only when the schema is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-08 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547233#comment-13547233
 ] 

Jakob Homan commented on HIVE-3585:
---

Expanding on what Sean said.  Yes, there will be a TrevniSerde that doesn't 
write Avro but happens to require the Avro libraries.  And there will be an 
extension to the AvroSerde that writes Trevni-encoded Avro.  

This patch is going to share 90% of its small code with the existing AvroSerde 
that was never shunted into contrib.  Why should this variation be?  There are 
active users and developers of this code.  Again, I'm not seeing any technical 
reasons to block progress.  Is anyone planning on exercising a -1?

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Mark Wagner
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3860) Alter table location shouldn't required fully qualified URI

2013-01-04 Thread Jakob Homan (JIRA)
Jakob Homan created HIVE-3860:
-

 Summary: Alter table location shouldn't required fully qualified 
URI
 Key: HIVE-3860
 URL: https://issues.apache.org/jira/browse/HIVE-3860
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Jakob Homan
Assignee: Jakob Homan


Currently one can create an external table by specifying a regular path, eg 
'/my/cool/table' but to update the location one must specify the whole NN addr 
and pah, eg 'hdfs://mycoolnn:9000/my/cooler/table' Alter table should assume 
the default fs if not specified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-04 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544237#comment-13544237
 ] 

Jakob Homan commented on HIVE-3585:
---

bq. Will Trevni but just another Serde/inputFormat combination of will it 
involve large scale changes to hive?
Just another serde/inputformat combination.  In fact, for Avro-wrapped Trevni, 
it'll just be AvroSerde + Trevni{I|O}Format.  For raw Trevni data it'll just be 
TrevniSerde + (probably) (?Raw)Trevni{I|O}Format.  This is not a big patch and 
doesn't require any more changes to Hive than AvroSerde did - none at all.  As 
I mentioned above, a veto on this can't be sustained on technical grounds, so 
I'm happy to re-assure He as to his concerns, but I don't see any reason not to 
proceed.

bq. I did not get why it does not work with partition schema update.
I didn't want to try to mix Avro-style schemas and Trevni-style schemas, but 
Mark has a way around that.

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Mark Wagner
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-03 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543401#comment-13543401
 ] 

Jakob Homan commented on HIVE-3585:
---

yeah.
bq. If it weren't for some subtle problems with updating partition schemas, I'd 
probably have just gone ahead and made read/write from trevni a table property 
of tables using the AvroSerde rather than have a separate TrevniSerde

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Mark Wagner
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-03 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543378#comment-13543378
 ] 

Jakob Homan commented on HIVE-3585:
---

But the Avro Serde wasn't added to a contrib package 
(org.apache.hadoop.hive.serde2.avro.AvroSerDe) - so why should its Trevni 
variant be?  They share a *lot* of code.  If it weren't for some subtle 
problems with updating partition schemas, I'd probably have just gone ahead and 
made read/write from trevni a table property of tables using the AvroSerde 
rather than have a separate TrevniSerde

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Mark Wagner
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-03 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543314#comment-13543314
 ] 

Jakob Homan commented on HIVE-3585:
---

And we have active users and contributors to this code (myself, Mark, Sean, 
etc.).  There's essentially no chance this will be orphaned on arrival.

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Mark Wagner
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2013-01-02 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542442#comment-13542442
 ] 

Jakob Homan commented on HIVE-3585:
---

Also, He, I'm assuming your -1 is not intended to be a veto? I don't believe it 
would hold up technically.  Trevni is essentially a variation on Avro.  Not 
letting people read their Trevni-encoded data in Hive just because there's 
already another columnar format doesn't seem like a good way forward.

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Jakob Homan
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3798) Can't escape reserved keywords used as table names

2012-12-13 Thread Jakob Homan (JIRA)
Jakob Homan created HIVE-3798:
-

 Summary: Can't escape reserved keywords used as table names
 Key: HIVE-3798
 URL: https://issues.apache.org/jira/browse/HIVE-3798
 Project: Hive
  Issue Type: Bug
Reporter: Jakob Homan
Assignee: Jakob Homan


{noformat}hive (some_table)> show tables;
OK
...
comment
...
Time taken: 0.076 seconds
hive (some_table)> describe comment;
FAILED: Parse Error: line 1:0 cannot recognize input near 'describe' 'comment' 
'' in describe statement
hive (some_table)> describe `comment`; 
OK
Table `comment` does not exist   
Time taken: 0.042 seconds
{noformat}

Describe should honor character escaping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3585) Integrate Trevni as another columnar oriented file format

2012-11-07 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-3585:
-

Assignee: Jakob Homan

> Integrate Trevni as another columnar oriented file format
> -
>
> Key: HIVE-3585
> URL: https://issues.apache.org/jira/browse/HIVE-3585
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: alex gemini
>Assignee: Jakob Homan
>Priority: Minor
>
> add new avro module trevni as another columnar format.New columnar format 
> need a columnar SerDe,seems fastutil is a good choice.the shark project use 
> fastutil library as columnar serde library but it seems too large (almost 
> 15m) for just a few primitive array collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-11-05 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490998#comment-13490998
 ] 

Jakob Homan commented on HIVE-895:
--

The fixed version is 9.1, but at least in the JIRA, I don't see it having been 
committed to anywhere else.  You can check that branch for the 895 commit, but 
I don't think it was. 

I don't plan to do any porting.  You're welcome to try it or just go ahead and 
use haivvreo until 0.10 comes out.

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Fix For: 0.10.0, 0.9.1
>
> Attachments: doctors.avro, episodes.avro, HIVE-895-draft.patch, 
> HIVE-895.patch, hive-895.patch.1.txt
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema

2012-10-19 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480115#comment-13480115
 ] 

Jakob Homan commented on HIVE-3528:
---

Great.  Happy to help either here or email me.

> Avro SerDe doesn't handle serializing Nullable types that require access to a 
> Schema
> 
>
> Key: HIVE-3528
> URL: https://issues.apache.org/jira/browse/HIVE-3528
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
>  Labels: avro
> Attachments: HIVE-3528.1.patch.txt
>
>
> Deserialization properly handles hiding Nullable Avro types, including 
> complex types like record, map, array, etc. However, when Serialization 
> attempts to write out these types it erroneously makes use of the UNION 
> schema that contains NULL and the other type.
> This results in Schema mis-match errors for Record, Array, Enum, Fixed, and 
> Bytes.
> Here's a [review board of unit tests that express the 
> problem|https://reviews.apache.org/r/7431/], as well as one that supports the 
> case that it's only when the schema is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema

2012-10-08 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472023#comment-13472023
 ] 

Jakob Homan commented on HIVE-3528:
---

Looks good.  Added a couple comments to reviewboard.  Also, it would be great 
to add a .q test that the Avro Serde can handle null records (with a proper 
nullable Avro schema) from other formats from Hive.  Had meant to do that 
during the Apachification.  If you'd like to add that to this patch, that'd be 
great.  If not, I'll spin up a quick test after this gets committed.  

> Avro SerDe doesn't handle serializing Nullable types that require access to a 
> Schema
> 
>
> Key: HIVE-3528
> URL: https://issues.apache.org/jira/browse/HIVE-3528
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
>  Labels: avro
> Attachments: HIVE-3528.1.patch.txt
>
>
> Deserialization properly handles hiding Nullable Avro types, including 
> complex types like record, map, array, etc. However, when Serialization 
> attempts to write out these types it erroneously makes use of the UNION 
> schema that contains NULL and the other type.
> This results in Schema mis-match errors for Record, Array, Enum, Fixed, and 
> Bytes.
> Here's a [review board of unit tests that express the 
> problem|https://reviews.apache.org/r/7431/], as well as one that supports the 
> case that it's only when the schema is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3525) Avro Maps with Nullable Values fail with NPE

2012-10-08 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471953#comment-13471953
 ] 

Jakob Homan commented on HIVE-3525:
---

+1.  Correct approach, good tests.  verified.

> Avro Maps with Nullable Values fail with NPE
> 
>
> Key: HIVE-3525
> URL: https://issues.apache.org/jira/browse/HIVE-3525
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
> Attachments: HIVE-3525.1.patch.txt, HIVE-3525.2.patch.txt
>
>
> When working against current trunk@1393794, using a backing Avro schema that 
> has a Map field with nullable values causes a NPE on deserialization when the 
> map contains a null value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3525) Avro Maps with Nullable Values fail with NPE

2012-10-05 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470640#comment-13470640
 ] 

Jakob Homan commented on HIVE-3525:
---

Reviewing...

> Avro Maps with Nullable Values fail with NPE
> 
>
> Key: HIVE-3525
> URL: https://issues.apache.org/jira/browse/HIVE-3525
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
> Attachments: HIVE-3525.1.patch.txt, HIVE-3525.2.patch.txt
>
>
> When working against current trunk@1393794, using a backing Avro schema that 
> has a Map field with nullable values causes a NPE on deserialization when the 
> map contains a null value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3538) Avro SerDe can't handle Nullable Enums

2012-10-05 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470639#comment-13470639
 ] 

Jakob Homan commented on HIVE-3538:
---

Good catch.  Let me take a look.

> Avro SerDe can't handle Nullable Enums
> --
>
> Key: HIVE-3538
> URL: https://issues.apache.org/jira/browse/HIVE-3538
> Project: Hive
>  Issue Type: Bug
>Reporter: Sean Busbey
> Attachments: HIVE-3538.tests.txt
>
>
> If a field has a schema that unions NULL with an enum, Avro fails to resolve 
> the union because Avro SerDe doesn't restore "enumness".
> Since the enum datum is a String, avro internals check the union for a string 
> schema, which is not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema

2012-10-05 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470635#comment-13470635
 ] 

Jakob Homan commented on HIVE-3528:
---

This looks good.  Let me test the patch.

> Avro SerDe doesn't handle serializing Nullable types that require access to a 
> Schema
> 
>
> Key: HIVE-3528
> URL: https://issues.apache.org/jira/browse/HIVE-3528
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Sean Busbey
>  Labels: avro
> Attachments: HIVE-3528.1.patch.txt
>
>
> Deserialization properly handles hiding Nullable Avro types, including 
> complex types like record, map, array, etc. However, when Serialization 
> attempts to write out these types it erroneously makes use of the UNION 
> schema that contains NULL and the other type.
> This results in Schema mis-match errors for Record, Array, Enum, Fixed, and 
> Bytes.
> Here's a [review board of unit tests that express the 
> problem|https://reviews.apache.org/r/7431/], as well as one that supports the 
> case that it's only when the schema is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3264) Add support for binary dataype to AvroSerde

2012-09-23 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461542#comment-13461542
 ] 

Jakob Homan commented on HIVE-3264:
---

Actually, can we add to the .q file a describe on the table to verify that Hive 
sees the new type correctly/ Also, there should be an equivalent unit test 
added to TestAvroDeserializer.  Also, does this support serializing Hive binary 
to bytes? Sorry for this falling of my radar...

> Add support for binary dataype to AvroSerde
> ---
>
> Key: HIVE-3264
> URL: https://issues.apache.org/jira/browse/HIVE-3264
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Jakob Homan
>  Labels: patch
> Attachments: HIVE-3264-1.patch, HIVE-3264-2.patch, HIVE-3264-3.patch, 
> HIVE-3264-4.patch, HIVE-3264-5.patch
>
>
> When the AvroSerde was written, Hive didn't have a binary type, so Avro's 
> byte array type is converted an array of small ints.  Now that HIVE-2380 is 
> in, this step isn't necessary and we can convert both Avro's bytes type and 
> probably fixed type to Hive's binary type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3443) Hive Metatool should take serde_param_key from the user to allow for changes to avro serde's schema url key

2012-09-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452367#comment-13452367
 ] 

Jakob Homan commented on HIVE-3443:
---

{code}+if (key != null) {
+  LOG.info("Looking for location in the value field of "+ key + " key in 
SERDES table...");
+} else {
+  LOG.info("Looking for location in the value field of 
schema.url/avro.schema.url key in " +
+  "SERDES table...");
+}{code}
Seems pretty Haivvreo or AvroSerDe specific...

> Hive Metatool should take serde_param_key from the user to allow for changes 
> to avro serde's schema url key
> ---
>
> Key: HIVE-3443
> URL: https://issues.apache.org/jira/browse/HIVE-3443
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.10.0
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>Priority: Critical
> Attachments: HIVE-3443.1.patch.txt
>
>
> Hive Metatool should take serde_param_key from the user to allow for chanes 
> to avro serde's schema url key. In the past "avro.schema.url" key used to be 
> called "schema.url". 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3443) Hive Metatool should take serde_param_key from the user to allow for changes to avro serde's schema url key

2012-09-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452358#comment-13452358
 ] 

Jakob Homan commented on HIVE-3443:
---

You're attempting to introduce code designed to support the change from 
Haivvreo to AvroSerde.  Once again, it is not Hive's responsibility to support 
non-ASF code like Haivvreo.  If you want this functionality, it should be added 
to Haivvreo (and I'd be happy to review any pull requests there), not Hive.  If 
users chose to use Haivvreo (or had it via vendor-supplied packages), it's 
their (or their vendor's) responsibility to update as necessary.  There is no 
need to add code or complexity to Hive to support non-ASF code.

> Hive Metatool should take serde_param_key from the user to allow for changes 
> to avro serde's schema url key
> ---
>
> Key: HIVE-3443
> URL: https://issues.apache.org/jira/browse/HIVE-3443
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.10.0
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>Priority: Critical
> Attachments: HIVE-3443.1.patch.txt
>
>
> Hive Metatool should take serde_param_key from the user to allow for chanes 
> to avro serde's schema url key. In the past "avro.schema.url" key used to be 
> called "schema.url". 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3443) Hive Metatool should take serde_param_key from the user to allow for changes to avro serde's schema url key

2012-09-10 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-3443:
--

Priority: Minor  (was: Critical)

This is in no way a critical bug.  

> Hive Metatool should take serde_param_key from the user to allow for changes 
> to avro serde's schema url key
> ---
>
> Key: HIVE-3443
> URL: https://issues.apache.org/jira/browse/HIVE-3443
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.10.0
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>Priority: Minor
> Attachments: HIVE-3443.1.patch.txt
>
>
> Hive Metatool should take serde_param_key from the user to allow for chanes 
> to avro serde's schema url key. In the past "avro.schema.url" key used to be 
> called "schema.url". 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3443) Hive Metatool should take serde_param_key from the user to allow for changes to avro serde's schema url key

2012-09-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452326#comment-13452326
 ] 

Jakob Homan commented on HIVE-3443:
---

I don't believe Apache Hive is obligated to support automating this relatively 
simple task.  Are you also going to change to the Serde class, since its name 
changed? Are you going to update input and outputformat at the same time? 
People writing tools on github (including me) that eventually get merged in 
some format to the main Hive branch do not obligate Hive to support the old 
github version in any way.

> Hive Metatool should take serde_param_key from the user to allow for changes 
> to avro serde's schema url key
> ---
>
> Key: HIVE-3443
> URL: https://issues.apache.org/jira/browse/HIVE-3443
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.10.0
>Reporter: Shreepadma Venugopalan
>Assignee: Shreepadma Venugopalan
>Priority: Critical
> Attachments: HIVE-3443.1.patch.txt
>
>
> Hive Metatool should take serde_param_key from the user to allow for chanes 
> to avro serde's schema url key. In the past "avro.schema.url" key used to be 
> called "schema.url". 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3447) Provide backward compatibility for AvroSerDe properties

2012-09-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452322#comment-13452322
 ] 

Jakob Homan commented on HIVE-3447:
---

bq. @Shreepadma: The AvroSerDe was added in HIVE-895. This patch has not yet 
appeared in a release. The Haivvreo SerDe (which was never part of Hive) 
supported the schema.url property. The AvroSerDe (added in HIVE-895) has always 
supported avro.schema.url.
Correct.

I'm of the position that Apache projects have no responsibility to support 
external solutions, such as Haivvreo, even if they eventually get merged into 
Apache.  Users already using Haivvreo are welcome to continue to do so as part 
of Hive 10 with no changes.  Those who wish to switch will only need to update 
the serde class, in/outputformats (to accomodate the new packaging regime) and 
replace their schema.url or schema.literal serdeproperty with a corresponding 
avro.schema.url or avro.schema.literal table property.

> Provide backward compatibility for AvroSerDe properties
> ---
>
> Key: HIVE-3447
> URL: https://issues.apache.org/jira/browse/HIVE-3447
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
> Attachments: HIVE-3447.1.patch.txt
>
>
> haivvreo has been merged into Hive as AvroSerDe.
> It has been so popular that many hive users/customers are using it now.
> There are a number of hive users/customers using haivvreo before its merge, 
> their old application/script is not working due to some of the property 
> changes(eg. schema.url -> avro.schema.url, schema.literal -> 
> avro.schema.literal).
> It could be a good idea that, we provide backward compatibility for AvroSerDe 
> properties, say, if "avro.schema.url" is not provided, take "schema.url", so 
> that hive users/customers haivvreo application/script could be working 
> smoothly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3447) Provide backward compatibility for AvroSerDe properties

2012-09-07 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450986#comment-13450986
 ] 

Jakob Homan commented on HIVE-3447:
---

This is a bad idea for two reasons: (a) schema.url and schema.literal are too 
generic to be used for this case, which is why they were changed and (b) as 
part of the Apacheification of Haivvreo, these parameters are no longer serde 
properties, but are table properties (in order to avoid a circular dependency 
on the ql and serde packages).  So even if this backwards compatible patch went 
in, it would have a very complicated use pattern.  

Since the necessary change is just a metadata change, and since users already 
have to change the packaging, there's no reason to add this extra complication.

> Provide backward compatibility for AvroSerDe properties
> ---
>
> Key: HIVE-3447
> URL: https://issues.apache.org/jira/browse/HIVE-3447
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
> Attachments: HIVE-3447.1.patch.txt
>
>
> haivvreo has been merged into Hive as AvroSerDe.
> It has been so popular that many hive users/customers are using it now.
> There are a number of hive users/customers using haivvreo before its merge, 
> their old application/script is not working due to some of the property 
> changes(eg. schema.url -> avro.schema.url, schema.literal -> 
> avro.schema.literal).
> It could be a good idea that, we provide backward compatibility for AvroSerDe 
> properties, say, if "avro.schema.url" is not provided, take "schema.url", so 
> that hive users/customers haivvreo application/script could be working 
> smoothly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3442) AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating external table

2012-09-06 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450217#comment-13450217
 ] 

Jakob Homan commented on HIVE-3442:
---

Sounds good.  

> AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating 
> external table
> ---
>
> Key: HIVE-3442
> URL: https://issues.apache.org/jira/browse/HIVE-3442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
>
> After creating a table and load data into it, I could check that the table is 
> created successfully, and data is inside:
> DROP TABLE IF EXISTS ml_items;
> CREATE TABLE ml_items(id INT,
>   title STRING,
>   release_date STRING,
>   video_release_date STRING,
>   imdb_url STRING,
>   unknown_genre TINYINT,
>   action TINYINT,
>   adventure TINYINT,
>   animation TINYINT,
>   children TINYINT,
>   comedy TINYINT,
>   crime TINYINT,
>   documentary TINYINT,
>   drama TINYINT,
>   fantasy TINYINT,
>   film_noir TINYINT,
>   horror TINYINT,
>   musical TINYINT,
>   mystery TINYINT,
>   romance TINYINT,
>   sci_fi TINYINT,
>   thriller TINYINT,
>   war TINYINT,
>   western TINYINT)
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
>   STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '../data/files/avro_items' INTO TABLE ml_items;
> select * from ml_items ORDER BY id ASC;
> While, the following create external table with AvroSerDe is not working:
> DROP TABLE IF EXISTS ml_items_as_avro;
> CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='${system:test.src.data.dir}/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:${system:test.tmp.dir}/hive-ml-items';
> describe ml_items_as_avro;
> INSERT OVERWRITE TABLE ml_items_as_avro
>   SELECT id, title,
> imdb_url, unknown_genre, action, adventure, animation, children, comedy, 
> crime,
> documentary, drama, fantasy, film_noir, horror, musical, mystery, romance,
> sci_fi, thriller, war, western
>   FROM ml_items;
> ml_items_as_avro is not created with expected schema, as shown in the 
> "describe ml_items_as_avro" output. The output is below:
> PREHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> PREHOOK: type: DROPTABLE
> POSTHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> POSTHOOK: type: DROPTABLE
> PREHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> PREHOOK: type: CREATETABLE
> POSTHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@ml_items_as_avro
> PREHOOK: query: describe ml_items_as_avro
> PREHOOK: type: DESCTABLE
> POSTHOOK: query: describe ml_items_as_avro
> POSTHOOK: type: DESCTABLE
> error_error_error_error_error_error_error   string  from deserializer
> cannot_determine_schema string  from deserializer
> check   string  from deserializer
> schema  string  from deserializer
> url string  from deserializer
> and string  from deserializer
> literal string  from deserializer
> FAILED: SemanticException [Error 10044]: Line 3:23 Cannot insert into target 
> table because column number/types are different 'ml_items_as_avro': Table 
> insclause-0 has 7 c

[jira] [Commented] (HIVE-3442) AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating external table

2012-09-06 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450209#comment-13450209
 ] 

Jakob Homan commented on HIVE-3442:
---

bq. 
'avro.schema.literal'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc'
Is this a valid URL? Is it accessible from the metastore?

> AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating 
> external table
> ---
>
> Key: HIVE-3442
> URL: https://issues.apache.org/jira/browse/HIVE-3442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
>
> After creating a table and load data into it, I could check that the table is 
> created successfully, and data is inside:
> DROP TABLE IF EXISTS ml_items;
> CREATE TABLE ml_items(id INT,
>   title STRING,
>   release_date STRING,
>   video_release_date STRING,
>   imdb_url STRING,
>   unknown_genre TINYINT,
>   action TINYINT,
>   adventure TINYINT,
>   animation TINYINT,
>   children TINYINT,
>   comedy TINYINT,
>   crime TINYINT,
>   documentary TINYINT,
>   drama TINYINT,
>   fantasy TINYINT,
>   film_noir TINYINT,
>   horror TINYINT,
>   musical TINYINT,
>   mystery TINYINT,
>   romance TINYINT,
>   sci_fi TINYINT,
>   thriller TINYINT,
>   war TINYINT,
>   western TINYINT)
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
>   STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '../data/files/avro_items' INTO TABLE ml_items;
> select * from ml_items ORDER BY id ASC;
> While, the following create external table with AvroSerDe is not working:
> DROP TABLE IF EXISTS ml_items_as_avro;
> CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='${system:test.src.data.dir}/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:${system:test.tmp.dir}/hive-ml-items';
> describe ml_items_as_avro;
> INSERT OVERWRITE TABLE ml_items_as_avro
>   SELECT id, title,
> imdb_url, unknown_genre, action, adventure, animation, children, comedy, 
> crime,
> documentary, drama, fantasy, film_noir, horror, musical, mystery, romance,
> sci_fi, thriller, war, western
>   FROM ml_items;
> ml_items_as_avro is not created with expected schema, as shown in the 
> "describe ml_items_as_avro" output. The output is below:
> PREHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> PREHOOK: type: DROPTABLE
> POSTHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> POSTHOOK: type: DROPTABLE
> PREHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> PREHOOK: type: CREATETABLE
> POSTHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@ml_items_as_avro
> PREHOOK: query: describe ml_items_as_avro
> PREHOOK: type: DESCTABLE
> POSTHOOK: query: describe ml_items_as_avro
> POSTHOOK: type: DESCTABLE
> error_error_error_error_error_error_error   string  from deserializer
> cannot_determine_schema string  from deserializer
> check   string  from deserializer
> schema  string  from deserializer
> url string  from deserializer
> and string  from deserializer
> literal string  from deserializer
> FAILED: SemanticException [Error 10044]: Line

[jira] [Commented] (HIVE-3442) AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating external table

2012-09-06 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450202#comment-13450202
 ] 

Jakob Homan commented on HIVE-3442:
---

updated the wiki.

> AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating 
> external table
> ---
>
> Key: HIVE-3442
> URL: https://issues.apache.org/jira/browse/HIVE-3442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
>
> After creating a table and load data into it, I could check that the table is 
> created successfully, and data is inside:
> DROP TABLE IF EXISTS ml_items;
> CREATE TABLE ml_items(id INT,
>   title STRING,
>   release_date STRING,
>   video_release_date STRING,
>   imdb_url STRING,
>   unknown_genre TINYINT,
>   action TINYINT,
>   adventure TINYINT,
>   animation TINYINT,
>   children TINYINT,
>   comedy TINYINT,
>   crime TINYINT,
>   documentary TINYINT,
>   drama TINYINT,
>   fantasy TINYINT,
>   film_noir TINYINT,
>   horror TINYINT,
>   musical TINYINT,
>   mystery TINYINT,
>   romance TINYINT,
>   sci_fi TINYINT,
>   thriller TINYINT,
>   war TINYINT,
>   western TINYINT)
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
>   STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '../data/files/avro_items' INTO TABLE ml_items;
> select * from ml_items ORDER BY id ASC;
> While, the following create external table with AvroSerDe is not working:
> DROP TABLE IF EXISTS ml_items_as_avro;
> CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='${system:test.src.data.dir}/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:${system:test.tmp.dir}/hive-ml-items';
> describe ml_items_as_avro;
> INSERT OVERWRITE TABLE ml_items_as_avro
>   SELECT id, title,
> imdb_url, unknown_genre, action, adventure, animation, children, comedy, 
> crime,
> documentary, drama, fantasy, film_noir, horror, musical, mystery, romance,
> sci_fi, thriller, war, western
>   FROM ml_items;
> ml_items_as_avro is not created with expected schema, as shown in the 
> "describe ml_items_as_avro" output. The output is below:
> PREHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> PREHOOK: type: DROPTABLE
> POSTHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> POSTHOOK: type: DROPTABLE
> PREHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> PREHOOK: type: CREATETABLE
> POSTHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@ml_items_as_avro
> PREHOOK: query: describe ml_items_as_avro
> PREHOOK: type: DESCTABLE
> POSTHOOK: query: describe ml_items_as_avro
> POSTHOOK: type: DESCTABLE
> error_error_error_error_error_error_error   string  from deserializer
> cannot_determine_schema string  from deserializer
> check   string  from deserializer
> schema  string  from deserializer
> url string  from deserializer
> and string  from deserializer
> literal string  from deserializer
> FAILED: SemanticException [Error 10044]: Line 3:23 Cannot insert into target 
> table because column number/types are different 'ml_items_as_avro': Table 
> insclause-0 has 

[jira] [Commented] (HIVE-3442) AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating external table

2012-09-06 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450198#comment-13450198
 ] 

Jakob Homan commented on HIVE-3442:
---

The docs are out of date (my fault).  schema.url and schema.literal got changed 
to avro.schema.url and avro.schema.literal during the move to Apache, to be 
more specific to Avro.  Try with those.  I'll update the wiki.

> AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating 
> external table
> ---
>
> Key: HIVE-3442
> URL: https://issues.apache.org/jira/browse/HIVE-3442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
>
> After creating a table and load data into it, I could check that the table is 
> created successfully, and data is inside:
> DROP TABLE IF EXISTS ml_items;
> CREATE TABLE ml_items(id INT,
>   title STRING,
>   release_date STRING,
>   video_release_date STRING,
>   imdb_url STRING,
>   unknown_genre TINYINT,
>   action TINYINT,
>   adventure TINYINT,
>   animation TINYINT,
>   children TINYINT,
>   comedy TINYINT,
>   crime TINYINT,
>   documentary TINYINT,
>   drama TINYINT,
>   fantasy TINYINT,
>   film_noir TINYINT,
>   horror TINYINT,
>   musical TINYINT,
>   mystery TINYINT,
>   romance TINYINT,
>   sci_fi TINYINT,
>   thriller TINYINT,
>   war TINYINT,
>   western TINYINT)
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
>   STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '../data/files/avro_items' INTO TABLE ml_items;
> select * from ml_items ORDER BY id ASC;
> While, the following create external table with AvroSerDe is not working:
> DROP TABLE IF EXISTS ml_items_as_avro;
> CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='${system:test.src.data.dir}/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:${system:test.tmp.dir}/hive-ml-items';
> describe ml_items_as_avro;
> INSERT OVERWRITE TABLE ml_items_as_avro
>   SELECT id, title,
> imdb_url, unknown_genre, action, adventure, animation, children, comedy, 
> crime,
> documentary, drama, fantasy, film_noir, horror, musical, mystery, romance,
> sci_fi, thriller, war, western
>   FROM ml_items;
> ml_items_as_avro is not created with expected schema, as shown in the 
> "describe ml_items_as_avro" output. The output is below:
> PREHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> PREHOOK: type: DROPTABLE
> POSTHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> POSTHOOK: type: DROPTABLE
> PREHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> PREHOOK: type: CREATETABLE
> POSTHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
> 'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@ml_items_as_avro
> PREHOOK: query: describe ml_items_as_avro
> PREHOOK: type: DESCTABLE
> POSTHOOK: query: describe ml_items_as_avro
> POSTHOOK: type: DESCTABLE
> error_error_error_error_error_error_error   string  from deserializer
> cannot_determine_schema string  from deserializer
> check   string  from deserializer
> schema  string  from deserializer
> url string  from deserializer
> and string  from deserializer
> litera

[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-09-04 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448155#comment-13448155
 ] 

Jakob Homan commented on HIVE-3323:
---

bq. MegaStruct
eh?

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-08-23 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440821#comment-13440821
 ] 

Jakob Homan commented on HIVE-3323:
---

OK, but that sounds like a less frequently useful use case than converting to 
strings.  Should we make the default behavior convert-to-string and add 
convert-to-struct as an option for thrift and avro?

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-08-23 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440741#comment-13440741
 ] 

Jakob Homan commented on HIVE-3323:
---

Is the current Thrift behavior worth keeping around? Maybe just convert it to 
just do the string conversion? I can't come up with a use case where I would 
want the struct Thrift provides.

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-08-23 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440724#comment-13440724
 ] 

Jakob Homan commented on HIVE-3323:
---

Right, but AvroSerde already does this conversion is what I'm saying.  There's 
never been an option not to do the conversion.

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-08-23 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-3323:
--

Summary: ThriftSerde: Enable enum to string conversions  (was: Enable enum 
to string conversions)

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-08-23 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440713#comment-13440713
 ] 

Jakob Homan commented on HIVE-3323:
---

One comment I have:
{noformat}
+CONVERT_ENUM_TO_STRING("hive.data.convert.enum.to.string", false),
{noformat}
since AvroSerde already does this and doesn't provide an option not to, can we 
change option name to be thrift specific?

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2702) listPartitionsByFilter only supports string partitions

2012-08-20 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438357#comment-13438357
 ] 

Jakob Homan commented on HIVE-2702:
---

This is very limiting for HCat as it means HCat clients cannot access 
non-string partitions from Pig or MR, whereas they are available via Hive 
directly.  The JDOQL that's used in generateJDOFilter() to generate the query 
is not powerful enough to support stripping out the value, casting it to a 
number, and then sorting it thusly.  The best solution is probably to re-write 
ObjectStore.listPartitionNamesByFilter() to not use the Partitions table (which 
stores the values as "PARTITION=FOO" but rather take advantage of 
PARTITION_KEY_VALS, which stores the partition values by themselves.  Any 
objections to this approach?

> listPartitionsByFilter only supports string partitions
> --
>
> Key: HIVE-2702
> URL: https://issues.apache.org/jira/browse/HIVE-2702
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Attachments: HIVE-2702.1.patch, HIVE-2702.D2043.1.patch
>
>
> listPartitionsByFilter supports only non-string partitions. This is because 
> its explicitly specified in generateJDOFilterOverPartitions in 
> ExpressionTree.java. 
> //Can only support partitions whose types are string
>   if( ! table.getPartitionKeys().get(partitionColumnIndex).
>   
> getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) {
> throw new MetaException
> ("Filtering is supported only on partition keys of type string");
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2965) Revert HIVE-2612

2012-08-17 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437109#comment-13437109
 ] 

Jakob Homan commented on HIVE-2965:
---

bq. In 4/19 contrib meeting it was decided to revert HIVE-2612.
Any documentation as to why?

> Revert HIVE-2612
> 
>
> Key: HIVE-2965
> URL: https://issues.apache.org/jira/browse/HIVE-2965
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Blocker
> Fix For: 0.9.0
>
> Attachments: hive-2765.patch
>
>
> In 4/19 contrib meeting it was decided to revert HIVE-2612.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3323) Enable enum to string conversions

2012-08-01 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426868#comment-13426868
 ] 

Jakob Homan commented on HIVE-3323:
---

Just as another data point, the AvroSerde already converts Avro enums to 
strings, so this will make the behavior more consistent.

> Enable enum to string conversions
> -
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3264) Add support for binary dataype to AvroSerde

2012-07-17 Thread Jakob Homan (JIRA)
Jakob Homan created HIVE-3264:
-

 Summary: Add support for binary dataype to AvroSerde
 Key: HIVE-3264
 URL: https://issues.apache.org/jira/browse/HIVE-3264
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jakob Homan


When the AvroSerde was written, Hive didn't have a binary type, so Avro's byte 
array type is converted an array of small ints.  Now that HIVE-2380 is in, this 
step isn't necessary and we can convert both Avro's bytes type and probably 
fixed type to Hive's binary type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3017) HIVE-2646 added Serde classes to hive-exec jar, duplicating hive-serde

2012-07-05 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407529#comment-13407529
 ] 

Jakob Homan commented on HIVE-3017:
---

Something this bad should be fixed. It's particularly vexing when trying to 
develop against Hive and needing to update two jars for one class.  Fat-jarring 
(which this esesentially is) is evil. If the intention is to provide a 
convenient package for deployment, can't that be done in maven by declaring a 
meta project with the other jars?

> HIVE-2646 added Serde classes to hive-exec jar, duplicating hive-serde
> --
>
> Key: HIVE-3017
> URL: https://issues.apache.org/jira/browse/HIVE-3017
> Project: Hive
>  Issue Type: Bug
>Reporter: Jakob Homan
>
> HIVE-2646 added the jars from hive-serde to the hive-exec class:
> {noformat}
> ...
>  0 Wed May 09 20:56:30 PDT 2012 org/apache/hadoop/hive/serde2/typeinfo/
>   1971 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/ListTypeInfo.class
>   2396 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/MapTypeInfo.class
>   2788 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/PrimitiveTypeInfo.class
>   4408 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/StructTypeInfo.class
>900 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfo.class
>   6576 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoFactory.class
>   1231 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$1.class
>   1239 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser$Token.class
>   7145 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser.class
>  14482 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.class
>   2594 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/UnionTypeInfo.class
>144 Wed May 09 20:56:30 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/package-info.class
> ...{noformat}
> Was this intentional? If so, the serde jar should be deprecated. If not, the 
> serde classes should be removed since this creates two sources of truth for 
> them and can cause other problems (see HCATALOG-407).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-06-26 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401669#comment-13401669
 ] 

Jakob Homan commented on HIVE-895:
--

I've moved all the relevant info from the github page to the Hive Wiki and 
linked to it from the main page: 
https://cwiki.apache.org/confluence/display/Hive/AvroSerDe+-+working+with+Avro+from+Hive

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Fix For: 0.9.1
>
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro, hive-895.patch.1.txt
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3159) Update AvroSerde to determine schema of new tables

2012-06-19 Thread Jakob Homan (JIRA)
Jakob Homan created HIVE-3159:
-

 Summary: Update AvroSerde to determine schema of new tables
 Key: HIVE-3159
 URL: https://issues.apache.org/jira/browse/HIVE-3159
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Jakob Homan
Assignee: Jakob Homan


Currently when writing tables to Avro one must manually provide an Avro schema 
that matches what is being delivered by Hive. It'd be better to have the serde 
infer this schema by converting the table's TypeInfo into an appropriate 
AvroSchema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3158) Update AvroSerde to take advantage of HIVE-2171

2012-06-19 Thread Jakob Homan (JIRA)
Jakob Homan created HIVE-3158:
-

 Summary: Update AvroSerde to take advantage of HIVE-2171
 Key: HIVE-3158
 URL: https://issues.apache.org/jira/browse/HIVE-3158
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Jakob Homan
Assignee: Jakob Homan


HIVE-2171 added the ability for custom serdes to set the schema field comments. 
 The Avro Serde should be updated to use this to set the comments to what has 
been provided by the Avro schema, if any.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-3095) Self-referencing Avro schema creates infinite loop on table creation

2012-06-06 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-3095:
-

Assignee: Jakob Homan

> Self-referencing Avro schema creates infinite loop on table creation
> 
>
> Key: HIVE-3095
> URL: https://issues.apache.org/jira/browse/HIVE-3095
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.1
>Reporter: Keegan Mosley
>Assignee: Jakob Homan
>Priority: Minor
>  Labels: avro
>
> An Avro schema which has a field reference to itself will create an infinite 
> loop which eventually throws a StackOverflowError.
> To reproduce using the linked-list example from 
> http://avro.apache.org/docs/1.6.1/spec.html
> create table linkedListTest row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> with serdeproperties ('avro.schema.literal'='
> {
>"type": "record", 
>"name": "LongList",
>"aliases": ["LinkedLongs"],  // old name for this
>"fields" : [
>   {"name": "value", "type": "long"}, // each element has a 
> long
>   {"name": "next", "type": ["LongList", "null"]} // optional next element
>]
> }
> ')
> stored as inputformat 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-06-06 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290073#comment-13290073
 ] 

Jakob Homan commented on HIVE-895:
--

bq. Could I please ask you to document this feature on the Wiki?
Sure thing.  I'll transfer all the text from the github account shortly.  I'm 
traveling so it may take a few days.

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Fix For: 0.9.1
>
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro, hive-895.patch.1.txt
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3092) Hive tests should load Hive classes from build directory, not Ivy cache

2012-06-06 Thread Jakob Homan (JIRA)
Jakob Homan created HIVE-3092:
-

 Summary: Hive tests should load Hive classes from build directory, 
not Ivy cache
 Key: HIVE-3092
 URL: https://issues.apache.org/jira/browse/HIVE-3092
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure, Tests
Reporter: Jakob Homan


As discussed in HIVE-895, currently the tests pull in jars for other components 
rather from Ivy rather than using the built classes and jars in the build 
directory (bit.ly/LzndQU).  This means that absent a very-clean, one is testing 
against a previous version of the code and cross-component tests are invalid.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-06-01 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287818#comment-13287818
 ] 

Jakob Homan commented on HIVE-895:
--

Yeah, that should get fixed, but the bigger problem is that tests shouldn't be 
relying on ivy artifacts all (for any of the Hive artifacts).  The 
classes-under-test should be loaded directly from build/ either as classes or 
jars.  Currently, all new patches that go between components and aren't 
very-clean'ed first are not getting tested correctly.

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-2172) Hive CLI should let you specify database on the command line

2012-06-01 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HIVE-2172.
---

Resolution: Duplicate

> Hive CLI should let you specify database on the command line
> 
>
> Key: HIVE-2172
> URL: https://issues.apache.org/jira/browse/HIVE-2172
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Reporter: Carl Steinbach
>Assignee: Jakob Homan
>Priority: Minor
> Attachments: HIVE-2172.D1269.1.patch
>
>
> I'd like to be able to do the following:
> {noformat}
> % hive --dbname=mydb
> hive> ...
> {noformat}
> instead of having to do:
> {noformat}
> % hive
> hive> use mydb;
> hive> ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-06-01 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287775#comment-13287775
 ] 

Jakob Homan commented on HIVE-895:
--

The problem is that the tests in ql load up the serde package from the local 
ivy rather than from the build path, unless you do a full very-clean.  These 
jars don't have the new classes and hence fail.  I could reproduce this by 
running a test without the patch, applying the patch, running a test and it 
would then fail from the local jars.  Running very-clean, applying the patch 
and then running the test passes:
{noformat}[junit] Running org.apache.hadoop.hive.cli.TestCliDriver
[junit] Begin query: avro_joins.q
[junit] Copying file: file:/private/tmp/tp895/git/data/files/doctors.avro
[junit] Copying file: file:/private/tmp/tp895/git/data/files/episodes.avro
[junit] diff -a 
/private/tmp/tp895/git/build/ql/test/logs/clientpositive/avro_joins.q.out 
/private/tmp/tp895/git/ql/src/test/results/clientpositive/avro_joins.q.out
[junit] Done query: avro_joins.q elapsedTime=16s
[junit] Cleaning up TestCliDriver
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 24.91 sec
{noformat}
I reproduced this on both my Mac and RHEL boxes and verified that if you go and 
blow away the {{~./cache/org.apache.hive/hive-serde/jars/}} directory and leave 
everything else constant, the test passes. This is a problem with how the test 
infrastructure loads classes, not with this patch itself...

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-05-31 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286866#comment-13286866
 ] 

Jakob Homan commented on HIVE-895:
--

Did the tests pass? Anything else I can do to help?

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-895) Add SerDe for Avro serialized data

2012-05-25 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-895:
-

Status: Patch Available  (was: In Progress)

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-895) Add SerDe for Avro serialized data

2012-05-21 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-895:
-

Attachment: (was: HIVE-895.patch)

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-895) Add SerDe for Avro serialized data

2012-05-21 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-895:
-

Attachment: (was: doctors.avro)

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-895) Add SerDe for Avro serialized data

2012-05-21 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-895:
-

Attachment: (was: episodes.avro)

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-895) Add SerDe for Avro serialized data

2012-05-21 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-895:
-

Attachment: HIVE-895.patch
episodes.avro
doctors.avro

Forgot to grant ASF license.  Re-uping.

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, HIVE-895.patch, 
> doctors.avro, doctors.avro, episodes.avro, episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-895) Add SerDe for Avro serialized data

2012-05-21 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-895:
-

Attachment: doctors.avro
episodes.avro
HIVE-895.patch

Final patch.  Swtiching to TBLPROPERTIES rather than SERDEPROPERTIES obviated 
the need for the ql calls previously.  Spent a lot of frustrtating time trying 
to get the phabricator to work (quite surprised that this FB-specific framework 
is kosher in the ASF).  It's up at https://reviews.facebook.net/D3321

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch, HIVE-895.patch, doctors.avro, 
> episodes.avro
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-05-15 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276107#comment-13276107
 ] 

Jakob Homan commented on HIVE-895:
--

btw, moving stuff to common doesn't work, so I'm doing a bit of refactoring.  
New patch shortly... (schedule permitting)

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3017) HIVE-2646 added Serde classes to hive-exec jar, duplicating hive-serde

2012-05-11 Thread Jakob Homan (JIRA)
Jakob Homan created HIVE-3017:
-

 Summary: HIVE-2646 added Serde classes to hive-exec jar, 
duplicating hive-serde
 Key: HIVE-3017
 URL: https://issues.apache.org/jira/browse/HIVE-3017
 Project: Hive
  Issue Type: Bug
Reporter: Jakob Homan


HIVE-2646 added the jars from hive-serde to the hive-exec class:
{noformat}
...
 0 Wed May 09 20:56:30 PDT 2012 org/apache/hadoop/hive/serde2/typeinfo/
  1971 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/ListTypeInfo.class
  2396 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/MapTypeInfo.class
  2788 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/PrimitiveTypeInfo.class
  4408 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/StructTypeInfo.class
   900 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/TypeInfo.class
  6576 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/TypeInfoFactory.class
  1231 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$1.class
  1239 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser$Token.class
  7145 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser.class
 14482 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.class
  2594 Wed May 09 20:56:28 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/UnionTypeInfo.class
   144 Wed May 09 20:56:30 PDT 2012 
org/apache/hadoop/hive/serde2/typeinfo/package-info.class
...{noformat}
Was this intentional? If so, the serde jar should be deprecated. If not, the 
serde classes should be removed since this creates two sources of truth for 
them and can cause other problems (see HCATALOG-407).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2012-05-07 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269914#comment-13269914
 ] 

Jakob Homan commented on HIVE-895:
--

bq. Do you think it's possible to move the ql code that the avro serde depends 
on to common?
Should be fine.  Will do this week.

bq. Also, I apologize in advance for this request, but would you mind posting a 
review for this on phabricator? 
Due it its reliance on facebook.com, this site still doesn't display correctly 
for me, but I'll use a different browser just for this request.

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-895) Add SerDe for Avro serialized data

2012-05-02 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-895:
-

Attachment: HIVE-895-draft.patch

Here's a first draft of the port to ASF.  It corresponds to the 
mergeHive8ToMaster branch on github, which has all the latest fixes and is 
compatible with Hive 8.  Need to re-format to Hive style and run full unit 
tests.  

One thing of concern is that the avroserde relies on the ql package, which 
required a change to the build script to build serde afterwards.  Is there a 
defined dependency for Hive's modules, and if so does this break that?  If so, 
the other option would be to move this to the contrib package, but to me 
contrib is a dirty word and I'd like to avoid that.  

Also, this bundles the avro serde into the serde jar.  It'd be nice for those 
not using Avro to not require it, but Avro is already a build-time dependency 
so it's not a new problem.  Eventually it'd be nice to have a separate jar with 
just the serde in it to make the code more modular.

I'll finish the port in the next couple of days, but take a glance and comment 
if you'd like.

> Add SerDe for Avro serialized data
> --
>
> Key: HIVE-895
> URL: https://issues.apache.org/jira/browse/HIVE-895
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Jeff Hammerbacher
>Assignee: Jakob Homan
> Attachments: HIVE-895-draft.patch
>
>
> As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
> data seems like a solid win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2233) Show current database in hive prompt

2011-08-30 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094181#comment-13094181
 ] 

Jakob Homan commented on HIVE-2233:
---

Thanks for committing this, Carl. Any chance it can go to 8? We heavily rely on 
users using separate databases and this would be very helpful.  If not, I can 
apply it to our internal version, but this would save a step.

> Show current database in hive prompt
> 
>
> Key: HIVE-2233
> URL: https://issues.apache.org/jira/browse/HIVE-2233
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 0.7.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.9.0
>
> Attachments: HIVE-2233.patch
>
>
> Currently the hive prompt doesn't show which database the user in.  It would 
> be nice if it were something along the lines of {noformat}hive 
> (prod_tracking)>{noformat} or such.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-2390) Expand support for union types

2011-08-18 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-2390:
-

Assignee: Jakob Homan

> Expand support for union types
> --
>
> Key: HIVE-2390
> URL: https://issues.apache.org/jira/browse/HIVE-2390
> Project: Hive
>  Issue Type: Bug
>Reporter: Jakob Homan
>Assignee: Jakob Homan
>
> When the union type was introduced, full support for it wasn't provided.  For 
> instance, when working with a union that gets passed to LazyBinarySerde: 
> {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2390) Expand support for union types

2011-08-17 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2390:
--

Summary: Expand support for union types  (was: LazyBinarySerde (and others) 
don't support unions)

Changing name of JIRA to be more representative of what needs to be done.  If 
reaction is positive, will open subtasks for individual items.

> Expand support for union types
> --
>
> Key: HIVE-2390
> URL: https://issues.apache.org/jira/browse/HIVE-2390
> Project: Hive
>  Issue Type: Bug
>Reporter: Jakob Homan
>
> When the union type was introduced, full support for it wasn't provided.  For 
> instance, when working with a union that gets passed to LazyBinarySerde: 
> {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2390) LazyBinarySerde (and others) don't support unions

2011-08-17 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086714#comment-13086714
 ] 

Jakob Homan commented on HIVE-2390:
---

Part of the problem is that the term union has been overloaded.  In SQL it 
means the actual set union of two compatible data types, whereas in Avro and 
programming languages it means one value that can be at any one time an 
instance of two or different types.  Union was added as a full-on, first-class 
type by its inclusion in ObjectInspector's Category enum.  Is there any reason 
not to expand this use to be more along the line of programming language's take 
on unions?  If so, it should be marked as not really being a first-class type.  
If not, support for unions in all the serdes, in the grammar and in the 
documentation should be provided.

I would lobby for expanding its support as it's an important type in Avro and 
we're quite hobbled by the inability to manipulate unioned values. (Avro 
handles nullable values by unioning them with their type T and null, but 
Haivvreo transparently converts these just to the type and returns null where 
appropriate. The problem lies in actual unions of non-null types, which are 
less frequent but still valid.)



> LazyBinarySerde (and others) don't support unions
> -
>
> Key: HIVE-2390
> URL: https://issues.apache.org/jira/browse/HIVE-2390
> Project: Hive
>  Issue Type: Bug
>Reporter: Jakob Homan
>
> When the union type was introduced, full support for it wasn't provided.  For 
> instance, when working with a union that gets passed to LazyBinarySerde: 
> {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2390) LazyBinarySerde (and others) don't support unions

2011-08-17 Thread Jakob Homan (JIRA)
LazyBinarySerde (and others) don't support unions
-

 Key: HIVE-2390
 URL: https://issues.apache.org/jira/browse/HIVE-2390
 Project: Hive
  Issue Type: Bug
Reporter: Jakob Homan


When the union type was introduced, full support for it wasn't provided.  For 
instance, when working with a union that gets passed to LazyBinarySerde: 
{noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2387) Facing issues while executing commands on hive shell. The system throws following error

2011-08-17 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086646#comment-13086646
 ] 

Jakob Homan commented on HIVE-2387:
---

http://hive.apache.org/mailing_lists.html

> Facing issues while executing commands on hive shell. The system throws 
> following error
> ---
>
> Key: HIVE-2387
> URL: https://issues.apache.org/jira/browse/HIVE-2387
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth tiwari
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-2387) Facing issues while executing commands on hive shell. The system throws following error

2011-08-17 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HIVE-2387.
---

Resolution: Invalid

> Facing issues while executing commands on hive shell. The system throws 
> following error
> ---
>
> Key: HIVE-2387
> URL: https://issues.apache.org/jira/browse/HIVE-2387
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth tiwari
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2387) Facing issues while executing commands on hive shell. The system throws following error

2011-08-17 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086620#comment-13086620
 ] 

Jakob Homan commented on HIVE-2387:
---

Hey Siddharth, I'm going to close this and redirect you to the hive-users list. 
 JIRA is reserved for tracking patches and improvements.  You'll be much more 
likely to get help on the users list.  Thanks.

> Facing issues while executing commands on hive shell. The system throws 
> following error
> ---
>
> Key: HIVE-2387
> URL: https://issues.apache.org/jira/browse/HIVE-2387
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth tiwari
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2386) Add Mockito to LICENSE file

2011-08-17 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2386:
--

Attachment: mockito-license.patch

Quick patch to add mockito's license text to LICENSE.  

> Add Mockito to LICENSE file
> ---
>
> Key: HIVE-2386
> URL: https://issues.apache.org/jira/browse/HIVE-2386
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.1, 0.8.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.8.0
>
> Attachments: mockito-license.patch
>
>
> Mockito was added in HIVE-2171, but not added to the license file.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2386) Add Mockito to LICENSE file

2011-08-17 Thread Jakob Homan (JIRA)
Add Mockito to LICENSE file
---

 Key: HIVE-2386
 URL: https://issues.apache.org/jira/browse/HIVE-2386
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1, 0.8.0
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.8.0
 Attachments: mockito-license.patch

Mockito was added in HIVE-2171, but not added to the license file.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2386) Add Mockito to LICENSE file

2011-08-17 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2386:
--

Status: Patch Available  (was: Open)

No tests, etc. as just some text.

> Add Mockito to LICENSE file
> ---
>
> Key: HIVE-2386
> URL: https://issues.apache.org/jira/browse/HIVE-2386
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.1, 0.8.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.8.0
>
> Attachments: mockito-license.patch
>
>
> Mockito was added in HIVE-2171, but not added to the license file.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2377) USE database doesn't work when it's first command in session

2011-08-15 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085427#comment-13085427
 ] 

Jakob Homan commented on HIVE-2377:
---

@Marcin what version of Hive are you running against?  On 7, use as the initial 
command works fine for me:
{noformat}hive> [jhoman@gw ~]$ hive
Hive history 
file=/hive_query_logs/jhoman/hive_job_log_jhoman_201108152322_635559337.txt
hive> use u_jhoman;
OK
Time taken: 1.968 seconds
hive> show tables;
OK
foo
Time taken: 0.297 seconds{noformat}

> USE database doesn't work when it's first command in session
> 
>
> Key: HIVE-2377
> URL: https://issues.apache.org/jira/browse/HIVE-2377
> Project: Hive
>  Issue Type: Bug
>Reporter: Marcin Kurczych
>Priority: Minor
>
> When USE database is run as a first command it has no effect:
> USE database;
> SHOW TABLES;
> // wrong - default database tables
> When run twice it works:
> USE database;
> USE database;
> SHOW TABLES;
> // ok
> When SHOW DATABASES is used before it, it works:
> SHOW DATABASES;
> USE database;
> SHOW TABLES;
> // ok

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2334) DESCRIBE TABLE causes NPE when hive.cli.print.header=true

2011-08-15 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2334:
--

Attachment: HIVE-2334-3.patch

New patch that merges cleanly after HIVE-2171.  Only change is to remove the 
inclusion of mockito into ivy, which was also done in the now-committed 2171.  
Otherwise patch is the same.

> DESCRIBE TABLE causes NPE when hive.cli.print.header=true
> -
>
> Key: HIVE-2334
> URL: https://issues.apache.org/jira/browse/HIVE-2334
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.7.1
>Reporter: Carl Steinbach
>Assignee: Jakob Homan
> Attachments: HIVE-2334-2.patch, HIVE-2334-3.patch, h2334.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2334) DESCRIBE TABLE causes NPE when hive.cli.print.header=true

2011-08-15 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2334:
--

Status: Patch Available  (was: Open)

re-submitting for review.

> DESCRIBE TABLE causes NPE when hive.cli.print.header=true
> -
>
> Key: HIVE-2334
> URL: https://issues.apache.org/jira/browse/HIVE-2334
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.7.1
>Reporter: Carl Steinbach
>Assignee: Jakob Homan
> Attachments: HIVE-2334-2.patch, HIVE-2334-3.patch, h2334.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2334) DESCRIBE TABLE causes NPE when hive.cli.print.header=true

2011-08-15 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2334:
--

Status: Open  (was: Patch Available)

canceling patch due to trunk drift.

> DESCRIBE TABLE causes NPE when hive.cli.print.header=true
> -
>
> Key: HIVE-2334
> URL: https://issues.apache.org/jira/browse/HIVE-2334
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.7.1
>Reporter: Carl Steinbach
>Assignee: Jakob Homan
> Attachments: HIVE-2334-2.patch, h2334.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2171) Allow custom serdes to set field comments

2011-08-12 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084354#comment-13084354
 ] 

Jakob Homan commented on HIVE-2171:
---

Quite likely.  I'll investigate and open a JIRA. Thanks.

> Allow custom serdes to set field comments
> -
>
> Key: HIVE-2171
> URL: https://issues.apache.org/jira/browse/HIVE-2171
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.8.0
>
> Attachments: HIVE-2171-2.patch, HIVE-2171.patch
>
>
> Currently, while serde implementations can set a field's name, they can't set 
> its comment.  These are set in the metastore utils to {{(from 
> deserializer)}}.  For those serdes that can provide meaningful comments for a 
> field, they should be propagated to the table description.  These 
> serde-provided comments could be prepended to "(from deserializer)" if others 
> feel that's a meaningful distinction.  This change involves updating 
> {{StructField}} to support a (possibly null) comment field and then 
> propagating this change out to the myriad places {{StructField}} is thrown 
> around.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2369) Minor typo in error message in HiveConnection.java (JDBC)

2011-08-11 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083310#comment-13083310
 ] 

Jakob Homan commented on HIVE-2369:
---

Thanks for doing this.  Apache keeps all of its software in a SVN repository, 
so we need all the patches uploaded to JIRA in diff format (with --no-prefix if 
generated from git, to be compatible).  

> Minor typo in error message in HiveConnection.java (JDBC)
> -
>
> Key: HIVE-2369
> URL: https://issues.apache.org/jira/browse/HIVE-2369
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.7.1, 0.8.0
> Environment: Linux
>Reporter: Clément Notin
>Priority: Trivial
>   Original Estimate: 2m
>  Remaining Estimate: 2m
>
> There is a minor typo issue in HiveConnection.java (jdbc) :
> {code}throw new SQLException("Could not establish connecton to "
> + uri + ": " + e.getMessage(), "08S01");{code}
> It seems like there's a "i" missing.
> I know it's a very minor typo but I report it anyway. I won't attach a patch 
> because it would be too long for me to SVN checkout just for 1 letter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-2362) HiveConf properties not appearing in the output of 'set' or 'set -v'

2011-08-09 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-2362:
-

Assignee: Jakob Homan

> HiveConf properties not appearing in the output of 'set' or 'set -v'
> 
>
> Key: HIVE-2362
> URL: https://issues.apache.org/jira/browse/HIVE-2362
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Configuration
>Reporter: Carl Steinbach
>Assignee: Jakob Homan
>Priority: Blocker
> Fix For: 0.8.0
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >