from:"Anthony Hsu"

[jira] [Created] (HIVE-18802) Incorrect results when referencing same Accumulo table multiple times in one query

2018-02-25 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-18802:
--

 Summary: Incorrect results when referencing same Accumulo table 
multiple times in one query
 Key: HIVE-18802
 URL: https://issues.apache.org/jira/browse/HIVE-18802
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Anthony Hsu


While investigating HIVE-18695, I noticed incorrect results returned by the 
following Accumulo query:
{code:java}
DROP TABLE accumulo_test;
CREATE TABLE accumulo_test(key int, value int)
STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler'
WITH SERDEPROPERTIES ("accumulo.columns.mapping" = ":rowID,cf:string")
TBLPROPERTIES ("accumulo.table.name" = "accumulo_table_0");

INSERT OVERWRITE TABLE accumulo_test VALUES (0,0), (1,1), (2,2), (3,3);

SELECT * from accumulo_test where key == 1 union all select * from 
accumulo_test where key == 2;{code}
The expected output is
{code:java}
1 1
2 2{code}
but the actual output is
{code:java}
1  0
1  1
1  2
1  3
2  0
2  1
2  2
2  3
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null

2018-02-05 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54341/
---

(Updated 二月 6, 2018, 3:01 a.m.)


Review request for hive, Carl Steinbach and Ratandeep Ratti.


Changes
---

Rebased on HEAD.


Bugs: HIVE-15353
https://issues.apache.org/jira/browse/HIVE-15353


Repository: hive-git


Description (updated)
---

Updated HiveAlterHandler.updateOrGetPartitionColumnStats to handle null 
`oldCols`.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 89354a2d34249903a9ff13c4ed913a68de93057e 


Diff: https://reviews.apache.org/r/54341/diff/4/

Changes: https://reviews.apache.org/r/54341/diff/3-4/


Testing
---

After making these changes, I no longer encounter NullPointerExceptions when 
setting cols to null in create_table, alter_table, and alter_partition calls.


Thanks,

Anthony Hsu

Re: Review Request 62321: HIVE-17530: ClassCastException when converting uniontype

2017-09-14 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62321/
---

(Updated 九月 15, 2017, 1:52 a.m.)


Review request for hive, Carl Steinbach and Ratandeep Ratti.


Changes
---

* Fixed test TestObjectInspectorConverters.testObjectInspectorConverters()
* Renamed SettableUnionObjectInspector.addField() to setFieldAndTag().


Bugs: HIVE-17530
https://issues.apache.org/jira/browse/HIVE-17530


Repository: hive-git


Description
---

Previously, StandardUnionObjectInspector was creating an ArrayList instead of a 
StandardUnion, causing the exception

```
java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.UnionObject
```

This patch fixes this.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorDeserializeRow.java 
2ad06fc12869e74e14aae7b7a36685482c4a1ade 
  ql/src/test/queries/clientpositive/orc_avro_partition_uniontype.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_avro_partition_uniontype.q.out 
PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java
 7921de8d9c4a56af715de5498954794aaba32fff 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SettableUnionObjectInspector.java
 564d8d60451d9756eca1f1edcc84248e4f559828 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardUnionObjectInspector.java
 7b2868233f127899c7dca07d4f899b24ae2cbc1b 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java
 2e1bb22cea715501749ee5e169ce34f7dc789e64 


Diff: https://reviews.apache.org/r/62321/diff/2/

Changes: https://reviews.apache.org/r/62321/diff/1-2/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

Review Request 62321: HIVE-17530: ClassCastException when converting uniontype

2017-09-13 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62321/
---

Review request for hive, Carl Steinbach and Ratandeep Ratti.


Bugs: HIVE-17530
https://issues.apache.org/jira/browse/HIVE-17530


Repository: hive-git


Description
---

Previously, StandardUnionObjectInspector was creating an ArrayList instead of a 
StandardUnion, causing the exception

```
java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.UnionObject
```

This patch fixes this.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorDeserializeRow.java 
2ad06fc12869e74e14aae7b7a36685482c4a1ade 
  ql/src/test/queries/clientpositive/orc_avro_partition_uniontype.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_avro_partition_uniontype.q.out 
PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java
 7921de8d9c4a56af715de5498954794aaba32fff 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SettableUnionObjectInspector.java
 564d8d60451d9756eca1f1edcc84248e4f559828 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardUnionObjectInspector.java
 7b2868233f127899c7dca07d4f899b24ae2cbc1b 


Diff: https://reviews.apache.org/r/62321/diff/1/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-17530) ClassCastException when converting uniontype

2017-09-13 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-17530:
--

 Summary: ClassCastException when converting uniontype
 Key: HIVE-17530
 URL: https://issues.apache.org/jira/browse/HIVE-17530
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0, 3.0.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu


To repro:
{noformat}
SET hive.exec.schema.evolution = false;

CREATE TABLE avro_orc_partitioned_uniontype (a uniontype<boolean, string>) 
PARTITIONED BY (b int) STORED AS ORC;

INSERT INTO avro_orc_partitioned_uniontype PARTITION (b=1) SELECT 
create_union(1, true, value) FROM src LIMIT 5;

ALTER TABLE avro_orc_partitioned_uniontype SET FILEFORMAT AVRO;

SELECT * FROM avro_orc_partitioned_uniontype;
{noformat}

The exception you get is:
{code}
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.UnionObject
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Re: Review Request 62247: HIVE-17394: AvroSerde is regenerating TypeInfo objects for each nullable Avro field for every row

2017-09-12 Thread Anthony Hsu via Review Board



> On 九月 12, 2017, 5:02 p.m., Ratandeep Ratti wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java
> > Line 305 (original), 305 (patched)
> > <https://reviews.apache.org/r/62247/diff/1/?file=1820197#file1820197line305>
> >
> > This comment is misleading now and can be removed.

Carl fixed this before committing. Thanks, Carl!


- Anthony


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62247/#review185212
-------


On 九月 12, 2017, 10:43 p.m., Anthony Hsu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62247/
> ---
> 
> (Updated 九月 12, 2017, 10:43 p.m.)
> 
> 
> Review request for hive, Carl Steinbach and Ratandeep Ratti.
> 
> 
> Bugs: HIVE-17394
> https://issues.apache.org/jira/browse/HIVE-17394
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Previously, when Avro found a nullable union in the reader schema, it would 
> regenerate the TypeInfo for the field for every record. This patch reuses the 
> same TypeInfo that only needs to be calculated once.
> 
> In our testing, we found this improved count() queries by 2x.
> 
> 
> Diffs
> -
> 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 
> ecfe15f59dac04bda3f8f1275babebf736608a6b 
> 
> 
> Diff: https://reviews.apache.org/r/62247/diff/2/
> 
> 
> Testing
> ---
> 
> `mvn clean package -DskipTests -Dmaven.javadoc.skip=true` succeeded.
> 
> 
> Thanks,
> 
> Anthony Hsu
> 
>

Re: Review Request 62247: HIVE-17394: AvroSerde is regenerating TypeInfo objects for each nullable Avro field for every row

2017-09-12 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62247/
---

(Updated 九月 12, 2017, 10:43 p.m.)


Review request for hive, Carl Steinbach and Ratandeep Ratti.


Changes
---

Addressed Ratandeep's comment.


Bugs: HIVE-17394
https://issues.apache.org/jira/browse/HIVE-17394


Repository: hive-git


Description
---

Previously, when Avro found a nullable union in the reader schema, it would 
regenerate the TypeInfo for the field for every record. This patch reuses the 
same TypeInfo that only needs to be calculated once.

In our testing, we found this improved count() queries by 2x.


Diffs (updated)
-

  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 
ecfe15f59dac04bda3f8f1275babebf736608a6b 


Diff: https://reviews.apache.org/r/62247/diff/2/

Changes: https://reviews.apache.org/r/62247/diff/1-2/


Testing
---

`mvn clean package -DskipTests -Dmaven.javadoc.skip=true` succeeded.


Thanks,

Anthony Hsu

Review Request 62247: HIVE-17394: AvroSerde is regenerating TypeInfo objects for each nullable Avro field for every row

2017-09-12 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62247/
---

Review request for hive, Carl Steinbach and Ratandeep Ratti.


Bugs: HIVE-17394
https://issues.apache.org/jira/browse/HIVE-17394


Repository: hive-git


Description
---

Previously, when Avro found a nullable union in the reader schema, it would 
regenerate the TypeInfo for the field for every record. This patch reuses the 
same TypeInfo that only needs to be calculated once.

In our testing, we found this improved count() queries by 2x.


Diffs
-

  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 
ecfe15f59dac04bda3f8f1275babebf736608a6b 


Diff: https://reviews.apache.org/r/62247/diff/1/


Testing
---

`mvn clean package -DskipTests -Dmaven.javadoc.skip=true` succeeded.


Thanks,

Anthony Hsu

Re: Review Request 60303: HIVE-16908: Update table and partition replication tests to not use 2nd HCat instance

2017-06-22 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60303/#review178710
---




hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
Lines 796-807 (original)
<https://reviews.apache.org/r/60303/#comment252859>

Instead of deleting this, what about just starting the second metastore in 
a separate process? Then we can preserve the end-to-end integration-esque 
nature of the tests.



hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
Line 1011 (original), 996 (patched)
<https://reviews.apache.org/r/60303/#comment252858>

its -> it's


- Anthony Hsu


On 六月 22, 2017, 12:59 a.m., Sunitha Beeram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60303/
> ---
> 
> (Updated 六月 22, 2017, 12:59 a.m.)
> 
> 
> Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti.
> 
> 
> Bugs: HIVE-16908
> https://issues.apache.org/jira/browse/HIVE-16908
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16908: Update table and partition replication tests to not use 2nd HCat 
> instance
> 
> HIVE-16844 fixed a connection leak issue which subsequently exposed failures 
> in TestHCatClient. The connection leak gets triggered if a metastore instance 
> is updated with a different JDO configuration. TestHCatClient uses 2 
> metastore instances to test replication related methods. Unfortunately, it 
> does so by providing a different derby db name for the second instance. Since 
> the 2 metastores run in the same JVM, the path fixed in HIVE-16844 gets 
> triggered, resulting in "sourceMetastore"'s connection being closed and thus 
> resulting in failures.
> 
> It appears to me that running 2 metastore instances within the same JVM is 
> error prone as there could be unintentional side-effects due to statics in 
> the code (as was exposed by fixing HIVE-16844). This patch provides a way to 
> test the replication related methods without involving a second instance. The 
> changes mainly validate the serialize/deserialize methods. One of the tests, 
> testPartitionRegistrationWithCustomSchema, uses addPartitions method to 
> verify propogation of changes and it appeared that addPartitions wasn't 
> covered by other tests in TestHCatClient and there wasn't a better way to 
> verify the intended path, so I used an approach where the original database 
> and table are dropped and recreated using the serialized-string and captured 
> partition spec.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
>  86d3acbcb462d244fa2dc2f48923aab1e3ccee66 
> 
> 
> Diff: https://reviews.apache.org/r/60303/diff/2/
> 
> 
> Testing
> ---
> 
> mvn test -DTest=TestHCatClient now passes.
> 
> 
> Thanks,
> 
> Sunitha Beeram
> 
>

Re: Review Request 59885: HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used

2017-06-08 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59885/#review177432
---


Ship it!




Looks good to me.

- Anthony Hsu


On 六月 7, 2017, 4:29 p.m., Sunitha Beeram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59885/
> ---
> 
> (Updated 六月 7, 2017, 4:29 p.m.)
> 
> 
> Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti.
> 
> 
> Bugs: HIVE-16844
> https://issues.apache.org/jira/browse/HIVE-16844
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> 4676e15942d72b0db56bedf0ff30aa60964c28d8 
> 
> 
> Diff: https://reviews.apache.org/r/59885/diff/1/
> 
> 
> Testing
> ---
> 
> Can't provide unit tests to test the functionality, but problem is 
> reproducible and one way to simulate it is by setting pmf=null in 
> ObjectStore::setConf - you will notice leaked connections. With the fix the 
> same does not happen.
> 
> 
> Thanks,
> 
> Sunitha Beeram
> 
>

Re: Review Request 59885: HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used

2017-06-08 Thread Anthony Hsu via Review Board



> On 六月 7, 2017, 8:45 p.m., Anthony Hsu wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Line 302 (original), 304 (patched)
> > <https://reviews.apache.org/r/59885/diff/1/?file=1743915#file1743915line304>
> >
> > Do we need to close the PersistenceManager as well?
> 
> Sunitha Beeram wrote:
> Good point, but the call to shutdown() on line 301 closes pm.

Ah, yes, thanks for pointing that out.


- Anthony


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59885/#review177222
---


On 六月 7, 2017, 4:29 p.m., Sunitha Beeram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59885/
> ---
> 
> (Updated 六月 7, 2017, 4:29 p.m.)
> 
> 
> Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti.
> 
> 
> Bugs: HIVE-16844
> https://issues.apache.org/jira/browse/HIVE-16844
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> 4676e15942d72b0db56bedf0ff30aa60964c28d8 
> 
> 
> Diff: https://reviews.apache.org/r/59885/diff/1/
> 
> 
> Testing
> ---
> 
> Can't provide unit tests to test the functionality, but problem is 
> reproducible and one way to simulate it is by setting pmf=null in 
> ObjectStore::setConf - you will notice leaked connections. With the fix the 
> same does not happen.
> 
> 
> Thanks,
> 
> Sunitha Beeram
> 
>

Re: Review Request 59867: HIVE-16831: Add unit tests for NPE fixes in HIVE-12054

2017-06-07 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59867/#review177223
---


Ship it!




Looks good to me!

- Anthony Hsu


On 六月 6, 2017, 11:20 p.m., Sunitha Beeram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59867/
> ---
> 
> (Updated 六月 6, 2017, 11:20 p.m.)
> 
> 
> Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti.
> 
> 
> Bugs: HIVE-16831
> https://issues.apache.org/jira/browse/HIVE-16831
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16831: Add unit tests for NPE fixes in HIVE-12054
> 
> 
> Diffs
> -
> 
>   ql/src/test/queries/clientpositive/orc_empty_table.q PRE-CREATION 
>   ql/src/test/results/clientpositive/orc_empty_table.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/59867/diff/1/
> 
> 
> Testing
> ---
> 
> qtests pass.
> 
> 
> Thanks,
> 
> Sunitha Beeram
> 
>

Re: Review Request 59885: HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used

2017-06-07 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59885/#review177222
---




metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 302 (original), 304 (patched)
<https://reviews.apache.org/r/59885/#comment250766>

Do we need to close the PersistenceManager as well?


- Anthony Hsu


On 六月 7, 2017, 4:29 p.m., Sunitha Beeram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59885/
> ---
> 
> (Updated 六月 7, 2017, 4:29 p.m.)
> 
> 
> Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti.
> 
> 
> Bugs: HIVE-16844
> https://issues.apache.org/jira/browse/HIVE-16844
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> 4676e15942d72b0db56bedf0ff30aa60964c28d8 
> 
> 
> Diff: https://reviews.apache.org/r/59885/diff/1/
> 
> 
> Testing
> ---
> 
> Can't provide unit tests to test the functionality, but problem is 
> reproducible and one way to simulate it is by setting pmf=null in 
> ObjectStore::setConf - you will notice leaked connections. With the fix the 
> same does not happen.
> 
> 
> Thanks,
> 
> Sunitha Beeram
> 
>

Review Request 59303: HIVE-16670: Hive should automatically clean up hive.downloaded.resources.dir

2017-05-15 Thread Anthony Hsu via Review Board


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59303/
---

Review request for hive.


Bugs: HIVE-16670
https://issues.apache.org/jira/browse/HIVE-16670


Repository: hive-git


Description
---

HIVE-16670: Hive should automatically clean up hive.downloaded.resources.dir


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
ffce1d1aec8728840bb8ef726db1b600a9aeef38 


Diff: https://reviews.apache.org/r/59303/diff/1/


Testing
---


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-16670) Hive should automatically clean up hive.downloaded.resources.dir

2017-05-15 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-16670:
--

 Summary: Hive should automatically clean up 
hive.downloaded.resources.dir
 Key: HIVE-16670
 URL: https://issues.apache.org/jira/browse/HIVE-16670
 Project: Hive
  Issue Type: Improvement
Reporter: Anthony Hsu
Assignee: Anthony Hsu


Currently, Hive does not automatically clean up the 
hive.downloaded.resources.dir, so resources and resource directories can 
accumulate over time. Ideally, Hive should automatically clean up the resources 
dir when the session ends.

Ref: 
https://github.com/apache/hive/blob/0ce98b3a7527f72216e9e41f7e610b44ee524758/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L677-L678



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-30 Thread Anthony Hsu



> On Jan. 30, 2017, 4:53 p.m., Peter Vary wrote:
> > Hi Anthony,
> > 
> > I am not too familiar with the ORC tables, but currently wokring on 
> > enabling yetus on Hive.
> > 
> > Yetus runs several checks which might help the work of the reviewers. Here 
> > is what Yetus found with the checkstyle plugin:
> > 
> > ./ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:3679:
> > addTableSchemaToConf(conf, tableScanOp.getSchemaEvolutionColumns(), 
> > tableScanOp.getSchemaEvolutionColumnsTypes());: warning: Line is longer 
> > than 100 characters (found 118).
> > ./ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:3688:  
> > LOG.info(IOConstants.SCHEMA_EVOLUTION_COLUMNS + " and " + 
> > IOConstants.SCHEMA_EVOLUTION_COLUMNS_TYPES +: warning: Line is longer than 
> > 100 characters (found 108).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:610:
> > pushFilters(jobConf, filterExpr, filterObj, serializedFilterObj, 
> > serializedFilterExpr, tableScan.getSchema(),: warning: Line is longer than 
> > 100 characters (found 113).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:614:  
> > public static void pushFilters(JobConf jobConf, ExprNodeGenericFuncDesc 
> > filterExpr, Serializable filterObject,: warning: Line is longer than 100 
> > characters (found 112).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:614:  
> > public static void pushFilters(JobConf jobConf, ExprNodeGenericFuncDesc 
> > filterExpr, Serializable filterObject,:22: warning: More than 7 parameters 
> > (found 8).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:615:  
> > String serializedFilterObj, String serializedFilterExpr, RowSchema 
> > rowSchema, String schemaEvolutionColumns,: warning: Line is longer than 100 
> > characters (found 114).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:743:
> > pushFilters(jobConf, tableFilterExpr, filterObject, serializedFilterObj, 
> > serializedFilterExpr, rowSchema,: warning: Line is longer than 100 
> > characters (found 109).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:747:  
> > private Set getAliasesForPath(Path splitPath, boolean nonNative, 
> > Path splitPathWithNoSchema) {: warning: Line is longer than 100 characters 
> > (found 104).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:791:  
> > private ExprNodeGenericFuncDesc buildTableFilterExpr(boolean noFilters, 
> > List filterExprs) {: warning: Line is longer than 
> > 100 characters (found 118).
> > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:796:
> >   if (tableFilterExpr == null ) {:38: warning: ')' is preceded with 
> > whitespace.
> > 
> > Running Findbugs, ASF header check, etc did not found any new problems.
> > 
> > Thanks for the patch!
> > 
> > Peter

Thanks for running Yetus on my patch, Peter! I addressed most of the warnings 
(except the "More than 7 parameters" one) in my revision.


- Anthony


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55816/#review163528
---


On Jan. 31, 2017, 2:43 a.m., Anthony Hsu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55816/
> ---
> 
> (Updated Jan. 31, 2017, 2:43 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15680
> https://issues.apache.org/jira/browse/HIVE-15680
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same 
> ORC table is referenced twice in query
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> 68dd5e7247415dec1e353010ea34481c4f2fc6cd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
> 51530ac16c92cc75d501bfcb573557754ba0c964 
>   ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out 
> PRE-CREATION 
>   serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 
> 1354680584305bc7ea928526160f08fc9cbfd73e 
> 
> Diff: https://reviews.apache.org/r/55816/diff/
> 
> 
> Testing
> ---
> 
> Added qtest.
> 
> 
> Thanks,
> 
> Anthony Hsu
> 
>

Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-30 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55816/
---

(Updated Jan. 31, 2017, 2:43 a.m.)


Review request for hive.


Changes
---

In HiveInputFormat.java, changed

```
  ColumnProjectionUtils.setReadAllColumns(jobConf);
```

to

```
  ColumnProjectionUtils.appendReadColumns(jobConf, new ArrayList(),
  new ArrayList(), new ArrayList());
```

Also fixed most of the warnings reported by Peter (all except the "More than 7 
parameters" one).


Bugs: HIVE-15680
https://issues.apache.org/jira/browse/HIVE-15680


Repository: hive-git


Description
---

HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC 
table is referenced twice in query


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
68dd5e7247415dec1e353010ea34481c4f2fc6cd 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
51530ac16c92cc75d501bfcb573557754ba0c964 
  ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out 
PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 
1354680584305bc7ea928526160f08fc9cbfd73e 

Diff: https://reviews.apache.org/r/55816/diff/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-28 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55816/
---

(Updated Jan. 28, 2017, 9:41 p.m.)


Review request for hive.


Changes
---

Add back `!neededColumnIDs.isEmpty()` check, add `newConfStr.isEmpty()` check 
in ColumnProjectionUtils.appendReadColumns().


Bugs: HIVE-15680
https://issues.apache.org/jira/browse/HIVE-15680


Repository: hive-git


Description
---

HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC 
table is referenced twice in query


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
1cf24b41c047b9bc43e42a2940ff54a3e331190c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
3ee8fdc24aa115710d2c42f5c44c7f28e0544589 
  ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out 
PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 
1354680584305bc7ea928526160f08fc9cbfd73e 

Diff: https://reviews.apache.org/r/55816/diff/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-26 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55816/
---

(Updated Jan. 26, 2017, 10:45 p.m.)


Review request for hive.


Changes
---

Fix NPEs in LLAP tests.


Bugs: HIVE-15680
https://issues.apache.org/jira/browse/HIVE-15680


Repository: hive-git


Description
---

HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC 
table is referenced twice in query


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
1cf24b41c047b9bc43e42a2940ff54a3e331190c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
3ee8fdc24aa115710d2c42f5c44c7f28e0544589 
  ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/55816/diff/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-26 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55816/
---

(Updated Jan. 26, 2017, 5:42 p.m.)


Review request for hive.


Changes
---

Added some missing null checks.


Bugs: HIVE-15680
https://issues.apache.org/jira/browse/HIVE-15680


Repository: hive-git


Description
---

HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC 
table is referenced twice in query


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
1cf24b41c047b9bc43e42a2940ff54a3e331190c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
3ee8fdc24aa115710d2c42f5c44c7f28e0544589 
  ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/55816/diff/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-23 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55816/
---

(Updated Jan. 24, 2017, 12:54 a.m.)


Review request for hive.


Changes
---

Added back code to setting needed column names and paths. Updated code to merge 
the names and paths.


Bugs: HIVE-15680
https://issues.apache.org/jira/browse/HIVE-15680


Repository: hive-git


Description
---

HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC 
table is referenced twice in query


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
1cf24b41c047b9bc43e42a2940ff54a3e331190c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
3ee8fdc24aa115710d2c42f5c44c7f28e0544589 
  ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/55816/diff/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-20 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55816/
---

Review request for hive.


Bugs: HIVE-15680
https://issues.apache.org/jira/browse/HIVE-15680


Repository: hive-git


Description
---

HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC 
table is referenced twice in query


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
1cf24b41c047b9bc43e42a2940ff54a3e331190c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
3ee8fdc24aa115710d2c42f5c44c7f28e0544589 
  ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/55816/diff/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-15680) Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-20 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15680:
--

 Summary: Incorrect results when hive.optimize.index.filter=true 
and same ORC table is referenced twice in query
 Key: HIVE-15680
 URL: https://issues.apache.org/jira/browse/HIVE-15680
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0, 2.2.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu


To repro:

{noformat}
set hive.optimize.index.filter=true;

create table test_table(number int) stored as ORC;

-- Two insertions will create two files, with one stripe each
insert into table test_table VALUES (1);
insert into table test_table VALUES (2);

-- This should and does return 2 records
select * from test_table;

-- These should and do each return 1 record
select * from test_table where number = 1;
select * from test_table where number = 2;

-- This should return 2 records but only returns 1 record
select * from test_table where number = 1
union all
select * from test_table where number = 2;
{noformat}

What's happening is only the last predicate is being pushed down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15438) avrocountemptytbl.q should use SORT_QUERY_RESULTS

2016-12-15 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15438:
--

 Summary: avrocountemptytbl.q should use SORT_QUERY_RESULTS
 Key: HIVE-15438
 URL: https://issues.apache.org/jira/browse/HIVE-15438
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0, 2.2.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu


In Hive 1.1.0, when building and testing using Java 1.8, I've noticed that 
avrocountemptytbl.q due to ordering issues:

{noformat}
57d56
< 100
58a58
> 100
{noformat}

This can be fixed by adding {{-- SORT_QUERY_RESULTS}} to the qtest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 54765: HIVE-15411: ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES

2016-12-14 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54765/
---

Review request for hive.


Bugs: HIVE-15411
https://issues.apache.org/jira/browse/HIVE-15411


Repository: hive-git


Description
---

HIVE-15411: ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES


Diffs
-

  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
 18bf172116828439751ca4d0e99c83912f2b3915 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
3f5813018b9305734e66dcff76064d6e3e6061f1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 
55915a63be916b79dae022d76a4252ab1a18c64b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
ce952c5ee4d54b4c2a092f9ee15197ec0337fb4c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
79e55b2de07983c7b799ff382b9c71ef14d25b43 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
520d3de9a7cc07b728b5d3ad3845622ddbec22fb 
  
ql/src/test/queries/clientpositive/add_part_fileformat_serdeproperties_location.q
 PRE-CREATION 
  
ql/src/test/results/clientpositive/add_part_fileformat_serdeproperties_location.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/54765/diff/


Testing
---

Added qtest.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-15411) ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES

2016-12-09 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15411:
--

 Summary: ADD PARTITION should support setting FILEFORMAT and 
SERDEPROPERTIES
 Key: HIVE-15411
 URL: https://issues.apache.org/jira/browse/HIVE-15411
 Project: Hive
  Issue Type: Improvement
Reporter: Anthony Hsu
Assignee: Anthony Hsu


Currently, {{ALTER TABLE ... ADD PARTITION}} only lets you set the partition's 
LOCATION but not its FILEFORMAT or SERDEPROPERTIES. In order to change the 
FILEFORMAT or SERDEPROPERTIES, you have to issue two additional calls to 
{{ALTER TABLE ... PARTITION ... SET FILEFORMAT}} and {{ALTER TABLE ... 
PARTITION ... SET SERDEPROPERTIES}}. This is not atomic, and queries that 
interleave the ALTER TABLE commands may fail.

We should extend the grammar to support setting FILEFORMAT and SERDEPROPERTIES 
atomically as part of the ADD PARTITION command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15400) EXCHANGE PARTITION should honor partition locations

2016-12-08 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15400:
--

 Summary: EXCHANGE PARTITION should honor partition locations
 Key: HIVE-15400
 URL: https://issues.apache.org/jira/browse/HIVE-15400
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu


Currently, if you add a partition with a custom location, EXCHANGE PARTITION 
will fail with a "File ... does not exist" error:
{noformat}
drop table if exists text_partitioned;
drop table if exists text_partitioned2;

create table text_partitioned (b string) partitioned by (a int) stored as 
textfile;
create table text_partitioned2 (b string) partitioned by (a int) stored as 
textfile;

alter table text_partitioned add partition (a=1) location '/tmp/text/1';

alter table text_partitioned2 exchange partition (a=1) with table 
text_partitioned;
{noformat}

The last command fails with
{code}
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
java.io.FileNotFoundException File 
file:/path/to/warehouse_dir/text_partitioned/a=1 does not exist)
{code}

EXCHANGE PARTITION should honor the location that has been set for the 
partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15394) HiveMetaStoreClient add_partition API should not allow partitions with a null StorageDescriptor.cols to be added

2016-12-08 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15394:
--

 Summary: HiveMetaStoreClient add_partition API should not allow 
partitions with a null StorageDescriptor.cols to be added
 Key: HIVE-15394
 URL: https://issues.apache.org/jira/browse/HIVE-15394
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0, 2.2.0
Reporter: Anthony Hsu


Follow up to HIVE-15353.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null

2016-12-08 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54341/
---

(Updated 十二月 8, 2016, 9:23 p.m.)


Review request for hive.


Changes
---

New version no longer updates the Thrift definition but just fixes the NPEs in 
the alter_partition code path.


Bugs: HIVE-15353
https://issues.apache.org/jira/browse/HIVE-15353


Repository: hive-git


Description (updated)
---

Update alter_partition() code path to fix NPEs.


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
86565a4198d5daced5e230a41d8ada577a656268 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
9ea6ac40d6f0eb9081c5cfad982ffc435f15f6fd 

Diff: https://reviews.apache.org/r/54341/diff/


Testing
---

After making these changes, I no longer encounter NullPointerExceptions when 
setting cols to null in create_table, alter_table, and alter_partition calls.


Thanks,

Anthony Hsu

Re: Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null

2016-12-05 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54341/
---

(Updated 十二月 5, 2016, 5:30 p.m.)


Review request for hive.


Changes
---

Fixed HiveMetaStore unit tests.


Bugs: HIVE-15353
https://issues.apache.org/jira/browse/HIVE-15353


Repository: hive-git


Description
---

Set a default value for StorageDescriptor.cols of empty list to avoid having to 
do null checks everywhere. However, null checks are still needed to guard 
against existing null values in the database (add_partition previously allowed 
you to store nulls in the database).


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 21d1b46fcbd4f8f10ee447dce9d40dd6b43a2793 
  metastore/if/hive_metastore.thrift baab31bb0f44361847224843f905c0417b1670be 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
6838133083684ee3b93a93129bb492ab29a4842e 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java
 938f06bbce7a2b213e901f153e1da4606339c0cf 
  metastore/src/gen/thrift/gen-php/metastore/Types.php 
b9af4efc5f8b7cdf19236db7d68865bdec8382a5 
  metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
21c039006fc05bc603fda0eeedc92174583f8403 
  metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 
c73593298bbddb46e0926b01ccb9c6fb5d880452 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
86565a4198d5daced5e230a41d8ada577a656268 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
9ea6ac40d6f0eb9081c5cfad982ffc435f15f6fd 

Diff: https://reviews.apache.org/r/54341/diff/


Testing
---

After making these changes, I no longer encounter NullPointerExceptions when 
setting cols to null in create_table, alter_table, and alter_partition calls.


Thanks,

Anthony Hsu

Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null

2016-12-03 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54341/
---

Review request for hive.


Bugs: HIVE-15353
https://issues.apache.org/jira/browse/HIVE-15353


Repository: hive-git


Description
---

Set a default value for StorageDescriptor.cols of empty list to avoid having to 
do null checks everywhere. However, null checks are still needed to guard 
against existing null values in the database (add_partition previously allowed 
you to store nulls in the database).


Diffs
-

  metastore/if/hive_metastore.thrift baab31bb0f44361847224843f905c0417b1670be 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
6838133083684ee3b93a93129bb492ab29a4842e 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java
 938f06bbce7a2b213e901f153e1da4606339c0cf 
  metastore/src/gen/thrift/gen-php/metastore/Types.php 
b9af4efc5f8b7cdf19236db7d68865bdec8382a5 
  metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
21c039006fc05bc603fda0eeedc92174583f8403 
  metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 
c73593298bbddb46e0926b01ccb9c6fb5d880452 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
86565a4198d5daced5e230a41d8ada577a656268 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
9ea6ac40d6f0eb9081c5cfad982ffc435f15f6fd 

Diff: https://reviews.apache.org/r/54341/diff/


Testing
---

After making these changes, I no longer encounter NullPointerExceptions when 
setting cols to null in create_table, alter_table, and alter_partition calls.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-15353) Metastore throws NPE if StorageDescriptor.cols is null

2016-12-03 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15353:
--

 Summary: Metastore throws NPE if StorageDescriptor.cols is null
 Key: HIVE-15353
 URL: https://issues.apache.org/jira/browse/HIVE-15353
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0, 2.2.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu


When using the HiveMetaStoreClient API directly to talk to the metastore, you 
get NullPointerExceptions when StorageDescriptor.cols is null in the 
Table/Partition object in the following calls:

* create_table
* alter_table
* alter_partition

Calling add_partition with StorageDescriptor.cols set to null causes null to be 
stored in the metastore database and subsequent calls to alter_partition for 
that partition to fail with an NPE.

The simplest way to fix these NPEs seems to be to update the 
StorageDescriptor.cols Thrift definition and set a default value of empty list. 
Some null checks will also have to be added to handle existing nulls in the 
metastore database.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15289) Flaky test: TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (setup)

2016-11-27 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15289:
--

 Summary: Flaky test: 
TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (setup)
 Key: HIVE-15289
 URL: https://issues.apache.org/jira/browse/HIVE-15289
 Project: Hive
  Issue Type: Sub-task
Reporter: Anthony Hsu


In recent PreCommit builds, TestSparkCliDriver has failed during setup with 
errors like the following:

>From https://builds.apache.org/job/PreCommit-HIVE-Build/2292/testReport/:
{noformat}
Failed during createSources processLine with code=3
...
Job failed with java.io.IOException: Failed to create local dir in 
/tmp/blockmgr-be4539eb-7896-4903-89c9-7ae1c48faa24/01.
at 
org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:70)
at org.apache.spark.storage.DiskStore.contains(DiskStore.scala:124)
at 
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$getCurrentBlockStatus(BlockManager.scala:379)
at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:959)
at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910)
at 
org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:700)
at 
org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1213)
at 
org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:103)
at 
org.apache.spark.broadcast.TorrentBroadcast.(TorrentBroadcast.scala:86)
at 
org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
at 
org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:56)
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1370)
at org.apache.spark.rdd.HadoopRDD.(HadoopRDD.scala:125)
at 
org.apache.spark.SparkContext$$anonfun$hadoopRDD$1.apply(SparkContext.scala:965)
at 
org.apache.spark.SparkContext$$anonfun$hadoopRDD$1.apply(SparkContext.scala:961)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:682)
at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:961)
at 
org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:412)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:339)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:358)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:323)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

>From https://builds.apache.org/job/PreCommit-HIVE-Build/2291/testReport/:
{noformat}
Failed during createSources processLine with code=1
...
Failed to monitor Job[ 11] with exception 
'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15288) Flaky test: TestMiniTezCliDriver.testCliDriver[explainuser_3]

2016-11-27 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15288:
--

 Summary: Flaky test: 
TestMiniTezCliDriver.testCliDriver[explainuser_3]
 Key: HIVE-15288
 URL: https://issues.apache.org/jira/browse/HIVE-15288
 Project: Hive
  Issue Type: Sub-task
Reporter: Anthony Hsu


explainuser_3.q sometimes fails with the following diff:
{noformat}
34c34
< Select Operator [SEL_7] (rows=16 width=106)
---
> Select Operator [SEL_7] (rows=16 width=107)
38c38
< Select Operator [SEL_5] (rows=16 width=106)
---
> Select Operator [SEL_5] (rows=16 width=107)
40c40
<   TableScan [TS_0] (rows=16 width=106)
---
>   TableScan [TS_0] (rows=16 width=107)
{noformat}

It was also previously reported as flaky in HIVE-14689.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15287) Flaky test: TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]

2016-11-27 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15287:
--

 Summary: Flaky test: 
TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 Key: HIVE-15287
 URL: https://issues.apache.org/jira/browse/HIVE-15287
 Project: Hive
  Issue Type: Sub-task
Reporter: Anthony Hsu


insert_values_orig_table_use_metadata.q sometimes fails with the following diff 
differences:
{noformat}
315c315
<   totalSize 1545
---
>   totalSize 1508
343c343
< Statistics: Num rows: 1 Data size: 1545 Basic stats: COMPLETE 
Column stats: COMPLETE
---
> Statistics: Num rows: 1 Data size: 1508 Basic stats: COMPLETE 
> Column stats: COMPLETE
345c345
<   Statistics: Num rows: 1 Data size: 1545 Basic stats: COMPLETE 
Column stats: COMPLETE
---
>   Statistics: Num rows: 1 Data size: 1508 Basic stats: COMPLETE 
> Column stats: COMPLETE
439c439
<   totalSize 3091
---
>   totalSize 3016
467c467
< Statistics: Num rows: 1 Data size: 3091 Basic stats: COMPLETE 
Column stats: COMPLETE
---
> Statistics: Num rows: 1 Data size: 3016 Basic stats: COMPLETE 
> Column stats: COMPLETE
469c469
<   Statistics: Num rows: 1 Data size: 3091 Basic stats: COMPLETE 
Column stats: COMPLETE
---
>   Statistics: Num rows: 1 Data size: 3016 Basic stats: COMPLETE 
> Column stats: COMPLETE
547c547
<   totalSize 380328
---
>   totalSize 380253
575c575
< Statistics: Num rows: 1 Data size: 380328 Basic stats: COMPLETE 
Column stats: COMPLETE
---
> Statistics: Num rows: 1 Data size: 380253 Basic stats: COMPLETE 
> Column stats: COMPLETE
577c577
<   Statistics: Num rows: 1 Data size: 380328 Basic stats: COMPLETE 
Column stats: COMPLETE
---
>   Statistics: Num rows: 1 Data size: 380253 Basic stats: COMPLETE 
> Column stats: COMPLETE
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15286) Flaky test: TestCliDriver.testCliDriver[autoColumnStats_4]

2016-11-27 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15286:
--

 Summary: Flaky test: TestCliDriver.testCliDriver[autoColumnStats_4]
 Key: HIVE-15286
 URL: https://issues.apache.org/jira/browse/HIVE-15286
 Project: Hive
  Issue Type: Sub-task
Reporter: Anthony Hsu


autoColumnStats_4.q sometimes fails with the following diff differences:
{noformat}
203c203
<   totalSize 1707
---
>   totalSize 1714
246c246
<   totalSize 2920
---
>   totalSize 2719
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 54094: HIVE-15190: Field names are not preserved in ORC files written with ACID

2016-11-26 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54094/
---

Review request for hive.


Bugs: HIVE-15190
https://issues.apache.org/jira/browse/HIVE-15190


Repository: hive-git


Description
---

Previously, when writing to an ACID ORC table, the file written to disk would 
have a schema of `struct<...(acid 
columns)...,row:struct<_col0:int,_col1:string,...>>`, using virtual column 
names `_col0`, `_col1`, etc., instead of the actual table column names. This 
patch fixes this issue.

Having the actual table column names in the ORC file itself is needed when 
doing schema evolution based on field names: 
https://issues.apache.org/jira/browse/ORC-54


Diffs
-

  orc/src/java/org/apache/orc/impl/SchemaEvolution.java 
7379de93a7f39d734ef7695c197bd9f24bc84321 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 
53660206e3f59c37be261b1a9796f04721a244f3 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
efde2db482367f1037c486df9c5cabd67b1368ed 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
492c64c29e8d4f38d857381bc375074e06868f7c 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 75c7680e267ab44e426d0b21c6fd6dce6a352bbd 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 
49ba6675bae5b3e6d8bf1fa2e9ed8d2a27b7f83a 

Diff: https://reviews.apache.org/r/54094/diff/


Testing
---

Added unit test. Also ran some of the existing ACID tests and they still passed.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-15190) Field names are not preserved in ORC files written with ACID

2016-11-13 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-15190:
--

 Summary: Field names are not preserved in ORC files written with 
ACID
 Key: HIVE-15190
 URL: https://issues.apache.org/jira/browse/HIVE-15190
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.1.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu


To repro:
{noformat}
drop table if exists orc_nonacid;
drop table if exists orc_acid;

create table orc_nonacid (a int) clustered by (a) into 2 buckets stored as orc;
create table orc_acid (a int) clustered by (a) into 2 buckets stored as orc 
TBLPROPERTIES('transactional'='true');

insert into table orc_nonacid values(1), (2);
insert into table orc_acid values(1), (2);
{noformat}

Running {{hive --service orcfiledump }} on the files created by the 
{{insert}} statements above, you'll see that for {{orc_nonacid}}, the files 
have schema {{struct}} whereas for {{orc_acid}}, the files have schema 
{{struct<operation:int,originalTransaction:bigint,bucket:int,rowId:bigint,currentTransaction:bigint,row:struct<_col0:int>>}}.
 The last field {{row}} should have schema {{struct}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-13993) Hive should provide built-in UDF that can apply another UDF to each element of an array

2016-06-10 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-13993:
--

 Summary: Hive should provide built-in UDF that can apply another 
UDF to each element of an array
 Key: HIVE-13993
 URL: https://issues.apache.org/jira/browse/HIVE-13993
 Project: Hive
  Issue Type: New Feature
Reporter: Anthony Hsu


There is currently no simple way to take an array field and apply a UDF on each 
element of the array, returning a new array. This is a basic use case that Hive 
should provide a built-in UDF for. More motivation: 
http://stackoverflow.com/questions/27722493/how-to-invoke-udf-for-each-element-in-an-array-in-hive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 45348: HIVE-13363: Add hive.metastore.token.signature property to HiveConf

2016-05-03 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45348/
---

(Updated 五月 4, 2016, 1:30 a.m.)


Review request for hive, Carl Steinbach and Ratandeep Ratti.


Changes
---

Fixed bug in original revision that caused build to fail.


Bugs: HIVE-13363
https://issues.apache.org/jira/browse/HIVE-13363


Repository: hive-git


Description
---

No logic changes, just added METASTORE_TOKEN_SIGNATURE property to HiveConf and 
replaced all instances of `hive.metastore.token.signature` with a references to 
`HiveConf.ConfVars.METASTORE_TOKEN_SIGNATURE`.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
06a6906ef1f5e0b7d941c042c74d257089f46f96 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
3ee30edef50940b4d9da21230177d6fb2a796819 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java
 13f3c9bd5e523e770dd8ccfd75a442bbbf93b680 
  
itests/hive-unit-hadoop2/src/test/java/org/apache/hadoop/hive/thrift/TestHadoopAuthBridge23.java
 d07162bd46f8bea88d8c856552a2b4a2d83caf8d 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
7d37d0706d5f0269b89c4c6486adecf4bb3d85b8 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 025b0b810b040ba6ea72b900ccd0802e207033a8 

Diff: https://reviews.apache.org/r/45348/diff/


Testing
---

Ran `grep -r hive.metastore.token.signature --include=*.java *` and saw that 
the only occurrences of this string are in HiveConf.java and a comment in 
Security.java.


Thanks,

Anthony Hsu

Review Request 46790: HIVE-13644: Remove hardcoded groovy.grape.report.downloads=true from DependencyResolver

2016-04-28 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46790/
---

Review request for hive, Carl Steinbach, Mark Wagner, and Ratandeep Ratti.


Bugs: HIVE-13644
https://issues.apache.org/jira/browse/HIVE-13644


Repository: hive-git


Description
---

HIVE-13644: Remove hardcoded groovy.grape.report.downloads=true from 
DependencyResolver


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java 
3891e59a274e6449c5f50eea51e4f23762efcbc0 

Diff: https://reviews.apache.org/r/46790/diff/


Testing
---

Tested manually.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-13644) Remove hardcoded groovy.grape.report.downloads=true from DependencyResolver

2016-04-28 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-13644:
--

 Summary: Remove hardcoded groovy.grape.report.downloads=true from 
DependencyResolver
 Key: HIVE-13644
 URL: https://issues.apache.org/jira/browse/HIVE-13644
 Project: Hive
  Issue Type: Improvement
Reporter: Anthony Hsu
Assignee: Anthony Hsu


Currently, in Hive's 
[DependencyResolver.java|https://github.com/apache/hive/blob/8dd1d1966f2f0b86604b4e991ebc865224f42b41/ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java#L176],
 the system property {{groovy.grape.report.downloads}} is hardcoded to {{true}} 
and there is no way to override it and disable the logging. We should remove 
this hardcoded value and allow users to configure it as they see fit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 45348: HIVE-13363: Add hive.metastore.token.signature property to HiveConf

2016-03-25 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45348/
---

Review request for hive, Carl Steinbach and Ratandeep Ratti.


Bugs: HIVE-13363
https://issues.apache.org/jira/browse/HIVE-13363


Repository: hive-git


Description
---

No logic changes, just added METASTORE_TOKEN_SIGNATURE property to HiveConf and 
replaced all instances of `hive.metastore.token.signature` with a references to 
`HiveConf.ConfVars.METASTORE_TOKEN_SIGNATURE`.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
b8870f2ef78884f23e65d9432415e49d89f8ee35 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
3ee30edef50940b4d9da21230177d6fb2a796819 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java
 13f3c9bd5e523e770dd8ccfd75a442bbbf93b680 
  
itests/hive-unit-hadoop2/src/test/java/org/apache/hadoop/hive/thrift/TestHadoopAuthBridge23.java
 d07162bd46f8bea88d8c856552a2b4a2d83caf8d 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
cdd12aba9fb4284bbb9989d7fcbe3c591ef17d98 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 025b0b810b040ba6ea72b900ccd0802e207033a8 

Diff: https://reviews.apache.org/r/45348/diff/


Testing
---

Ran `grep -r hive.metastore.token.signature --include=*.java *` and saw that 
the only occurrences of this string are in HiveConf.java and a comment in 
Security.java.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-13363) Add hive.metastore.token.signature property to HiveConf

2016-03-25 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-13363:
--

 Summary: Add hive.metastore.token.signature property to HiveConf
 Key: HIVE-13363
 URL: https://issues.apache.org/jira/browse/HIVE-13363
 Project: Hive
  Issue Type: Improvement
Reporter: Anthony Hsu
Assignee: Anthony Hsu


I noticed that the {{hive.metastore.token.signature}} property is not defined 
in HiveConf.java, but hardcoded everywhere it's used in the Hive codebase.

[HIVE-2963] fixes this but was never committed due to being resolved as a 
duplicate ticket.

We should add {{hive.metastore.token.signature}} to HiveConf.java to centralize 
its definition and make the property more discoverable (it's useful to set it 
when talking to multiple metastores).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-13132) Hive should lazily load and cache metastore (permanent) functions

2016-02-23 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-13132:
--

 Summary: Hive should lazily load and cache metastore (permanent) 
functions
 Key: HIVE-13132
 URL: https://issues.apache.org/jira/browse/HIVE-13132
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.1
Reporter: Anthony Hsu
Assignee: Anthony Hsu


In Hive 0.13.1, we have noticed that as the number of databases increases, the 
start-up time of the Hive interactive shell increases. This is because during 
start-up, all databases are iterated over to fetch the permanent functions to 
display in the {{SHOW FUNCTIONS}} output.

{noformat:title=FunctionRegistry.java}
  private static Set getFunctionNames(boolean searchMetastore) {
Set functionNames = mFunctions.keySet();
if (searchMetastore) {
  functionNames = new HashSet(functionNames);
  try {
Hive db = getHive();
List dbNames = db.getAllDatabases();

for (String dbName : dbNames) {
  List funcNames = db.getFunctions(dbName, "*");
  for (String funcName : funcNames) {
functionNames.add(FunctionUtils.qualifyFunctionName(funcName, 
dbName));
  }
}
  } catch (Exception e) {
LOG.error(e);
// Continue on, we can still return the functions we've gotten to this 
point.
  }
}
return functionNames;
  }
{noformat}

Instead of eagerly loading all metastore functions, we should only load them 
the first time {{SHOW FUNCTIONS}} is invoked. We should also cache the results.

Note that this issue may have been fixed by HIVE-2573, though I haven't 
verified this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority

2016-02-11 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-13046:
--

 Summary: DependencyResolver should not lowercase the dependency 
URI's authority
 Key: HIVE-13046
 URL: https://issues.apache.org/jira/browse/HIVE-13046
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu


When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, Hive 
will lowercase it to {{1.2.3-snapshot}} due to:

{code:title=DependencyResolver.java}
String[] authorityTokens = authority.toLowerCase().split(":");
{code}

We should not {{.lowerCase()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 43513: HIVE-13046: DependencyResolver should not lowercase the dependency URI's authority

2016-02-11 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43513/
---

Review request for hive, Carl Steinbach, Mark Wagner, and Ratandeep Ratti.


Bugs: HIVE-13046
https://issues.apache.org/jira/browse/HIVE-13046


Repository: hive-git


Description
---

HIVE-13046: DependencyResolver should not lowercase the dependency URI's 
authority


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java 
3891e59a274e6449c5f50eea51e4f23762efcbc0 

Diff: https://reviews.apache.org/r/43513/diff/


Testing
---

Tested manually.


Thanks,

Anthony Hsu

[jira] [Created] (HIVE-12978) Hive Metastore should have a config for starting background thread services

2016-02-01 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-12978:
--

 Summary: Hive Metastore should have a config for starting 
background thread services
 Key: HIVE-12978
 URL: https://issues.apache.org/jira/browse/HIVE-12978
 Project: Hive
  Issue Type: New Feature
Reporter: Anthony Hsu
Assignee: Anthony Hsu


It would be convenient to have a configuration for setting custom background 
threads to run in the Hive Metastore. This could be useful for running custom 
monitoring, logging, or table/partition registration services.

I propose adding a {{hive.metastore.thread.services}} config that takes a 
comma-separated list of classes that implement the {{MetaStoreThread}} 
interface, which the metastore would launch during start-up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 43073: HIVE-12978: Hive Metastore should have a config for starting background thread services

2016-02-01 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43073/
---

Review request for hive, Carl Steinbach and Ratandeep Ratti.


Bugs: HIVE-12978
https://issues.apache.org/jira/browse/HIVE-12978


Repository: hive-git


Description
---

Added a new property `hive.metastore.thread.services` for configuring 
background metastore threads.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
6678de6c488e838b82caf62186ff6518295b7e98 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
dde253a9c7a19527620f9d516265971529fa838d 

Diff: https://reviews.apache.org/r/43073/diff/


Testing
---

Tested manually.

Wrote an 
[ExampleMetaStoreThreadService.java](https://github.com/erwa/test/blob/master/metastore-thread-service-example-hive-2.x/src/main/java/ExampleMetaStoreThreadService.java).
 Configured the following in my hive-site.xml:
```

  hive.metastore.thread.services
  ExampleMetaStoreThreadService

```

When starting the Hive Metastore, I saw the following log output:
```
16/02/01 15:09:54 [Thread-3]: INFO metastore.HiveMetaStore: Starting background 
metastore service ExampleMetaStoreThreadService
16/02/01 15:09:54 [Thread-3]: INFO metastore.HiveMetaStore: Starting metastore 
thread of type ExampleMetaStoreThreadService
16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Setting 
HiveConf in ExampleMetaStoreThreadService
16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Setting 
thread id in ExampleMetaStoreThreadService
16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Initing 
ExampleMetaStoreThreadService
16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Starting 
ExampleMetaStoreThreadService
```


Thanks,

Anthony Hsu

Re: Review Request 38663: HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-12-10 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38663/#review109782
---

Ship it!


Revision looks good to me.


itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
 (line 92)
<https://reviews.apache.org/r/38663/#comment169418>

Nit: trailing whitespace


- Anthony Hsu


On 十二月 9, 2015, 8:33 a.m., Ratandeep Ratti wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38663/
> ---
> 
> (Updated 十二月 9, 2015, 8:33 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-11878
> https://issues.apache.org/jira/browse/HIVE-11878
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are 
> registered one at a time in Hive
> 
> 
> Diffs
> -
> 
>   conf/ivysettings.xml bda842a 
>   itests/custom-udfs/pom.xml PRE-CREATION 
>   itests/custom-udfs/udf-classloader-udf1/pom.xml PRE-CREATION 
>   
> itests/custom-udfs/udf-classloader-udf1/src/main/java/hive/it/custom/udfs/UDF1.java
>  PRE-CREATION 
>   itests/custom-udfs/udf-classloader-udf2/pom.xml PRE-CREATION 
>   
> itests/custom-udfs/udf-classloader-udf2/src/main/java/hive/it/custom/udfs/UDF2.java
>  PRE-CREATION 
>   itests/custom-udfs/udf-classloader-util/pom.xml PRE-CREATION 
>   
> itests/custom-udfs/udf-classloader-util/src/main/java/hive/it/custom/udfs/Util.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
>  0c03a00 
>   itests/pom.xml 5d8249f 
>   itests/qtest/pom.xml 8f6807a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/UDFClassLoader.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java c01994f 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 5c69fb6 
>   ql/src/test/queries/clientpositive/udf_classloader.q PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/udf_classloader_dynamic_dependency_resolution.q
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/udf_classloader.q.out PRE-CREATION 
>   
> ql/src/test/results/clientpositive/udf_classloader_dynamic_dependency_resolution.q.out
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/38663/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ratandeep Ratti
> 
>

Re: Review Request 38663: HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-11-18 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38663/#review107063
---

Ship it!


Revision looks good to me.


ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java (lines 369 - 
370)
<https://reviews.apache.org/r/38663/#comment165963>

You could also use `new String[0]`.


- Anthony Hsu


On 十一月 18, 2015, 5:21 a.m., Ratandeep Ratti wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38663/
> ---
> 
> (Updated 十一月 18, 2015, 5:21 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-11878
> https://issues.apache.org/jira/browse/HIVE-11878
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are 
> registered one at a time in Hive
> 
> 
> Diffs
> -
> 
>   conf/ivysettings.xml bda842a89bb07710fdcd7180a00833a7388ada8f 
>   itests/custom-udfs/pom.xml PRE-CREATION 
>   itests/custom-udfs/udf-classloader-udf1/pom.xml PRE-CREATION 
>   
> itests/custom-udfs/udf-classloader-udf1/src/main/java/hive/it/custom/udfs/UDF1.java
>  PRE-CREATION 
>   itests/custom-udfs/udf-classloader-udf2/pom.xml PRE-CREATION 
>   
> itests/custom-udfs/udf-classloader-udf2/src/main/java/hive/it/custom/udfs/UDF2.java
>  PRE-CREATION 
>   itests/custom-udfs/udf-classloader-util/pom.xml PRE-CREATION 
>   
> itests/custom-udfs/udf-classloader-util/src/main/java/hive/it/custom/udfs/Util.java
>  PRE-CREATION 
>   itests/pom.xml 0686f1fd58c2be26b2ee645c4e244159aec565e5 
>   itests/qtest/pom.xml 8db6fb04d0a5d4600bc23543a0215d31c1cd0648 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/UDFClassLoader.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> de2eb984159526048e8dacf71d3ff8b0647394a3 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
> ff875df98e1dd64a8af3ad22f4b38dbc1d6a1923 
>   ql/src/test/queries/clientpositive/udf_classloader.q PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/udf_classloader_dynamic_dependency_resolution.q
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/udf_classloader.q.out PRE-CREATION 
>   
> ql/src/test/results/clientpositive/udf_classloader_dynamic_dependency_resolution.q.out
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/38663/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ratandeep Ratti
> 
>

[jira] [Created] (HIVE-11951) DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES

2015-09-24 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-11951:
--

 Summary: DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES
 Key: HIVE-11951
 URL: https://issues.apache.org/jira/browse/HIVE-11951
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Anthony Hsu


Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE 
DATABASE EXTENDED}}. To reproduce:
{code}
create database test with dbproperties('foo'='bar');
desc database extended test;
{code}

The output I see is
{code}
> desc database extended test;
OK
testhdfs://:/path/to/test.dbahsu
Time taken: 0.019 seconds, Fetched: 1 row(s)
{code}
I do not see the {{foo=bar}} property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 38663: HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-09-23 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38663/#review100274
---



contrib/src/java/org/apache/hadoop/hive/contrib/classloader/UDF2.java (line 32)
<https://reviews.apache.org/r/38663/#comment157424>

Should UDF1 be replaced with UDF2?


- Anthony Hsu


On 九月 23, 2015, 5:38 a.m., Ratandeep Ratti wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38663/
> ---
> 
> (Updated 九月 23, 2015, 5:38 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-11878
> https://issues.apache.org/jira/browse/HIVE-11878
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are 
> registered one at a time in Hive
> 
> 
> Diffs
> -
> 
>   contrib/src/java/org/apache/hadoop/hive/contrib/classloader/ClassA.java 
> PRE-CREATION 
>   contrib/src/java/org/apache/hadoop/hive/contrib/classloader/UDF1.java 
> PRE-CREATION 
>   contrib/src/java/org/apache/hadoop/hive/contrib/classloader/UDF2.java 
> PRE-CREATION 
>   itests/pom.xml acce7131948edd5aeab34af6879d781daa12ba30 
>   itests/qtest/pom.xml 0588233b250f7c78f594bb36554a80990e907550 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/UDFClassLoader.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> ca863019f3347c94852dcad2a21c43758aed30a7 
>   ql/src/test/queries/clientpositive/test_classloader.q PRE-CREATION 
>   ql/src/test/results/clientpositive/test_classloader.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/38663/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ratandeep Ratti
> 
>

[jira] [Commented] (HIVE-9022) When creating external tables, Hive needs to verify whether the user has read permissions to the data

2015-02-20 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329317#comment-14329317
 ] 

Anthony Hsu commented on HIVE-9022:
---

Could you please upload this patch to the [Apache Review 
Board|https://reviews.apache.org]? It makes it easier to review and add 
comments. One suggestion is we should add a test case for this.

 When creating external tables, Hive needs to verify whether the user has read 
 permissions to the data
 -

 Key: HIVE-9022
 URL: https://issues.apache.org/jira/browse/HIVE-9022
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Anant Nag
  Labels: patch
 Attachments: createExternal.patch


 Hive doesn't verify whether user has read permissions on the data before 
 creating external table referring to the data. This needs to be fixed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9021) Hive should not allow any user to create tables in other hive DB's that user doesn't own

2015-02-20 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329339#comment-14329339
 ] 

Anthony Hsu commented on HIVE-9021:
---

I don't think this feature is necessary. Hive has a [Storage-Based 
Authorization 
Model|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization#LanguageManualAuthorization-1StorageBasedAuthorizationintheMetastoreServer]
 that uses HDFS permissions for authorization. If a user does not want other 
users to be able to create tables in his database, he should set the 
permissions for his database's directory on HDFS accordingly (such as to 
rwxr-xr-x).

 Hive should not allow any user to create tables in other hive DB's that user 
 doesn't own
 

 Key: HIVE-9021
 URL: https://issues.apache.org/jira/browse/HIVE-9021
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Anant Nag
  Labels: patch
 Attachments: db.patch


 Hive allows users to create tables in other users db. This should not be 
 allowed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9020) When dropping external tables, Hive should not verify whether user has access to the data.

2015-02-20 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329397#comment-14329397
 ] 

Anthony Hsu commented on HIVE-9020:
---

Patch looks fine to me, apart from some formatting issues (indentation and 
spaces around {{}}). I agree with Thejas that we should add a unit test for 
this.

 When dropping external tables, Hive should not verify whether user has access 
 to the data. 
 ---

 Key: HIVE-9020
 URL: https://issues.apache.org/jira/browse/HIVE-9020
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Anant Nag
 Attachments: dropExternal.patch


 When dropping tables, hive verifies whether the user has access to the data 
 on hdfs. It fails, if user doesn't have access. It makes sense for internal 
 tables since the data has to be deleted when dropping internal tables but for 
 external tables, Hive should not check for data access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-3779) An empty value to hive.logquery.location can't disable the creation of hive history log files

2014-11-06 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200521#comment-14200521
 ] 

Anthony Hsu commented on HIVE-3779:
---

In case you're still using an older version of Hive that doesn't let you 
disable the history log files, one workaround that you can use is to run
{code}
!rm -r /path/to/hive.querylog.location;
{code}
as your first shell command before running your queries.

 An empty value to hive.logquery.location can't disable the creation of hive 
 history log files
 -

 Key: HIVE-3779
 URL: https://issues.apache.org/jira/browse/HIVE-3779
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9.0
Reporter: Bing Li
Priority: Minor

 In AdminManual Configuration 
 (https://cwiki.apache.org/Hive/adminmanual-configuration.html), the 
 description of hive.querylog.location mentioned that if the variable set to 
 empty string structured log will not be created.
 But it fails with the following setting,
 property
   namehive.querylog.location/name
   value/value 
 /property
 It seems that it can NOT get an empty value from 
 HiveConf.ConfVars.HIVEHISTORYFILELOC, but the default value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8560) SerDes that do not inherit AbstractSerDe do not get table properties during initialize()

2014-10-25 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184000#comment-14184000
 ] 

Anthony Hsu commented on HIVE-8560:
---

I'm late to the party, but change looks good to me, too.  This won't affect the 
behavior of AbstractSerDes like AvroSerDes, so I'm cool with it :-).

 SerDes that do not inherit AbstractSerDe do not get table properties during 
 initialize()
 

 Key: HIVE-8560
 URL: https://issues.apache.org/jira/browse/HIVE-8560
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 0.14.0

 Attachments: HIVE-8560.1.patch


 Looks like this may have been introduced during HIVE-6835.  During 
 initialize(), 3rd party SerDes which do not inherit AbstractSerDe end up 
 getting a Properties object created by 
 SerDeUtils.createOverlayedProperties().  This properties object receives the 
 table properties as defaults.  So looking up a key by name will yield the 
 default value, but a call like getKeys() will not show any of the table 
 properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981278#comment-13981278
 ] 

Anthony Hsu commented on HIVE-6835:
---

I will do some local testing soon and let you know.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981425#comment-13981425
 ] 

Anthony Hsu commented on HIVE-6835:
---

I tried all the failed union_remove TestCliDriver tests locally and they all 
passed.  Looking at some of the previous precommit builds, several of them also 
have the same test failures, so I believe these test failures are unrelated to 
my changes.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981431#comment-13981431
 ] 

Anthony Hsu commented on HIVE-6835:
---

BTW, I have been doing all my development and testing against Hadoop 1.2.1 
(-Phadoop-1).

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981677#comment-13981677
 ] 

Anthony Hsu commented on HIVE-6835:
---

Thanks, [~xuefuz], for all your help and guidance.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Fix For: 0.14.0

 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-24 Thread Anthony Hsu

/TestAvroSerde.java a5d494f 
  
serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/TestBinarySortableSerDe.java
 e512f42 
  
serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java
 e8639ff 
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyArrayMapStruct.java 
714045b 
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleSerDe.java 
28eb868 
  
serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java
 69c891d 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestCrossMapEqualComparer.java
 a69fcb7 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestSimpleMapEqualComparer.java
 dd9610e 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
2a113d5 

Diff: https://reviews.apache.org/r/20096/diff/


Testing
---

Added test cases


Thanks,

Anthony Hsu

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-24 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: HIVE-6835.5.patch

Uploaded a new patch addressing [~xuefuz]'s comments.  I removed the 
getOverlayedProperties() from PartitionDesc and added a new 
createOverlayedProperties() method in SerDeUtils.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-24 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980100#comment-13980100
 ] 

Anthony Hsu commented on HIVE-6835:
---

Also updated [the RB|https://reviews.apache.org/r/20096/].

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-24 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Status: Patch Available  (was: Open)

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-23 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Status: Open  (was: Patch Available)

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-23 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: HIVE-6835.4.patch

Thanks for the suggestions and clarification, [~xuefuz].  I have uploaded a new 
patch (HIVE-6835.4.patch) using the new approach.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-23 Thread Anthony Hsu

/TestLazyBinaryColumnarSerDe.java
 e8639ff 
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyArrayMapStruct.java 
714045b 
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleSerDe.java 
28eb868 
  
serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java
 69c891d 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestCrossMapEqualComparer.java
 a69fcb7 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestSimpleMapEqualComparer.java
 dd9610e 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
2a113d5 

Diff: https://reviews.apache.org/r/20096/diff/


Testing
---

Added test cases


Thanks,

Anthony Hsu

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-23 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979175#comment-13979175
 ] 

Anthony Hsu commented on HIVE-6835:
---

P.S.: I also updated my Review Board request: 
https://reviews.apache.org/r/20096/

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-22 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977337#comment-13977337
 ] 

Anthony Hsu commented on HIVE-6835:
---

I started looking into this alternative and encountered an issue.  Most calls 
to serde.initialize() are treating serde as a Deserializer (interface).  I 
would either have to change the interface (and change all the implementations) 
or cast the Deserializer as an AbstractSerDe (whenever I want to use the new 
initialize() method), neither of which seems like a great solution. So I am 
back to supporting my original table. prefix approach. Any thoughts on this?

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-22 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977710#comment-13977710
 ] 

Anthony Hsu commented on HIVE-6835:
---

Yes, this is possible, but I would have to add these instanceof AbstractSerde 
checks and then cast the Deserializer as an AbstractSerde before I can use the 
new initialize() method.  There are dozens of usages of .initialize() and 
adding all this type checking/casting code in so many places just for this new 
method doesn't seem very clean to me.

Also, if we add the new initialize() method, what should we do for table-level 
serde initialization?  When dealing with the table, there are no partition 
properties, so are we supposed to pass the table properties for both the 
tblProps and partProps arguments? If we leave partProps null, then the default 
new initialize() method implementation will just pass null to the old 
initialize() method.

There doesn't seem to be a very clean way of adding a new initialize() method 
without creating a lot of redundant boilerplate code and creating confusion 
which initialize() method to use and what values to pass in.  Given these 
concerns, I feel that prepending table. might be a cleaner and less confusing 
approach.  What are your thoughts on this?

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-21 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976144#comment-13976144
 ] 

Anthony Hsu commented on HIVE-6835:
---

Great, sounds like we're on the same page. I'll implement this new approach and 
upload a new patch soon.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974192#comment-13974192
 ] 

Anthony Hsu commented on HIVE-6835:
---

I'm guessing the schema was specified in the SERDEPROPERTIES to work around 
HIVE-3953.  However, one issue with storing the schema in TBLPROPERTIES instead 
is that for partitioned tables, when you do a {{describe \[extended] 
table_name partition(...);}}, you get
{code}
error_error_error_error_error_error_error   string  from 
deserializer   
cannot_determine_schema string  from deserializer   
check   string  from deserializer   
schema  string  from deserializer   
url string  from deserializer   
and string  from deserializer   
literal string  from deserializer
{code}
because the AvroSerDe cannot find avro.schema.literal or avro.schema.url.  
If you store the schema in SERDEPROPERTIES, you don't get this issue, since the 
SERDEPROPERTIES get copied to the partition when it is created.

I do think it is useful to make both the table-level properties and the 
partition-level properties available separately to the SerDe when it's doing 
its .initalize().  The SerDe should be able to decide which set of properties 
it wants to use. From this point of view, I think my change is still useful and 
valid.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974342#comment-13974342
 ] 

Anthony Hsu commented on HIVE-6835:
---

If TBLPROPERTIES were copied to the partition, then you still might have the 
problem of the table-level Avro schema and the partition-level Avro schema 
getting out of sync, which might lead to ClassCastExceptions.  The Avro schema 
should always use the latest table-level schema, whether it is stored in 
TBLPROPERTIES or SERDEPROPERTIES.

The root of the problem is if an Avro schema somehow ends up in the partition 
properties, these could get out of sync with the table-level properties.  The 
Avro SerDe should always be using the table-level schema, and that's why my 
change was to (1) make the table-level properties available to the serde, and 
(2) change the Avro SerDe to use the table-level properties when present.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974670#comment-13974670
 ] 

Anthony Hsu commented on HIVE-6835:
---

[~xuefuz] and [~ashutoshc], just to clarify, is this the alternative solution 
you're proposing?:
# Add
{code}
public void initialize(Configuration configuration, Properties tableProperties, 
Properties partitionProperties) throws SerDeException;
{code}
to AbstractSerDe and provide a default implementation that just calls 
{{initialize(configuration, partitionProperties)}}
# Change all calls of {{partitionSerde.initialize(conf, partProps)}} to 
{{partitionSerde.initialize(conf, tblProps, partProps)}}
# Add
{code}
@Override
public void initialize(Configuration configuration, Properties tableProperties, 
Properties partitionProperties) throws SerDeException;
{code}
to AvroSerDe and provide an implementation that just uses the tableProperties

I am okay with taking this approach, though it involves a lot more code changes 
and will change the public AbstractSerDe API.  Let me know what your thoughts 
on this approach are.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973279#comment-13973279
 ] 

Anthony Hsu commented on HIVE-6835:
---

The AvroSerDe handles schema evolution as described in 
http://avro.apache.org/docs/current/spec.html#Schema+Resolution.  However, in 
the Hive code, the AvroSerDe needs to always be initialized with the latest 
schema so that ObjectInspectorConverters.getConvertedOI() (in 
FetchOperator:getRecordReader()) will work.  When the AvroSerDe actually reads 
the Avro file, it will then compare the latest schema to the actual schema 
stored in the Avro file and do schema resolution/evolution.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973615#comment-13973615
 ] 

Anthony Hsu commented on HIVE-6835:
---

What happens is Hive tries to build ObjectInspectorConverters from the 
partition schema to the table schema.  If the partition schema is different 
from the table schema, you may get a ClassCastException like above.

When you add new columns at the end, this is not a problem because these new 
columns are chopped off.  See ObjectInspectorConverters:StructConverter:
{code}
int minFields = Math.min(inputFields.size(), outputFields.size());
fieldConverters = new ArrayListConverter(minFields);
{code}
It's only when you insert new columns at the beginning or in the middle that 
you might run into ClassCastExceptions.

For the AvroSerDe, if it always uses the latest schema (which should be the 
table-level schema), Hive will not get confused when constructing its 
ObjectInspectorConverters.  Then, later, when the AvroSerDe actually goes to 
read the Avro files, it can compare the latest schema with the (possibly old) 
schemas stored in the Avro data files themselves, and do the proper schema 
resolution, omitting fields or substituting default values, following the 
[schema resolution 
rules|http://avro.apache.org/docs/current/spec.html#Schema+Resolution].

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Status: Patch Available  (was: Open)

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973654#comment-13973654
 ] 

Anthony Hsu commented on HIVE-6835:
---

On a side note: If you create an Avro table and store the schema in the 
TBLPROPERTIES -
{code}
CREATE TABLE ... TBLPROPERTIES ('avro.schema.literal'='...');
{code}
\- everything works fine with partitions because TBLPROPERTIES are NOT copied 
to the partition, so the partition will end using the TBLPROPERTIES for 
initializing the Avro SerDe.

It's only when you store the schema in the SERDEPROPERTIES -
{code}
CREATE TABLE ... WITH SERDEPROPERTIES ('avro.schema.literal'='...');
{code}
\- that problems arise.  SERDEPROPERTIES DO get copied to the partitions, so if 
you then end up changing the SERDEPROPERTIES stored at the table level, the 
SERDEPROPERTIES in the table and the partitions get out of sync and this 
sometimes leads to ClassCastExceptions with the AvroSerDe.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-16 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: (was: HIVE-6835.2.patch)

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-16 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: HIVE-6835.2.patch

Reuploading patch version 2 to trigger the tests again.  I ran locally the 
tests that failed in the last pre-commit build run, and they both passed for me.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-16 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20096/
---

(Updated April 17, 2014, 1:14 a.m.)


Review request for hive.


Changes
---

Addressed Ashutosh's comments in HIVE-6835. Added the constant to serde.thrift 
and used the Thrift compiler to generate all the language-specific bindings.


Repository: hive-git


Description
---

The problem occurs when you store the avro.schema.(literal|url) in the 
SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the 
table's schema, and then try reading from the old partition.

I fixed this problem by passing the table properties to the partition with a 
table. prefix, and changing the Avro SerDe to always use the table properties 
when available.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c 
  ql/src/test/queries/clientpositive/avro_partitioned.q 6fe5117 
  ql/src/test/results/clientpositive/avro_partitioned.q.out 644716d 
  serde/if/serde.thrift 31c87ee 
  serde/src/gen/thrift/gen-cpp/serde_constants.h d56c917 
  serde/src/gen/thrift/gen-cpp/serde_constants.cpp 54503e3 
  
serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java
 515cf25 
  serde/src/gen/thrift/gen-php/org/apache/hadoop/hive/serde/Types.php 837dd11 
  serde/src/gen/thrift/gen-py/org_apache_hadoop_hive_serde/constants.py 8eac87d 
  serde/src/gen/thrift/gen-rb/serde_constants.rb ed86522 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 
67d5570 

Diff: https://reviews.apache.org/r/20096/diff/


Testing
---

Added test cases


Thanks,

Anthony Hsu

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-16 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: HIVE-6835.3.patch

Thanks for catching this, Ashutosh.  My bad for not noticing I was modifying a 
generated file.  I have updated my [Review Board 
request|https://reviews.apache.org/r/20096/] and also uploaded a new patch.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-14 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20096/
---

(Updated April 14, 2014, 6:49 p.m.)


Review request for hive.


Changes
---

Addressed Carl's comments. Changes:
- Reverted whitespace changes.
- Moved the TABLE_PROP_PREFIX (table.) to serdeConstants.
- Removed code that mutated the Properties passed to the AvroSerDe
- Added/improved comments
- Synced with latest


Repository: hive-git


Description
---

The problem occurs when you store the avro.schema.(literal|url) in the 
SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the 
table's schema, and then try reading from the old partition.

I fixed this problem by passing the table properties to the partition with a 
table. prefix, and changing the Avro SerDe to always use the table properties 
when available.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c 
  ql/src/test/queries/clientpositive/avro_partitioned.q 6fe5117 
  ql/src/test/results/clientpositive/avro_partitioned.q.out 644716d 
  
serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java
 515cf25 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 
67d5570 

Diff: https://reviews.apache.org/r/20096/diff/


Testing
---

Added test cases


Thanks,

Anthony Hsu

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-14 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: HIVE-6835.2.patch

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-14 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Status: Patch Available  (was: Open)

Thanks for the very thorough code review, [~cwsteinbach].  I've uploaded a new 
patch that addresses your comments and also updated the Review Board request.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20096/
---

Review request for hive.


Repository: hive-git


Description
---

The problem occurs when you store the avro.schema.(literal|url) in the 
SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the 
table's schema, and then try reading from the old partition.

I fixed this problem by passing the table properties to the partition with a 
table. prefix, and changing the Avro SerDe to always use the table properties 
when available.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c 
  ql/src/test/queries/clientpositive/avro_partitioned.q 068a13c 
  ql/src/test/results/clientpositive/avro_partitioned.q.out 352ec0d 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 
67d5570 

Diff: https://reviews.apache.org/r/20096/diff/


Testing
---

Added test cases


Thanks,

Anthony Hsu

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: HIVE-6835.1.patch

Uploaded a patch with a fix.  Review Board link: 
https://reviews.apache.org/r/20096/

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu reassigned HIVE-6835:
-

Assignee: Anthony Hsu

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Assignee: (was: Anthony Hsu)
  Status: Patch Available  (was: Open)

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
 Attachments: HIVE-6835.1.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-03 Thread Anthony Hsu (JIRA)

Anthony Hsu created HIVE-6835:
-

 Summary: Reading of partitioned Avro data fails if partition 
schema does not match table schema
 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu


To reproduce:
{code}
create table testarray (a arraystring);

load data local inpath '/home/ahsu/test/array.txt' into table testarray;

# create partitioned Avro table with one array column
create table avroarray (a arraystring) partitioned by (y string) row format 
serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
('avro.schema.literal'='{namespace:test,name:avroarray,type: 
record, fields: [ { name:a, type:{type:array,items:string} } 
] }')  STORED as INPUTFORMAT  
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';

insert into table avroarray partition(y=1) select * from testarray;

# add an int column with a default value of 0
alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
with 
serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
 record, fields: [ {name:intfield,type:int,default:0},{ 
name:a, type:{type:array,items:string} } ] }');

# fails with ClassCastException
select * from avroarray;
{code}
The select * fails with:
{code}
Failed with exception java.io.IOException:java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-03 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958989#comment-13958989
 ] 

Anthony Hsu commented on HIVE-6835:
---

Right now, when AvroSerDe.initialize() is called, the Properties it is passed 
include both table and partition properties, with the partition properties 
*overriding* the table properties.  The AvroSerDe needs the *latest* schema 
(which should be stored in the table properties) for proper initialization and 
to prevent the ClassCastException.  My proposal is to pass both the table and 
partition properties to SerDe.initialize() by prepending the table properties 
with table., and let the SerDe decide which set of properties to use.

BTW, here's the full stack trace when you do the select *:
{code}
Failed with exception java.io.IOException:java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
14/04/03 10:11:02 ERROR CliDriver: Failed with exception 
java.io.IOException:java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
java.io.IOException: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:551)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1471)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:272)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:414)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:782)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:676)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:148)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.init(ObjectInspectorConverters.java:304)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:150)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:407)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:515)
... 14 more
{code}

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu

 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray (a arraystring) partitioned by (y string) row format 
 serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield

[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-03 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Description: 
To reproduce:
{code}
create table testarray (a arraystring);

load data local inpath '/home/ahsu/test/array.txt' into table testarray;

# create partitioned Avro table with one array column
create table avroarray partitioned by (y string) row format serde 
'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
('avro.schema.literal'='{namespace:test,name:avroarray,type: 
record, fields: [ { name:a, type:{type:array,items:string} } 
] }')  STORED as INPUTFORMAT  
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';

insert into table avroarray partition(y=1) select * from testarray;

# add an int column with a default value of 0
alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
with 
serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
 record, fields: [ {name:intfield,type:int,default:0},{ 
name:a, type:{type:array,items:string} } ] }');

# fails with ClassCastException
select * from avroarray;
{code}
The select * fails with:
{code}
Failed with exception java.io.IOException:java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
{code}

  was:
To reproduce:
{code}
create table testarray (a arraystring);

load data local inpath '/home/ahsu/test/array.txt' into table testarray;

# create partitioned Avro table with one array column
create table avroarray (a arraystring) partitioned by (y string) row format 
serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
('avro.schema.literal'='{namespace:test,name:avroarray,type: 
record, fields: [ { name:a, type:{type:array,items:string} } 
] }')  STORED as INPUTFORMAT  
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';

insert into table avroarray partition(y=1) select * from testarray;

# add an int column with a default value of 0
alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
with 
serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
 record, fields: [ {name:intfield,type:int,default:0},{ 
name:a, type:{type:array,items:string} } ] }');

# fails with ClassCastException
select * from avroarray;
{code}
The select * fails with:
{code}
Failed with exception java.io.IOException:java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
{code}


 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu

 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-31 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955374#comment-13955374
 ] 

Anthony Hsu commented on HIVE-6570:
---

[~leftylev] - Thanks for the instructions!
[~xuefuz] - Thanks for committing this!

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Fix For: 0.14.0

 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-30 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954616#comment-13954616
 ] 

Anthony Hsu commented on HIVE-6570:
---

It would probably be nice to add an example to the Variable Substitution page 
that uses variable substitution with the source command.

On a side note, how does one get edit privileges for the wiki?

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-29 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954321#comment-13954321
 ] 

Anthony Hsu commented on HIVE-6570:
---

Thanks.  Could one of you guys commit the patch for me please?

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-28 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951347#comment-13951347
 ] 

Anthony Hsu commented on HIVE-6570:
---

What concerns does [~appodictic] have?

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-21 Thread Anthony Hsu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943740#comment-13943740
 ] 

Anthony Hsu commented on HIVE-6570:
---

Ping

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-17 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6570:
--

Release Note: 
This patch adds Hive variable substitution support to the source command.  
For example, you will now be able to use a statement such as:
source ${hivevar:test-dir}/test.q; 

Added a Release Note explaining the changes in this patch.

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

1 2 >

1 - 100 of 109 matches

Mail list logo