[jira] [Created] (HIVE-18802) Incorrect results when referencing same Accumulo table multiple times in one query
Anthony Hsu created HIVE-18802: -- Summary: Incorrect results when referencing same Accumulo table multiple times in one query Key: HIVE-18802 URL: https://issues.apache.org/jira/browse/HIVE-18802 Project: Hive Issue Type: Bug Affects Versions: 3.0.0 Reporter: Anthony Hsu While investigating HIVE-18695, I noticed incorrect results returned by the following Accumulo query: {code:java} DROP TABLE accumulo_test; CREATE TABLE accumulo_test(key int, value int) STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' WITH SERDEPROPERTIES ("accumulo.columns.mapping" = ":rowID,cf:string") TBLPROPERTIES ("accumulo.table.name" = "accumulo_table_0"); INSERT OVERWRITE TABLE accumulo_test VALUES (0,0), (1,1), (2,2), (3,3); SELECT * from accumulo_test where key == 1 union all select * from accumulo_test where key == 2;{code} The expected output is {code:java} 1 1 2 2{code} but the actual output is {code:java} 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54341/ --- (Updated 二月 6, 2018, 3:01 a.m.) Review request for hive, Carl Steinbach and Ratandeep Ratti. Changes --- Rebased on HEAD. Bugs: HIVE-15353 https://issues.apache.org/jira/browse/HIVE-15353 Repository: hive-git Description (updated) --- Updated HiveAlterHandler.updateOrGetPartitionColumnStats to handle null `oldCols`. Diffs (updated) - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 89354a2d34249903a9ff13c4ed913a68de93057e Diff: https://reviews.apache.org/r/54341/diff/4/ Changes: https://reviews.apache.org/r/54341/diff/3-4/ Testing --- After making these changes, I no longer encounter NullPointerExceptions when setting cols to null in create_table, alter_table, and alter_partition calls. Thanks, Anthony Hsu
Re: Review Request 62321: HIVE-17530: ClassCastException when converting uniontype
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62321/ --- (Updated 九月 15, 2017, 1:52 a.m.) Review request for hive, Carl Steinbach and Ratandeep Ratti. Changes --- * Fixed test TestObjectInspectorConverters.testObjectInspectorConverters() * Renamed SettableUnionObjectInspector.addField() to setFieldAndTag(). Bugs: HIVE-17530 https://issues.apache.org/jira/browse/HIVE-17530 Repository: hive-git Description --- Previously, StandardUnionObjectInspector was creating an ArrayList instead of a StandardUnion, causing the exception ``` java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.UnionObject ``` This patch fixes this. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorDeserializeRow.java 2ad06fc12869e74e14aae7b7a36685482c4a1ade ql/src/test/queries/clientpositive/orc_avro_partition_uniontype.q PRE-CREATION ql/src/test/results/clientpositive/orc_avro_partition_uniontype.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java 7921de8d9c4a56af715de5498954794aaba32fff serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SettableUnionObjectInspector.java 564d8d60451d9756eca1f1edcc84248e4f559828 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardUnionObjectInspector.java 7b2868233f127899c7dca07d4f899b24ae2cbc1b serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java 2e1bb22cea715501749ee5e169ce34f7dc789e64 Diff: https://reviews.apache.org/r/62321/diff/2/ Changes: https://reviews.apache.org/r/62321/diff/1-2/ Testing --- Added qtest. Thanks, Anthony Hsu
Review Request 62321: HIVE-17530: ClassCastException when converting uniontype
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62321/ --- Review request for hive, Carl Steinbach and Ratandeep Ratti. Bugs: HIVE-17530 https://issues.apache.org/jira/browse/HIVE-17530 Repository: hive-git Description --- Previously, StandardUnionObjectInspector was creating an ArrayList instead of a StandardUnion, causing the exception ``` java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.UnionObject ``` This patch fixes this. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorDeserializeRow.java 2ad06fc12869e74e14aae7b7a36685482c4a1ade ql/src/test/queries/clientpositive/orc_avro_partition_uniontype.q PRE-CREATION ql/src/test/results/clientpositive/orc_avro_partition_uniontype.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java 7921de8d9c4a56af715de5498954794aaba32fff serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SettableUnionObjectInspector.java 564d8d60451d9756eca1f1edcc84248e4f559828 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardUnionObjectInspector.java 7b2868233f127899c7dca07d4f899b24ae2cbc1b Diff: https://reviews.apache.org/r/62321/diff/1/ Testing --- Added qtest. Thanks, Anthony Hsu
[jira] [Created] (HIVE-17530) ClassCastException when converting uniontype
Anthony Hsu created HIVE-17530: -- Summary: ClassCastException when converting uniontype Key: HIVE-17530 URL: https://issues.apache.org/jira/browse/HIVE-17530 Project: Hive Issue Type: Bug Affects Versions: 1.1.0, 3.0.0 Reporter: Anthony Hsu Assignee: Anthony Hsu To repro: {noformat} SET hive.exec.schema.evolution = false; CREATE TABLE avro_orc_partitioned_uniontype (a uniontype<boolean, string>) PARTITIONED BY (b int) STORED AS ORC; INSERT INTO avro_orc_partitioned_uniontype PARTITION (b=1) SELECT create_union(1, true, value) FROM src LIMIT 5; ALTER TABLE avro_orc_partitioned_uniontype SET FILEFORMAT AVRO; SELECT * FROM avro_orc_partitioned_uniontype; {noformat} The exception you get is: {code} java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.UnionObject {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Review Request 62247: HIVE-17394: AvroSerde is regenerating TypeInfo objects for each nullable Avro field for every row
> On 九月 12, 2017, 5:02 p.m., Ratandeep Ratti wrote: > > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java > > Line 305 (original), 305 (patched) > > <https://reviews.apache.org/r/62247/diff/1/?file=1820197#file1820197line305> > > > > This comment is misleading now and can be removed. Carl fixed this before committing. Thanks, Carl! - Anthony --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62247/#review185212 ------- On 九月 12, 2017, 10:43 p.m., Anthony Hsu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62247/ > --- > > (Updated 九月 12, 2017, 10:43 p.m.) > > > Review request for hive, Carl Steinbach and Ratandeep Ratti. > > > Bugs: HIVE-17394 > https://issues.apache.org/jira/browse/HIVE-17394 > > > Repository: hive-git > > > Description > --- > > Previously, when Avro found a nullable union in the reader schema, it would > regenerate the TypeInfo for the field for every record. This patch reuses the > same TypeInfo that only needs to be calculated once. > > In our testing, we found this improved count() queries by 2x. > > > Diffs > - > > serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java > ecfe15f59dac04bda3f8f1275babebf736608a6b > > > Diff: https://reviews.apache.org/r/62247/diff/2/ > > > Testing > --- > > `mvn clean package -DskipTests -Dmaven.javadoc.skip=true` succeeded. > > > Thanks, > > Anthony Hsu > >
Re: Review Request 62247: HIVE-17394: AvroSerde is regenerating TypeInfo objects for each nullable Avro field for every row
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62247/ --- (Updated 九月 12, 2017, 10:43 p.m.) Review request for hive, Carl Steinbach and Ratandeep Ratti. Changes --- Addressed Ratandeep's comment. Bugs: HIVE-17394 https://issues.apache.org/jira/browse/HIVE-17394 Repository: hive-git Description --- Previously, when Avro found a nullable union in the reader schema, it would regenerate the TypeInfo for the field for every record. This patch reuses the same TypeInfo that only needs to be calculated once. In our testing, we found this improved count() queries by 2x. Diffs (updated) - serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java ecfe15f59dac04bda3f8f1275babebf736608a6b Diff: https://reviews.apache.org/r/62247/diff/2/ Changes: https://reviews.apache.org/r/62247/diff/1-2/ Testing --- `mvn clean package -DskipTests -Dmaven.javadoc.skip=true` succeeded. Thanks, Anthony Hsu
Review Request 62247: HIVE-17394: AvroSerde is regenerating TypeInfo objects for each nullable Avro field for every row
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62247/ --- Review request for hive, Carl Steinbach and Ratandeep Ratti. Bugs: HIVE-17394 https://issues.apache.org/jira/browse/HIVE-17394 Repository: hive-git Description --- Previously, when Avro found a nullable union in the reader schema, it would regenerate the TypeInfo for the field for every record. This patch reuses the same TypeInfo that only needs to be calculated once. In our testing, we found this improved count() queries by 2x. Diffs - serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java ecfe15f59dac04bda3f8f1275babebf736608a6b Diff: https://reviews.apache.org/r/62247/diff/1/ Testing --- `mvn clean package -DskipTests -Dmaven.javadoc.skip=true` succeeded. Thanks, Anthony Hsu
Re: Review Request 60303: HIVE-16908: Update table and partition replication tests to not use 2nd HCat instance
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60303/#review178710 --- hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java Lines 796-807 (original) <https://reviews.apache.org/r/60303/#comment252859> Instead of deleting this, what about just starting the second metastore in a separate process? Then we can preserve the end-to-end integration-esque nature of the tests. hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java Line 1011 (original), 996 (patched) <https://reviews.apache.org/r/60303/#comment252858> its -> it's - Anthony Hsu On 六月 22, 2017, 12:59 a.m., Sunitha Beeram wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60303/ > --- > > (Updated 六月 22, 2017, 12:59 a.m.) > > > Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti. > > > Bugs: HIVE-16908 > https://issues.apache.org/jira/browse/HIVE-16908 > > > Repository: hive-git > > > Description > --- > > HIVE-16908: Update table and partition replication tests to not use 2nd HCat > instance > > HIVE-16844 fixed a connection leak issue which subsequently exposed failures > in TestHCatClient. The connection leak gets triggered if a metastore instance > is updated with a different JDO configuration. TestHCatClient uses 2 > metastore instances to test replication related methods. Unfortunately, it > does so by providing a different derby db name for the second instance. Since > the 2 metastores run in the same JVM, the path fixed in HIVE-16844 gets > triggered, resulting in "sourceMetastore"'s connection being closed and thus > resulting in failures. > > It appears to me that running 2 metastore instances within the same JVM is > error prone as there could be unintentional side-effects due to statics in > the code (as was exposed by fixing HIVE-16844). This patch provides a way to > test the replication related methods without involving a second instance. The > changes mainly validate the serialize/deserialize methods. One of the tests, > testPartitionRegistrationWithCustomSchema, uses addPartitions method to > verify propogation of changes and it appeared that addPartitions wasn't > covered by other tests in TestHCatClient and there wasn't a better way to > verify the intended path, so I used an approach where the original database > and table are dropped and recreated using the serialized-string and captured > partition spec. > > > Diffs > - > > > hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java > 86d3acbcb462d244fa2dc2f48923aab1e3ccee66 > > > Diff: https://reviews.apache.org/r/60303/diff/2/ > > > Testing > --- > > mvn test -DTest=TestHCatClient now passes. > > > Thanks, > > Sunitha Beeram > >
Re: Review Request 59885: HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59885/#review177432 --- Ship it! Looks good to me. - Anthony Hsu On 六月 7, 2017, 4:29 p.m., Sunitha Beeram wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59885/ > --- > > (Updated 六月 7, 2017, 4:29 p.m.) > > > Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti. > > > Bugs: HIVE-16844 > https://issues.apache.org/jira/browse/HIVE-16844 > > > Repository: hive-git > > > Description > --- > > HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used > > > Diffs > - > > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 4676e15942d72b0db56bedf0ff30aa60964c28d8 > > > Diff: https://reviews.apache.org/r/59885/diff/1/ > > > Testing > --- > > Can't provide unit tests to test the functionality, but problem is > reproducible and one way to simulate it is by setting pmf=null in > ObjectStore::setConf - you will notice leaked connections. With the fix the > same does not happen. > > > Thanks, > > Sunitha Beeram > >
Re: Review Request 59885: HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used
> On 六月 7, 2017, 8:45 p.m., Anthony Hsu wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > > Line 302 (original), 304 (patched) > > <https://reviews.apache.org/r/59885/diff/1/?file=1743915#file1743915line304> > > > > Do we need to close the PersistenceManager as well? > > Sunitha Beeram wrote: > Good point, but the call to shutdown() on line 301 closes pm. Ah, yes, thanks for pointing that out. - Anthony --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59885/#review177222 --- On 六月 7, 2017, 4:29 p.m., Sunitha Beeram wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59885/ > --- > > (Updated 六月 7, 2017, 4:29 p.m.) > > > Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti. > > > Bugs: HIVE-16844 > https://issues.apache.org/jira/browse/HIVE-16844 > > > Repository: hive-git > > > Description > --- > > HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used > > > Diffs > - > > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 4676e15942d72b0db56bedf0ff30aa60964c28d8 > > > Diff: https://reviews.apache.org/r/59885/diff/1/ > > > Testing > --- > > Can't provide unit tests to test the functionality, but problem is > reproducible and one way to simulate it is by setting pmf=null in > ObjectStore::setConf - you will notice leaked connections. With the fix the > same does not happen. > > > Thanks, > > Sunitha Beeram > >
Re: Review Request 59867: HIVE-16831: Add unit tests for NPE fixes in HIVE-12054
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59867/#review177223 --- Ship it! Looks good to me! - Anthony Hsu On 六月 6, 2017, 11:20 p.m., Sunitha Beeram wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59867/ > --- > > (Updated 六月 6, 2017, 11:20 p.m.) > > > Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti. > > > Bugs: HIVE-16831 > https://issues.apache.org/jira/browse/HIVE-16831 > > > Repository: hive-git > > > Description > --- > > HIVE-16831: Add unit tests for NPE fixes in HIVE-12054 > > > Diffs > - > > ql/src/test/queries/clientpositive/orc_empty_table.q PRE-CREATION > ql/src/test/results/clientpositive/orc_empty_table.q.out PRE-CREATION > > > Diff: https://reviews.apache.org/r/59867/diff/1/ > > > Testing > --- > > qtests pass. > > > Thanks, > > Sunitha Beeram > >
Re: Review Request 59885: HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59885/#review177222 --- metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java Line 302 (original), 304 (patched) <https://reviews.apache.org/r/59885/#comment250766> Do we need to close the PersistenceManager as well? - Anthony Hsu On 六月 7, 2017, 4:29 p.m., Sunitha Beeram wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59885/ > --- > > (Updated 六月 7, 2017, 4:29 p.m.) > > > Review request for hive, Carl Steinbach, Anthony Hsu, and Ratandeep Ratti. > > > Bugs: HIVE-16844 > https://issues.apache.org/jira/browse/HIVE-16844 > > > Repository: hive-git > > > Description > --- > > HIVE-16844: Fix Connection leak in ObjectStore when new Conf object is used > > > Diffs > - > > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 4676e15942d72b0db56bedf0ff30aa60964c28d8 > > > Diff: https://reviews.apache.org/r/59885/diff/1/ > > > Testing > --- > > Can't provide unit tests to test the functionality, but problem is > reproducible and one way to simulate it is by setting pmf=null in > ObjectStore::setConf - you will notice leaked connections. With the fix the > same does not happen. > > > Thanks, > > Sunitha Beeram > >
Review Request 59303: HIVE-16670: Hive should automatically clean up hive.downloaded.resources.dir
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59303/ --- Review request for hive. Bugs: HIVE-16670 https://issues.apache.org/jira/browse/HIVE-16670 Repository: hive-git Description --- HIVE-16670: Hive should automatically clean up hive.downloaded.resources.dir Diffs - ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java ffce1d1aec8728840bb8ef726db1b600a9aeef38 Diff: https://reviews.apache.org/r/59303/diff/1/ Testing --- Thanks, Anthony Hsu
[jira] [Created] (HIVE-16670) Hive should automatically clean up hive.downloaded.resources.dir
Anthony Hsu created HIVE-16670: -- Summary: Hive should automatically clean up hive.downloaded.resources.dir Key: HIVE-16670 URL: https://issues.apache.org/jira/browse/HIVE-16670 Project: Hive Issue Type: Improvement Reporter: Anthony Hsu Assignee: Anthony Hsu Currently, Hive does not automatically clean up the hive.downloaded.resources.dir, so resources and resource directories can accumulate over time. Ideally, Hive should automatically clean up the resources dir when the session ends. Ref: https://github.com/apache/hive/blob/0ce98b3a7527f72216e9e41f7e610b44ee524758/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L677-L678 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
> On Jan. 30, 2017, 4:53 p.m., Peter Vary wrote: > > Hi Anthony, > > > > I am not too familiar with the ORC tables, but currently wokring on > > enabling yetus on Hive. > > > > Yetus runs several checks which might help the work of the reviewers. Here > > is what Yetus found with the checkstyle plugin: > > > > ./ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:3679: > > addTableSchemaToConf(conf, tableScanOp.getSchemaEvolutionColumns(), > > tableScanOp.getSchemaEvolutionColumnsTypes());: warning: Line is longer > > than 100 characters (found 118). > > ./ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:3688: > > LOG.info(IOConstants.SCHEMA_EVOLUTION_COLUMNS + " and " + > > IOConstants.SCHEMA_EVOLUTION_COLUMNS_TYPES +: warning: Line is longer than > > 100 characters (found 108). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:610: > > pushFilters(jobConf, filterExpr, filterObj, serializedFilterObj, > > serializedFilterExpr, tableScan.getSchema(),: warning: Line is longer than > > 100 characters (found 113). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:614: > > public static void pushFilters(JobConf jobConf, ExprNodeGenericFuncDesc > > filterExpr, Serializable filterObject,: warning: Line is longer than 100 > > characters (found 112). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:614: > > public static void pushFilters(JobConf jobConf, ExprNodeGenericFuncDesc > > filterExpr, Serializable filterObject,:22: warning: More than 7 parameters > > (found 8). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:615: > > String serializedFilterObj, String serializedFilterExpr, RowSchema > > rowSchema, String schemaEvolutionColumns,: warning: Line is longer than 100 > > characters (found 114). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:743: > > pushFilters(jobConf, tableFilterExpr, filterObject, serializedFilterObj, > > serializedFilterExpr, rowSchema,: warning: Line is longer than 100 > > characters (found 109). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:747: > > private Set getAliasesForPath(Path splitPath, boolean nonNative, > > Path splitPathWithNoSchema) {: warning: Line is longer than 100 characters > > (found 104). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:791: > > private ExprNodeGenericFuncDesc buildTableFilterExpr(boolean noFilters, > > List filterExprs) {: warning: Line is longer than > > 100 characters (found 118). > > ./ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:796: > > if (tableFilterExpr == null ) {:38: warning: ')' is preceded with > > whitespace. > > > > Running Findbugs, ASF header check, etc did not found any new problems. > > > > Thanks for the patch! > > > > Peter Thanks for running Yetus on my patch, Peter! I addressed most of the warnings (except the "More than 7 parameters" one) in my revision. - Anthony --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55816/#review163528 --- On Jan. 31, 2017, 2:43 a.m., Anthony Hsu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/55816/ > --- > > (Updated Jan. 31, 2017, 2:43 a.m.) > > > Review request for hive. > > > Bugs: HIVE-15680 > https://issues.apache.org/jira/browse/HIVE-15680 > > > Repository: hive-git > > > Description > --- > > HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same > ORC table is referenced twice in query > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > 68dd5e7247415dec1e353010ea34481c4f2fc6cd > ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java > 51530ac16c92cc75d501bfcb573557754ba0c964 > ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q > PRE-CREATION > > ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out > PRE-CREATION > serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java > 1354680584305bc7ea928526160f08fc9cbfd73e > > Diff: https://reviews.apache.org/r/55816/diff/ > > > Testing > --- > > Added qtest. > > > Thanks, > > Anthony Hsu > >
Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55816/ --- (Updated Jan. 31, 2017, 2:43 a.m.) Review request for hive. Changes --- In HiveInputFormat.java, changed ``` ColumnProjectionUtils.setReadAllColumns(jobConf); ``` to ``` ColumnProjectionUtils.appendReadColumns(jobConf, new ArrayList(), new ArrayList(), new ArrayList()); ``` Also fixed most of the warnings reported by Peter (all except the "More than 7 parameters" one). Bugs: HIVE-15680 https://issues.apache.org/jira/browse/HIVE-15680 Repository: hive-git Description --- HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 68dd5e7247415dec1e353010ea34481c4f2fc6cd ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 51530ac16c92cc75d501bfcb573557754ba0c964 ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q PRE-CREATION ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 1354680584305bc7ea928526160f08fc9cbfd73e Diff: https://reviews.apache.org/r/55816/diff/ Testing --- Added qtest. Thanks, Anthony Hsu
Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55816/ --- (Updated Jan. 28, 2017, 9:41 p.m.) Review request for hive. Changes --- Add back `!neededColumnIDs.isEmpty()` check, add `newConfStr.isEmpty()` check in ColumnProjectionUtils.appendReadColumns(). Bugs: HIVE-15680 https://issues.apache.org/jira/browse/HIVE-15680 Repository: hive-git Description --- HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1cf24b41c047b9bc43e42a2940ff54a3e331190c ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 3ee8fdc24aa115710d2c42f5c44c7f28e0544589 ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q PRE-CREATION ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 1354680584305bc7ea928526160f08fc9cbfd73e Diff: https://reviews.apache.org/r/55816/diff/ Testing --- Added qtest. Thanks, Anthony Hsu
Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55816/ --- (Updated Jan. 26, 2017, 10:45 p.m.) Review request for hive. Changes --- Fix NPEs in LLAP tests. Bugs: HIVE-15680 https://issues.apache.org/jira/browse/HIVE-15680 Repository: hive-git Description --- HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1cf24b41c047b9bc43e42a2940ff54a3e331190c ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 3ee8fdc24aa115710d2c42f5c44c7f28e0544589 ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q PRE-CREATION ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out PRE-CREATION Diff: https://reviews.apache.org/r/55816/diff/ Testing --- Added qtest. Thanks, Anthony Hsu
Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55816/ --- (Updated Jan. 26, 2017, 5:42 p.m.) Review request for hive. Changes --- Added some missing null checks. Bugs: HIVE-15680 https://issues.apache.org/jira/browse/HIVE-15680 Repository: hive-git Description --- HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1cf24b41c047b9bc43e42a2940ff54a3e331190c ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 3ee8fdc24aa115710d2c42f5c44c7f28e0544589 ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q PRE-CREATION ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out PRE-CREATION Diff: https://reviews.apache.org/r/55816/diff/ Testing --- Added qtest. Thanks, Anthony Hsu
Re: Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55816/ --- (Updated Jan. 24, 2017, 12:54 a.m.) Review request for hive. Changes --- Added back code to setting needed column names and paths. Updated code to merge the names and paths. Bugs: HIVE-15680 https://issues.apache.org/jira/browse/HIVE-15680 Repository: hive-git Description --- HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1cf24b41c047b9bc43e42a2940ff54a3e331190c ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 3ee8fdc24aa115710d2c42f5c44c7f28e0544589 ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q PRE-CREATION ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out PRE-CREATION Diff: https://reviews.apache.org/r/55816/diff/ Testing --- Added qtest. Thanks, Anthony Hsu
Review Request 55816: HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55816/ --- Review request for hive. Bugs: HIVE-15680 https://issues.apache.org/jira/browse/HIVE-15680 Repository: hive-git Description --- HIVE-15680: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1cf24b41c047b9bc43e42a2940ff54a3e331190c ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 3ee8fdc24aa115710d2c42f5c44c7f28e0544589 ql/src/test/queries/clientpositive/orc_ppd_same_table_multiple_aliases.q PRE-CREATION ql/src/test/results/clientpositive/orc_ppd_same_table_multiple_aliases.q.out PRE-CREATION Diff: https://reviews.apache.org/r/55816/diff/ Testing --- Added qtest. Thanks, Anthony Hsu
[jira] [Created] (HIVE-15680) Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query
Anthony Hsu created HIVE-15680: -- Summary: Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query Key: HIVE-15680 URL: https://issues.apache.org/jira/browse/HIVE-15680 Project: Hive Issue Type: Bug Affects Versions: 1.1.0, 2.2.0 Reporter: Anthony Hsu Assignee: Anthony Hsu To repro: {noformat} set hive.optimize.index.filter=true; create table test_table(number int) stored as ORC; -- Two insertions will create two files, with one stripe each insert into table test_table VALUES (1); insert into table test_table VALUES (2); -- This should and does return 2 records select * from test_table; -- These should and do each return 1 record select * from test_table where number = 1; select * from test_table where number = 2; -- This should return 2 records but only returns 1 record select * from test_table where number = 1 union all select * from test_table where number = 2; {noformat} What's happening is only the last predicate is being pushed down. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15438) avrocountemptytbl.q should use SORT_QUERY_RESULTS
Anthony Hsu created HIVE-15438: -- Summary: avrocountemptytbl.q should use SORT_QUERY_RESULTS Key: HIVE-15438 URL: https://issues.apache.org/jira/browse/HIVE-15438 Project: Hive Issue Type: Bug Affects Versions: 1.1.0, 2.2.0 Reporter: Anthony Hsu Assignee: Anthony Hsu In Hive 1.1.0, when building and testing using Java 1.8, I've noticed that avrocountemptytbl.q due to ordering issues: {noformat} 57d56 < 100 58a58 > 100 {noformat} This can be fixed by adding {{-- SORT_QUERY_RESULTS}} to the qtest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 54765: HIVE-15411: ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54765/ --- Review request for hive. Bugs: HIVE-15411 https://issues.apache.org/jira/browse/HIVE-15411 Repository: hive-git Description --- HIVE-15411: ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES Diffs - hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java 18bf172116828439751ca4d0e99c83912f2b3915 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 3f5813018b9305734e66dcff76064d6e3e6061f1 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 55915a63be916b79dae022d76a4252ab1a18c64b ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java ce952c5ee4d54b4c2a092f9ee15197ec0337fb4c ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 79e55b2de07983c7b799ff382b9c71ef14d25b43 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 520d3de9a7cc07b728b5d3ad3845622ddbec22fb ql/src/test/queries/clientpositive/add_part_fileformat_serdeproperties_location.q PRE-CREATION ql/src/test/results/clientpositive/add_part_fileformat_serdeproperties_location.q.out PRE-CREATION Diff: https://reviews.apache.org/r/54765/diff/ Testing --- Added qtest. Thanks, Anthony Hsu
[jira] [Created] (HIVE-15411) ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES
Anthony Hsu created HIVE-15411: -- Summary: ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES Key: HIVE-15411 URL: https://issues.apache.org/jira/browse/HIVE-15411 Project: Hive Issue Type: Improvement Reporter: Anthony Hsu Assignee: Anthony Hsu Currently, {{ALTER TABLE ... ADD PARTITION}} only lets you set the partition's LOCATION but not its FILEFORMAT or SERDEPROPERTIES. In order to change the FILEFORMAT or SERDEPROPERTIES, you have to issue two additional calls to {{ALTER TABLE ... PARTITION ... SET FILEFORMAT}} and {{ALTER TABLE ... PARTITION ... SET SERDEPROPERTIES}}. This is not atomic, and queries that interleave the ALTER TABLE commands may fail. We should extend the grammar to support setting FILEFORMAT and SERDEPROPERTIES atomically as part of the ADD PARTITION command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15400) EXCHANGE PARTITION should honor partition locations
Anthony Hsu created HIVE-15400: -- Summary: EXCHANGE PARTITION should honor partition locations Key: HIVE-15400 URL: https://issues.apache.org/jira/browse/HIVE-15400 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Currently, if you add a partition with a custom location, EXCHANGE PARTITION will fail with a "File ... does not exist" error: {noformat} drop table if exists text_partitioned; drop table if exists text_partitioned2; create table text_partitioned (b string) partitioned by (a int) stored as textfile; create table text_partitioned2 (b string) partitioned by (a int) stored as textfile; alter table text_partitioned add partition (a=1) location '/tmp/text/1'; alter table text_partitioned2 exchange partition (a=1) with table text_partitioned; {noformat} The last command fails with {code} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.io.FileNotFoundException File file:/path/to/warehouse_dir/text_partitioned/a=1 does not exist) {code} EXCHANGE PARTITION should honor the location that has been set for the partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15394) HiveMetaStoreClient add_partition API should not allow partitions with a null StorageDescriptor.cols to be added
Anthony Hsu created HIVE-15394: -- Summary: HiveMetaStoreClient add_partition API should not allow partitions with a null StorageDescriptor.cols to be added Key: HIVE-15394 URL: https://issues.apache.org/jira/browse/HIVE-15394 Project: Hive Issue Type: Bug Affects Versions: 1.1.0, 2.2.0 Reporter: Anthony Hsu Follow up to HIVE-15353. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54341/ --- (Updated 十二月 8, 2016, 9:23 p.m.) Review request for hive. Changes --- New version no longer updates the Thrift definition but just fixes the NPEs in the alter_partition code path. Bugs: HIVE-15353 https://issues.apache.org/jira/browse/HIVE-15353 Repository: hive-git Description (updated) --- Update alter_partition() code path to fix NPEs. Diffs (updated) - metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 86565a4198d5daced5e230a41d8ada577a656268 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 9ea6ac40d6f0eb9081c5cfad982ffc435f15f6fd Diff: https://reviews.apache.org/r/54341/diff/ Testing --- After making these changes, I no longer encounter NullPointerExceptions when setting cols to null in create_table, alter_table, and alter_partition calls. Thanks, Anthony Hsu
Re: Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54341/ --- (Updated 十二月 5, 2016, 5:30 p.m.) Review request for hive. Changes --- Fixed HiveMetaStore unit tests. Bugs: HIVE-15353 https://issues.apache.org/jira/browse/HIVE-15353 Repository: hive-git Description --- Set a default value for StorageDescriptor.cols of empty list to avoid having to do null checks everywhere. However, null checks are still needed to guard against existing null values in the database (add_partition previously allowed you to store nulls in the database). Diffs (updated) - itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java 21d1b46fcbd4f8f10ee447dce9d40dd6b43a2793 metastore/if/hive_metastore.thrift baab31bb0f44361847224843f905c0417b1670be metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 6838133083684ee3b93a93129bb492ab29a4842e metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java 938f06bbce7a2b213e901f153e1da4606339c0cf metastore/src/gen/thrift/gen-php/metastore/Types.php b9af4efc5f8b7cdf19236db7d68865bdec8382a5 metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 21c039006fc05bc603fda0eeedc92174583f8403 metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb c73593298bbddb46e0926b01ccb9c6fb5d880452 metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 86565a4198d5daced5e230a41d8ada577a656268 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 9ea6ac40d6f0eb9081c5cfad982ffc435f15f6fd Diff: https://reviews.apache.org/r/54341/diff/ Testing --- After making these changes, I no longer encounter NullPointerExceptions when setting cols to null in create_table, alter_table, and alter_partition calls. Thanks, Anthony Hsu
Review Request 54341: HIVE-15353: Metastore throws NPE if StorageDescriptor.cols is null
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54341/ --- Review request for hive. Bugs: HIVE-15353 https://issues.apache.org/jira/browse/HIVE-15353 Repository: hive-git Description --- Set a default value for StorageDescriptor.cols of empty list to avoid having to do null checks everywhere. However, null checks are still needed to guard against existing null values in the database (add_partition previously allowed you to store nulls in the database). Diffs - metastore/if/hive_metastore.thrift baab31bb0f44361847224843f905c0417b1670be metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 6838133083684ee3b93a93129bb492ab29a4842e metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java 938f06bbce7a2b213e901f153e1da4606339c0cf metastore/src/gen/thrift/gen-php/metastore/Types.php b9af4efc5f8b7cdf19236db7d68865bdec8382a5 metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 21c039006fc05bc603fda0eeedc92174583f8403 metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb c73593298bbddb46e0926b01ccb9c6fb5d880452 metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 86565a4198d5daced5e230a41d8ada577a656268 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 9ea6ac40d6f0eb9081c5cfad982ffc435f15f6fd Diff: https://reviews.apache.org/r/54341/diff/ Testing --- After making these changes, I no longer encounter NullPointerExceptions when setting cols to null in create_table, alter_table, and alter_partition calls. Thanks, Anthony Hsu
[jira] [Created] (HIVE-15353) Metastore throws NPE if StorageDescriptor.cols is null
Anthony Hsu created HIVE-15353: -- Summary: Metastore throws NPE if StorageDescriptor.cols is null Key: HIVE-15353 URL: https://issues.apache.org/jira/browse/HIVE-15353 Project: Hive Issue Type: Bug Affects Versions: 1.1.0, 2.2.0 Reporter: Anthony Hsu Assignee: Anthony Hsu When using the HiveMetaStoreClient API directly to talk to the metastore, you get NullPointerExceptions when StorageDescriptor.cols is null in the Table/Partition object in the following calls: * create_table * alter_table * alter_partition Calling add_partition with StorageDescriptor.cols set to null causes null to be stored in the metastore database and subsequent calls to alter_partition for that partition to fail with an NPE. The simplest way to fix these NPEs seems to be to update the StorageDescriptor.cols Thrift definition and set a default value of empty list. Some null checks will also have to be added to handle existing nulls in the metastore database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15289) Flaky test: TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (setup)
Anthony Hsu created HIVE-15289: -- Summary: Flaky test: TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (setup) Key: HIVE-15289 URL: https://issues.apache.org/jira/browse/HIVE-15289 Project: Hive Issue Type: Sub-task Reporter: Anthony Hsu In recent PreCommit builds, TestSparkCliDriver has failed during setup with errors like the following: >From https://builds.apache.org/job/PreCommit-HIVE-Build/2292/testReport/: {noformat} Failed during createSources processLine with code=3 ... Job failed with java.io.IOException: Failed to create local dir in /tmp/blockmgr-be4539eb-7896-4903-89c9-7ae1c48faa24/01. at org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:70) at org.apache.spark.storage.DiskStore.contains(DiskStore.scala:124) at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$getCurrentBlockStatus(BlockManager.scala:379) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:959) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910) at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:700) at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1213) at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:103) at org.apache.spark.broadcast.TorrentBroadcast.(TorrentBroadcast.scala:86) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:56) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1370) at org.apache.spark.rdd.HadoopRDD.(HadoopRDD.scala:125) at org.apache.spark.SparkContext$$anonfun$hadoopRDD$1.apply(SparkContext.scala:965) at org.apache.spark.SparkContext$$anonfun$hadoopRDD$1.apply(SparkContext.scala:961) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.SparkContext.withScope(SparkContext.scala:682) at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:961) at org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:412) at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205) at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145) at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117) at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:339) at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:358) at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:323) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} >From https://builds.apache.org/job/PreCommit-HIVE-Build/2291/testReport/: {noformat} Failed during createSources processLine with code=1 ... Failed to monitor Job[ 11] with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)' {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15288) Flaky test: TestMiniTezCliDriver.testCliDriver[explainuser_3]
Anthony Hsu created HIVE-15288: -- Summary: Flaky test: TestMiniTezCliDriver.testCliDriver[explainuser_3] Key: HIVE-15288 URL: https://issues.apache.org/jira/browse/HIVE-15288 Project: Hive Issue Type: Sub-task Reporter: Anthony Hsu explainuser_3.q sometimes fails with the following diff: {noformat} 34c34 < Select Operator [SEL_7] (rows=16 width=106) --- > Select Operator [SEL_7] (rows=16 width=107) 38c38 < Select Operator [SEL_5] (rows=16 width=106) --- > Select Operator [SEL_5] (rows=16 width=107) 40c40 < TableScan [TS_0] (rows=16 width=106) --- > TableScan [TS_0] (rows=16 width=107) {noformat} It was also previously reported as flaky in HIVE-14689. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15287) Flaky test: TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
Anthony Hsu created HIVE-15287: -- Summary: Flaky test: TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] Key: HIVE-15287 URL: https://issues.apache.org/jira/browse/HIVE-15287 Project: Hive Issue Type: Sub-task Reporter: Anthony Hsu insert_values_orig_table_use_metadata.q sometimes fails with the following diff differences: {noformat} 315c315 < totalSize 1545 --- > totalSize 1508 343c343 < Statistics: Num rows: 1 Data size: 1545 Basic stats: COMPLETE Column stats: COMPLETE --- > Statistics: Num rows: 1 Data size: 1508 Basic stats: COMPLETE > Column stats: COMPLETE 345c345 < Statistics: Num rows: 1 Data size: 1545 Basic stats: COMPLETE Column stats: COMPLETE --- > Statistics: Num rows: 1 Data size: 1508 Basic stats: COMPLETE > Column stats: COMPLETE 439c439 < totalSize 3091 --- > totalSize 3016 467c467 < Statistics: Num rows: 1 Data size: 3091 Basic stats: COMPLETE Column stats: COMPLETE --- > Statistics: Num rows: 1 Data size: 3016 Basic stats: COMPLETE > Column stats: COMPLETE 469c469 < Statistics: Num rows: 1 Data size: 3091 Basic stats: COMPLETE Column stats: COMPLETE --- > Statistics: Num rows: 1 Data size: 3016 Basic stats: COMPLETE > Column stats: COMPLETE 547c547 < totalSize 380328 --- > totalSize 380253 575c575 < Statistics: Num rows: 1 Data size: 380328 Basic stats: COMPLETE Column stats: COMPLETE --- > Statistics: Num rows: 1 Data size: 380253 Basic stats: COMPLETE > Column stats: COMPLETE 577c577 < Statistics: Num rows: 1 Data size: 380328 Basic stats: COMPLETE Column stats: COMPLETE --- > Statistics: Num rows: 1 Data size: 380253 Basic stats: COMPLETE > Column stats: COMPLETE {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15286) Flaky test: TestCliDriver.testCliDriver[autoColumnStats_4]
Anthony Hsu created HIVE-15286: -- Summary: Flaky test: TestCliDriver.testCliDriver[autoColumnStats_4] Key: HIVE-15286 URL: https://issues.apache.org/jira/browse/HIVE-15286 Project: Hive Issue Type: Sub-task Reporter: Anthony Hsu autoColumnStats_4.q sometimes fails with the following diff differences: {noformat} 203c203 < totalSize 1707 --- > totalSize 1714 246c246 < totalSize 2920 --- > totalSize 2719 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 54094: HIVE-15190: Field names are not preserved in ORC files written with ACID
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54094/ --- Review request for hive. Bugs: HIVE-15190 https://issues.apache.org/jira/browse/HIVE-15190 Repository: hive-git Description --- Previously, when writing to an ACID ORC table, the file written to disk would have a schema of `struct<...(acid columns)...,row:struct<_col0:int,_col1:string,...>>`, using virtual column names `_col0`, `_col1`, etc., instead of the actual table column names. This patch fixes this issue. Having the actual table column names in the ORC file itself is needed when doing schema evolution based on field names: https://issues.apache.org/jira/browse/ORC-54 Diffs - orc/src/java/org/apache/orc/impl/SchemaEvolution.java 7379de93a7f39d734ef7695c197bd9f24bc84321 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 53660206e3f59c37be261b1a9796f04721a244f3 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java efde2db482367f1037c486df9c5cabd67b1368ed ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 492c64c29e8d4f38d857381bc375074e06868f7c ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java 75c7680e267ab44e426d0b21c6fd6dce6a352bbd ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 49ba6675bae5b3e6d8bf1fa2e9ed8d2a27b7f83a Diff: https://reviews.apache.org/r/54094/diff/ Testing --- Added unit test. Also ran some of the existing ACID tests and they still passed. Thanks, Anthony Hsu
[jira] [Created] (HIVE-15190) Field names are not preserved in ORC files written with ACID
Anthony Hsu created HIVE-15190: -- Summary: Field names are not preserved in ORC files written with ACID Key: HIVE-15190 URL: https://issues.apache.org/jira/browse/HIVE-15190 Project: Hive Issue Type: Bug Affects Versions: 2.1.0 Reporter: Anthony Hsu Assignee: Anthony Hsu To repro: {noformat} drop table if exists orc_nonacid; drop table if exists orc_acid; create table orc_nonacid (a int) clustered by (a) into 2 buckets stored as orc; create table orc_acid (a int) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true'); insert into table orc_nonacid values(1), (2); insert into table orc_acid values(1), (2); {noformat} Running {{hive --service orcfiledump }} on the files created by the {{insert}} statements above, you'll see that for {{orc_nonacid}}, the files have schema {{struct}} whereas for {{orc_acid}}, the files have schema {{struct<operation:int,originalTransaction:bigint,bucket:int,rowId:bigint,currentTransaction:bigint,row:struct<_col0:int>>}}. The last field {{row}} should have schema {{struct}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13993) Hive should provide built-in UDF that can apply another UDF to each element of an array
Anthony Hsu created HIVE-13993: -- Summary: Hive should provide built-in UDF that can apply another UDF to each element of an array Key: HIVE-13993 URL: https://issues.apache.org/jira/browse/HIVE-13993 Project: Hive Issue Type: New Feature Reporter: Anthony Hsu There is currently no simple way to take an array field and apply a UDF on each element of the array, returning a new array. This is a basic use case that Hive should provide a built-in UDF for. More motivation: http://stackoverflow.com/questions/27722493/how-to-invoke-udf-for-each-element-in-an-array-in-hive -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 45348: HIVE-13363: Add hive.metastore.token.signature property to HiveConf
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45348/ --- (Updated 五月 4, 2016, 1:30 a.m.) Review request for hive, Carl Steinbach and Ratandeep Ratti. Changes --- Fixed bug in original revision that caused build to fail. Bugs: HIVE-13363 https://issues.apache.org/jira/browse/HIVE-13363 Repository: hive-git Description --- No logic changes, just added METASTORE_TOKEN_SIGNATURE property to HiveConf and replaced all instances of `hive.metastore.token.signature` with a references to `HiveConf.ConfVars.METASTORE_TOKEN_SIGNATURE`. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 06a6906ef1f5e0b7d941c042c74d257089f46f96 hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 3ee30edef50940b4d9da21230177d6fb2a796819 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java 13f3c9bd5e523e770dd8ccfd75a442bbbf93b680 itests/hive-unit-hadoop2/src/test/java/org/apache/hadoop/hive/thrift/TestHadoopAuthBridge23.java d07162bd46f8bea88d8c856552a2b4a2d83caf8d metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 7d37d0706d5f0269b89c4c6486adecf4bb3d85b8 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java 025b0b810b040ba6ea72b900ccd0802e207033a8 Diff: https://reviews.apache.org/r/45348/diff/ Testing --- Ran `grep -r hive.metastore.token.signature --include=*.java *` and saw that the only occurrences of this string are in HiveConf.java and a comment in Security.java. Thanks, Anthony Hsu
Review Request 46790: HIVE-13644: Remove hardcoded groovy.grape.report.downloads=true from DependencyResolver
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/46790/ --- Review request for hive, Carl Steinbach, Mark Wagner, and Ratandeep Ratti. Bugs: HIVE-13644 https://issues.apache.org/jira/browse/HIVE-13644 Repository: hive-git Description --- HIVE-13644: Remove hardcoded groovy.grape.report.downloads=true from DependencyResolver Diffs - ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java 3891e59a274e6449c5f50eea51e4f23762efcbc0 Diff: https://reviews.apache.org/r/46790/diff/ Testing --- Tested manually. Thanks, Anthony Hsu
[jira] [Created] (HIVE-13644) Remove hardcoded groovy.grape.report.downloads=true from DependencyResolver
Anthony Hsu created HIVE-13644: -- Summary: Remove hardcoded groovy.grape.report.downloads=true from DependencyResolver Key: HIVE-13644 URL: https://issues.apache.org/jira/browse/HIVE-13644 Project: Hive Issue Type: Improvement Reporter: Anthony Hsu Assignee: Anthony Hsu Currently, in Hive's [DependencyResolver.java|https://github.com/apache/hive/blob/8dd1d1966f2f0b86604b4e991ebc865224f42b41/ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java#L176], the system property {{groovy.grape.report.downloads}} is hardcoded to {{true}} and there is no way to override it and disable the logging. We should remove this hardcoded value and allow users to configure it as they see fit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 45348: HIVE-13363: Add hive.metastore.token.signature property to HiveConf
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45348/ --- Review request for hive, Carl Steinbach and Ratandeep Ratti. Bugs: HIVE-13363 https://issues.apache.org/jira/browse/HIVE-13363 Repository: hive-git Description --- No logic changes, just added METASTORE_TOKEN_SIGNATURE property to HiveConf and replaced all instances of `hive.metastore.token.signature` with a references to `HiveConf.ConfVars.METASTORE_TOKEN_SIGNATURE`. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b8870f2ef78884f23e65d9432415e49d89f8ee35 hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 3ee30edef50940b4d9da21230177d6fb2a796819 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java 13f3c9bd5e523e770dd8ccfd75a442bbbf93b680 itests/hive-unit-hadoop2/src/test/java/org/apache/hadoop/hive/thrift/TestHadoopAuthBridge23.java d07162bd46f8bea88d8c856552a2b4a2d83caf8d metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java cdd12aba9fb4284bbb9989d7fcbe3c591ef17d98 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java 025b0b810b040ba6ea72b900ccd0802e207033a8 Diff: https://reviews.apache.org/r/45348/diff/ Testing --- Ran `grep -r hive.metastore.token.signature --include=*.java *` and saw that the only occurrences of this string are in HiveConf.java and a comment in Security.java. Thanks, Anthony Hsu
[jira] [Created] (HIVE-13363) Add hive.metastore.token.signature property to HiveConf
Anthony Hsu created HIVE-13363: -- Summary: Add hive.metastore.token.signature property to HiveConf Key: HIVE-13363 URL: https://issues.apache.org/jira/browse/HIVE-13363 Project: Hive Issue Type: Improvement Reporter: Anthony Hsu Assignee: Anthony Hsu I noticed that the {{hive.metastore.token.signature}} property is not defined in HiveConf.java, but hardcoded everywhere it's used in the Hive codebase. [HIVE-2963] fixes this but was never committed due to being resolved as a duplicate ticket. We should add {{hive.metastore.token.signature}} to HiveConf.java to centralize its definition and make the property more discoverable (it's useful to set it when talking to multiple metastores). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13132) Hive should lazily load and cache metastore (permanent) functions
Anthony Hsu created HIVE-13132: -- Summary: Hive should lazily load and cache metastore (permanent) functions Key: HIVE-13132 URL: https://issues.apache.org/jira/browse/HIVE-13132 Project: Hive Issue Type: Improvement Affects Versions: 0.13.1 Reporter: Anthony Hsu Assignee: Anthony Hsu In Hive 0.13.1, we have noticed that as the number of databases increases, the start-up time of the Hive interactive shell increases. This is because during start-up, all databases are iterated over to fetch the permanent functions to display in the {{SHOW FUNCTIONS}} output. {noformat:title=FunctionRegistry.java} private static Set getFunctionNames(boolean searchMetastore) { Set functionNames = mFunctions.keySet(); if (searchMetastore) { functionNames = new HashSet(functionNames); try { Hive db = getHive(); List dbNames = db.getAllDatabases(); for (String dbName : dbNames) { List funcNames = db.getFunctions(dbName, "*"); for (String funcName : funcNames) { functionNames.add(FunctionUtils.qualifyFunctionName(funcName, dbName)); } } } catch (Exception e) { LOG.error(e); // Continue on, we can still return the functions we've gotten to this point. } } return functionNames; } {noformat} Instead of eagerly loading all metastore functions, we should only load them the first time {{SHOW FUNCTIONS}} is invoked. We should also cache the results. Note that this issue may have been fixed by HIVE-2573, though I haven't verified this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority
Anthony Hsu created HIVE-13046: -- Summary: DependencyResolver should not lowercase the dependency URI's authority Key: HIVE-13046 URL: https://issues.apache.org/jira/browse/HIVE-13046 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, Hive will lowercase it to {{1.2.3-snapshot}} due to: {code:title=DependencyResolver.java} String[] authorityTokens = authority.toLowerCase().split(":"); {code} We should not {{.lowerCase()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 43513: HIVE-13046: DependencyResolver should not lowercase the dependency URI's authority
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43513/ --- Review request for hive, Carl Steinbach, Mark Wagner, and Ratandeep Ratti. Bugs: HIVE-13046 https://issues.apache.org/jira/browse/HIVE-13046 Repository: hive-git Description --- HIVE-13046: DependencyResolver should not lowercase the dependency URI's authority Diffs - ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java 3891e59a274e6449c5f50eea51e4f23762efcbc0 Diff: https://reviews.apache.org/r/43513/diff/ Testing --- Tested manually. Thanks, Anthony Hsu
[jira] [Created] (HIVE-12978) Hive Metastore should have a config for starting background thread services
Anthony Hsu created HIVE-12978: -- Summary: Hive Metastore should have a config for starting background thread services Key: HIVE-12978 URL: https://issues.apache.org/jira/browse/HIVE-12978 Project: Hive Issue Type: New Feature Reporter: Anthony Hsu Assignee: Anthony Hsu It would be convenient to have a configuration for setting custom background threads to run in the Hive Metastore. This could be useful for running custom monitoring, logging, or table/partition registration services. I propose adding a {{hive.metastore.thread.services}} config that takes a comma-separated list of classes that implement the {{MetaStoreThread}} interface, which the metastore would launch during start-up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 43073: HIVE-12978: Hive Metastore should have a config for starting background thread services
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43073/ --- Review request for hive, Carl Steinbach and Ratandeep Ratti. Bugs: HIVE-12978 https://issues.apache.org/jira/browse/HIVE-12978 Repository: hive-git Description --- Added a new property `hive.metastore.thread.services` for configuring background metastore threads. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 6678de6c488e838b82caf62186ff6518295b7e98 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java dde253a9c7a19527620f9d516265971529fa838d Diff: https://reviews.apache.org/r/43073/diff/ Testing --- Tested manually. Wrote an [ExampleMetaStoreThreadService.java](https://github.com/erwa/test/blob/master/metastore-thread-service-example-hive-2.x/src/main/java/ExampleMetaStoreThreadService.java). Configured the following in my hive-site.xml: ``` hive.metastore.thread.services ExampleMetaStoreThreadService ``` When starting the Hive Metastore, I saw the following log output: ``` 16/02/01 15:09:54 [Thread-3]: INFO metastore.HiveMetaStore: Starting background metastore service ExampleMetaStoreThreadService 16/02/01 15:09:54 [Thread-3]: INFO metastore.HiveMetaStore: Starting metastore thread of type ExampleMetaStoreThreadService 16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Setting HiveConf in ExampleMetaStoreThreadService 16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Setting thread id in ExampleMetaStoreThreadService 16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Initing ExampleMetaStoreThreadService 16/02/01 15:09:54 [Thread-3]: INFO ExampleMetaStoreThreadService: Starting ExampleMetaStoreThreadService ``` Thanks, Anthony Hsu
Re: Review Request 38663: HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38663/#review109782 --- Ship it! Revision looks good to me. itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java (line 92) <https://reviews.apache.org/r/38663/#comment169418> Nit: trailing whitespace - Anthony Hsu On 十二月 9, 2015, 8:33 a.m., Ratandeep Ratti wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38663/ > --- > > (Updated 十二月 9, 2015, 8:33 a.m.) > > > Review request for hive. > > > Bugs: HIVE-11878 > https://issues.apache.org/jira/browse/HIVE-11878 > > > Repository: hive-git > > > Description > --- > > HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are > registered one at a time in Hive > > > Diffs > - > > conf/ivysettings.xml bda842a > itests/custom-udfs/pom.xml PRE-CREATION > itests/custom-udfs/udf-classloader-udf1/pom.xml PRE-CREATION > > itests/custom-udfs/udf-classloader-udf1/src/main/java/hive/it/custom/udfs/UDF1.java > PRE-CREATION > itests/custom-udfs/udf-classloader-udf2/pom.xml PRE-CREATION > > itests/custom-udfs/udf-classloader-udf2/src/main/java/hive/it/custom/udfs/UDF2.java > PRE-CREATION > itests/custom-udfs/udf-classloader-util/pom.xml PRE-CREATION > > itests/custom-udfs/udf-classloader-util/src/main/java/hive/it/custom/udfs/Util.java > PRE-CREATION > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java > 0c03a00 > itests/pom.xml 5d8249f > itests/qtest/pom.xml 8f6807a > ql/src/java/org/apache/hadoop/hive/ql/exec/UDFClassLoader.java PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java c01994f > ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 5c69fb6 > ql/src/test/queries/clientpositive/udf_classloader.q PRE-CREATION > > ql/src/test/queries/clientpositive/udf_classloader_dynamic_dependency_resolution.q > PRE-CREATION > ql/src/test/results/clientpositive/udf_classloader.q.out PRE-CREATION > > ql/src/test/results/clientpositive/udf_classloader_dynamic_dependency_resolution.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/38663/diff/ > > > Testing > --- > > > Thanks, > > Ratandeep Ratti > >
Re: Review Request 38663: HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38663/#review107063 --- Ship it! Revision looks good to me. ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java (lines 369 - 370) <https://reviews.apache.org/r/38663/#comment165963> You could also use `new String[0]`. - Anthony Hsu On 十一月 18, 2015, 5:21 a.m., Ratandeep Ratti wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38663/ > --- > > (Updated 十一月 18, 2015, 5:21 a.m.) > > > Review request for hive. > > > Bugs: HIVE-11878 > https://issues.apache.org/jira/browse/HIVE-11878 > > > Repository: hive-git > > > Description > --- > > HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are > registered one at a time in Hive > > > Diffs > - > > conf/ivysettings.xml bda842a89bb07710fdcd7180a00833a7388ada8f > itests/custom-udfs/pom.xml PRE-CREATION > itests/custom-udfs/udf-classloader-udf1/pom.xml PRE-CREATION > > itests/custom-udfs/udf-classloader-udf1/src/main/java/hive/it/custom/udfs/UDF1.java > PRE-CREATION > itests/custom-udfs/udf-classloader-udf2/pom.xml PRE-CREATION > > itests/custom-udfs/udf-classloader-udf2/src/main/java/hive/it/custom/udfs/UDF2.java > PRE-CREATION > itests/custom-udfs/udf-classloader-util/pom.xml PRE-CREATION > > itests/custom-udfs/udf-classloader-util/src/main/java/hive/it/custom/udfs/Util.java > PRE-CREATION > itests/pom.xml 0686f1fd58c2be26b2ee645c4e244159aec565e5 > itests/qtest/pom.xml 8db6fb04d0a5d4600bc23543a0215d31c1cd0648 > ql/src/java/org/apache/hadoop/hive/ql/exec/UDFClassLoader.java PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > de2eb984159526048e8dacf71d3ff8b0647394a3 > ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java > ff875df98e1dd64a8af3ad22f4b38dbc1d6a1923 > ql/src/test/queries/clientpositive/udf_classloader.q PRE-CREATION > > ql/src/test/queries/clientpositive/udf_classloader_dynamic_dependency_resolution.q > PRE-CREATION > ql/src/test/results/clientpositive/udf_classloader.q.out PRE-CREATION > > ql/src/test/results/clientpositive/udf_classloader_dynamic_dependency_resolution.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/38663/diff/ > > > Testing > --- > > > Thanks, > > Ratandeep Ratti > >
[jira] [Created] (HIVE-11951) DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES
Anthony Hsu created HIVE-11951: -- Summary: DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES Key: HIVE-11951 URL: https://issues.apache.org/jira/browse/HIVE-11951 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Anthony Hsu Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE DATABASE EXTENDED}}. To reproduce: {code} create database test with dbproperties('foo'='bar'); desc database extended test; {code} The output I see is {code} > desc database extended test; OK testhdfs://:/path/to/test.dbahsu Time taken: 0.019 seconds, Fetched: 1 row(s) {code} I do not see the {{foo=bar}} property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 38663: HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38663/#review100274 --- contrib/src/java/org/apache/hadoop/hive/contrib/classloader/UDF2.java (line 32) <https://reviews.apache.org/r/38663/#comment157424> Should UDF1 be replaced with UDF2? - Anthony Hsu On 九月 23, 2015, 5:38 a.m., Ratandeep Ratti wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38663/ > --- > > (Updated 九月 23, 2015, 5:38 a.m.) > > > Review request for hive. > > > Bugs: HIVE-11878 > https://issues.apache.org/jira/browse/HIVE-11878 > > > Repository: hive-git > > > Description > --- > > HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are > registered one at a time in Hive > > > Diffs > - > > contrib/src/java/org/apache/hadoop/hive/contrib/classloader/ClassA.java > PRE-CREATION > contrib/src/java/org/apache/hadoop/hive/contrib/classloader/UDF1.java > PRE-CREATION > contrib/src/java/org/apache/hadoop/hive/contrib/classloader/UDF2.java > PRE-CREATION > itests/pom.xml acce7131948edd5aeab34af6879d781daa12ba30 > itests/qtest/pom.xml 0588233b250f7c78f594bb36554a80990e907550 > ql/src/java/org/apache/hadoop/hive/ql/exec/UDFClassLoader.java PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > ca863019f3347c94852dcad2a21c43758aed30a7 > ql/src/test/queries/clientpositive/test_classloader.q PRE-CREATION > ql/src/test/results/clientpositive/test_classloader.q.out PRE-CREATION > > Diff: https://reviews.apache.org/r/38663/diff/ > > > Testing > --- > > > Thanks, > > Ratandeep Ratti > >
[jira] [Commented] (HIVE-9022) When creating external tables, Hive needs to verify whether the user has read permissions to the data
[ https://issues.apache.org/jira/browse/HIVE-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329317#comment-14329317 ] Anthony Hsu commented on HIVE-9022: --- Could you please upload this patch to the [Apache Review Board|https://reviews.apache.org]? It makes it easier to review and add comments. One suggestion is we should add a test case for this. When creating external tables, Hive needs to verify whether the user has read permissions to the data - Key: HIVE-9022 URL: https://issues.apache.org/jira/browse/HIVE-9022 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Anant Nag Labels: patch Attachments: createExternal.patch Hive doesn't verify whether user has read permissions on the data before creating external table referring to the data. This needs to be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9021) Hive should not allow any user to create tables in other hive DB's that user doesn't own
[ https://issues.apache.org/jira/browse/HIVE-9021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329339#comment-14329339 ] Anthony Hsu commented on HIVE-9021: --- I don't think this feature is necessary. Hive has a [Storage-Based Authorization Model|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization#LanguageManualAuthorization-1StorageBasedAuthorizationintheMetastoreServer] that uses HDFS permissions for authorization. If a user does not want other users to be able to create tables in his database, he should set the permissions for his database's directory on HDFS accordingly (such as to rwxr-xr-x). Hive should not allow any user to create tables in other hive DB's that user doesn't own Key: HIVE-9021 URL: https://issues.apache.org/jira/browse/HIVE-9021 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Anant Nag Labels: patch Attachments: db.patch Hive allows users to create tables in other users db. This should not be allowed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9020) When dropping external tables, Hive should not verify whether user has access to the data.
[ https://issues.apache.org/jira/browse/HIVE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329397#comment-14329397 ] Anthony Hsu commented on HIVE-9020: --- Patch looks fine to me, apart from some formatting issues (indentation and spaces around {{}}). I agree with Thejas that we should add a unit test for this. When dropping external tables, Hive should not verify whether user has access to the data. --- Key: HIVE-9020 URL: https://issues.apache.org/jira/browse/HIVE-9020 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Anant Nag Attachments: dropExternal.patch When dropping tables, hive verifies whether the user has access to the data on hdfs. It fails, if user doesn't have access. It makes sense for internal tables since the data has to be deleted when dropping internal tables but for external tables, Hive should not check for data access. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3779) An empty value to hive.logquery.location can't disable the creation of hive history log files
[ https://issues.apache.org/jira/browse/HIVE-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200521#comment-14200521 ] Anthony Hsu commented on HIVE-3779: --- In case you're still using an older version of Hive that doesn't let you disable the history log files, one workaround that you can use is to run {code} !rm -r /path/to/hive.querylog.location; {code} as your first shell command before running your queries. An empty value to hive.logquery.location can't disable the creation of hive history log files - Key: HIVE-3779 URL: https://issues.apache.org/jira/browse/HIVE-3779 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.9.0 Reporter: Bing Li Priority: Minor In AdminManual Configuration (https://cwiki.apache.org/Hive/adminmanual-configuration.html), the description of hive.querylog.location mentioned that if the variable set to empty string structured log will not be created. But it fails with the following setting, property namehive.querylog.location/name value/value /property It seems that it can NOT get an empty value from HiveConf.ConfVars.HIVEHISTORYFILELOC, but the default value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8560) SerDes that do not inherit AbstractSerDe do not get table properties during initialize()
[ https://issues.apache.org/jira/browse/HIVE-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184000#comment-14184000 ] Anthony Hsu commented on HIVE-8560: --- I'm late to the party, but change looks good to me, too. This won't affect the behavior of AbstractSerDes like AvroSerDes, so I'm cool with it :-). SerDes that do not inherit AbstractSerDe do not get table properties during initialize() Key: HIVE-8560 URL: https://issues.apache.org/jira/browse/HIVE-8560 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.14.0 Attachments: HIVE-8560.1.patch Looks like this may have been introduced during HIVE-6835. During initialize(), 3rd party SerDes which do not inherit AbstractSerDe end up getting a Properties object created by SerDeUtils.createOverlayedProperties(). This properties object receives the table properties as defaults. So looking up a key by name will yield the default value, but a call like getKeys() will not show any of the table properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981278#comment-13981278 ] Anthony Hsu commented on HIVE-6835: --- I will do some local testing soon and let you know. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch, HIVE-6835.5.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981425#comment-13981425 ] Anthony Hsu commented on HIVE-6835: --- I tried all the failed union_remove TestCliDriver tests locally and they all passed. Looking at some of the previous precommit builds, several of them also have the same test failures, so I believe these test failures are unrelated to my changes. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch, HIVE-6835.5.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981431#comment-13981431 ] Anthony Hsu commented on HIVE-6835: --- BTW, I have been doing all my development and testing against Hadoop 1.2.1 (-Phadoop-1). Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch, HIVE-6835.5.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981677#comment-13981677 ] Anthony Hsu commented on HIVE-6835: --- Thanks, [~xuefuz], for all your help and guidance. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Fix For: 0.14.0 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch, HIVE-6835.5.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema
/TestAvroSerde.java a5d494f serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/TestBinarySortableSerDe.java e512f42 serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java e8639ff serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyArrayMapStruct.java 714045b serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleSerDe.java 28eb868 serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java 69c891d serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestCrossMapEqualComparer.java a69fcb7 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestSimpleMapEqualComparer.java dd9610e service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 2a113d5 Diff: https://reviews.apache.org/r/20096/diff/ Testing --- Added test cases Thanks, Anthony Hsu
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Attachment: HIVE-6835.5.patch Uploaded a new patch addressing [~xuefuz]'s comments. I removed the getOverlayedProperties() from PartitionDesc and added a new createOverlayedProperties() method in SerDeUtils. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch, HIVE-6835.5.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980100#comment-13980100 ] Anthony Hsu commented on HIVE-6835: --- Also updated [the RB|https://reviews.apache.org/r/20096/]. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch, HIVE-6835.5.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Status: Patch Available (was: Open) Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch, HIVE-6835.5.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Status: Open (was: Patch Available) Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Attachment: HIVE-6835.4.patch Thanks for the suggestions and clarification, [~xuefuz]. I have uploaded a new patch (HIVE-6835.4.patch) using the new approach. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema
/TestLazyBinaryColumnarSerDe.java e8639ff serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyArrayMapStruct.java 714045b serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleSerDe.java 28eb868 serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java 69c891d serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestCrossMapEqualComparer.java a69fcb7 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestSimpleMapEqualComparer.java dd9610e service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 2a113d5 Diff: https://reviews.apache.org/r/20096/diff/ Testing --- Added test cases Thanks, Anthony Hsu
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979175#comment-13979175 ] Anthony Hsu commented on HIVE-6835: --- P.S.: I also updated my Review Board request: https://reviews.apache.org/r/20096/ Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, HIVE-6835.4.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977337#comment-13977337 ] Anthony Hsu commented on HIVE-6835: --- I started looking into this alternative and encountered an issue. Most calls to serde.initialize() are treating serde as a Deserializer (interface). I would either have to change the interface (and change all the implementations) or cast the Deserializer as an AbstractSerDe (whenever I want to use the new initialize() method), neither of which seems like a great solution. So I am back to supporting my original table. prefix approach. Any thoughts on this? Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977710#comment-13977710 ] Anthony Hsu commented on HIVE-6835: --- Yes, this is possible, but I would have to add these instanceof AbstractSerde checks and then cast the Deserializer as an AbstractSerde before I can use the new initialize() method. There are dozens of usages of .initialize() and adding all this type checking/casting code in so many places just for this new method doesn't seem very clean to me. Also, if we add the new initialize() method, what should we do for table-level serde initialization? When dealing with the table, there are no partition properties, so are we supposed to pass the table properties for both the tblProps and partProps arguments? If we leave partProps null, then the default new initialize() method implementation will just pass null to the old initialize() method. There doesn't seem to be a very clean way of adding a new initialize() method without creating a lot of redundant boilerplate code and creating confusion which initialize() method to use and what values to pass in. Given these concerns, I feel that prepending table. might be a cleaner and less confusing approach. What are your thoughts on this? Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976144#comment-13976144 ] Anthony Hsu commented on HIVE-6835: --- Great, sounds like we're on the same page. I'll implement this new approach and upload a new patch soon. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974192#comment-13974192 ] Anthony Hsu commented on HIVE-6835: --- I'm guessing the schema was specified in the SERDEPROPERTIES to work around HIVE-3953. However, one issue with storing the schema in TBLPROPERTIES instead is that for partitioned tables, when you do a {{describe \[extended] table_name partition(...);}}, you get {code} error_error_error_error_error_error_error string from deserializer cannot_determine_schema string from deserializer check string from deserializer schema string from deserializer url string from deserializer and string from deserializer literal string from deserializer {code} because the AvroSerDe cannot find avro.schema.literal or avro.schema.url. If you store the schema in SERDEPROPERTIES, you don't get this issue, since the SERDEPROPERTIES get copied to the partition when it is created. I do think it is useful to make both the table-level properties and the partition-level properties available separately to the SerDe when it's doing its .initalize(). The SerDe should be able to decide which set of properties it wants to use. From this point of view, I think my change is still useful and valid. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974342#comment-13974342 ] Anthony Hsu commented on HIVE-6835: --- If TBLPROPERTIES were copied to the partition, then you still might have the problem of the table-level Avro schema and the partition-level Avro schema getting out of sync, which might lead to ClassCastExceptions. The Avro schema should always use the latest table-level schema, whether it is stored in TBLPROPERTIES or SERDEPROPERTIES. The root of the problem is if an Avro schema somehow ends up in the partition properties, these could get out of sync with the table-level properties. The Avro SerDe should always be using the table-level schema, and that's why my change was to (1) make the table-level properties available to the serde, and (2) change the Avro SerDe to use the table-level properties when present. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974670#comment-13974670 ] Anthony Hsu commented on HIVE-6835: --- [~xuefuz] and [~ashutoshc], just to clarify, is this the alternative solution you're proposing?: # Add {code} public void initialize(Configuration configuration, Properties tableProperties, Properties partitionProperties) throws SerDeException; {code} to AbstractSerDe and provide a default implementation that just calls {{initialize(configuration, partitionProperties)}} # Change all calls of {{partitionSerde.initialize(conf, partProps)}} to {{partitionSerde.initialize(conf, tblProps, partProps)}} # Add {code} @Override public void initialize(Configuration configuration, Properties tableProperties, Properties partitionProperties) throws SerDeException; {code} to AvroSerDe and provide an implementation that just uses the tableProperties I am okay with taking this approach, though it involves a lot more code changes and will change the public AbstractSerDe API. Let me know what your thoughts on this approach are. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973279#comment-13973279 ] Anthony Hsu commented on HIVE-6835: --- The AvroSerDe handles schema evolution as described in http://avro.apache.org/docs/current/spec.html#Schema+Resolution. However, in the Hive code, the AvroSerDe needs to always be initialized with the latest schema so that ObjectInspectorConverters.getConvertedOI() (in FetchOperator:getRecordReader()) will work. When the AvroSerDe actually reads the Avro file, it will then compare the latest schema to the actual schema stored in the Avro file and do schema resolution/evolution. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973615#comment-13973615 ] Anthony Hsu commented on HIVE-6835: --- What happens is Hive tries to build ObjectInspectorConverters from the partition schema to the table schema. If the partition schema is different from the table schema, you may get a ClassCastException like above. When you add new columns at the end, this is not a problem because these new columns are chopped off. See ObjectInspectorConverters:StructConverter: {code} int minFields = Math.min(inputFields.size(), outputFields.size()); fieldConverters = new ArrayListConverter(minFields); {code} It's only when you insert new columns at the beginning or in the middle that you might run into ClassCastExceptions. For the AvroSerDe, if it always uses the latest schema (which should be the table-level schema), Hive will not get confused when constructing its ObjectInspectorConverters. Then, later, when the AvroSerDe actually goes to read the Avro files, it can compare the latest schema with the (possibly old) schemas stored in the Avro data files themselves, and do the proper schema resolution, omitting fields or substituting default values, following the [schema resolution rules|http://avro.apache.org/docs/current/spec.html#Schema+Resolution]. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Status: Patch Available (was: Open) Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973654#comment-13973654 ] Anthony Hsu commented on HIVE-6835: --- On a side note: If you create an Avro table and store the schema in the TBLPROPERTIES - {code} CREATE TABLE ... TBLPROPERTIES ('avro.schema.literal'='...'); {code} \- everything works fine with partitions because TBLPROPERTIES are NOT copied to the partition, so the partition will end using the TBLPROPERTIES for initializing the Avro SerDe. It's only when you store the schema in the SERDEPROPERTIES - {code} CREATE TABLE ... WITH SERDEPROPERTIES ('avro.schema.literal'='...'); {code} \- that problems arise. SERDEPROPERTIES DO get copied to the partitions, so if you then end up changing the SERDEPROPERTIES stored at the table level, the SERDEPROPERTIES in the table and the partitions get out of sync and this sometimes leads to ClassCastExceptions with the AvroSerDe. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Attachment: (was: HIVE-6835.2.patch) Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Attachment: HIVE-6835.2.patch Reuploading patch version 2 to trigger the tests again. I ran locally the tests that failed in the last pre-commit build run, and they both passed for me. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20096/ --- (Updated April 17, 2014, 1:14 a.m.) Review request for hive. Changes --- Addressed Ashutosh's comments in HIVE-6835. Added the constant to serde.thrift and used the Thrift compiler to generate all the language-specific bindings. Repository: hive-git Description --- The problem occurs when you store the avro.schema.(literal|url) in the SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the table's schema, and then try reading from the old partition. I fixed this problem by passing the table properties to the partition with a table. prefix, and changing the Avro SerDe to always use the table properties when available. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c ql/src/test/queries/clientpositive/avro_partitioned.q 6fe5117 ql/src/test/results/clientpositive/avro_partitioned.q.out 644716d serde/if/serde.thrift 31c87ee serde/src/gen/thrift/gen-cpp/serde_constants.h d56c917 serde/src/gen/thrift/gen-cpp/serde_constants.cpp 54503e3 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java 515cf25 serde/src/gen/thrift/gen-php/org/apache/hadoop/hive/serde/Types.php 837dd11 serde/src/gen/thrift/gen-py/org_apache_hadoop_hive_serde/constants.py 8eac87d serde/src/gen/thrift/gen-rb/serde_constants.rb ed86522 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 67d5570 Diff: https://reviews.apache.org/r/20096/diff/ Testing --- Added test cases Thanks, Anthony Hsu
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Attachment: HIVE-6835.3.patch Thanks for catching this, Ashutosh. My bad for not noticing I was modifying a generated file. I have updated my [Review Board request|https://reviews.apache.org/r/20096/] and also uploaded a new patch. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20096/ --- (Updated April 14, 2014, 6:49 p.m.) Review request for hive. Changes --- Addressed Carl's comments. Changes: - Reverted whitespace changes. - Moved the TABLE_PROP_PREFIX (table.) to serdeConstants. - Removed code that mutated the Properties passed to the AvroSerDe - Added/improved comments - Synced with latest Repository: hive-git Description --- The problem occurs when you store the avro.schema.(literal|url) in the SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the table's schema, and then try reading from the old partition. I fixed this problem by passing the table properties to the partition with a table. prefix, and changing the Avro SerDe to always use the table properties when available. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c ql/src/test/queries/clientpositive/avro_partitioned.q 6fe5117 ql/src/test/results/clientpositive/avro_partitioned.q.out 644716d serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java 515cf25 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 67d5570 Diff: https://reviews.apache.org/r/20096/diff/ Testing --- Added test cases Thanks, Anthony Hsu
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Attachment: HIVE-6835.2.patch Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Status: Patch Available (was: Open) Thanks for the very thorough code review, [~cwsteinbach]. I've uploaded a new patch that addresses your comments and also updated the Review Board request. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20096/ --- Review request for hive. Repository: hive-git Description --- The problem occurs when you store the avro.schema.(literal|url) in the SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the table's schema, and then try reading from the old partition. I fixed this problem by passing the table properties to the partition with a table. prefix, and changing the Avro SerDe to always use the table properties when available. Diffs - ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c ql/src/test/queries/clientpositive/avro_partitioned.q 068a13c ql/src/test/results/clientpositive/avro_partitioned.q.out 352ec0d serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 67d5570 Diff: https://reviews.apache.org/r/20096/diff/ Testing --- Added test cases Thanks, Anthony Hsu
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Attachment: HIVE-6835.1.patch Uploaded a patch with a fix. Review Board link: https://reviews.apache.org/r/20096/ Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu reassigned HIVE-6835: - Assignee: Anthony Hsu Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Assignee: (was: Anthony Hsu) Status: Patch Available (was: Open) Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Attachments: HIVE-6835.1.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
Anthony Hsu created HIVE-6835: - Summary: Reading of partitioned Avro data fails if partition schema does not match table schema Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray (a arraystring) partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958989#comment-13958989 ] Anthony Hsu commented on HIVE-6835: --- Right now, when AvroSerDe.initialize() is called, the Properties it is passed include both table and partition properties, with the partition properties *overriding* the table properties. The AvroSerDe needs the *latest* schema (which should be stored in the table properties) for proper initialization and to prevent the ClassCastException. My proposal is to pass both the table and partition properties to SerDe.initialize() by prepending the table properties with table., and let the SerDe decide which set of properties to use. BTW, here's the full stack trace when you do the select *: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector 14/04/03 10:11:02 ERROR CliDriver: Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:551) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1471) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:272) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:414) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:782) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:676) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:148) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.init(ObjectInspectorConverters.java:304) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:150) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:407) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:515) ... 14 more {code} Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray (a arraystring) partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6835: -- Description: To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} was: To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray (a arraystring) partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955374#comment-13955374 ] Anthony Hsu commented on HIVE-6570: --- [~leftylev] - Thanks for the instructions! [~xuefuz] - Thanks for committing this! Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu Fix For: 0.14.0 Attachments: HIVE-6570.1.patch The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954616#comment-13954616 ] Anthony Hsu commented on HIVE-6570: --- It would probably be nice to add an example to the Variable Substitution page that uses variable substitution with the source command. On a side note, how does one get edit privileges for the wiki? Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6570.1.patch The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954321#comment-13954321 ] Anthony Hsu commented on HIVE-6570: --- Thanks. Could one of you guys commit the patch for me please? Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6570.1.patch The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951347#comment-13951347 ] Anthony Hsu commented on HIVE-6570: --- What concerns does [~appodictic] have? Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6570.1.patch The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943740#comment-13943740 ] Anthony Hsu commented on HIVE-6570: --- Ping Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6570.1.patch The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6570: -- Release Note: This patch adds Hive variable substitution support to the source command. For example, you will now be able to use a statement such as: source ${hivevar:test-dir}/test.q; Added a Release Note explaining the changes in this patch. Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6570.1.patch The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)