[ https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139468#comment-16139468 ]
Hive QA commented on HIVE-6131: ------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12637927/HIVE-6131.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11000 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_cascade] (batchId=84) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_data_after_schema_update] (batchId=17) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat11] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat12] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat13] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat14] (batchId=73) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_primitive] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_complex] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_primitive] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_complex] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive] (batchId=158) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6504/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6504/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6504/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12637927 - PreCommit-HIVE-Build > New columns after table alter result in null values despite data > ---------------------------------------------------------------- > > Key: HIVE-6131 > URL: https://issues.apache.org/jira/browse/HIVE-6131 > Project: Hive > Issue Type: Bug > Affects Versions: 0.11.0, 0.12.0, 0.13.0, 1.2.1 > Reporter: James Vaughan > Priority: Critical > Attachments: HIVE-6131.1.patch > > > Hi folks, > I found and verified a bug on our CDH 4.0.3 install of Hive when adding > columns to tables with Partitions using 'REPLACE COLUMNS'. I dug through the > Jira a little bit and didn't see anything for it so hopefully this isn't just > noise on the radar. > Basically, when you alter a table with partitions and then reupload data to > that partition, it doesn't seem to recognize the extra data that actually > exists in HDFS- as in, returns NULL values on the new column despite having > the data and recognizing the new column in the metadata. > Here's some steps to reproduce using a basic table: > 1. Run this hive command: CREATE TABLE jvaughan_test (col1 string) > partitioned by (day string); > 2. Create a simple file on the system with a couple of entries, something > like "hi" and "hi2" separated by newlines. > 3. Run this hive command, pointing it at the file: LOAD DATA LOCAL INPATH > '<FILEDIR>' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02'); > 4. Confirm the data with: SELECT * FROM jvaughan_test WHERE day = > '2014-01-02'; > 5. Alter the column definitions: ALTER TABLE jvaughan_test REPLACE COLUMNS > (col1 string, col2 string); > 6. Edit your file and add a second column using the default separator > (ctrl+v, then ctrl+a in Vim) and add two more entries, such as "hi3" on the > first row and "hi4" on the second > 7. Run step 3 again > 8. Check the data again like in step 4 > For me, this is the results that get returned: > hive> select * from jvaughan_test where day = '2014-01-01'; > OK > hi NULL 2014-01-02 > hi2 NULL 2014-01-02 > This is despite the fact that there is data in the file stored by the > partition in HDFS. > Let me know if you need any other information. The only workaround for me > currently is to drop partitions for any I'm replacing data in and THEN > reupload the new data file. > Thanks, > -James -- This message was sent by Atlassian JIRA (v6.4.14#64029)