[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128218#comment-14128218 ] Lefty Leverenz commented on HIVE-6147: -- Doc question: Will this be documented in the HBase Integration design doc or the Avro SerDe doc, or a new doc? (The HBase doc has a list of open issues, but this one isn't on the list.) * [HBase Integration -- Open Issues (JIRA) | https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-OpenIssues(JIRA)] * [Avro SerDe | https://cwiki.apache.org/confluence/display/Hive/AvroSerDe] Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125501#comment-14125501 ] Swarnim Kulkarni commented on HIVE-6147: [~brocknoland][~xuefuz] Updated RB for the patch: https://reviews.apache.org/r/17566/ Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125791#comment-14125791 ] Brock Noland commented on HIVE-6147: [~swarnim] for some reason I cannot change the JIRA to Patch Available so tests can run. Do you have the button? Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125870#comment-14125870 ] Swarnim Kulkarni commented on HIVE-6147: [~brocknoland] Just did that. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126008#comment-14126008 ] Hive QA commented on HIVE-6147: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12667104/HIVE-6147.6.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6192 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/695/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/695/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-695/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12667104 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126078#comment-14126078 ] Brock Noland commented on HIVE-6147: +1 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126086#comment-14126086 ] Swarnim Kulkarni commented on HIVE-6147: One thing to note here is that this doesn't support serializing of avro data into HBase yet. Should be pretty straightforward to add that in on top of this patch. Logged HIVE-8020 for that. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109862#comment-14109862 ] Swarnim Kulkarni commented on HIVE-6147: Thanks for your reply [~brocknoland]. Actually there is a little more cleanup work that is to be done with this patch as well along with rebasing. I should have a chance to get to this sometime this week. Thanks again. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100172#comment-14100172 ] Brock Noland commented on HIVE-6147: [~swarnim] very sorry this review has taken so long. Can you rebase the current patch? I also see there are many extra newlines in several methods like getTestAvroBytesFromClass2. Can you remove those as well? I will review the updated patch promptly. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.12.0, 0.13.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930726#comment-13930726 ] Swarnim Kulkarni commented on HIVE-6147: Thanks [~xuefuz] for reviewing. I agree it makes lot of sense for HIVE-6411 to go in first and then I can refactor this on the basis of that. Also on the point of reusing AvroSerDe code, I have tried to write AvroLazyObjectInspector simply as a wrapper on top of AvroSerDe still delegating most of the operations to the serde. Any specific instance PC instance you want me to look deeper into? Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930813#comment-13930813 ] Xuefu Zhang commented on HIVE-6147: --- [~swarnim] I'm glad that you have the principle of code reuse in mind. I only browsed the patch, and spotted HiveSerdeHelper.getSchemaFromFS(), which is seemingly for the same purpose as AvroSerdeUtils.getSchemaFromFS() is. This might be coincidental. No big deal. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925658#comment-13925658 ] Hive QA commented on HIVE-6147: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12633602/HIVE-6147.5.patch.txt {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5381 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1691/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1691/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12633602 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925754#comment-13925754 ] Swarnim Kulkarni commented on HIVE-6147: [~xuefuz] As the previously failing tests now pass, I have updated the RB with the latest patch for review. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, HIVE-6147.5.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925165#comment-13925165 ] Hive QA commented on HIVE-6147: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12633416/HIVE-6147.4.patch.txt Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1672/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1672/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1672/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java' ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target hcatalog/server-extensions/target hcatalog/core/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/results/clientnegative/parquet_timestamp.q.out ql/src/test/results/clientnegative/parquet_char.q.out ql/src/test/results/clientnegative/parquet_date.q.out ql/src/test/results/clientnegative/parquet_decimal.q.out ql/src/test/results/clientnegative/parquet_varchar.q.out ql/src/test/queries/clientnegative/parquet_char.q ql/src/test/queries/clientnegative/parquet_timestamp.q ql/src/test/queries/clientnegative/parquet_decimal.q ql/src/test/queries/clientnegative/parquet_date.q ql/src/test/queries/clientnegative/parquet_varchar.q + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1575684. At revision 1575684. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12633416 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923121#comment-13923121 ] Xuefu Zhang commented on HIVE-6147: --- [~swarnim] I'm not totally convinced that these tests are unrelated, as they consistently appeared in the test result. In addition, I manually ran TestHCatLoader, and got errors as the following: {code} testProjectionsBasic(org.apache.hive.hcatalog.pig.TestHCatLoader) Time elapsed: 0.184 sec ERROR! java.io.IOException: Failed to execute create table junit_unparted_complex(name string, studentid int, contact structphno:string,email:string, currently_registered_courses arraystring, current_grades mapstring,string, phnos arraystructphno:string,type:string) stored as RCFILE tblproperties('hcat.isd'='org.apache.hive.hcatalog.rcfile.RCFileInputDriver','hcat.osd'='org.apache.hive.hcatalog.rcfile.RCFileOutputDriver'). Driver returned 1 Error: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException at org.apache.hive.hcatalog.pig.TestHCatLoader.executeStatementOnDriver(TestHCatLoader.java:125) at org.apache.hive.hcatalog.pig.TestHCatLoader.createTable(TestHCatLoader.java:111) at org.apache.hive.hcatalog.pig.TestHCatLoader.createTable(TestHCatLoader.java:101) at org.apache.hive.hcatalog.pig.TestHCatLoader.createTable(TestHCatLoader.java:115) at org.apache.hive.hcatalog.pig.TestHCatLoader.setup(TestHCatLoader.java:154) {code} Please further investigate. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921619#comment-13921619 ] Swarnim Kulkarni commented on HIVE-6147: To circle back on this one, I re-ran the tests locally again and they all seem to pass for me(See the screenshot above) Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911420#comment-13911420 ] Hive QA commented on HIVE-6147: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630595/HIVE-6147.3.patch.txt {color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 5186 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers org.apache.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hcatalog.pig.TestHCatStorer.testBagNStruct org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadPrimitiveTypes org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hive.hcatalog.pig.TestHCatStorer.testBagNStruct org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1485/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1485/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 25 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630595 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910438#comment-13910438 ] Yong Zhang commented on HIVE-6147: -- I was looking for this feature, as Hbase itself is hard to handle 1 to M relationship, or support nest structure data, if we are only using Hbase as a DW storage. If we can store the AVRO bytes in one column family of Hbase table, and export that Avro schema out in the Hive, that can give us random update/insert the data as AVRO in Hbase, and use it in Hive or MR jobs generated from Hive. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910781#comment-13910781 ] Swarnim Kulkarni commented on HIVE-6147: [~xuefuz] Any way we can get the pre-commit tests running on this guy? Seems like with the latest attached patch, the test did not run. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910802#comment-13910802 ] Xuefu Zhang commented on HIVE-6147: --- [~swarnim] Right now the queue for test runs are long, and it may take a day or two before you see the result. I have just manually created a test job in the queue. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909967#comment-13909967 ] Swarnim Kulkarni commented on HIVE-6147: I am not sure what's going on here. I checked out the latest trunk, applied the patch and re-ran the failing tests locally and they all passed! {noformat} mac-swarnim:hive swarnim$ git pull --rebase Current branch trunk is up to date. mac-swarnim:hive swarnim$ wget https://issues.apache.org/jira/secure/attachment/12629556/HIVE-6147.3.patch.txt --2014-02-23 18:40:27-- https://issues.apache.org/jira/secure/attachment/12629556/HIVE-6147.3.patch.txt Resolving issues.apache.org... 140.211.11.121 Connecting to issues.apache.org|140.211.11.121|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 212743 (208K) [text/plain] Saving to: `HIVE-6147.3.patch.txt' 100%[=] 212,743 29.5K/s in 7.0s 2014-02-23 18:40:37 (29.5 KB/s) - `HIVE-6147.3.patch.txt' saved [212743/212743] mac-swarnim:hive swarnim$ git status # On branch trunk # Untracked files: # (use git add file... to include in what will be committed) # # HIVE-6147.3.patch.txt nothing added to commit but untracked files present (use git add to track) mac-swarnim:hive swarnim$ patch -p0 HIVE-6147.3.patch.txt patching file hbase-handler/pom.xml patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/Address.java patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/ContactInfo.java patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/Employee.java patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/EmployeeAvro.java patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/Gender.java patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/HomePhone.java patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/Magic.java patching file hbase-handler/src/gen/avro/gen-java/org/apache/hadoop/hive/hbase/avro/OfficePhone.java patching file hbase-handler/src/if/avro/avro_test.avpr patching file hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseCompositeKey.java patching file hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java patching file hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDeHelper.java patching file hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java patching file hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestAvroSchemaRetriever.java patching file hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestCompositeKey.java patching file hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java patching file serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java patching file serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroLazyObjectInspector.java patching file serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorException.java patching file serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorGenerator.java patching file serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSchemaRetriever.java patching file serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java patching file serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java patching file serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java patching file serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUnion.java patching file serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazyObjectInspectorFactory.java patching file serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazySimpleStructObjectInspector.java patching file serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java mac-swarnim:hive swarnim$ git status # On branch trunk # Changes not staged for commit: # (use git add file... to update what will be committed) # (use git checkout -- file... to discard changes in working directory) # # modified: hbase-handler/pom.xml # modified: hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseCompositeKey.java # modified: hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java # modified: hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java # modified: hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestCompositeKey.java # modified: hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java # modified: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java #
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904110#comment-13904110 ] Swarnim Kulkarni commented on HIVE-6147: [~brocknoland],[~xuefuz] If one of you get a chance to do a quick review of this, I would really appreciate that. Thanks, Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904206#comment-13904206 ] Xuefu Zhang commented on HIVE-6147: --- [~swarnim] Thanks for submitting the patch. Have you investigated the test failures shown above? Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904859#comment-13904859 ] Hive QA commented on HIVE-6147: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629556/HIVE-6147.3.patch.txt {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 5140 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hcatalog.pig.TestHCatStorer.testBagNStruct org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadPrimitiveTypes org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hive.hcatalog.pig.TestHCatStorer.testBagNStruct {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1390/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1390/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12629556 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, HIVE-6147.3.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888012#comment-13888012 ] Hive QA commented on HIVE-6147: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626225/HIVE-6147.2.patch.txt {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 4987 tests executed *Failed tests:* {noformat} org.apache.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hcatalog.pig.TestHCatStorer.testBagNStruct org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hive.hcatalog.pig.TestHCatStorer.testBagNStruct {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1131/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1131/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626225 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887215#comment-13887215 ] Hive QA commented on HIVE-6147: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626065/HIVE-6147.1.patch.txt {color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 4987 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby org.apache.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hcatalog.pig.TestHCatStorer.testBagNStruct org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag org.apache.hive.hcatalog.pig.TestHCatStorer.testBagNStruct {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1119/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1119/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 22 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626065 Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887238#comment-13887238 ] Xuefu Zhang commented on HIVE-6147: --- This looks good, but from the patch, it seems that the solution is only for HBase. I wonder if we have given thoughts on the idea of generalizing the problem and providing a general solution. I can see the benefits of separating the storage (such as hbase) and data format (avro, thrift, protocol buf, parquet, etc). Then we solve M + N problems rather than M * N problems. What if the avro data is coming from other storage, such as accumulo, or parquet data from HBase. Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887299#comment-13887299 ] Swarnim Kulkarni commented on HIVE-6147: [~xuefuz] You are correct. This adds support to the existing HBaseSerDe via ObjectInspectors to query avro data stored in HBase. Also you make a very good point about having a unified interface to allow for the querying of any structured format stored in any storage like HBase or HDFS. Specifically we can look at creating specific implementations of the HBase DataType[1] to have such layer on top of HBase. The idea here though this specific problem and get this support in for now and then move towards a more generalized approach. [1] https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/types/DataType.html Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns
[ https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887475#comment-13887475 ] Swarnim Kulkarni commented on HIVE-6147: Review Request: https://reviews.apache.org/r/17566/ Support avro data stored in HBase columns - Key: HIVE-6147 URL: https://issues.apache.org/jira/browse/HIVE-6147 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt Presently, the HBase Hive integration supports querying only primitive data types in columns. It would be nice to be able to store and query Avro objects in HBase columns by making them visible as structs to Hive. This will allow Hive to perform ad hoc analysis of HBase data which can be deeply structured. -- This message was sent by Atlassian JIRA (v6.1.5#6160)