[ https://issues.apache.org/jira/browse/HIVE-20771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654834#comment-16654834 ]
ASF GitHub Bot commented on HIVE-20771: --------------------------------------- GitHub user cvaliente opened a pull request: https://github.com/apache/hive/pull/450 [HIVE-20771] LazyBinarySerDe fails on empty structs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cvaliente/hive HIVE-20771 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/450.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #450 ---- commit c4b839f610e4a57c497cfc108823d8da1b466fa7 Author: Clemens Valiente <clemens.valiente@...> Date: 2018-10-18T08:08:08Z HIVE-20771 enable LazyBinarySerDe to read fields with empty structs ---- > LazyBinarySerDe fails on empty structs. > --------------------------------------- > > Key: HIVE-20771 > URL: https://issues.apache.org/jira/browse/HIVE-20771 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 1.2.2, 2.3.2, 3.1.0 > Reporter: Clemens Valiente > Assignee: Clemens Valiente > Priority: Minor > Labels: pull-request-available > Attachments: HIVE-20771.patch > > > {code:java} > CREATE TABLE cvaliente.structtest AS > SELECT named_struct(); > SHOW CREATE TABLE cvaliente.structtest; > SELECT * FROM cvaliente.structtest ORDER BY rand(); > {code} > The resulting schema is: > {code:sql} > CREATE TABLE `cvaliente.structtest`( > `_c0` struct<>) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://nameservice1/user/cvaliente/cvaliente/structtest2' > TBLPROPERTIES ( > 'COLUMN_STATS_ACCURATE'='true', > 'numFiles'='1', > 'numRows'='1', > 'rawDataSize'='0', > 'totalSize'='1', > 'transient_lastDdlTime'='1539781607'); > {code} > Between the MAP and REDUCE phase hive serializes to LazyBinaryStruct and when > trying to read the same object back the {{SELECT}} query above fails: > {code} > 2018-10-17 14:32:02,298 [FATAL] [TezChild] |tez.ReduceRecordSource|: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row (tag=0) > {"key":{"reducesinkkey0":0.13508293503238622},"value":{"_col0":{}}} > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:338) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:259) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:169) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating > VALUE._col0 > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:329) > ... 17 more > Caused by: java.lang.RuntimeException: length should be positive! > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryNonPrimitive.init(LazyBinaryNonPrimitive.java:54) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.init(LazyBinaryStruct.java:95) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264) > at > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201) > at > org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64) > at > org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77) > ... 18 more > {code} > this is because the LazyBinaryNonPrimitive doesn't allow for empty structs in > https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryNonPrimitive.java#L53 -- This message was sent by Atlassian JIRA (v7.6.3#76005)