[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE

2016-03-22 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207384#comment-15207384
 ] 

Aaron Dossett commented on HIVE-11221:
--

[~ashishen...@gmail.com] HDP 2.3.4 does include this fix backported to 1.2.1 
(https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/content/patch_hive.html)

We recently upgraded to 2.3.4 and concatenation is working fine so far.


> In Tez mode, alter table concatenate orc files can intermittently fail with 
> NPE
> ---
>
> Key: HIVE-11221
> URL: https://issues.apache.org/jira/browse/HIVE-11221
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11221.1.patch
>
>
> We are not waiting for input ready events which can trigger occasional NPE if 
> input is not actually ready.
> Stacktrace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.(MRReaderMapred.java:73)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7181) Beginner User On Apache Jira

2015-12-23 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett resolved HIVE-7181.
-
Resolution: Not A Problem

> Beginner User On Apache Jira
> 
>
> Key: HIVE-7181
> URL: https://issues.apache.org/jira/browse/HIVE-7181
> Project: Hive
>  Issue Type: Wish
>Reporter: Nishant Kelkar
>Priority: Minor
>  Labels: documentation, newbie
>
> Hi All! 
> I've just started to use Apache's Jira board (I registered today). I've used 
> Jira for my work before, so I know how to navigate within Jira. But my main 
> question, was understanding how issues are handled in the open source 
> community (to which I want to contribute, but I'm a noob here too). So 
> basically, a person comes up with a ticket when he/she thinks that the issue 
> they are facing, is a bug/improvement. 
> Questions:
> 1. Whom am I supposed to assign the ticket to? (myself?)
> 2. Who would be the QA assignee? 
> 3. If addressing the issue requires looking at the code, how am I supposed to 
> change the code and bring into effect those changes? (At work, we maintain a 
> Git repo on our private server. So everyone always has access to the latest 
> code).
> 4. Where can I find a list of all the people who are active on this project 
> (Hive)? It would be nice if I could tag people by their names in my ticket 
> comments. 
> 5. Where can I find well formatted documentation about how to take issues 
> from discovery to fixture on Apache Jira? 
> I apologize in advance, if my questions are too simple.
> Thanks, and any/all help is appreciated! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE

2015-11-05 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992094#comment-14992094
 ] 

Aaron Dossett commented on HIVE-11221:
--

I am seeing a different NPE when I concatenate a partition with TEZ, but trying 
the same command in MR works fine.  This was the closest JIRA I could to this 
issue.  Does anyone this this a new issue?

alter table mytable partition (partition_date='2015-10-30') concatenate;

Status: Failed
Vertex failed, vertexName=File Merge, vertexId=vertex_1446684298977_0879_1_00, 
diagnostics=[Vertex vertex_1446684298977_0879_1_00 [File Merge] killed/failed 
due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: 
[hdfs://namenode/apps/hive/warehouse/mydb.db/estore/partition_date=2015-10-30] 
initializer failed, vertex=vertex_1446684298977_0879_1_00 [File Merge], 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452)
at 
org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441)
at 
org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295)
at 
org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

> In Tez mode, alter table concatenate orc files can intermittently fail with 
> NPE
> ---
>
> Key: HIVE-11221
> URL: https://issues.apache.org/jira/browse/HIVE-11221
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11221.1.patch
>
>
> We are not waiting for input ready events which can trigger occasional NPE if 
> input is not actually ready.
> Stacktrace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648)
>   at 
> 

[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE

2015-11-05 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992691#comment-14992691
 ] 

Aaron Dossett commented on HIVE-11221:
--

Just to confirm -- you think my error is a duplicate?  The stack traces were a 
little different.

> In Tez mode, alter table concatenate orc files can intermittently fail with 
> NPE
> ---
>
> Key: HIVE-11221
> URL: https://issues.apache.org/jira/browse/HIVE-11221
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11221.1.patch
>
>
> We are not waiting for input ready events which can trigger occasional NPE if 
> input is not actually ready.
> Stacktrace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.(MRReaderMapred.java:73)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE

2015-11-05 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992628#comment-14992628
 ] 

Aaron Dossett commented on HIVE-11221:
--

[~prasanth_j] Hive 0.14 on HDP 2.2.4

> In Tez mode, alter table concatenate orc files can intermittently fail with 
> NPE
> ---
>
> Key: HIVE-11221
> URL: https://issues.apache.org/jira/browse/HIVE-11221
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11221.1.patch
>
>
> We are not waiting for input ready events which can trigger occasional NPE if 
> input is not actually ready.
> Stacktrace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.(MRReaderMapred.java:73)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE

2015-11-05 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992869#comment-14992869
 ] 

Aaron Dossett commented on HIVE-11221:
--

Thank you [~prasanth_j]!

> In Tez mode, alter table concatenate orc files can intermittently fail with 
> NPE
> ---
>
> Key: HIVE-11221
> URL: https://issues.apache.org/jira/browse/HIVE-11221
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11221.1.patch
>
>
> We are not waiting for input ready events which can trigger occasional NPE if 
> input is not actually ready.
> Stacktrace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.(MRReaderMapred.java:73)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12059) Clean up reference to deprecated constants in AvroSerdeUtils

2015-10-07 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-12059:
-
Attachment: HIVE-12059.patch

> Clean up reference to deprecated constants in AvroSerdeUtils
> 
>
> Key: HIVE-12059
> URL: https://issues.apache.org/jira/browse/HIVE-12059
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Aaron Dossett
>Priority: Minor
> Attachments: HIVE-12059.patch
>
>
> AvroSerdeUtils contains several deprecated String constants that are used by 
> other Hive modules.  Those should be cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12059) Clean up reference to deprecated constants in AvroSerdeUtils

2015-10-07 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947575#comment-14947575
 ] 

Aaron Dossett commented on HIVE-12059:
--

My patch gets all deprecated references out EXCEPT from the SerDeSpec 
annotation in AvroSerDe.  I don't have any experience developing annotations so 
that fix for that isn't obvious to me.

One approach would be to add some redundant Strings to AvroSerdeUtils with a 
level of access below public that AvroSerDe could use.  Open to other 
suggestions if this is important enough.

> Clean up reference to deprecated constants in AvroSerdeUtils
> 
>
> Key: HIVE-12059
> URL: https://issues.apache.org/jira/browse/HIVE-12059
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
>Priority: Minor
> Attachments: HIVE-12059.patch
>
>
> AvroSerdeUtils contains several deprecated String constants that are used by 
> other Hive modules.  Those should be cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-06 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945487#comment-14945487
 ] 

Aaron Dossett commented on HIVE-11977:
--

Thank you all around, [~ashutoshc]!  Yes, I have tested this on local clusters 
and we are planning to deploy it to production next week as well.  I will 
resubmit my patch, thank you for that pointer

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-06 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-11977:
-
Attachment: HIVE-11977.2.patch

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977.2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-06 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-11977:
-
Attachment: (was: HIVE-11977-2.patch)

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7316) Hive fails on zero length files

2015-10-06 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett resolved HIVE-7316.
-
Resolution: Duplicate

> Hive fails on zero length files
> ---
>
> Key: HIVE-7316
> URL: https://issues.apache.org/jira/browse/HIVE-7316
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>
> Flume will, at times, generate zero length files. This causes queries to fail 
> on Avro data and likely sequence file as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-02 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941665#comment-14941665
 ] 

Aaron Dossett commented on HIVE-11977:
--

Thanks, [~ashutoshc], I checked the Avro JIRA based on your suggestion.  The 
Avro project declined that option in AVRO-1530 and suggested clients ignore 
zero length files.  That also led me to HIVE-7316, which my issue duplicates.

[~brocknoland] Your thoughts, since you are on both of the above JIRAs?

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-02 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941603#comment-14941603
 ] 

Aaron Dossett commented on HIVE-11977:
--

[~ashutoshc] Thank you for your response! My thought is that any process for 
generating this data could have failure scenarios that result in zero length 
files, this was the case when I initially ran into this issue.  A file was 
opened on HDFS and "held" as zero length file before data was written to it, 
and it crashed before any data could be written.  The consequences of these 
cases, that the entire table is unreadable (based on my experience), seems 
disproportionate to the actual problem.  Likewise, a process deleting empty 
files could expose small windows where the table was unusable.

Would adding a warning and/or adding an option like 
{{hive.exec.orc.skip.corrupt.data}} be more appropriate than silently ignoring 
the files?  This is my first foray into Hive internals, so perhaps that orc 
option is not an exact comparison to this situation, but as a user it seems 
similar.

Thank you again for the response and your feedback!

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-29 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-11977:
-
Attachment: HIVE-11977-002.patch

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-002.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-29 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936249#comment-14936249
 ] 

Aaron Dossett commented on HIVE-11977:
--

Attached a second patch that includes a unit test and better patch formatting

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-29 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-11977:
-
Attachment: HIVE-11977-2.patch

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-29 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-11977:
-
Attachment: (was: HIVE-11977-002.patch)

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-28 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-11977:
-
Attachment: HIVE-11977.patch

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-28 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett updated HIVE-11977:
-
Attachment: HIVE-11977.patch

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-28 Thread Aaron Dossett (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dossett reassigned HIVE-11977:


Assignee: Aaron Dossett

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-09-28 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933731#comment-14933731
 ] 

Aaron Dossett commented on HIVE-11977:
--

Uploading my first take at a fix.  I am working on adding appropriate unit / 
integration tests, but any feedback would be welcome in the meantime.

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)