[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-16 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206090#comment-16206090
 ] 

Andrew Sherman commented on HIVE-16395:
---

Thanks for the commit [~stakiar]

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
> Fix For: 3.0.0
>
> Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch
>
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-13 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204017#comment-16204017
 ] 

Andrew Sherman commented on HIVE-16395:
---

Test failures look unrelated, so this is ready to go IMHO [~stakiar]

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
> Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch
>
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203100#comment-16203100
 ] 

Hive QA commented on HIVE-16395:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891792/HIVE-16395.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11223 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=239)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=239)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=202)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7266/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7266/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7266/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891792 - PreCommit-HIVE-Build

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
> Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch
>
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 

[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-12 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202528#comment-16202528
 ] 

Sahil Takiar commented on HIVE-16395:
-

LGTM +1 pending results of Hive QA

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
> Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch
>
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-12 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202487#comment-16202487
 ] 

Andrew Sherman commented on HIVE-16395:
---

Thanks [~stakiar] I will future-proof the test

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
> Attachments: HIVE-16395.1.patch
>
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-12 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202287#comment-16202287
 ] 

Sahil Takiar commented on HIVE-16395:
-

Overall patch LGTM, just one comment:

{quote} assertNull( "this test assumes the value of " + sparkCloneConfiguration 
+ " is not set in HiveConf", hiveSetting); {quote} Should we be assuming this? 
Would be better for the test to handle both scenarios.

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
> Attachments: HIVE-16395.1.patch
>
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--

[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201801#comment-16201801
 ] 

Hive QA commented on HIVE-16395:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891581/HIVE-16395.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 11212 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7250/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7250/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7250/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891581 - PreCommit-HIVE-Build

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Andrew Sherman
> Attachments: HIVE-16395.1.patch
>
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> 

[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-10 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198558#comment-16198558
 ] 

Rui Li commented on HIVE-16395:
---

Hi [~asherman], sorry for the late response, just returned from a long holiday. 
Cloning the job conf sounds good to me.

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-01 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187634#comment-16187634
 ] 

Sahil Takiar commented on HIVE-16395:
-

3.5 ms isn't very long. I say we set it to true by default in HoS. We'll have 
to add code thats enables it by default, but if any Hive configuration file 
({{hive-site.xml}}) explicitly sets it to {{false}} we'll have to honor that 
setting and disable it.

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-09-29 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16186112#comment-16186112
 ] 

Andrew Sherman commented on HIVE-16395:
---

[~lirui] [~stakiar] If I had to decide, I would say that setting 
spark.hadoop.cloneConf to true in HiveSparkClientFactory is better as it closes 
a whole category of problems. 


> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-09-27 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183376#comment-16183376
 ] 

Andrew Sherman commented on HIVE-16395:
---

In the case of the reported bug S3AUtils.propagateBucketOptions() is cloning a 
Configuration, and then iterating over the properties in the source 
Configuration which is where the exception happend. So we could fix this 
particular bug (in Hadoop) by having S3AUtils.propagateBucketOptions() iterate 
over the clone it has just made, adding any new properties after the operation 
has finished. I have code that demonstrates the problem, and a fix.

The more general fix is to clone the JobConf. I think we would do this by 
setting in spark.hadoop.cloneConf to true in HiveSparkClientFactory.

I did some toy benchmarks on cloning a Configuration using 
{noformat}
Configuration clone = new Configuration(original);
{noformat}
The time it takes depends on the size of the Configuration.
* A Configuration with 1000 properties takes less than 1 ms.
* A Configuration with 1 properties takes ~ 3.5 ms.

What do you think is the best approach [~lirui] [~stakiar] ?

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at 

[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-05-19 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017220#comment-16017220
 ] 

Rui Li commented on HIVE-16395:
---

Thanks [~stakiar] for the good research. Looking at 
{{CombineFileRecordReader.initNextRecordReader}}, seems we update the JobConf 
every time we get the record reader. That means even w/o the 
ConcurrentModificationException, we may hit other issues in the future. 
Actually we ship a serialized JobConf to each task so that each task can 
instantiate its own JobConf and process data using it (see 
{{HivePairFlatMapFunction}}). I wonder whether we can somehow tell HadoopRDD to 
use this JobConf  instead of the broadcasted one. If not, I prefer clone the 
JobConf since we mutate it. 
Sahil, could you do some benchmark to quantify the perf impact?

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at 

[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-05-18 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016720#comment-16016720
 ] 

Sahil Takiar commented on HIVE-16395:
-

Think I found the issue. By default, Spark gives Tasks the same {{JobConf}} 
object to use (ref 
[HadoopRDD.scala|https://github.com/apache/spark/blob/branch-2.0/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala#L143]).
 In this case, each {{CombineHiveInputFormat.getRecordReader}} inside an 
executor is given the same {{JobConf}} object. This can lead to a 
{{ConcurrentModificationException}} if one of the tasks mutates the {{JobConf}} 
while another task is iterating over it.

Similar issues have been reported in Spark too: SPARK-2546

I dug through the code a bit and Hive modifies the {{JobConf}} object in at 
least one place: {{ColumnProjectionUtils.appendReadColumns}}. Its probably 
modified in other places too, and SPARK-2546 suggests that some 
{{RecordReader}} implementations mutate it too.

The solution to SPARK-2546 was to add a config called 
{{spark.hadoop.cloneConf}} that is set to {{false}} by default. When set to 
{{true}} it clones a new {{JobConf}} object for each Spark Task, which avoids 
any thread safety issues (the [PR|https://github.com/apache/spark/pull/2684] 
for SPARK-2546 has a good explanation of the change). It's set to {{false}} by 
default for performance considerations.

So for this JIRA we could:

1: Close this and tell users if they hit this issue just set 
{{spark.hadoop.cloneConf}} to {{true}}
2: Given that we know Hive will mutate {{JobConf}} objects, maybe we should set 
{{spark.hadoop.cloneConf}} to {{true}} for HoS, we can profile the perf impact
3: ??

[~lirui], [~xuefuz] any thoughts on this?

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
>