[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206090#comment-16206090 ] Andrew Sherman commented on HIVE-16395: --- Thanks for the commit [~stakiar] > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Andrew Sherman > Fix For: 3.0.0 > > Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch > > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204017#comment-16204017 ] Andrew Sherman commented on HIVE-16395: --- Test failures look unrelated, so this is ready to go IMHO [~stakiar] > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Andrew Sherman > Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch > > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203100#comment-16203100 ] Hive QA commented on HIVE-16395: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12891792/HIVE-16395.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11223 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=162) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=239) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=239) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=239) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=202) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7266/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7266/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7266/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12891792 - PreCommit-HIVE-Build > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Andrew Sherman > Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch > > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202528#comment-16202528 ] Sahil Takiar commented on HIVE-16395: - LGTM +1 pending results of Hive QA > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Andrew Sherman > Attachments: HIVE-16395.1.patch, HIVE-16395.2.patch > > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202487#comment-16202487 ] Andrew Sherman commented on HIVE-16395: --- Thanks [~stakiar] I will future-proof the test > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Andrew Sherman > Attachments: HIVE-16395.1.patch > > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202287#comment-16202287 ] Sahil Takiar commented on HIVE-16395: - Overall patch LGTM, just one comment: {quote} assertNull( "this test assumes the value of " + sparkCloneConfiguration + " is not set in HiveConf", hiveSetting); {quote} Should we be assuming this? Would be better for the test to handle both scenarios. > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Andrew Sherman > Attachments: HIVE-16395.1.patch > > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} --
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201801#comment-16201801 ] Hive QA commented on HIVE-16395: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12891581/HIVE-16395.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 11212 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=162) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=239) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7250/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7250/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7250/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12891581 - PreCommit-HIVE-Build > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Andrew Sherman > Attachments: HIVE-16395.1.patch > > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at >
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198558#comment-16198558 ] Rui Li commented on HIVE-16395: --- Hi [~asherman], sorry for the late response, just returned from a long holiday. Cloning the job conf sounds good to me. > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187634#comment-16187634 ] Sahil Takiar commented on HIVE-16395: - 3.5 ms isn't very long. I say we set it to true by default in HoS. We'll have to add code thats enables it by default, but if any Hive configuration file ({{hive-site.xml}}) explicitly sets it to {{false}} we'll have to honor that setting and disable it. > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} -- This message was sent by Atlassian JIRA
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16186112#comment-16186112 ] Andrew Sherman commented on HIVE-16395: --- [~lirui] [~stakiar] If I had to decide, I would say that setting spark.hadoop.cloneConf to true in HiveSparkClientFactory is better as it closes a whole category of problems. > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at > org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) > at > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68) > ... 26 more > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183376#comment-16183376 ] Andrew Sherman commented on HIVE-16395: --- In the case of the reported bug S3AUtils.propagateBucketOptions() is cloning a Configuration, and then iterating over the properties in the source Configuration which is where the exception happend. So we could fix this particular bug (in Hadoop) by having S3AUtils.propagateBucketOptions() iterate over the clone it has just made, adding any new properties after the operation has finished. I have code that demonstrates the problem, and a fix. The more general fix is to clone the JobConf. I think we would do this by setting in spark.hadoop.cloneConf to true in HiveSparkClientFactory. I did some toy benchmarks on cloning a Configuration using {noformat} Configuration clone = new Configuration(original); {noformat} The time it takes depends on the size of the Configuration. * A Configuration with 1000 properties takes less than 1 ms. * A Configuration with 1 properties takes ~ 3.5 ms. What do you think is the best approach [~lirui] [~stakiar] ? > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017220#comment-16017220 ] Rui Li commented on HIVE-16395: --- Thanks [~stakiar] for the good research. Looking at {{CombineFileRecordReader.initNextRecordReader}}, seems we update the JobConf every time we get the record reader. That means even w/o the ConcurrentModificationException, we may hit other issues in the future. Actually we ship a serialized JobConf to each task so that each task can instantiate its own JobConf and process data using it (see {{HivePairFlatMapFunction}}). I wonder whether we can somehow tell HadoopRDD to use this JobConf instead of the broadcasted one. If not, I prefer clone the JobConf since we mutate it. Sahil, could you do some benchmark to quantify the perf impact? > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253) > ... 21 more > Caused by: java.util.ConcurrentModificationException > at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) > at > org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455) > at > org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) > at
[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016720#comment-16016720 ] Sahil Takiar commented on HIVE-16395: - Think I found the issue. By default, Spark gives Tasks the same {{JobConf}} object to use (ref [HadoopRDD.scala|https://github.com/apache/spark/blob/branch-2.0/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala#L143]). In this case, each {{CombineHiveInputFormat.getRecordReader}} inside an executor is given the same {{JobConf}} object. This can lead to a {{ConcurrentModificationException}} if one of the tasks mutates the {{JobConf}} while another task is iterating over it. Similar issues have been reported in Spark too: SPARK-2546 I dug through the code a bit and Hive modifies the {{JobConf}} object in at least one place: {{ColumnProjectionUtils.appendReadColumns}}. Its probably modified in other places too, and SPARK-2546 suggests that some {{RecordReader}} implementations mutate it too. The solution to SPARK-2546 was to add a config called {{spark.hadoop.cloneConf}} that is set to {{false}} by default. When set to {{true}} it clones a new {{JobConf}} object for each Spark Task, which avoids any thread safety issues (the [PR|https://github.com/apache/spark/pull/2684] for SPARK-2546 has a good explanation of the change). It's set to {{false}} by default for performance considerations. So for this JIRA we could: 1: Close this and tell users if they hit this issue just set {{spark.hadoop.cloneConf}} to {{true}} 2: Given that we know Hive will mutate {{JobConf}} objects, maybe we should set {{spark.hadoop.cloneConf}} to {{true}} for HoS, we can profile the perf impact 3: ?? [~lirui], [~xuefuz] any thoughts on this? > ConcurrentModificationException on config object in HoS > --- > > Key: HIVE-16395 > URL: https://issues.apache.org/jira/browse/HIVE-16395 > Project: Hive > Issue Type: Task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Looks like this is happening inside spark executors, looks to be some race > condition when modifying {{Configuration}} objects. > Stack-Trace: > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at >