Hi everyone,

When I run the following SQL in beeline, hive just throws a 
ConcurrentModificationException. Anybody knows what's wrong in my hive? Or give 
me some ideas to target where the problem is?


INSERT OVERWRITE TABLE 
kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000 
SELECT TBL_HIS_UWIP_SCAN_PROM.ORDER_NAME FROM TESTMES.TBL_HIS_UWIP_SCAN_PROM as 
TBL_HIS_UWIP_SCAN_PROM  WHERE (TBL_HIS_UWIP_SCAN_PROM.START_TIME >= '1970-01-01 
01:00:00' AND TBL_HIS_UWIP_SCAN_PROM.START_TIME < '2010-01-01 01:00:00') 
DISTRIBUTE BY RAND();


My environment:

12 nodes cluster with

Hadoop 2.7.2

Spark 1.6.2

Zookeeper 3.4.6

Hbase 1.2.2

Hive 2.1.0

Kylin 1.5.3


Also list some settings in hive-site.xml I think maybe helpful for you to 
analyze the problem:

hive.support.concurrency=true

hive.lock.manager=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager

hive.execution.engine=spark

hive.server2.transport.mode=http

hive.server2.authentication=NONE


Actually it's one step of building a Kylin cube.  The select query returns 
about 3,000,000 lines. Here is the log I got from hive.log:


2016-08-12T18:43:07,473 INFO  [HiveServer2-Background-Pool: Thread-83]: 
status.SparkJobMonitor (:()) - 2016-08-12 18:43:07,472  Stage-0_0: 58/58 
Finished       Stage-1_0: 13/13 Finished
2016-08-12T18:43:07,476 INFO  [HiveServer2-Background-Pool: Thread-83]: 
status.SparkJobMonitor (:()) - Status: Finished successfully in 264.96 seconds
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) - =====Spark Job[85a00425-c044-4e22-b54a-f2c12feb4e82] 
statistics=====
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) - Spark Job[85a00425-c044-4e22-b54a-f2c12feb4e82] Metrics
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         ExecutorDeserializeTime: 157772
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         ExecutorRunTime: 4102583
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         ResultSize: 149069
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         JvmGCTime: 234246
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         ResultSerializationTime: 23
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         MemoryBytesSpilled: 0
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         DiskBytesSpilled: 0
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         BytesRead: 6831052047
2016-08-12T18:43:07,488 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         RemoteBlocksFetched: 702
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         LocalBlocksFetched: 52
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         TotalBlocksFetched: 754
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         FetchWaitTime: 12
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         RemoteBytesRead: 2611264054
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         ShuffleBytesWritten: 2804791500
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         ShuffleWriteTime: 56641742751
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) - HIVE
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         CREATED_FILES: 13
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         
RECORDS_OUT_1_default.kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000:
 271942413
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         RECORDS_IN: 1076808610
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         RECORDS_OUT_INTERMEDIATE: 271942413
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) -         DESERIALIZE_ERRORS: 0
2016-08-12T18:43:07,489 INFO  [HiveServer2-Background-Pool: Thread-83]: 
spark.SparkTask (:()) - Execution completed successfully
2016-08-12T18:43:07,521 INFO  [HiveServer2-Background-Pool: Thread-83]: 
exec.FileSinkOperator (:()) - Moving tmp dir: 
hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/_tmp.-ext-10000
 to: 
hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/-ext-10000
2016-08-12T18:43:07,740 INFO  [HiveServer2-Background-Pool: Thread-83]: 
ql.Driver (:()) - Starting task [Stage-0:MOVE] in serial mode
2016-08-12T18:43:07,741 INFO  [HiveServer2-Background-Pool: Thread-83]: 
hive.metastore (:()) - Closed a connection to metastore, current connections: 1
2016-08-12T18:43:07,742 INFO  [HiveServer2-Background-Pool: Thread-83]: 
exec.Task (:()) - Loading data to table 
default.kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000
 from 
hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000/.hive-staging_hive_2016-08-12_18-37-50_610_2817977227856616745-2/-ext-10000
2016-08-12T18:43:07,743 INFO  [HiveServer2-Background-Pool: Thread-83]: 
hive.metastore (:()) - Trying to connect to metastore with URI 
thrift://bigdata-master:9083
2016-08-12T18:43:07,744 INFO  [HiveServer2-Background-Pool: Thread-83]: 
hive.metastore (:()) - Opened a connection to metastore, current connections: 2
2016-08-12T18:43:07,769 INFO  [HiveServer2-Background-Pool: Thread-83]: 
hive.metastore (:()) - Connected to metastore.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-1]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-12]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-0]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-7]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-4]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,110 INFO  [Delete-Thread-8]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,111 INFO  [Delete-Thread-2]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,111 INFO  [Delete-Thread-9]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-10]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-3]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,112 INFO  [Delete-Thread-5]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,113 INFO  [Delete-Thread-6]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,113 INFO  [Delete-Thread-11]: fs.TrashPolicyDefault (:()) - 
Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 
0 minutes.
2016-08-12T18:43:08,164 INFO  [HiveServer2-Background-Pool: Thread-83]: 
common.FileUtils (:()) - Creating directory if it doesn't exist: 
hdfs://bigdata/kylin/kylin_metadata/kylin-38250257-1649-4530-8ccb-975469aa6d22/kylin_intermediate_prom_group_by_ws_name_cur_cube_19700101010000_20100101010000
2016-08-12T18:43:08,177 ERROR [HiveServer2-Background-Pool: Thread-83]: 
hdfs.KeyProviderCache (:()) - Could not find uri with key 
[dfs.encryption.key.provider.uri] to create a keyProvider !!
2016-08-12T18:43:08,285 ERROR [HiveServer2-Background-Pool: Thread-83]: 
exec.Task (:()) - Failed with exception 
java.util.ConcurrentModificationException
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.util.ConcurrentModificationException
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2942)
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
        at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
        at 
org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
        at 
org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at 
org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
        at java.util.ArrayList$Itr.next(ArrayList.java:831)
        at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.convertAclEntryProto(PBHelper.java:2325)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setAcl(ClientNamenodeProtocolTranslatorPB.java:1325)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.setAcl(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2052)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2049)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.setAcl(DistributedFileSystem.java:2049)
        at 
org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2919)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2911)
        ... 4 more

2016-08-12T18:43:08,286 ERROR [HiveServer2-Background-Pool: Thread-83]: 
ql.Driver (:()) - FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask. 
java.util.ConcurrentModificationException
2016-08-12T18:43:08,286 INFO  [HiveServer2-Background-Pool: Thread-83]: 
ql.Driver (:()) - Completed executing 
command(queryId=hadoop_20160812183750_2f4560e7-7a07-4443-8937-cd0ec03ee887); 
Time taken: 267.439 seconds
2016-08-12T18:43:08,664 ERROR [HiveServer2-Background-Pool: Thread-83]: 
operation.Operation (:()) - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask. 
java.util.ConcurrentModificationException
        at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:387)
        at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
        at 
org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
        at 
org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at 
org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.util.ConcurrentModificationException
        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2942)
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
        at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
        ... 11 more
Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
        at java.util.ArrayList$Itr.next(ArrayList.java:831)
        at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.convertAclEntryProto(PBHelper.java:2325)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setAcl(ClientNamenodeProtocolTranslatorPB.java:1325)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.setAcl(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setAcl(DFSClient.java:3242)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2052)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$43.doCall(DistributedFileSystem.java:2049)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.setAcl(DistributedFileSystem.java:2049)
        at 
org.apache.hadoop.hive.io.HdfsUtils.setFullFileStatus(HdfsUtils.java:126)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2919)
        at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2911)
        ... 4 more

An interesting thing is that if I narrow down the 'where' to make the select 
query only return about 300,000 line, the insert SQL can be completed 
successfully.

Thanks,
Mh F

Reply via email to