[ https://issues.apache.org/jira/browse/HIVE-11540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970142#comment-14970142 ]
Hive QA commented on HIVE-11540: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12767962/HIVE-11540.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5741/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5741/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5741/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5741/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 552bfbc..51a0c03 branch-1 -> origin/branch-1 2fd619b..3f03d26 master -> origin/master + git reset --hard HEAD HEAD is now at 2fd619b HIVE-11895: CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix udaf_percentile_approx_23.q (Pengcheng Xiong, reviewed by Ashutosh Chauhan) + git clean -f -d Removing data/files/parquet_type_promotion.txt Removing ql/src/test/queries/clientpositive/parquet_type_promotion.q Removing ql/src/test/results/clientpositive/parquet_type_promotion.q.out + git checkout master Already on 'master' Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at 3f03d26 HIVE-11710: Beeline embedded mode doesn't output query progress after setting any session property (Aihua via Xuefu) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12767962 - PreCommit-HIVE-TRUNK-Build > HDP 2.3 and Flume 1.6: Hive Streaming – Too many delta files during Compaction > ------------------------------------------------------------------------------ > > Key: HIVE-11540 > URL: https://issues.apache.org/jira/browse/HIVE-11540 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 1.0.0 > Reporter: Nivin Mathew > Assignee: Eugene Koifman > Attachments: HIVE-11540.patch > > > Hello, > I am streaming weblogs to Kafka and then to Flume 1.6 using a Hive sink, with > an average of 20 million records a day. I have 5 compactors running at > various times (30m/5m/5s), no matter what time I give, the compactors seem to > run out of memory cleaning up a couple thousand delta files and ultimately > falls behind compacting/cleaning delta files. Any suggestions on what I can > do to improve performance? Or can Hive streaming not handle this kind of load? > I used this post as reference: > http://henning.kropponline.de/2015/05/19/hivesink-for-flume/ > {noformat} > 2015-08-12 15:05:01,197 FATAL [main] org.apache.hadoop.mapred.YarnChild: > Error running child : java.lang.OutOfMemoryError: Direct buffer memory > Max block location exceeded for split: CompactorInputSplit{base: > hdfs://Dev01HWNameService/user/hive/warehouse/weblogs.db/dt=15-08-12/base_1056406, > bucket: 0, length: 6493042, deltas: [delta_1056407_1056408, > delta_1056409_1056410, delta_1056411_1056412, delta_1056413_1056414, > delta_1056415_1056416, delta_1056417_1056418,… > , delta_1074039_1074040, delta_1074041_1074042, delta_1074043_1074044, > delta_1074045_1074046, delta_1074047_1074048, delta_1074049_1074050, > delta_1074051_1074052]} splitsize: 8772 maxsize: 10 > 2015-08-12 15:34:25,271 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(198)) - number of > splits:3 > 2015-08-12 15:34:25,367 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.JobSubmitter (JobSubmitter.java:printTokens(287)) - Submitting > tokens for job: job_1439397150426_0068 > 2015-08-12 15:34:25,603 INFO [upladevhwd04v.researchnow.com-18]: > impl.YarnClientImpl (YarnClientImpl.java:submitApplication(274)) - Submitted > application application_1439397150426_0068 > 2015-08-12 15:34:25,610 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:submit(1294)) - The url to track the job: > http://upladevhwd02v.researchnow.com:8088/proxy/application_1439397150426_0068/ > 2015-08-12 15:34:25,611 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1339)) - Running job: > job_1439397150426_0068 > 2015-08-12 15:34:30,170 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:34:33,756 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1360)) - Job > job_1439397150426_0068 running in uber mode : false > 2015-08-12 15:34:33,757 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 0% reduce 0% > 2015-08-12 15:34:35,147 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:34:40,155 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:34:45,184 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:34:50,201 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:34:55,256 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:00,205 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:02,975 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 33% reduce 0% > 2015-08-12 15:35:02,982 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:printTaskEvents(1406)) - Task Id : > attempt_1439397150426_0068_m_000000_0, Status : FAILED > 2015-08-12 15:35:03,000 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:printTaskEvents(1406)) - Task Id : > attempt_1439397150426_0068_m_000001_0, Status : FAILED > 2015-08-12 15:35:04,008 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 0% reduce 0% > 2015-08-12 15:35:05,132 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:10,206 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:15,228 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:20,207 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:25,148 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:28,154 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:printTaskEvents(1406)) - Task Id : > attempt_1439397150426_0068_m_000000_1, Status : FAILED > 2015-08-12 15:35:29,161 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:printTaskEvents(1406)) - Task Id : > attempt_1439397150426_0068_m_000001_1, Status : FAILED > 2015-08-12 15:35:30,142 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:35,140 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:40,170 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:45,153 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:50,150 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:35:52,268 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:printTaskEvents(1406)) - Task Id : > attempt_1439397150426_0068_m_000000_2, Status : FAILED > 2015-08-12 15:35:53,274 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:printTaskEvents(1406)) - Task Id : > attempt_1439397150426_0068_m_000001_2, Status : FAILED > 2015-08-12 15:35:55,149 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:36:00,160 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:36:05,145 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:36:10,155 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:36:15,158 INFO [Thread-7]: compactor.Initiator > (Initiator.java:run(88)) - Checking to see if we should compact > weblogs.vop_hs.dt=15-08-12 > 2015-08-12 15:36:17,397 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 100% reduce 0% > 2015-08-12 15:36:18,409 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1380)) - Job > job_1439397150426_0068 failed with state FAILED due to: Task failed > task_1439397150426_0068_m_000000 > Job failed as tasks failed. failedMaps:1 failedReduces:0 > 2015-08-12 15:36:18,443 INFO [upladevhwd04v.researchnow.com-18]: > mapreduce.Job (Job.java:monitorAndPrintJob(1385)) - Counters: 10 > Job Counters > Failed map tasks=7 > Killed map tasks=1 > Launched map tasks=8 > Other local map tasks=6 > Data-local map tasks=2 > Total time spent by all maps in occupied slots (ms)=191960 > Total time spent by all reduces in occupied slots (ms)=0 > Total time spent by all map tasks (ms)=191960 > Total vcore-seconds taken by all map tasks=191960 > Total megabyte-seconds taken by all map tasks=884551680 > 2015-08-12 15:36:18,443 ERROR [upladevhwd04v.researchnow.com-18]: > compactor.Worker (Worker.java:run(176)) - Caught exception while trying to > compact weblogs.vop_hs.dt=15-08-12. Marking clean to avoid repeated > failures, java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:186) > at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:169) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166) > 2015-08-12 15:36:18,444 ERROR [upladevhwd04v.researchnow.com-18]: > txn.CompactionTxnHandler (CompactionTxnHandler.java:markCleaned(327)) - > Expected to remove at least one row from completed_txn_components when > marking compaction entry as clean! > ^C > {noformat} > [ngmathew@upladevhwd04v ~]$ tail -f /var/log/hive/hivemetastore.log > 2015-08-12 15:36:18,443 ERROR [upladevhwd04v.researchnow.com-18]: > compactor.Worker (Worker.java:run(176)) - Caught exception while trying to > compact weblogs.vop_hs.dt=15-08-12. Marking clean to avoid repeated > failures, java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:186) > at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:169) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166) > Settings: > hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager > hive.compactor.initiator.on = true > hive.compactor.worker.threads = 5 > Table stored as ORC > hive.vectorized.execution.enabled = false > hive.input.format = org.apache.hadoop.hive.ql.io.HiveInputFormat -- This message was sent by Atlassian JIRA (v6.3.4#6332)