[ https://issues.apache.org/jira/browse/TEZ-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387870#comment-14387870 ]
Siddharth Seth commented on TEZ-2237: ------------------------------------- bq. that a less noisy flag-based approach would be nicer; Agreed. It's possible to run into a similar case with SortedOutput as well - it uses the same mechanism to figure out that a buffer is too small. On the latest run that was posted (application_1427324000018_1444.yarn-logs.red.txt) - there's several things happening. - Tasks end up getting pre-empted because we've ended up scheduling some tasks out of order. - This would typically cause tasks to be KILLED and re-run. It looks like you're using a branch / release which doesn't contain TEZ-1929 which fixes this. - These pre-empted tasks are treated as FAILURES, which eventually cause nodes to be blacklisted. orc1 and orc2. All tasks which ran here previously are re-run. Despite this, the DAG should have continue - I'm not sure why there was a long pause after; I haven't looked more. At the end, it looks like tasks from vertex _14 and vertex_80 were running. The dependencies for _14 seem to be done - but only 35 events were sent (instead of 70) for attempt_1427324000018_1444_1_15_000000_0. This needs more looking into, if anyone else wants to go from here. Couple of things to try: Apply TEZ-1929. Set "tez.am.dag.scheduler.class" to "org.apache.tez.dag.app.dag.impl.DAGSchedulerNaturalOrderControlled" in tez-site.xml to avoid the out of order execution. Lets see if that makes progress. > BufferTooSmallException raised in UnorderedPartitionedKVWriter then DAG > lingers > ------------------------------------------------------------------------------- > > Key: TEZ-2237 > URL: https://issues.apache.org/jira/browse/TEZ-2237 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.6.0 > Environment: Debian Linux "jessie" > OpenJDK Runtime Environment (build 1.8.0_40-internal-b27) > OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode) > 7 * Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16/24 GB RAM per node, 1*system > disk + 4*1 or 2 TiB HDD for HDFS & local (on-prem, dedicated hardware) > Scalding 0.13.1 modified with https://github.com/twitter/scalding/pull/1220 > to run Cascading 3.0.0-wip-90 with TEZ 0.6.0 > Reporter: Cyrille Chépélov > Attachments: all_stacks.lst, alloc_mem.png, alloc_vcores.png, > application_1427324000018_1444.yarn-logs.red.txt.gz, > appmaster____syslog_dag_1427282048097_0215_1.red.txt.gz, > appmaster____syslog_dag_1427282048097_0237_1.red.txt.gz, > gc_count_MRAppMaster.png, mem_free.png, ordered-grouped-kv-input-traces.diff, > start_containers.png, stop_containers.png, > syslog_attempt_1427282048097_0215_1_21_000014_0.red.txt.gz, > syslog_attempt_1427282048097_0237_1_70_000028_0.red.txt.gz, yarn_rm_flips.png > > > On a specific DAG with many vertices (actually part of a larger meta-DAG), > after about a hour of processing, several BufferTooSmallException are raised > in UnorderedPartitionedKVWriter (about one every two or three spills). > Once these exceptions are raised, the DAG remains indefinitely "active", > tying up memory and CPU resources as far as YARN is concerned, while little > if any actual processing takes place. > It seems two separate issues are at hand: > 1. BufferTooSmallException are raised even though, small as the actually > allocated buffers seem to be (around a couple megabytes were allotted whereas > 100MiB were requested), the actual keys and values are never bigger than 24 > and 1024 bytes respectively. > 2. In the event BufferTooSmallExceptions are raised, the DAG fails to stop > (stop requests appear to be sent 7 hours after the BTSE exceptions are > raised, but 9 hours after these stop requests, the DAG was still lingering on > with all containers present tying up memory and CPU allocations) > The emergence of the BTSE prevent the Cascade to complete, preventing from > validating the results compared to traditional MR1-based results. The lack of > conclusion renders the cluster queue unavailable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)