[
https://issues.apache.org/jira/browse/HADOOP-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645379#action_12645379
]
Devaraj Das commented on HADOOP-4595:
-------------------------------------
bq. Note that both existing JVMs have their 'busy' status as true. But
numFreeSlots was > 0;
I assume you mean that the number of slots is 2.
bq. the preceeding log entry was: 2008-11-05 08:57:55,296 INFO
org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1
and trying to launch attempt_200811040110_0230_r_000011_1
Did you notice whether this attempt got launched at all? Did it get killed
immediately or something?
bq. The reduce output (part-XXXXX) gets lost when this attempt fails, even
though the other (earlier) attempt succeeded.
I didn't understand this part. Could you please explain?
> JVM Reuse triggers RuntimeException("Invalid state")
> ----------------------------------------------------
>
> Key: HADOOP-4595
> URL: https://issues.apache.org/jira/browse/HADOOP-4595
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.0
> Reporter: Aaron Kimball
>
> A Reducer triggers the following exception:
> 08/11/05 08:58:50 INFO mapred.JobClient: Task Id :
> attempt_200811040110_0230_r_000008_1, Status : FAILED
> java.lang.RuntimeException: Inconsistent state!!! JVM Manager reached an
> unstable state while reaping a JVM for task:
> attempt_200811040110_0230_r_000008_1 Number of active JVMs:2
> JVMId jvm_200811040110_0230_r_-735233075 #Tasks ran: 0 Currently busy? true
> Currently running: attempt_200811040110_0230_r_000012_0
> JVMId jvm_200811040110_0230_r_-1716942642 #Tasks ran: 0 Currently busy? true
> Currently running: attempt_200811040110_0230_r_000040_0
> at java.lang.Throwable.<init>(Throwable.java:67)
> at
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:245)
> at
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:113)
> at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:78)
> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:410)
> Other clues:
> In the three reduce task attempts where this was observed, this was attempt
> _1. Attempt _0 had started and eventually switches to "SUCCEEDED." So I think
> this is happening only on speculatively-executed reduce task attempts. The
> reduce output (part-XXXXX) gets lost when this attempt fails, even though the
> other (earlier) attempt succeeded.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.