[
https://issues.apache.org/jira/browse/HADOOP-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655608#action_12655608
]
Francesco Salbaroli commented on HADOOP-4586:
---------------------------------------------
I will release a preliminary test version of Fault tolerant Hadoop before 17th
Dec.
Features will include:
-JGroups 2.6.7 toolkit for reliable multicast communication that is based on a
highly configurable protocol stack to adapt to different environments (I will
post documentation about it).
-Completely wraps around the Hadoop sourcecode to minimize modifications in the
source tree.
-Dynamic JobTracker address resolution using HDFS as a support.
Enhancement in future versions:
-Higher level of abstraction
-Better exception handling
I'll post the sourcecode at the beginning of the next week (hopefully).
Can I be added to the developers?
Best regards,
Francesco
> Fault tolerant Hadoop Job Tracker
> ---------------------------------
>
> Key: HADOOP-4586
> URL: https://issues.apache.org/jira/browse/HADOOP-4586
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Affects Versions: 0.18.0
> Environment: High availability enterprise system
> Reporter: Francesco Salbaroli
> Attachments: FaultTolerantHadoop.pdf
>
> Original Estimate: 2016h
> Remaining Estimate: 2016h
>
> The Hadoop framework has been designed, in an eort to enhance perfor-
> mances, with a single JobTracker (master node). It's responsibilities varies
> from managing job submission process, compute the input splits, schedule
> the tasks to the slave nodes (TaskTrackers) and monitor their health.
> In some environments, like the IBM and Google's Internet-scale com-
> puting initiative, there is the need for high-availability, and performances
> becomes a secondary issue. In this environments, having a system with
> a Single Point of Failure (such as Hadoop's single JobTracker) is a major
> concern.
> My proposal is to provide a redundant version of Hadoop by adding
> support for multiple replicated JobTrackers. This design can be approached
> in many dierent ways.
> In the document at:
> http://sites.google.com/site/hadoopthesis/Home/FaultTolerantHadoop.pdf?attredirects=0
> I wrote an overview of the problem and some approaches to solve it.
> I post this to the community to gather feedback on the best way to proceed in
> my work.
> Thank you!
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.