[
https://issues.apache.org/jira/browse/HADOOP-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651962#action_12651962
]
Kai Mosebach commented on HADOOP-3999:
--------------------------------------
The basic capability plugin system is done so far but i have some structural
problems/questions which you hopefully might be able to help me out with:
Status :
- I currently do the execution of the collector-plugins into both, the DataNode
startup as well as the TaskTracker startup.
- The results should be persisted locally with a timestamp so that expensive
plugins (like searching for a binary, harddisk performance checks etc) are not
run too often.
Questions:
- Where should i put the configuration to be available throughout the cluster
(especially to the namenode and to the jobtracker). Would DatanodeInfo be a
good place?
- Would it make sense to merge the capabilities with the generic conf structure?
- Plugins (.class, shell and perl scripts) currently reside in
$HADOOP_HOME/plugins. I am not quite happy with that and not yet sure where to
place them in the build stack. any recommendations? Maybe
$HADOOP_HOME/bin/plugins ?
> Need to add host capabilites / abilities
> ----------------------------------------
>
> Key: HADOOP-3999
> URL: https://issues.apache.org/jira/browse/HADOOP-3999
> Project: Hadoop Core
> Issue Type: Improvement
> Components: metrics
> Environment: Any
> Reporter: Kai Mosebach
>
> The MapReduce paradigma is limited to run MapReduce jobs with the lowest
> common factor of all nodes in the cluster.
> On the one hand this is wanted (cloud computing, throw simple jobs in,
> nevermind who does it)
> On the other hand this is limiting the possibilities quite a lot, for
> instance if you had data which could/needs to be fed to a 3rd party interface
> like Mathlab, R, BioConductor you could solve a lot more jobs via hadoop.
> Furthermore it could be interesting to know about the OS, the architecture,
> the performance of the node in relation to the rest of the cluster.
> (Performance ranking)
> i.e. if i'd know about a sub cluster of very computing performant nodes or a
> sub cluster of very fast disk-io nodes, the job tracker could select these
> nodes regarding a so called job profile (i.e. my job is a heavy computing job
> / heavy disk-io job), which can usually be estimated by a developer before.
> To achieve this, node capabilities could be introduced and stored in the DFS,
> giving you
> a1.) basic information about each node (OS, ARCH)
> a2.) more sophisticated infos (additional software, path to software,
> version).
> a3.) PKI collected about the node (disc-io, cpu power, memory)
> a4.) network throughput to neighbor hosts, which might allow generating a
> network performance map over the cluster
> This would allow you to
> b1.) generate jobs that have a profile (computing intensive, disk io
> intensive, net io intensive)
> b2.) generate jobs that have software dependencies (run on Linux only, run on
> nodes with MathLab only)
> b3.) generate a performance map of the cluster (sub clusters of fast disk
> nodes, sub clusters of fast CPU nodes, network-speed-relation-map between
> nodes)
> From step b3) you could then even acquire statistical information which could
> again be fed into the DFS Namenode to see if we could store data on fast disk
> subclusters only (that might need to be a tool outside of hadoop core though)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.