Hi,
A JobClient is something that facilitates validating your job
configuration and shipping necessities to the cluster and notifying
the JobTracker of that new job. Afterwards, its responsibility may
merely be to monitor progress via reports from
JobTracker(MR1)/ApplicationMaster(MR2).
A client
Currently the DistributedCache is populated pre-Job run, hence both
Map and Reduce phases carry the same items. With MR2, the approach
Robert describes above should work better instead.
On Sat, Apr 21, 2012 at 5:21 AM, JAX jayunit...@gmail.com wrote:
No reducers can't access mapper counters.
Context is what the new MR API offers, and it wraps over a Reporter
object, and provides other helpful functions and data you'd require
within a task (lives up to its name).
Reporter was the raw object provided in the old MR API, that lets one
report progress, set status, etc.. In new API, you
Hi,
in my former job:
productive, Germany, Web portal. Throughput 600 mb/minute. Logfiles from
Windows IIS, Apache. Used in a usual way, no own decorators or sinks. Simply
syslog - bucketing (1 minute rollover) - hdfs splitted into minutes
(MMDDHHMM).
Stable, some issues (you'll found on
Karl,
since you did ask for alternatives, people using MapR prefer to use the
NFS access to directly deposit data (or access it). Works seamlessly from
all Linuxes, Solaris, Windows, AIX and a myriad of other legacy systems
without having to load any agents on those machines. And it is fully
Thanks j harsh:
I have another question , though ---
You mentioned that :
The client needs access to
the
DataNodes (for actually writing the previous files to DFS for the
JobTracker to pick up)
What do you mean by previous files? It seems like, if designing Hadoop from
scratch , I wouldn't
By previous files I meant the job related files there. DataNodes are
persistent members in HDFS. A removal of a DN results in loss of
blocks. Usually you have replication handling failures of DN
flawlessly, but consider a 1-replication cluster. A DN downtime can't
be acceptable in that case.
We decided NO product and vendor advertising on apache mailing lists!
I do not understand why you'll put that closed source stuff from your employe
in the room. It has nothing to do with flume or the use cases!
--
Alexander Lorenz
http://mapredit.blogspot.com
On Apr 21, 2012, at 4:06 PM, M. C.
It seems pretty relevant. If you can directly log via NFS that is a
viable alternative.
On Sat, Apr 21, 2012 at 11:42 AM, alo alt wget.n...@googlemail.com wrote:
We decided NO product and vendor advertising on apache mailing lists!
I do not understand why you'll put that closed source stuff
hello,
I am fairly new to Hadoop and I am trying to figure out how to find the full
Name Node URI with port and full JobTracker URI with port for usage with the
new oceansync hadoop management software that came out. The software is asking
for two configuration properties and I am trying to
Can the NFS become the bottleneck ?
Chen
On Sat, Apr 21, 2012 at 5:23 PM, Edward Capriolo edlinuxg...@gmail.comwrote:
It seems pretty relevant. If you can directly log via NFS that is a
viable alternative.
On Sat, Apr 21, 2012 at 11:42 AM, alo alt wget.n...@googlemail.com
wrote:
We
Hi
Can you tell how you started hadoop , those are locations where hadoop
namenode is running.
http://hadoop.apache.org/common/docs/current/single_node_setup.html
If you read the link above there we have detailed info about then and
hadoop install
If you are new to hadoop then you should not
no. That is the Flume Open Source Mailinglist. Not a vendor list.
NFS logging has nothing to do with decentralized collectors like Flume, JMS or
Scribe.
sent via my mobile device
On Apr 22, 2012, at 12:23 AM, Edward Capriolo edlinuxg...@gmail.com wrote:
It seems pretty relevant. If you can
13 matches
Mail list logo