Hi to all,
first many thanks for the quality of the work you are doing : thanks a lot
I am facing a bug with the memory management at shuffle time, I regularly get
Map output copy failure : java.lang.OutOfMemoryError: Java heap space
at
I'm curious about profiling, I see some documentation about it (1.0.3 on
AWS), but the references to JobConf seem to be for the old api and I've
got everything running on the new api.
I've got a job to handle processing of about 30GB of compressed CSVs and
it's taking over a day with 3
Hi Peter,
Can you also track the the details like, in what nodes your
mappers/reducers are running in each execution.
As your data might have been replicated across different nodes
each time your job runs JobTracker might schedule your task to run in
different nodes in the
Generally true for the framework config files, but some of the
supplementary features can be refreshed without restart. For e.g. scheduler
configuration, host files (for included / excluded nodes) ...
On Tue, Dec 4, 2012 at 5:33 AM, Cristian Cira
cmc0...@tigermail.auburn.eduwrote:
No. You will
I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
if your firewall is OK, you can check you RPC service to see if it is also
OK; and test it by telnet 10.130.110.80 50020;
I suggested hive because HQL(SQL-like) is familiar to most people, and the
learning curve is smooth;
Hiii, I am new in hadoop. Trying to configure it in fully distributed mode.
But after the command bin/start-all.sh or bin/start-mapred.sh or
bin/hadoop-daemon.sh start tasktracker, TASKTRACKER IS GOING DOWN
IMMEDIATELY WITHOUT ANY ERROR IN LOG FILE.
HELP PLEASE...
I'm using hadoop 1.0.3 version.
THE ENTIRE CONTENT OF TASKTRACKER -- PASTED HERE WITH AN ATTACHMENT
ALSO --- THANK YOU FOR THE REPLY
2012-12-04 15:20:45,942 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
/
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG: host
Hey,
We are running Map reduce jobs against a 12 machine hbase cluster and
for a long time they took approx 30 mins to return a result against ~95
million rows. Without any major changes to the data or any upgrade of
hbase/hadoop they now seem to be taking about 4 hours. and the logs are
full of
Hi Haitao,
To help isolate, what happens if you run a different job? Also, if you
view the namenode webui or the specific datanode webui having the issue,
are there any indicators of it being down?
Regards,
Robert
On Tue, Dec 4, 2012 at 12:49 AM, panfei cnwe...@gmail.com wrote:
I noticed that
What does JPS return on your namenode and on one of the datanodes?
Cristian Cira
Graduate Research Assistant
Parallel Architecture and System Laboratory(PASL)
Shelby Center 2105
Auburn University, AL 36849
From: Robert Molina [rmol...@hortonworks.com]
First rule to be wary of is your use of the combiner. The combiner *might*
be run, it *might not* be run, and it *might be run multiple times*. The
combiner is only for reducing the amount of data going to the reducer, and
it will only be run *if and when* it's deemed likely to be useful by
11 matches
Mail list logo