Hi,
I have lot's of small jobs and would like to compute the aggregate
running time of all the mappers and reducers in my job history rather
than tally the numbers by hand through the web interface. I know that
the Reporter object can be used to output performance numbers for a
single job
e.org/core/docs/current/api/org/apache/hadoop/
dfs/FSNamesystemMetrics.html
Hope this is helpful.
--Konstantin
Shirley Cohen wrote:
Hi,
I would like to measure the disk i/o performance of our hadoop
cluster. However, running iostat on 16 nodes is rather cumbersome.
Does dfs keep track of any
command to get it.
Shirley Cohen wrote:
Hi,
I would like to measure the disk i/o performance of our hadoop
cluster. However, running iostat on 16 nodes is rather cumbersome.
Does dfs keep track of any stats like the number of blocks or
bytes read and written? From scanning the api, I found a
Hi,
I would like to measure the disk i/o performance of our hadoop
cluster. However, running iostat on 16 nodes is rather cumbersome.
Does dfs keep track of any stats like the number of blocks or bytes
read and written? From scanning the api, I found a class called
"org.apache.hadoop.fs.F
Hi,
I'm trying to figure out which log files are used by the job
tracker's web interface to display the following information:
Job Name: my job
Job File: hdfs://localhost:9000/tmp/hadoop-scohen/mapred/system/
job_200809260816_0001/job.xml
Status: Succeeded
Started at: Fri Sep 26 08:18:04 CD
Thanks Owen! I found the bug in my code: Doing collect twice does
work now :))
Shirley
On Sep 9, 2008, at 4:19 PM, Owen O'Malley wrote:
On Sep 9, 2008, at 12:20 PM, Shirley Cohen wrote:
I have a simple reducer that computes the average by doing a sum/
count. But I want to output bot
I have a simple reducer that computes the average by doing a sum/
count. But I want to output both the average and the count for a
given key, not just the average. Is it possible to output both values
from the same invocation of the reducer? Or do I need two reducer
invocations? If I try to
rley
On Sep 7, 2008, at 8:38 AM, 叶双明 wrote:
Are you sure there isn't any error or exception in logs?
2008/9/5, Shirley Cohen <[EMAIL PROTECTED]>:
Hi Dmitry,
Thanks for your suggestion. I checked and the other systems on the
cluster
do seem to have java installed. I was also able
utput from hadoop. If it help - can you submit bug
request
? :)
-Original Message-
From: Shirley Cohen [mailto:[EMAIL PROTECTED]
Sent: Thursday, September 04, 2008 10:07 AM
To: core-user@hadoop.apache.org
Subject: no output from job run on cluster
Hi,
I'm running on hadoop-0.18.0.
Thanks, Owen. This fixed my problem!
Shirley
On Sep 2, 2008, at 8:44 PM, Owen O'Malley wrote:
On Tue, Sep 2, 2008 at 10:24 AM, Shirley Cohen
<[EMAIL PROTECTED]> wrote:
Hi,
I'm trying to write the output of two different map-reduce jobs
into the
same output dire
Hi,
I'm running on hadoop-0.18.0. I have a m-r job that executes
correctly in standalone mode. However, when run on a cluster, the
same job produces zero output. It is very bizarre. I looked in the
logs and couldn't find anything unusual. All I see are the usual
deprecated filesystem name
Hi,
I'm trying to write the output of two different map-reduce jobs into
the same output directory. I'm using MultipleOutputFormats to set the
filename dynamically, so there is no filename collision between the
two jobs. However, I'm getting the error "output directory already
exists".
Thanks, Benjamin. Your example saved me a lot of time :))
Shirley
On Aug 28, 2008, at 8:03 AM, Benjamin Gufler wrote:
Hi Shirley,
On 2008-08-28 14:32, Shirley Cohen wrote:
Do you have an example that shows how to use MultipleOutputFormat?
using MultipleOutputFormat is actually pretty easy
. With MultipleOutputFormat you can't.
(and if I'm not mistaken) If using MultipleOutputFormat in a map you
can't have a reduce phase. With MultipleOutputs you can.
A
On Thu, Aug 28, 2008 at 3:36 AM, Shirley Cohen
<[EMAIL PROTECTED]> wrote:
Hi,
I would like the reducer t
Hi,
I would like the reducer to output to different files based upon the
value of the key. I understand that both MultipleOutputs and
MultipleOutputFormat can do this. Is that correct? However, I don't
understand the differences between these two classes. Can someone
explain the differenc
Hi,
What is the best way to do a distinct count in m-r? Is there any way
of doing it with one reduce instead of two?
Thanks,
Shirley
stage,
because the partitioner only consider the size of blocks in bytes.
Instead
you can output the intermediate key/value pair as this:
key: 1 if C=1,3,5,7. 0 otherwise
value: the tuple.
In reducer you can have a reducer deal with all the key with
c=1,3,5,7.
On Mon, Aug 4, 2008 at 3:29 PM, Sh
Hi,
I want to implement some data partitioning logic where a mapper is
assigned a specific range of values. Here is a concrete example of
what I have in mind:
Suppose I have attributes A, B, C and the following tuples:
(A, B, C)
(1, 3, 1)
(1, 2, 2)
(1, 2, 3)
(12, 3, 4)
(12, 2, 5)
(12, 8, 6
Hi,
Does anyone know what the following error means?
hadoop-0.16.4/logs/userlogs/task_200808021906_0002_m_14_2]$ cat
syslog
2008-08-02 20:28:00,443 INFO
org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
with processName=MAP, sessionId=
2008-08-02 20:28:00,684 INFO org.
Hi,
We're getting the following error when starting up hadoop on the
cluster:
2008-08-01 14:42:37,334 INFO org.apache.hadoop.dfs.DataNode:
STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = node5.cube.disc.cias.ut
write a iterative script.
On Tue, Jul 29, 2008 at 9:57 AM, Shirley Cohen
<[EMAIL PROTECTED]> wrote:
Hi,
I want to call a map-reduce program recursively until some
condition is
met. How do I do that?
Thanks,
Shirley
Hi,
I want to call a map-reduce program recursively until some condition
is met. How do I do that?
Thanks,
Shirley
How do I partition the inputs to the mapper, such that a mapper
processes an entire file or files? What is happening now is that each
mapper receives only portions of a file and I want them to receive an
entire file. Is there a way to do that within the scope of the
framework?
Thanks,
Sh
Hi,
How does one do a join operation in map reduce? Is there more than
one way to do a join? Which way works better and why?
Thanks,
Shirley
n with something like Pig, where you have a
good
representation for internal optimizations, it is probably going to be
difficult to convert the two MR steps into one pre-aggregation and
two final
aggregations.
On 4/20/08 7:39 AM, "Shirley Cohen" <[EMAIL PROTECTED]> wrote:
n can be used to produce multiple low
definition
aggregates. I would find it very surprising if you could detect
these
sorts of situations.
On 4/16/08 5:26 PM, "Shirley Cohen" <[EMAIL PROTECTED]> wrote:
Dear Hadoop Users,
I'm writing to find out what you
Dear Hadoop Users,
I'm writing to find out what you think about being able to
incrementally re-execute a map reduce job. My understanding is that
the current framework doesn't support it and I'd like to know
whether, in your opinion, having this capability could help to speed
up developme
27 matches
Mail list logo