from:"Vasilis Liaskovitis"

duplicate START_TIME, FINISH_TIME timestamps in history log

2009-09-10 Thread Vasilis Liaskovitis

Hi,

I am getting different values for START_TIME, FINISH_TIME regarding
the exact same task when looking at the history log of a sorter job.
E.g greping for a particular reduce task in the history log:

masternode:/home/hadoop-git # grep -r
"task_200909031613_0002_r_02"
/home/vliaskov/hadoop-git/logs/history/doomhead.amd.com_1252012387435_job_200909031613_0002_root_sorter

Task TASKID="task_200909031613_0002_r_02" TASK_TYPE="REDUCE"
START_TIME="1252014285633" SPLITS="" .
ReduceAttempt TASK_TYPE="REDUCE"
TASKID="task_200909031613_0002_r_02"
TASK_ATTEMPT_ID="attempt_200909031613_0002_r_02_0"
START_TIME="1252012959359"
TRACKER_NAME="tracker_compute-5\.doomcluster\.amd\.com:localhost/127\.0\.0\.1:37388"
HTTP_PORT="50060" .
ReduceAttempt TASK_TYPE="REDUCE"
TASKID="task_200909031613_0002_r_02"
TASK_ATTEMPT_ID="attempt_200909031613_0002_r_02_0"
TASK_STATUS="SUCCESS" SHUFFLE_FINISHED="1252014856244"
SORT_FINISHED="1252014856723" FINISH_TIME="1252015219858"
Task TASKID="task_200909031613_0002_r_02" TASK_TYPE="REDUCE"
TASK_STATUS="SUCCESS" FINISH_TIME="1252016563501" ...

>From the above you can see that e.g. START_TIME is reported twice and
is different.
START_TIME= 1252014285633
START_TIME= 1252012959359

Same goes for FINISH_TIME and possibly other timestamped events. Does
the history log gather messages from several tasktrackers and from the
jobtracker regarding one particular task? That might explain the log.

Could this be a problem because the cluster nodes are not synchronized
to a master node using e.g. NTP?
On that note, do people use ntpd and ntpq/ntpc to synchronize their
slave/tasktracker nodes' clocks to their master/jobtracker node clock?

thanks for any help,

- Vasilis

filesystem counters HDFS_BYTES vs FILE_BYTES

2009-09-16 Thread Vasilis Liaskovitis

Hi,

in the filesystem counters for each job, what is the difference
between HDFS_BYTES_WRITTEN and FILE_BYTES_WRITTEN?

- Do they refer to disjoint data, perhaps hdfs-metadata and map/reduce
application data respectively?
- another interpretation is that HDFS_BYTES refers to bytes
"virtually" written to the HDFS system without taking into account the
replicated blocks, whereas FILE_BYTES shows the real i/o for all
replicas of the data blocks. With dfs.replication=3, I 'd expect
FILE_BYTES to be 3 times the HDFS_BYTES, but this is not the case on a
sorting job:
FILE_BYTES_WRITTEN=56497924338
HDFS_BYTES_WRITTEN=29445895448

Is there a description for these counters on the wiki somewhere?
More generally is there a description for all the generic filesystem
and map/reduce counters?
thanks,

- Vasilis

default job scheduler behaviour

2009-09-26 Thread Vasilis Liaskovitis

Hi,

given a single cluster running with the default job scheduler: Is only
one job executing on the cluster, regardless of how many task
map/reduce slots it can keep busy?
In other words, If a job does not use all task slots, would the
default scheduler consider scheduling map/reduce from other jobs that
have already been submitted to the system?

I am using an 8-node cluster to run some test jobs based on gridmix
(the synthetic benchmark found in the hadoop distribution under
src/banchmarks/gridmix). The gridmix workload submits many different
jobs in parallel - 5 different kinds of jobs of varying sizes for each
kind: small, medium, large. While running, I am noticing that at any
time only one job is making progress - at least according to the
jobtracker web ui. I think this is happening even for small-size jobs,
which don't take up all slots of the cluster's tasktrackers/nodes.

If the default scheduler is not capable of scheduling tasks for
multiple jobs, would I have to use the capacity scheduler? Or
something else?

thanks for any help,

- Vasilis

build and use hadoop-git

2009-11-29 Thread Vasilis Liaskovitis

Hi,

how can I build and use hadoop-git?
The project has recently been split into 3 repositories hadoop-common,
hadoop-hdfs and hadoop-mapred. It's not clear to me how to
build/compile and use the git/tip for the whole framework. E.g. would
building all jars from the 3 subprojects (and copying them under the
same directory) work?
Is there a guide/wiki page out there for this? Or perhaps there is
another repository which still has a centralized trunk for all
subprojects?
thanks in advance,

- Vasilis

hadoop idle time on terasort

2009-12-02 Thread Vasilis Liaskovitis

Hi,

I am using hadoop-0.20.1 to run terasort and randsort benchmarking
tests on a small 8-node linux cluster. Most runs consist of usually
low (<50%) core utilizations in the map and reduce phase, as well as
heavy I/O phases . There is usually a large fraction of runtime for
which cores are idling and i/o disk traffic is not heavy.

On average for the duration of a terasort run I get 20-30% cpu
utilization, 10-30% iowait times and the rest 40-70% is idle time.
This is data collected with mpstat for the duration of the run across
the cores of a specific node. This utilization behaviour is true and
symmetric for all tasktracker/data nodes (The namenode cores and I/O
are mostly idle, so there doesn’t seem to be a bottleneck in the
namenode).

I am looking for an explanation for the significant idle-time in the
runs. Could it have something to do with misconfigured network/RPC
latency hadoop paremeters? For example, I have tried to increase
mapred.heartbeats.in.second to 1000 from 100 but that didn’t help. The
network bandwidth (1Gige card on each node) is not saturated during
the runs, according to my netstat results.

Have other people noticed significant cpu idle times that can’t be
explained by I/O traffic?

Is it reasonable to always expect decreasing idle times as the
terasort dataset scales on the same cluster? I ‘ve only tried 2 small
datasets of 40GB and 64GB each, but core utilizations didn’t increase
with the runs done so far.

Yahoo’s paper on terasort (http://sortbenchmark.org/Yahoo2009.pdf)
mentions several performance optimizations, some of which seem
relevant to idle times. I am wondering which, if any, of the yahoo
patches are part of the hadoop-0.20.1 distribution.

Would it be a good idea to try a development version of hadoop to
resolve this issue?

thanks,

- Vasilis

Re: hadoop idle time on terasort

2009-12-02 Thread Vasilis Liaskovitis

Hi Todd,

thanks for the reply.

>
> This is seen reasonably often, and could be partly due to missed
> configuration changes. A few things to check:
>
> - Did you increase the number of tasks per node from the default? If you
> have a reasonable number of disks/cores, you're going to want to run a lot
> more than 2 map and 2 reduce tasks on each node.

For all tests so far, I have increased
mapred.tasktracker.map.tasks.maximum,
mapred.tasktracker.reduce.tasks.maximum to the number of cores per
tasktracker/node. (12 cores per node).
I 've also set mapred.map.tasks and mapred.reduce.tasks to a prime
close to the number of nodes i.e. 8. (though the recommendation for
mapred.map.tasks is "a prime several times greater than the number of
hosts").

> - Have you tuned any other settings? If you google around you can find some
> guides for configuration tuning that should help squeeze some performance
> out of your cluster.

I am reusing JVMs. I also enabled default codec compression (native
zlib I think) for intermediate map outputs. This decreased iowait
times for some datasets. But idle time is still significant even with
compression. I wonder if LZO compression would have better results -
less overall execution time and perhaps less idle time?

I also increased io.sort.mb (set to half the JVM heapsize) though I am
not sure how that affected performance yet. If other parameters could
be significant here, let me know. Would increasing the number of i/o
streams (io.sort.factor I think) help, with a not-so-beefy disk system
per node?

If you can recommend specific tutorial/guide/blog for performance
tuning, fell free to share.(though I suspect there may be so many out
there)

> There are several patches that aren't in 0.20.1 but will be in 0.21 that
> help performance. These aren't eligible for backport into 0.20 since point
> releases are for bug fixes only. Some are eligible for backporting into
> Cloudera's distro (or Yahoo's) and may show up in our next release (CDH3)
> which should be available first in January for those who like to live on the
> edge.

ok thanks. I 'll try to check out 0.21 or a cloudera distro at some
point. I wonder if there's a cetralized svn/git somewhere if I want to
build from source. Or do I need to somehow combine all subprojects
hadoop-common, hadoop-mapred and hadoop-hdfs?

thanks again,

- Vasilis

> Thanks,
> -Todd
>
> On Wed, Dec 2, 2009 at 12:22 PM, Vasilis Liaskovitis 
> wrote:
>
>> Hi,
>>
>> I am using hadoop-0.20.1 to run terasort and randsort benchmarking
>> tests on a small 8-node linux cluster. Most runs consist of usually
>> low (<50%) core utilizations in the map and reduce phase, as well as
>> heavy I/O phases . There is usually a large fraction of runtime for
>> which cores are idling and i/o disk traffic is not heavy.
>>
>> On average for the duration of a terasort run I get 20-30% cpu
>> utilization, 10-30% iowait times and the rest 40-70% is idle time.
>> This is data collected with mpstat for the duration of the run across
>> the cores of a specific node. This utilization behaviour is true and
>> symmetric for all tasktracker/data nodes (The namenode cores and I/O
>> are mostly idle, so there doesn’t seem to be a bottleneck in the
>> namenode).
>>
>> I am looking for an explanation for the significant idle-time in the
>> runs. Could it have something to do with misconfigured network/RPC
>> latency hadoop paremeters? For example, I have tried to increase
>> mapred.heartbeats.in.second to 1000 from 100 but that didn’t help. The
>> network bandwidth (1Gige card on each node) is not saturated during
>> the runs, according to my netstat results.
>>
>> Have other people noticed significant cpu idle times that can’t be
>> explained by I/O traffic?
>>
>> Is it reasonable to always expect decreasing idle times as the
>> terasort dataset scales on the same cluster? I ‘ve only tried 2 small
>> datasets of 40GB and 64GB each, but core utilizations didn’t increase
>> with the runs done so far.
>>
>> Yahoo’s paper on terasort (http://sortbenchmark.org/Yahoo2009.pdf)
>> mentions several performance optimizations, some of which seem
>> relevant to idle times. I am wondering which, if any, of the yahoo
>> patches are part of the hadoop-0.20.1 distribution.
>>
>> Would it be a good idea to try a development version of hadoop to
>> resolve this issue?
>>
>> thanks,
>>
>> - Vasilis
>>
>

Re: hadoop idle time on terasort

2009-12-08 Thread Vasilis Liaskovitis

Hi Scott,

thanks for the extra tips, these are very helpful.

On Mon, Dec 7, 2009 at 3:57 PM, Scott Carey  wrote:
>
>>
>> I am using hadoop-0.20.1 to run terasort and randsort benchmarking
>> tests on a small 8-node linux cluster. Most runs consist of usually
>> low (<50%) core utilizations in the map and reduce phase, as well as
>> heavy I/O phases . There is usually a large fraction of runtime for
>> which cores are idling and i/o disk traffic is not heavy.
>>
>> On average for the duration of a terasort run I get 20-30% cpu
>> utilization, 10-30% iowait times and the rest 40-70% is idle time.
>> This is data collected with mpstat for the duration of the run across
>> the cores of a specific node. This utilization behaviour is true and
>> symmetric for all tasktracker/data nodes (The namenode cores and I/O
>> are mostly idle, so there doesn¹t seem to be a bottleneck in the
>> namenode).
>>
>
> Look at individual task logs for map and reduce through the UI.  Also, look
> at the cluster utilization during a run -- are most map and reduce slots
> full most of the time, or is the slot utilization low?
>

I think the slots are being highly utilized, but I seem to have
forgotten which option in the web UI allows you to look at the slot
allocations during runtime on each tasktracker. Is this available on
the job details from the jobtracker's ui or somewhere else?

> Running the fair scheduler -- or any scheduler -- that can be configured to
> schedule more than one task per heartbeat can dramatically increase your
> slot utilization if it is low.

I 've been running a single job - would  the scheduler benefits show
up with multiple jobs only, by definition? I am now trying multiple
simultaneous sort jobs with smaller disjoint datasets launched in
parallel by the same user. Do I need to setup any fairscheduler
parameters other than the below?

  mapred.jobtracker.taskScheduler
  org.apache.hadoop.mapred.FairScheduler

mapred.fairscheduler.assignmultiple
true

  mapred.fairscheduler.allocation.file
  /home/user2/hadoop-0.20.1/conf/fair-scheduler.xml

fairscheduler.xml is:

   12
   12
   16
   4

 16

With 4 parallel sort jobs, I am noticing that maps execute in parallel
across all jobs.  But reduce tasks are only allocated/executed from a
single job at a time, until that job finishes. Is that expected or am
I missing something in my fairscheduler (or other) settings?

> Next, if you find that your delay times correspond with the shuffle phase
> (look in the reduce logs), there are fixes in 0.21 for that on the way, but
> there is a quick win, one line change that cuts shuffle times down a lot on
> clusters that have a large ratio of map tasks per node if the map output is
> not too large.  For a pure sort test, the map outputs are medium sized (the
> same size as the input), so this might not help.  But the indicators of the
> issue are in the reduce task logs.
> See this ticket: http://issues.apache.org/jira/browse/MAPREDUCE-318 and my
> comment from June 10 2009.
>

For a single big sort job, I have ~2300maps and 84 reduces on a 7node
cluster with 12-core nodes. The thread dumps for my unpatched version
also show sleeping threads at fetchOutputs() - I don't know how often
you 've seen it in your own task dumps.  From what I understand, what
we are looking for in the reduce logs, in terms of a shuffle idle
bottleneck, is the time elapsed between the "shuffling" lines to the
"in-memory merge complete" lines, does that sound right? With the one
line change in fetchOutput(), I 've sometimes seen the average shuffle
time across tasks go down by ~5-10% and so has the execution time. But
the results are variable across runs, I need to look at the reduce
logs and repeat the experiment.

>From your  system description ticket-318 notes, I see you have
configured your cluster to do "10 concurrent shuffle copies".  Does
this refer to  the parameter mapred.reduce.parallel.copies (default 5)
or io.sort.factor (default 10)? I retried with double the
reduce.parallel.copies from the default 5 to 10, but it didn't help.

> My summary is that the hadoop scheduling process has not been so far for
> servers that can run more than 6 or so tasks at once.  A server capable of
> running 12 maps is especially prone to running under-utilized.  Many changes
> in the 0.21 timeframe address some of this.
>

What levels of utilization have you achieved on servers capable of
running 10-12 maps/reduce slots? I understand that this depends on the
type and number of jobs. I suspect you 've seen higher utilizations
when having more concurrent jobs. Were you using the fairscheduler
instead of the default one?

If you can suggest any public hadoop examples/benchmarks that allow
for the "cascade-type" MR jobs that you refer too, please share.

thanks again,

- Vasilis

verifying that lzo compression is being used

2010-01-27 Thread Vasilis Liaskovitis

I am trying to use lzo for intermediate map compression and gzip for
output compression in my hadoop-0.20.1 jobs. For lzo usage, I 've
compiled .jar and jni/native library from
http://code.google.com/p/hadoop-gpl-compression/ (version 0.1.0). Also
using native lzo library v2.03.

Is there an easy way to verify that the lzo compression is indeed
being used? My hadoop job output mentions native zlib as well as lzo
loading:

10/01/27 22:53:24 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
10/01/27 22:53:24 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library
10/01/27 22:53:25 INFO util.NativeCodeLoader: Loaded the native-hadoop library
10/01/27 22:53:25 INFO zlib.ZlibFactory: Successfully loaded &
initialized native-zlib library
10/01/27 22:53:25 INFO compress.CodecPool: Got brand-new compressor

However I am a bit suspicious as to whether lzo is actually used. I
've been doing some CPU profling (using oprofile) during jobs and I
don't see any cpu samples coming from lzo-related native or jni
symbols, e.g. my system's liblzo.so or libgplcompression.so do not
appear. But I do see a lot of samples coming from libz.so and
libzip.so symbols. This profling makes me think that I may not
actually be using from the compressor I want to use:

Are there any other indications or hadoop/system logs that would prove
the use of lzo? Is it possible that lzo is overridden by zlib? I see
lzo is first loaded, and then zlib is loaded, judging from the job
output.

These are the relevant hadoop configs in my mapred-site.xml

io.compression.codecs
org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec

io.compression.codec.lzo.class
com.hadoop.compression.lzo.LzoCodec

mapreduce.map.output.compress.codec
com.hadoop.compression.lzo.LzoCodec

mapred.output.compression.codec
org.apache.hadoop.io.compress.GzipCodec

mapred.output.compress
true
Should the job outputs be compressed?

mapred.compress.map.output
true
Should the outputs of the maps be compressed before being
sent across the network. Uses SequenceFile compression.

mapred.output.compression.type
BLOCK
If the job outputs are to compressed as SequenceFiles, how should
they be compressed? Should be one of NONE, RECORD or BLOCK.

I am including the jni/native interface library libgplcompression.so
in the JAVA_LIBRARY_PATH of my child map/reduce tasks (through javaopt
-Djava.library.path). I.e. I am not using jira-2838 or jira-5981
patches.

Also I am specifying gzip for final output compression and not libzip
or libz- shouldn't libgz.so show up in my profiling? (Btw my oprofile
and hadoop tasks are set to use a jvmti agent suitable for java
profiling).

The tests are being run on a SLES10 cluster.
thanks for any help,

- Vasilis

ClassCastException in lzo indexer

2010-02-02 Thread Vasilis Liaskovitis

Hi,

I am trying to use hadoop-0.20.1 and hadoop-lzo
(http://github.com/kevinweil/hadoop-lzo) to index an lzo file. I 've
followed the instructions and copied both jar and native libs in my
classpaths. I am getting this error in both local and distributed
indexer mode

bin/hadoop jar lib/hadoop-lzo-0.3.0.jar
com.hadoop.compression.lzo.LzoIndexer /data/userVisits.lzo

10/02/02 17:30:38 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
10/02/02 17:30:38 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library
10/02/02 17:30:38 INFO lzo.DistributedLzoIndexer: Adding LZO file
/data/UserVisits.lzo to indexing list (no index currently exists)
10/02/02 17:30:38 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the
same.
10/02/02 17:30:39 INFO input.FileInputFormat: Total input paths to process : 1
10/02/02 17:30:39 INFO mapred.JobClient: Running job: job_201002020748_0409
10/02/02 17:30:40 INFO mapred.JobClient:  map 0% reduce 0%
10/02/02 17:31:02 INFO mapred.JobClient: Task Id :
attempt_201002020748_0409_m_00_0, Status : FAILED
java.lang.ClassCastException:
com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast
to com.hadoop.compression.lzo.LzopDecompressor

same error for local indexer:

us...@amdqc08:~/hadoop-0.20.1-prof> bin/hadoop jar
lib/hadoop-lzo-0.3.0.jar com.hadoop.compression.lzo.LzoIndexer
/data/UserVisits.lzo
10/02/02 17:38:47 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
10/02/02 17:38:47 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library
10/02/02 17:38:47 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file
/data/UserVisits.lzo, size 9.94 GB...
Exception in thread "main" java.lang.ClassCastException:
com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast
to com.hadoop.compression.lzo.LzopDecompressor

is hadoop-0.20.1 compatible with the git/master of hadoop-lzo? Or do I
need to use some older version of hadoop-lzo to be compatible with
hadoop-0.20.1?

- a different, but relevant question: In order to compress
intermediate map outputs with lzo and process them in an efficient
way, does the map/reduce job need to explicitly create index files for
the compressed intermediate files? I think that at this shuffle stage,
input files have already been split and we are not relying on indexing
the intermediate lzo files for parallel shuffling. Is that correct? Or
would a job need to index the intermediate files? If yes, can this be
handled in an automatic fashion by hadoop-lzo?

any suggestions are welcome.
thanks,

- Vasilis

Re: ClassCastException in lzo indexer

2010-02-02 Thread Vasilis Liaskovitis

Hi Todd,

On Tue, Feb 2, 2010 at 1:41 PM, Todd Lipcon  wrote:
> Hi Vasilis,
>
> Did you make sure to "ant clean" before rebuilding hadoop-lzo if you updated
> the code? Also, can you paste your configuration for io.compression.codecs ?
>

I rebuilt after doing "ant clean" but still get the same behaviour.
Here's my codecs config:


io.compression.codecs
org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec



io.compression.codec.lzo.class
com.hadoop.compression.lzo.LzoCodec


my error sounds lzop-specific, maybe my io.compression.codec.lzo.class
should include something about lzop?

thanks,

- Vasilis

> Thanks
> -Todd
>
> On Tue, Feb 2, 2010 at 9:09 AM, Vasilis Liaskovitis wrote:
>
>> Hi,
>>
>> I am trying to use hadoop-0.20.1 and hadoop-lzo
>> (http://github.com/kevinweil/hadoop-lzo) to index an lzo file. I 've
>> followed the instructions and copied both jar and native libs in my
>> classpaths. I am getting this error in both local and distributed
>> indexer mode
>>
>> bin/hadoop jar lib/hadoop-lzo-0.3.0.jar
>> com.hadoop.compression.lzo.LzoIndexer /data/userVisits.lzo
>>
>> 10/02/02 17:30:38 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
>> 10/02/02 17:30:38 INFO lzo.LzoCodec: Successfully loaded & initialized
>> native-lzo library
>> 10/02/02 17:30:38 INFO lzo.DistributedLzoIndexer: Adding LZO file
>> /data/UserVisits.lzo to indexing list (no index currently exists)
>> 10/02/02 17:30:38 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the
>> same.
>> 10/02/02 17:30:39 INFO input.FileInputFormat: Total input paths to process
>> : 1
>> 10/02/02 17:30:39 INFO mapred.JobClient: Running job: job_201002020748_0409
>> 10/02/02 17:30:40 INFO mapred.JobClient:  map 0% reduce 0%
>> 10/02/02 17:31:02 INFO mapred.JobClient: Task Id :
>> attempt_201002020748_0409_m_00_0, Status : FAILED
>> java.lang.ClassCastException:
>> com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast
>> to com.hadoop.compression.lzo.LzopDecompressor
>>
>> same error for local indexer:
>>
>> us...@amdqc08:~/hadoop-0.20.1-prof> bin/hadoop jar
>> lib/hadoop-lzo-0.3.0.jar com.hadoop.compression.lzo.LzoIndexer
>> /data/UserVisits.lzo
>> 10/02/02 17:38:47 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
>> 10/02/02 17:38:47 INFO lzo.LzoCodec: Successfully loaded & initialized
>> native-lzo library
>> 10/02/02 17:38:47 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file
>> /data/UserVisits.lzo, size 9.94 GB...
>> Exception in thread "main" java.lang.ClassCastException:
>> com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast
>> to com.hadoop.compression.lzo.LzopDecompressor
>>
>> is hadoop-0.20.1 compatible with the git/master of hadoop-lzo? Or do I
>> need to use some older version of hadoop-lzo to be compatible with
>> hadoop-0.20.1?
>>
>> - a different, but relevant question: In order to compress
>> intermediate map outputs with lzo and process them in an efficient
>> way, does the map/reduce job need to explicitly create index files for
>> the compressed intermediate files? I think that at this shuffle stage,
>> input files have already been split and we are not relying on indexing
>> the intermediate lzo files for parallel shuffling. Is that correct? Or
>> would a job need to index the intermediate files? If yes, can this be
>> handled in an automatic fashion by hadoop-lzo?
>>
>> any suggestions are welcome.
>> thanks,
>>
>> - Vasilis
>>
>

maximum number of jobs

2010-02-08 Thread Vasilis Liaskovitis

Hi,

I am trying to submit many independent jobs in paralllel (same user).
This works for up to 16 jobs, but after that I only get 16 jobs in
parallel no matter how many I try to submit. I am using fair scheduler
with the following config:


 
   12
   12
   100
   4
 
 100


Judging by this config, I would expect my job limit to be 100, not 16
jobs. I am using hadoop-0.20.1. Am I missing some other config option?
any suggestions are welcome,

- Vasilis

using multiple disks for HDFS

2010-02-09 Thread Vasilis Liaskovitis

Hi,

I am trying to use 4 SATA disks per node in my hadoop cluster. This is
a JBOD configuration, no RAID is involved. There is one single xfs
partition per disk, each one mounted as /local/, /local2/, /local3,
/local4 - with sufficient privileges for running hadoop jobs. HDFS is
setup across the 4 disks for a single user usage (user2) with the
following comma separated list in hadoop.tmp.dir


  dfs.data.dir
  ${hadoop.tmp.dir}/dfs/data


 
hadoop.tmp.dir

/local/user2/hdfs/hadoop-${user.name},/local2/user2/hdfs/hadoop-${user.name},/local3/user2/hdfs/hadoop-${user.name,/local4/user2/hdfs/hadoop-${user.name}
A base for other temporary directories.
  

What I see is that most or all data is stored on disks /local and
/local4 across nodes. Directories local2 and local3 from the other
disks are not used. I have verified that these disks can be written to
and have free space.

Isn't HDFS supposed to use all disks in a round-robin way? (provided
there is free space on all). Do I need to change another config
parameter for HDFS to spread I/O across all  provided mount points?

- Vasilis

JVM heap and sort buffer size guidelines

2010-02-19 Thread Vasilis Liaskovitis

Hi,

For a node with M gigabytes of memory and N total child tasks (both
map + reduce) running on the node, what do people typically use for
the following parameters:

- Xmx (heap size per child task JVM)?
I.e. my question here is what percentage of the total memory node do
you use for the heaps of the tasks' JVMs. I am trying to reuse JVMs,
and there are roughly N task-JVMs on one node at any time. I 've tried
using a very large chunk of my memory of my node for heaps (i.e. close
to M/N) and I have seen better execution times without experiencing
swapping; but I am wondering if this is a job-specific behaviour. When
I 've used both -Xmx and -Xms set to the same heap size (i.e. maximum
and minum heap size the same to avoid contraction and expansion
overheads) I have run into some swapping; I guess Xms=Xmx should be
avoided if we are close to the physical memory limit.

- io.sort.mb and io.sort.factor. I understand that to answer this we
'd have to take the disk configuration into consideration. Do you
consider this only a function of disk or also a function of the heap
size? Obviously io.sort.mb < heapsize, but how much space do you leave
for non-sort buffer usage?

I am interested in small cluster setups ( 8-16 nodes), and not large
clusters, if that makes any difference.

- Vasilis

reuse JVMs across multiple jobs

2010-02-19 Thread Vasilis Liaskovitis

Hi,

Is it possible (and does it make sense) to reuse JVMs across jobs?

The job.reuse.jvm.num.tasks config option is a job specific parameter,
as its name implies. When running multiple independent jobs
simultaneously with job.reuse.jvm=-1 (this means always reuse), I see
a lot of different Java PIDs (far more than map.tasks.maximum +
reduce.tasks.maximum) across the duration of the job runs, instead of
the same Java processes persisting. The number of live JVMs on a given
node/tasktracker at any time never exceeds map.tasks.maximum +
reduce.tasks.maximum, as expected, but we do tear down idle JVMs and
spawn new ones quite often.

for example, here are the number of distinct Java PIDs when submitting
1, 4, 32 copies of the same job in parallel:
1   28
2   39
4   106
32  740

The relevant killing and spawing logic should be in
src/mapred/org/mapred/org/apache/hadoop/mapred/JvmManager.java,
particularly the reapJvm() method, but I haven't dug deeper. I am
wondering if it would be possible and worthwhile from a performance
standpoint to be able to reuse JVMs across jobs i.e. have a common JVM
pool for all hadoop jobs?
thanks,

- Vasilis

hadoop.log.dir

2010-03-29 Thread Vasilis Liaskovitis

Hi all,

is  there a config option that controls placement of all hadoop logs?
I 'd like to put all hadoop logs under a specific directory e.g. /tmp.
on the namenode and all datanodes.

Is hadoop.log.dir the right config? Can I change this in the
log4j.properties file, or pass it e.g. in the JVM opts as
"-Dhadoop.log.dir=/tmp" ?
I am using hadoop-0.20.1 or hadoop-0.20.2.

thanks,

- Vasilis

swapping on hadoop

2010-03-30 Thread Vasilis Liaskovitis

Hi all,

I 've noticed swapping for a single terasort job on a small 8-node
cluster using hadoop-0.20.1. The swapping doesn't happen repeatably; I
can have back to back runs of the same job from the same hdfs input
data and get swapping only on 1 out of 4 identical runs. I 've noticed
this swapping behaviour on both terasort jobs and hive query jobs.

- Focusing on a single job config, Is there a rule of thumb about how
much node memory should be left for use outside of Child JVMs?
I make sure that per Node, there is free memory:
(#maxmapTasksperTaskTracker + #maxreduceTasksperTaskTracker) *
JVMHeapSize < PhysicalMemoryonNode
The total JVM heap size per node per job from the above equation
currently account 65%-75% of the node's memory. (I 've tried
allocating a riskier 90% of the node's memory, with similar swapping
observations).

- Could there be an issue with HDFS data or metadata taking up memory?
I am not cleaning output or intermediate outputs from HDFS between
runs. Is this possible?

- Do people use any specific java flags (particularly garbage
collection flags) for production environments where one job runs (or
possibly more jobs run simultaneously) ?

- What are the memory requirements for the jobtracker,namenode and
tasktracker,datanode JVMs?

- I am setting io.sort.mb to about half of the JVM heap size (half of
-Xmx in javaopts). Should this be set to a different ratio? (this
setting doesn't sound like it should be causing swapping in the first
place).

- The buffer cache is cleaned before each run (flush and echo 3 >
/proc/sys/vm/drop_caches)

any empirical advice and suggestions  to solve this are appreciated.
thanks,

- Vasilis

Re: swapping on hadoop

2010-04-01 Thread Vasilis Liaskovitis

All,

thanks for your suggestions everyone, these are valuable.
Some comments:

On Wed, Mar 31, 2010 at 6:06 PM, Scott Carey  wrote:
> On Linux, check out the 'swappiness' OS tunable -- you can turn this down 
> from the default to reduce swapping at the expense of some system file cache.
> However, you want a decent chunk of RAM left for the system to cache files -- 
> if it is all allocated and used by Hadoop there will be extra I/O.

I have set /proc/sys/vm/swappiness to 1, though I haven't tried 0.

> For Java GC, if your -Xmx is above 600MB or so, try either changing 
> -XX:NewRatio to a smaller number (default is 2 for Sun JDK 6) or setting the 
> -XX:MaxNewSize parameter to around 150MB to 250MB.
> An example of Hadoop memory use scaling as -Xmx grows:
> Lets say you have a Hadoop job with a 100MB map side join, and 250MB of 
> hadoop sort space.

In this example, what hadoop config parameters do the above 2 buffers
refer to? io.sort.mb=250, but which parameter does the "map side join"
100MB refer to? Are you referring to the split size of the input data
handled by a single map task? Apart from that question, the example is
clear to me and useful, thanks.

> Now, perhaps due to some other jobs you want to set -Xmx1200M.  The above job 
> will end up using about 150MB more now, because the new space has grown, 
> although the footprint is the same.   A larger new space can improve 
> performance, but with most typical hadoop jobs it won't.   Making sure it 
> does not grow larger just because -Xmx is larger can help save a lot of 
> memory.  Additionally, a job that would have failed with an OOME at -Xmx1200M 
> might pass at -Xmx1000M if the young generation takes 150MB instead of 400MB 
> of the space.
>
Indeed for my jobs I haven't noticed better performance going to
XMx900 or 1000. I normally use -XMx700. I haven't tried the
-XX:MaxNewSize or -XX:NewRatio but I will.

> If you are using a 64 bit JRE, you can also save space with the 
> -XX:+UseCompressedOops option -- sometimes quite a bit of space.
>
I am using this already, thanks.

Quoting Allen: "Java takes more RAM than just the heap size.
Sometimes 2-3x as much."
Is there a clear indication that Java memory usage extends so far
beyond its allocated heap? E.g. would java thread stacks really
account for such a big increase 2x to 3x? Tasks seem to be heavily
threaded. What are the relevant config options to control number of
threads within a task?

"My general rule of thumb for general purpose grids is to plan on having
3-4gb of free VM (swap+physical) space for the OS, monitoring, datanode, and
task tracker processes.  After that, you can carve it up however you want."

I now have 4-6GB of free space, when taking into account the full heap
space of all child JVMs. Does that sound reasonable for all the other
node needs (file caching, datanode, task tracker)? Having ~1G for
tasktracker and ~1G for datanode leaves 2-4GB for file caching. Even
with this setup on a terasort run of 64GB data across 7 nodes
(separate node for namenode/jobtracker), I run low on memory, though
there is no swapping for the majority of cases. I assume I am running
low on memory mainly due to file/disk caching or thread stacks? Any
other possible reasons?

With this new setup, I don't normally get swapping for a single job
e.g. terasort or hive job. However, the problem in general is
exacerbated if one spawns multiple indepenendent hadoop jobs
simultaneously. I 've noticed that JVMs are not re-used across jobs,
in an earlier post:
http://www.mail-archive.com/common-...@hadoop.apache.org/msg01174.html
This implies that Java memory usage would blow up when submitting
multiple independent jobs. So this multiple job scenario sounds more
susceptible to swapping

A relevant question is: in production environments, do people run jobs
in parallel? Or is it that the majority of jobs is a serial pipeline /
cascade of jobs being run back to back?

thanks,

- Vasilis

Re: swapping on hadoop

2010-04-01 Thread Vasilis Liaskovitis

Hi,

On Thu, Apr 1, 2010 at 2:02 PM, Scott Carey  wrote:
>> In this example, what hadoop config parameters do the above 2 buffers
>> refer to? io.sort.mb=250, but which parameter does the "map side join"
>> 100MB refer to? Are you referring to the split size of the input data
>> handled by a single map task?
>
> "Map side join" in just an example of one of many possible use cases where a 
> particular map implementation may hold on to some semi-permanent data for the 
> whole task.
> It could be anything that takes 100MB of heap and holds the data across 
> individual calls to map().
>

ok. Now, considering a map side space buffer and a sort buffer, do
both account for tenured space for both map and reduce JVMs? I 'd
think the map side buffer gets used and tenured for map tasks and the
sort space gets used and tenured for the reduce task during sort/merge
phase. Would both spaces really be used in both kinds of tasks?

>
> Java typically uses 5MB to 60MB for classloader data (statics, classes) and 
> some space for threads, etc.  The default thread stack on most OS's is about 
> 1MB, and the number of threads for a task process is on the order of a dozen.
> Getting 2-3x the space in a java process outside the heap would require 
> either a huge thread count, a large native library loaded, or perhaps a 
> non-java hadoop job using pipes.
> It would be rather obvious in 'top' if you sort by memory (shift-M on linux), 
> or vmstat, etc.  To get the current size of the heap of a process, you can 
> use jstat or 'kill -3' to create a stack dump and heap summary.
>
thanks, good to know.

>>
>> With this new setup, I don't normally get swapping for a single job
>> e.g. terasort or hive job. However, the problem in general is
>> exacerbated if one spawns multiple indepenendent hadoop jobs
>> simultaneously. I 've noticed that JVMs are not re-used across jobs,
>> in an earlier post:
>> http://www.mail-archive.com/common-...@hadoop.apache.org/msg01174.html
>> This implies that Java memory usage would blow up when submitting
>> multiple independent jobs. So this multiple job scenario sounds more
>> susceptible to swapping
>>
> The maximum number of map and reduce tasks per node applies no matter how 
> many jobs are running.

RIght. But depending on your job scheduler, isn't it possible that you
may be swapping the different jobs' JVM space in and out of physical
memory while scheduling all the parallel jobs? Especially if nodes
don't have huge amounts of memory, this scenario sounds likely.

>
>> A relevant question is: in production environments, do people run jobs
>> in parallel? Or is it that the majority of jobs is a serial pipeline /
>> cascade of jobs being run back to back?
>>
> Jobs are absolutely run in parallel.  I recommend using the fair scheduler 
> with no config parameters other than 'assignmultiple = true' as the 
> 'baseline' scheduler, and adjust from there accordingly.  The Capacity 
> Scheduler has more tuning knobs for dealing with memory constraints if jobs 
> have drastically different memory needs.  The out-of-the-box FIFO scheduler 
> tends to have a hard time keeping the cluster utilization high when there are 
> multiple jobs to run.

thanks, I 'll try this.

Back to a single job running and assuming all heap space being used,
what percentage of a node's memory would you leave for other functions
esp. disk cache? I currently only have 25% of memory (~4GB) for
non-heapJVM data; I guess there should be a sweet-spot, probably
dependent on the job I/O characteristics.

- Vasilis

separate JVM flags for map and reduce tasks

2010-04-22 Thread Vasilis Liaskovitis

Hi,

I 'd like to pass different JVM options for map tasks and different
ones for reduce tasks. I think it should be straightforward to add
mapred.mapchild.java.opts, mapred.reducechild.java.opts to my
conf/mapred-site.xml and process the new options accordingly in
src/mapred/org/apache/mapreduce/TaskRunner.java . Let me know if you
think it's more involved than what I described.

My question is: if mapred.job.reuse.jvm.num.tasks is set to -1 (always
reuse), can the same JVM be re-used for different types of tasks? So
the same JVM being used e.g. first by a map task and then used by
reduce task. I am assuming this is definitely possible, though I
haven't verified in the code.
So , if one wants to pass different jvm options to map tasks and
reduce tasks, perhaps jobs.reuse.jvm.num.task should be set to 1
(never reuse) ?

thanks for your help,

- Vasilis

utilizing all cores on single-node hadoop

2009-08-17 Thread Vasilis Liaskovitis

Hi,

I am a beginner trying to setup a few simple hadoop tests on a single
node before moving on to a cluster. I am just using the simple
wordcount example for now. My question is what's the best way to
guarantee utilization of all cores on a single-node? So assuming a
single node with 16-cores what are the suggested values for:

mapred.map.tasks
mapred.reduce.tasks
mapred.tasktracker.map.tasks.maximum
mapred.tasktracker.map.tasks.maxium

I found an old similar thread
http://www.mail-archive.com/hadoop-u...@lucene.apache.org/msg00152.html
and I have followed similar settings for my 16-core system (e.g.
map.tasks=reduce.tasks=90 and map.tasks.maximum=100), however I always
see only 3-4 cores utilized using top.

- The description for mapred.map.tasks says "Ignored when
mapred.job.tracker is "local" ", and in my case
mapred.job.tracker=hdfs://localhost:54311
is it possible that the map.tasks and reduce.tasks I am setting are
being ignored? How can I verify this? Is there a way to enforce my
values even on a localhost scenario like this?

- Are there other config options/values that I need to set besides the
4 I mentioned above?

- Also is it possible that for short tasks, I won't see full
utilization of all cores anyway? Something along those lines is
mentioned in an issue a year ago:
http://issues.apache.org/jira/browse/HADOOP-3136
"If the individual tasks are very short i.e. run for less than the
heartbeat interval the TaskTracker serially runs one task at a time"

I am using hadoop-0.19.2

thanks for any guidance,

- Vasilis

Re: utilizing all cores on single-node hadoop

2009-08-23 Thread Vasilis Liaskovitis

Hi,

thanks to everyone for the valuable suggestions.

what would be the default number of map and reduce tasks for the
sort-rand example described at:
http://wiki.apache.org/hadoop/Sort
This is one of the simplest possible examples and uses identity mapper/reducers

I am seeing 160 map tasks and 27 reduce tasks on my jobtracker web ui
for a single-node test. The number of map tasks seems particularly
odd, because my tasktracker.reduce.tasks.maximum=30 and
mapred.map.tasks=24 settings were was well below 160.

In general, is the number of map/reduce tasks for a specific job set
by the Mapper/Reducer job-specific java classes or is it inferred
somehow by the framework?

Also, cores may be idle because the job is I/O-bound - what are the
config parameters related to memory/disk buffering of map outputs and
reduce merges?  WIth the default io.sort.mb and io.sort.factor, would
you expect the sort example to be i/o-bound? Some profiling runs
should help investigate this soon, but at this point I am just asking
for any untuition from more  experienced users.

I have switched to using hadoop-0.20.0 (I believe this version has
changed the site-specfic overrides file from conf/hadoop-site.xml to
conf/mapred-site.xml and several other conf/ files. Let me know if the
site overrides don't work or should be changed somewhere else for this
version)
Does 0.20.0 have a different job scheduler or different default
settings than 0.19.2 - I am getting higher core utlizations with
0.20.0 for some jobs e.g. wordcount examples.

thanks,

- Vasilis

On Wed, Aug 19, 2009 at 9:09 AM, Jason Venner wrote:
> Another reason you may not see full utilization of your map tasks per
> tracker is if the mean run time of a task is very short, All the slots are
> being used but the setup and teardown for each task is large enough in time
> compared to the run time of the task that it appears that not all the task
> slots are being used.
>
>
> On Mon, Aug 17, 2009 at 10:35 PM, Amogh Vasekar  wrote:
>
>> While setting mapred.tasktracker.map.tasks.maximum and
>> mapred.tasktracker.reduce.tasks.maximum, please consider the memory usage
>> your application might have since all tasks will be competing for the same
>> and might reduce overall performance.
>>
>> Thanks,
>> Amogh
>> -Original Message-
>> From: Harish Mallipeddi [mailto:harish.mallipe...@gmail.com]
>> Sent: Tuesday, August 18, 2009 10:37 AM
>> To: common-user@hadoop.apache.org
>> Subject: Re: utilizing all cores on single-node hadoop
>>
>> Hi Vasilis,
>>
>> Here's some info that I know:
>>
>> mapred.map.tasks - this is a job-specific setting. This is just a hint to
>> InputFormat as to how many InputSplits (and hence MapTasks) you want for
>> your job. The default InputFormat classes usually keep each split size to
>> the HDFS block size (64MB default). So if your input data is less than 64
>> MB, it will just result in only 1 split and hence 1 MapTask only.
>>
>> mapred.reduce.tasks - this is also a job-specific setting.
>>
>> mapred.tasktracker.map.tasks.maximum
>> mapred.tasktracker.reduce.tasks.maximum
>>
>> The above 2 are tasktracker-specific config options and determine how many
>> "simultaneous" MapTasks and ReduceTasks run on each TT. Ideally on a 8-core
>> box, you would want to set map.tasks.maximum to something like 6 and
>> reduce.tasks.maximum to 4 to utilize all the 8 cores to the maximum
>> (there's
>> a little bit of over-subscription to account for tasks idling while doing
>> I/O).
>>
>> In the web admin console, how many map-tasks and reduce-tasks are reported
>> to have been launched for your job?
>>
>> Cheers,
>> Harish
>>
>> On Tue, Aug 18, 2009 at 5:47 AM, Vasilis Liaskovitis > >wrote:
>>
>> > Hi,
>> >
>> > I am a beginner trying to setup a few simple hadoop tests on a single
>> > node before moving on to a cluster. I am just using the simple
>> > wordcount example for now. My question is what's the best way to
>> > guarantee utilization of all cores on a single-node? So assuming a
>> > single node with 16-cores what are the suggested values for:
>> >
>> > mapred.map.tasks
>> > mapred.reduce.tasks
>> >
>> mapred.tasktracker.map.tasks.maximum
>> > mapred.tasktracker.map.tasks.maxium
>> >
>>
>> > I found an old similar thread
>> > http://www.mail-archive.com/hadoop-u...@lucene.apache.org/msg00152.html
>> > and I have followed similar settings for my 16-core system (e.g.
>> > map.tasks=reduce.tasks=90 and map.tasks.maximum=100), however I always
>> > see only

performance counters & vaidya diagnostics help

2009-08-28 Thread Vasilis Liaskovitis

Hi,

a) Is there a wiki page or other documentation explaining the exact
meaning of the job / filesystem / mapreduce counters reported after
every job run?

9/08/27 15:04:10 INFO mapred.JobClient: Job complete: job_200908271428_0002
09/08/27 15:04:10 INFO mapred.JobClient: Counters: 19
09/08/27 15:04:10 INFO mapred.JobClient:   Job Counters
09/08/27 15:04:10 INFO mapred.JobClient: Launched reduce tasks=72
09/08/27 15:04:10 INFO mapred.JobClient: Rack-local map tasks=9
09/08/27 15:04:10 INFO mapred.JobClient: Launched map tasks=480
09/08/27 15:04:10 INFO mapred.JobClient: Data-local map tasks=471
09/08/27 15:04:10 INFO mapred.JobClient:   FileSystemCounters
09/08/27 15:04:10 INFO mapred.JobClient: FILE_BYTES_READ=3225441
09/08/27 15:04:10 INFO mapred.JobClient: HDFS_BYTES_READ=32326069128
09/08/27 15:04:10 INFO mapred.JobClient: FILE_BYTES_WRITTEN=64510015810
09/08/27 15:04:10 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=32318798519
09/08/27 15:04:10 INFO mapred.JobClient:   Map-Reduce Framework
09/08/27 15:04:10 INFO mapred.JobClient: Reduce input groups=3064894
09/08/27 15:04:10 INFO mapred.JobClient: Combine output records=0
09/08/27 15:04:10 INFO mapred.JobClient: Map input records=3064894
09/08/27 15:04:10 INFO mapred.JobClient: Reduce shuffle bytes=32197844126
09/08/27 15:04:10 INFO mapred.JobClient: Reduce output records=3064894
09/08/27 15:04:10 INFO mapred.JobClient: Spilled Records=6129788
09/08/27 15:04:10 INFO mapred.JobClient: Map output bytes=32237256375
09/08/27 15:04:10 INFO mapred.JobClient: Map input bytes=32318791307
09/08/27 15:04:10 INFO mapred.JobClient: Combine input records=0
09/08/27 15:04:10 INFO mapred.JobClient: Map output records=3064894
09/08/27 15:04:10 INFO mapred.JobClient: Reduce input records=3064894

Do spilled records refer to the mapper output phase only, to the
merge/reduce phase only, or both?

Is it possible to enable more of these counters (in case not all of
them are on by default)?

b) I am trying to use vaidya in hadoop-0.20.0 to do post-run
performance analysis of a simple sort job. I am specifying my job
configuration and history log found under logs/history in my local
filesystem. However I get an exception while the log is being
processed.

headnode:~/hadoop-0.20.0 # java -classpath
hadoop-0.20.0-core.jar:contrib/vaidya/hadoop-0.20.0-vaidya.jar:lib/commons-logging-1.0.4.jar
org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser
-jobconf 
file://localhost/home/vliaskov/hadoop-0.20.0/logs/history/headnode_1251401328090_job_200908271428_0004_conf.xml
-joblog 
file://localhost/home/vliaskov/hadoop-0.20.0/logs/history/headnode_1251401328090_job_200908271428_0004_root_sorter

Pattern:<{(org.apache.hadoop.mapred.JobInProgress$Counter)(Job
Counters )[(TOTAL_LAUNCHED_REDUCES)(Launched reduce
tasks)(72)][(RACK_LOCAL_MAPS)(Rack-local map
tasks)(2)][(TOTAL_LAUNCHED_MAPS)(Launched map
tasks)(480)][(DATA_LOCAL_MAPS)(Data-local map
tasks)(478)]}{(FileSystemCounters)(FileSystemCounters)[(FILE_BYTES_READ)(FILE_BYTES_READ)(3225441)][(HDFS_BYTES_READ)(HDFS_BYTES_READ)(32326069128)][(FILE_BYTES_WRITTEN)(FILE_BYTES_WRITTEN)(64510015810)][(HDFS_BYTES_WRITTEN)(HDFS_BYTES_WRITTEN)(32318798519)]}{(org.apache.hadoop.mapred.Task$Counter)(Map-Reduce
Framework)[(REDUCE_INPUT_GROUPS)(Reduce input
groups)(3064894)][(COMBINE_OUTPUT_RECORDS)(Combine output
records)(0)][(MAP_INPUT_RECORDS)(Map input
records)(3064894)][(REDUCE_SHUFFLE_BYTES)(Reduce shuffle
bytes)(32193406838)][(REDUCE_OUTPUT_RECORDS)(Reduce output
records)(3064894)][(SPILLED_RECORDS)(Spilled
Records)(6129788)][(MAP_OUTPUT_BYTES)(Map output
bytes)(32237256375)][(MAP_INPUT_BYTES)(Map input
bytes)(32318791307)][(MAP_OUTPUT_RECORDS)(Map output
records)(3064894)][(COMBINE_INPUT_RECORDS)(Combine input
records)(0)][(REDUCE_INPUT_RECORDS)(Reduce input records)(3064894)]}>
==> NOT INCLUDED IN PERFORMANCE ADVISOR
JobHistory.Keys.JOB_PRIORITY : NOT INCLUDED IN PERFORMANCE ADVISOR COUNTERS
JobHistory.Keys.STATE_STRING : NOT INCLUDED IN PERFORMANCE ADVISOR MAP COUNTERS
JobHistory.Keys.HTTP_PORT : NOT INCLUDED IN PERFORMANCE ADVISOR MAP COUNTERS
[snip lots of output...]
Pattern:<{(FileSystemCounters)(FileSystemCounters)[(FILE_BYTES_READ)(FILE_BYTES_READ)(447950837)][(FILE_BYTES_WRITTEN)(FILE_BYTES_WRITTEN)(447950837)][(HDFS_BYTES_WRITTEN)(HDFS_BYTES_WRITTEN)(448842224)]}{(org.apache.hadoop.mapred.Task$Counter)(Map-Reduce
Framework)[(REDUCE_INPUT_GROUPS)(Reduce input
groups)(42528)][(COMBINE_OUTPUT_RECORDS)(Combine output
records)(0)][(REDUCE_SHUFFLE_BYTES)(Reduce shuffle
bytes)(447117550)][(REDUCE_OUTPUT_RECORDS)(Reduce output
records)(42528)][(SPILLED_RECORDS)(Spilled
Records)(42528)][(COMBINE_INPUT_RECORDS)(Combine input
records)(0)][(REDUCE_INPUT_RECORDS)(Reduce input records)(42528)]}>
==> NOT INCLUDED IN PERFORMANCE ADVISOR MAP TASK
JobHistory.Keys.STATE_STRING : NOT INCLUDED IN PERFORMANCE ADVISOR
REDUCE COUNTERS
JobHistory.Keys.SPLITS : NOT

duplicate START_TIME, FINISH_TIME timestamps in history log

filesystem counters HDFS_BYTES vs FILE_BYTES

default job scheduler behaviour

build and use hadoop-git

hadoop idle time on terasort

Re: hadoop idle time on terasort

Re: hadoop idle time on terasort

verifying that lzo compression is being used

ClassCastException in lzo indexer

Re: ClassCastException in lzo indexer

maximum number of jobs

using multiple disks for HDFS

JVM heap and sort buffer size guidelines

reuse JVMs across multiple jobs

hadoop.log.dir

swapping on hadoop

Re: swapping on hadoop

Re: swapping on hadoop

separate JVM flags for map and reduce tasks

utilizing all cores on single-node hadoop

Re: utilizing all cores on single-node hadoop

performance counters & vaidya diagnostics help

22 matches

Site Navigation

Mail list logo

Footer information