Hi ,
You can look at the job metrics from your jobtracker Web UI . The
"Spilled Record" Counter under the group "Map Reduce Framework" displays
the number of records spilled in both map and reduce tasks.
Regards
Ravi Magham.
On Thu, Sep 5, 2013 at 12:23 PM, ch huang wrote:
> hi,all:
>
You can also look at
a ) https://github.com/intel-hadoop/HiBench
Regards
Ravi Magham
On Mon, Sep 2, 2013 at 12:26 PM, ch huang wrote:
> hi ,all:
>i want to evaluate my hadoop cluster performance ,what tool can i
> use? (TestDFSIO,nnbench?)
>
Adeel,
To add to Yong's points
a) Consider tuning the number of threads in reduce tasks and the task
tracker process. mapred.reduce.parallel.copies
b) See if the map output can be compressed to ensure there is less IO .
c) Increase the io.sort.factor to ensure the framework merges a larg
Hi ,
Can you check if you are able to ping or telnet to the ip address and
port of Oracle database from your machine. I have a hunch that Oracle
Listener is stopped . If so , start it.
The commands to check the status and start if the listener isn't running.
$ lsnrctl status
$ lsnrctl start
R
Also, if both are defined , the framework will use RawComparator . I hope
you have registered the comparator in a static block as follows
static
{
WritableComparator.define(PairOfInts.class, new Comparator());
}
Regards
Ravi Magham
On Sat, Aug 31, 2013 at 1:23 PM, Ravi Kiran wrote:
> Hi Ad
Hi Adeel,
The RawComparator is the fastest between the two as you avoid the need
to convert the byte stream to Writable objects for comparison .
Regards
Ravi Magham
On Fri, Aug 30, 2013 at 11:16 PM, Adeel Qureshi wrote:
> For secondary sort I am implementing a RawComparator and providing t
Also to add, the default serialization libraries supported are specified in
core-default,xml as
io.serializations
org.apache.hadoop.io.serializer.WritableSerialization,org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization,org.apache.hadoop.io.serializer.avro.AvroReflectSerialization
I have written a blog on this a while ago where I was writing to multiple
tables from my mapper class. You can look into it at
http://bigdatabuzz.wordpress.com/2012/04/24/how-to-write-to-multiple-hbase-tables-in-a-mapreduce-job/
Key things are,
a) job.setOutputFormatClass (MultiTableOutputFormat.c
Hi ,
You can definitely run the Driver (ClassWithMain) to a remote hadoop
cluster from say Eclipse following the steps under
a) Have the jar (Some.jar) in your classpath of your project in Eclipse .
b) Ensure you have set both the Namenode and Job Tracker information either
in core-site.xml and
t; Configuration conf = new Configuration();
>
> conf.set("mapreduce.output.textoutputformat.separator", ",");
>
> ** **
>
> Am I changing the field right?****
>
> ** **
>
> Thanks,
>
> Andrew
>
> ** **
>
> *From:* Ravi
Hi Andrew,
You can pass change the default keyValueSeparator of the output format
from a "\t" to a "," by
setting the following property *mapred.textoutputformat.separator* to
Configuration of the job.
You will face difficulties if this output is an input to another job as
you wouldn't kno
11 matches
Mail list logo