Hi All,
I am new to Hadoop and was trying to generate random numbers using apache
commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math
jar as lib/commons-math3.jar
I have placed this jar file in HADOOP_HOME/lib folder but still I am getting
Class
Hi,
Check the links below.
Read from HDFS:
https://sites.google.com/site/hadoopandhive/home/hadoop-how-to-read-a-file-from-hdfs
Write from HDFS:
https://sites.google.com/site/hadoopandhive/home/how-to-write-a-file-in-hdfs-using-hadoop
Hope they help!
Thanks & regards
Arko
On Tue, Apr 3, 2012 a
Hi Xin,
when you're running your MapReduce job, at some point you'll have to wire it
together, i.e., say what the mapper class is, what the reducer class is, etc.
There you can also configure the job to use your new OutputFormat class.
Something like this:
--
Job job = new Job(conf
Hi Christoph,
Thank you for your reply.
I create such a class in the project, and build an instance of it in
main, and try to use this method included, but it didnt work.
Can you explain a little bit more about how to let this function work?
Thank you!
On Tue, Apr 3, 2012 at 6:39 PM, Christoph S
Hi Bejoy,
Could you kindly further elaborate this? what and where should I insert?
Thank you!
On Tue, Apr 3, 2012 at 7:36 PM, Bejoy Ks wrote:
> Hi Xin
> In a very simple way, just include the line of code in your Driver
> class to check whether the output dir exists in hdfs, if exists de
Hi,
The following code creates a cross product between two files. If you for
same file specify the same file in arguments.
package com.example.hadoopexamples.joinnew;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import ja
Hi Xin
In a very simple way, just include the line of code in your Driver
class to check whether the output dir exists in hdfs, if exists delete that.
Regards
Bejoy KS
On Tue, Apr 3, 2012 at 4:09 PM, Christoph Schmitz <
christoph.schm...@1und1.de> wrote:
> Hi Xin,
>
> you can derive your ow
YARN in hadoop 0.23.1 can do this.
Hi Xin,
you can derive your own output format class from one of the Hadoop
OutputFormats and make sure the "checkOutputSpecs" method, which usually does
the checking, is empty:
---
public final class OverwritingTextOutputFormat extends
TextOutputFormat {
@Override
public void c
Hi, all
I'm writing my own map-reduce code using eclipse with hadoop plug-in.
I've specified input and output directories in the project property.
(two folders, namely input and output)
My problem is that each time when I do some modification and try to
run it again, i have to manually delete the
hi Xin
To add on the factors that you need to primarily consider
in deciding the slots is
- Memory
If your task needs 1Gb each and you have an available memory of
12Gb you can host 12 slots. Divide the same between mapper and reducer
slots proportionally based on the jobs in
Hi Xin
Yes, the number of worker nodes do count on the map and reduce
capacity of the cluster. The map and reduce task capacity/slots is
dependen't on each node and of course the requirements of your applications
that use the cluster. Based on the available memory, number of cores etc
you nee
Hi all,
of course it's sensible that number of nodes in the cluster will
influence map / reduce task capacity, but what determines average task
per node?
Can the number be manually set? any hardware constraint on setting the number?
Thank you!
Xin
13 matches
Mail list logo