Hi,
I am new to hadoop and I am trying to understand hadoop job submission.
We submit the job using:
hadoop jar some.jar name input output
this in turn invoke the RunJar . But in RunJar I can not find any
JobSubmit() or any call to JobClient.
Then, how the job gets submitted to the
Hi Arindam,
hadoop jar jarFileName MainClassName
The above command will not submit the job. This command only executes the jar
file using the Main Class(Main-class present in manifest info if available
otherwise class name(i.e MainClassName in the above command) passed as an
argument. If
Yes, the job is submitted by the api calls in map reduce code
On Wed, Apr 25, 2012 at 3:56 AM, Devaraj k devara...@huawei.com wrote:
Hi Arindam,
hadoop jar jarFileName MainClassName
The above command will not submit the job. This command only executes the
jar file using the Main
Hi,
The code is:
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
if (otherArgs.length != 2) {
System.err.println(Usage: wordcount in out);
You can submit the job using any one of the below ways,
1. If you submit the job using JobClient, you need to create JobConf and submit
the job using JobClient.runJob(JobConf conf) API.
2. Also you can submit the job by creating instance for Job by passing
Configuration object and submit(using
Hi,
I wanted to know if there are any existing API's within Hadoop for us to do
some text analysis like sentiment analysis, etc. OR are we to rely on tools
like R, etc. for this.
Regards,
Karanveer
This e-mail and any attachments are confidential and intended
solely for the addressee and
For distribution of load you can start reading some chapters from different
types of hadoop scheduler. I have not yet studied other implementation like
hadoop, however a very simplified version of distribution concept is the
following:
a) Tasktracker ask for work (heartbeat consist of a status
I just saw this line in the javadocs for OutputFormat:
Output files are stored in a
FileSystemhttp://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html.
Seems like an odd sentence. What is the implication here -- is this
implying anything other than the obvious ?
Hadoop itself is the core Map/Reduce and HDFS functionality. The higher level
algorithms like sentiment analysis are often done by others. Cloudera has a
video from HadoopWorld 2010 about it
http://www.cloudera.com/resource/hw10_video_sentiment_analysis_powered_by_hadoop/
And there are
There are Api which you can use , offcourse they are third party.
---
Sent from Mobile , short and crisp.
On 25-Apr-2012 8:57 PM, Robert Evans ev...@yahoo-inc.com wrote:
Hadoop itself is the core Map/Reduce and HDFS functionality. The higher
level algorithms like sentiment analysis
I think what it means is that the output files can be stored in any of the
possible implementation of the FileSystem abstract class depending on the
user requirement. So, it could be stored in DistributedFileSystem,
LocalFileSystem etc...
Regards,
John George
-Original Message-
From:
Strangely isee the tmp folder has enough space. What else could be the problem
? How much should my tmp space be ?
Error: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at
I do not know about the implementation existence of something as
specific as sentiment analysis, but if you're generally looking at
MapReduce for Text processing I highly recommend visiting
http://cloud9lib.org
On Wed, Apr 25, 2012 at 6:20 PM, karanveer.si...@barclays.com wrote:
Hi,
I wanted
looks like the hadoop partition is full.
sent via my mobile device
On Apr 25, 2012, at 9:13 PM, Nuthalapati, Ramesh
ramesh.nuthalap...@mtvstaff.com wrote:
Strangely isee the tmp folder has enough space. What else could be the
problem ? How much should my tmp space be ?
Error:
This is from your mapred.local.dir (which by default may reuse hadoop.tmp.dir).
Do you see free space available when you do the following?:
df -h /opt/hadoop
On Thu, Apr 26, 2012 at 12:43 AM, Nuthalapati, Ramesh
ramesh.nuthalap...@mtvstaff.com wrote:
Strangely isee the tmp folder has enough
If you've got existing R code, you might want to look at this
http://www.quora.com/How-can-R-and-Hadoop-be-used-together.
Quora posting, also by Cloudera, or the rhipe R Hadoop package
https://github.com/saptarshiguha/RHIPE/wiki
Mahout and Lucene/Solr offer some level of text analysis, although
I have lot of space available
FilesystemSize Used Avail Use% Mounted on
/dev/mapper/sysvg-opt
14G 1.2G 12G 9% /opt
My input files are around 10G, is there a requirement that the hadoop tmp dir
should be at certain % of the input files or something ?
Ramesh,
That explains it then.
Going from Map to Reduce requires disk storage worth at least the
amount of data you're gonna be sending between them. If you're running
your 'cluster' on a single machine, the answer to your question is
yes.
On Thu, Apr 26, 2012 at 1:01 AM, Nuthalapati, Ramesh
RHaddop package allows you to do statistical anlysis. we were able to do word
cloud on the text files using rmr and rhdfs packages.
Installtion details for these packages is available in the following link.
https://github.com/RevolutionAnalytics/RHadoop/wiki/rmr
Devi
Hi,
I am wondering what kind of hadoop cluster I can get on rackspace for
a 2k/month budget (#servers, size of each server)? btw, any pointer to
instruction on how to set up hadoop cluster or cloudera version on
rackspace is really appreciated.
Regards,
Shawn
I had the same issue. My problem was the use of VPN
connected to work, and at the same time working
with M/R jobs on my Mac. It occurred to me that
maybe Hadoop was binding to the wrong IP (the IP
given to you after connecting through VPN),
bottom line, I disconnect from the VPN, and the M/R
21 matches
Mail list logo