Re: How to print values in console while running MapReduce application

2017-10-08 Thread Naganarasimha Garla
;> > wrote: >> >>> i did the same tutorial, i think they only way is doing it outside >>> hadoop. >>> in the command line: >>> cat folder/* | python mapper.py | sort | python reducer >>> >>> >>> El Miércoles, 4 de octubr

Re: How to print values in console while running MapReduce application

2017-10-08 Thread Harsh J
>> i did the same tutorial, i think they only way is doing it outside >> hadoop. >> in the command line: >> cat folder/* | python mapper.py | sort | python reducer >> >> >> El Miércoles, 4 de octubre, 2017 16:20:31, Tanvir Rahman < >> tanvir9982...@

Re: How to print values in console while running MapReduce application

2017-10-04 Thread Tanvir Rahman
and line: > cat folder/* | python mapper.py | sort | python reducer > > > El Miércoles, 4 de octubre, 2017 16:20:31, Tanvir Rahman < > tanvir9982...@gmail.com> escribió: > > > Hello, > I have a small cluster and I am running MapReduce WordCount application in > it. &g

Re: How to print values in console while running MapReduce application

2017-10-04 Thread Demian Kurejwowski
d I am running MapReduce WordCount application in it.  I want to print some variable values in the console(where you can see the map and reduce job progress and other job information) for debugging purpose while running the MapReduce application.  What is the easiest way to do that? Thanks in a

Re: How to print values in console while running MapReduce application

2017-10-04 Thread Sultan Alamro
Hi, The easiest way is to open a new window and display the log file as follow tail -f /path/to/log/file.log Best, Sultan > On Oct 4, 2017, at 5:20 PM, Tanvir Rahman <tanvir9982...@gmail.com> wrote: > > Hello, > I have a small cluster and I am running MapReduce Wor

How to print values in console while running MapReduce application

2017-10-04 Thread Tanvir Rahman
Hello, I have a small cluster and I am running MapReduce WordCount application in it. I want to print some variable values in the console(where you can see the map and reduce job progress and other job information) for debugging purpose while running the MapReduce application. What is the easiest

Re: running mapreduce on different filesystems as input and output locations

2017-03-31 Thread sudhakara st
It not possible write to S3 if use context.write(), but it possible when you open a s3 file in reducer and write. Create output stream to a S3 file in reducer *setup() *method like FSDataOutputStream fsStream = FileSystem.create(to s3); PrintWriter writer = new PrintWriter(fsStream);

running mapreduce on different filesystems as input and output locations

2017-03-27 Thread Jae-Hyuck Kwak
Hi, I want to run mapreduce on different filesystems as input and output locations. # hadoop jar examples.jar wordcount hdfs://input s3://output Is it possible? any kinds of comments will be welcome. Best regards, Jae-Hyuck

Outofmemory error with Java Heap space when running mapreduce

2016-01-21 Thread Suresh V
We have a mapreduce that processes text files that are inside a zip file. The program ran fine when we gave upto 40GB sized zip files. When we gave a zip file of size 80MB as input (the zip file has a 1.2GB text file inside), the map reduce errored out with below error: 2016-01-21 14:47:19,384

Running MapReduce jobs in batch mode on different data sets

2015-02-21 Thread tesm...@gmail.com
Hi, Is it possible to run jobs on Hadoop in batch mode? I have 5 different datasets in HDFS and need to run the same MapReduce application on these datasets sets one after the other. Right now I am doing it manually How can I automate this? How can I save the log of each execution in text

Re: RE: YarnChild didn't be killed after running mapreduce

2014-11-02 Thread dwld0...@gmail.com
K S Date: 2014-10-31 16:12 To: user@hadoop.apache.org Subject: RE: YarnChild didn't be killed after running mapreduce This is strange!! Can you get ps �Caef | grep pid fro this process? What is the application status in RM UI? Thanks Regards Rohith Sharma K S This e-mail and its attachments

YarnChild didn't be killed after running mapreduce

2014-10-31 Thread dwld0...@gmail.com
/hsperfdata_yarn ,it will be there after running mapreduce(yarn) again. I had modified many parameters in yarn-site.xml and mapred-site.xml. yarn-site.xml property nameyarn.nodemanager.resource.memory-mb/name value4096/value /property property nameyarn.nodemanager.resource.cpu-vcores/name

RE: YarnChild didn't be killed after running mapreduce

2014-10-31 Thread Rohith Sharma K S
immediately and delete it! From: dwld0...@gmail.com [mailto:dwld0...@gmail.com] Sent: 31 October 2014 13:05 To: user@hadoop.apache.org Subject: YarnChild didn't be killed after running mapreduce All I runed mapreduce example successfully,but it always appeared invalid process on the nodemanager

Re: Re: running mapreduce

2014-05-27 Thread dwld0...@gmail.com
HiHow to solve this problem?I deleted all data and reformated the hadoop.All are work,only it appeared the  unavailable process after running mapreduce,deleted the process,still appear again. dwld0...@gmail.com  From: dwld0425@gmail.comDate: 2014-05-26 10:56To:  user@hadoop.apache.orgSubject

running mapreduce

2014-05-25 Thread dwld0...@gmail.com
Hi Once running mapreduce, it will appear an unavailable process.Each time it will be like this.3472 ThriftServer  3134 NodeManager  3322 HRegionServer  4383 -- process information unavailable  4595 Jps  2978 DataNode I delete the process id in directory of  /tmp/hsperfdata_yarn

Re: running mapreduce

2014-05-25 Thread Ted Yu
Can you provide a bit more information ? Such as the release of hadoop you're running. BTW did you use 'ps' command to see the command line for 4383 ? Cheers On Sun, May 25, 2014 at 7:30 AM, dwld0...@gmail.com dwld0...@gmail.comwrote: Hi *Once running mapreduce, it will appear

Re: Re: running mapreduce

2014-05-25 Thread dwld0...@gmail.com
HiIt is CDH5.0.0, Hadoop 2.3.0, I found the  unavailable process disappeared this morning.but it appears again on the Map and Reduce server after running mapreduce  #jps15371 Jps 2269 QuorumPeerMain 15306 -- process information unavailable 11295 DataNode 11455 NodeManager #ps -ef|grep

Fwd: Trouble in running MapReduce application

2013-02-24 Thread Fatih Haltas
Hi Hemanth; Thanks for your grreat helps, I am really much obliged to you. I solved this problem by changing my java compiler vs. but now though I changed everynodes configuration I am getting this error even I tried to run example of wordcount without making any changes. What may be the

Re: Trouble in running MapReduce application

2013-02-23 Thread Hemanth Yamijala
Can you try this ? Pick a class like WordCount from your package and execute this command: javap -classpath path to your jar -verbose org.myorg.Wordcount | grep version. For e.g. here's what I get for my class: $ javap -verbose WCMapper | grep version minor version: 0 major version: 50

Trouble in running MapReduce application

2013-02-19 Thread Fatih Haltas
Hi everyone, I know this is the common mistake to not specify the class adress while trying to run a jar, however, although I specified, I am still getting the ClassNotFound exception. What may be the reason for it? I have been struggling for this problem more than a 2 days. I just wrote

Re: Trouble in running MapReduce application

2013-02-19 Thread Harsh J
Your point (4) explains the problem. The jar packed structure should look like the below, and not how it is presently (one extra top level dir is present): META-INF/ META-INF/MANIFEST.MF org/ org/myorg/ org/myorg/WordCount.class org/myorg/WordCount$TokenizerMapper.class

Re: Trouble in running MapReduce application

2013-02-19 Thread Harsh J
Oops. I just noticed Hemanth has been answering on a dupe thread as well. Lets drop this thread and carry on there :) On Tue, Feb 19, 2013 at 11:14 PM, Harsh J ha...@cloudera.com wrote: Hi, The new error usually happens if you compile using Java 7 and try to run via Java 6 (for example). That

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-15 Thread yaotian
. Regards Bejoy KS Sent from remote device, Please excuse typos -- *From: * yaotian yaot...@gmail.com *Date: *Fri, 11 Jan 2013 14:35:07 +0800 *To: *user@hadoop.apache.org *ReplyTo: * user@hadoop.apache.org *Subject: *Re: I am running MapReduce on a 30G data on 1master/2

Re:Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-15 Thread Charlie A.
from remote device, Please excuse typos From: yaotian yaot...@gmail.com Date: Fri, 11 Jan 2013 14:35:07 +0800 To: user@hadoop.apache.org ReplyTo: user@hadoop.apache.org Subject: Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed. See inline. 2013/1/11 Harsh J ha

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-13 Thread yaotian
-- *From: * yaotian yaot...@gmail.com *Date: *Fri, 11 Jan 2013 14:35:07 +0800 *To: *user@hadoop.apache.org *ReplyTo: * user@hadoop.apache.org *Subject: *Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed. See inline. 2013/1/11 Harsh J ha

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-11 Thread Serge Blazhiyevskyy
Are you running this on the VM by any chance? On Jan 10, 2013, at 9:11 PM, Mahesh Balija balijamahesh@gmail.commailto:balijamahesh@gmail.com wrote: Hi, 2 reducers are successfully completed and 1498 have been killed. I assume that you have the data issues. (Either the

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-10 Thread Mahesh Balija
Hi, 2 reducers are successfully completed and 1498 have been killed. I assume that you have the data issues. (Either the data is huge or some issues with the data you are trying to process) One possibility could be you have many values associated to a single key, which can

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-10 Thread yaotian
Yes, you are right. The data is GPS trace related to corresponding uid. The reduce is doing this: Sort user to get this kind of result: uid, gps1, gps2, gps3 Yes, the gps data is big because this is 30G data. How to solve this? 2013/1/11 Mahesh Balija balijamahesh@gmail.com Hi,

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-10 Thread Harsh J
If the per-record processing time is very high, you will need to periodically report a status. Without a status change report from the task to the tracker, it will be killed away as a dead task after a default timeout of 10 minutes (600s). Also, beware of holding too much memory in a reduce JVM -

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-10 Thread yaotian
See inline. 2013/1/11 Harsh J ha...@cloudera.com If the per-record processing time is very high, you will need to periodically report a status. Without a status change report from the task to the tracker, it will be killed away as a dead task after a default timeout of 10 minutes (600s).

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-10 Thread bejoy . hadoop
To: user@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed. See inline. 2013/1/11 Harsh J ha...@cloudera.com If the per-record processing time is very high, you will need to periodically report a status. Without

Re: Cygwin+Windows: Error running mapreduce unit tests

2012-09-23 Thread Sahana Bhat
Hi, I am encountering the below error when i try to run unit tests using MRUnit for mapreduce on Windows in my eclipse environment. Version of Hadoop is 0.20.2. I already have Cygwin installed and set in my PATH variable. Error:

problem in running mapreduce task

2011-03-14 Thread vishalgoyal
give a solution to this -- View this message in context: http://hadoop-common.472056.n3.nabble.com/problem-in-running-mapreduce-task-tp2676765p2676765.html Sent from the Users mailing list archive at Nabble.com.

Re: problem in running mapreduce task

2011-03-14 Thread maha
/problem-in-running-mapreduce-task-tp2676765p2676765.html Sent from the Users mailing list archive at Nabble.com.

problem in running mapreduce task

2011-03-14 Thread vishalgoyal
a solution to this -- View this message in context: http://hadoop-common.472056.n3.nabble.com/problem-in-running-mapreduce-task-tp2676753p2676753.html Sent from the Users mailing list archive at Nabble.com.

Any projects to help with running MapReduce across physically distributed clusters?

2010-11-03 Thread Jason Smith
I am looking into the problem of running jobs to generate statistics across a large data set that would be split into different clusters geographically. Each cluster would have a unique piece of the overall data set, as the network overhead to collocate the data would be too much. I tried

Re: Any projects to help with running MapReduce across physically distributed clusters?

2010-11-03 Thread Chris K Wensel
You could easily write Cascading apps that could pull all the data into a single source and perform the processing. You could also use it to launch jobs in different clusters from a single application (each Flow can be given unique properties causing it to run mr jobs on arbitrary clusters).

weird exception when running mapreduce jobs with hadoop 0.21.0

2010-09-16 Thread Marc Sturlese
) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.util Any idea why is this happening? -- View this message in context: http://lucene.472066.n3.nabble.com/weird-exception-when-running-mapreduce-jobs-with-hadoop-0-21-0-tp1488154p1488154.html Sent from

Re: Running Mapreduce program apart from command prompt

2010-05-27 Thread Eric Sammer
Nishant: You can submit jobs from any java program provided you have the Hadoop jars and configuration directory in your classpath. This is done with the normal JobClient class. On Thu, May 27, 2010 at 3:58 AM, Nishant Sonar nisha...@synechron.com wrote: Hello I have a requirement where i

Re: Running MapReduce without setJar

2009-04-07 Thread Farhan Husain
Thanks Aaron for the explanation. On Tue, Apr 7, 2009 at 1:51 PM, Aaron Kimball aa...@cloudera.com wrote: All the nodes in your Hadoop cluster need access to the class files for your MapReduce job. The current mechanism that Hadoop has to distribute classes and attach them to the classpath

Re: Running MapReduce without setJar

2009-04-07 Thread Aaron Kimball
All the nodes in your Hadoop cluster need access to the class files for your MapReduce job. The current mechanism that Hadoop has to distribute classes and attach them to the classpath assumes they're in a JAR together. Thus, merely specifying the names of mapper/reducer classes with

Re: Running MapReduce without setJar

2009-04-02 Thread Rasit OZDAS
Yes, as an additional info, you can use this code just to start the job, not wait until it's finished: JobClient client = new JobClient(conf); client.runJob(conf); 2009/4/1 javateck javateck javat...@gmail.com you can run from java program: JobConf conf = new

Re: Running MapReduce without setJar

2009-04-02 Thread Farhan Husain
Does this class need to have the mapper and reducer classes too? On Wed, Apr 1, 2009 at 1:52 PM, javateck javateck javat...@gmail.comwrote: you can run from java program: JobConf conf = new JobConf(MapReduceWork.class); // setting your params JobClient.runJob(conf);

Re: Running MapReduce without setJar

2009-04-02 Thread Farhan Husain
I did all of them i.e. I used setMapClass, setReduceClass and new JobConf(MapReduceWork.class) but still it cannot run the job without a jar file. I understand the reason that it looks for those classes inside a jar but I think there should be some better way to find those classes without using a

Running MapReduce without setJar

2009-04-01 Thread Farhan Husain
Hello, Can anyone tell me if there is any way running a map-reduce job from a java program without specifying the jar file by JobConf.setJar() method? Thanks, -- Mohammad Farhan Husain Research Assistant Department of Computer Science Erik Jonsson School of Engineering and Computer Science

Re: Running MapReduce without setJar

2009-04-01 Thread javateck javateck
I think you need to set a property (mapred.jar) inside hadoop-site.xml, then you don't need to hardcode in your java code, and it will be fine. But I don't know if there is any way that we can set multiple jars, since a lot of times our own mapreduce class needs to reference other jars. On Wed,

Re: Running MapReduce without setJar

2009-04-01 Thread Farhan Husain
Can I get rid of the whole jar thing? Is there any way to run map reduce programs without using a jar? I do not want to use hadoop jar ... either. On Wed, Apr 1, 2009 at 1:10 PM, javateck javateck javat...@gmail.comwrote: I think you need to set a property (mapred.jar) inside hadoop-site.xml,

Re: Running MapReduce without setJar

2009-04-01 Thread javateck javateck
you can run from java program: JobConf conf = new JobConf(MapReduceWork.class); // setting your params JobClient.runJob(conf); On Wed, Apr 1, 2009 at 11:42 AM, Farhan Husain russ...@gmail.com wrote: Can I get rid of the whole jar thing? Is there any way to run map