Re: Estimating Time required to compute M/Rjob

2011-04-18 Thread Matthew Foley
R.V., I was only suggesting one way to tackle the problem; I don't have a list of appropriate parameters. I think Ted has much more experience in this area, and he is encouraging you to stay with the generic approach. You should study that paper he recommended, the approach looks really

Re: Estimating Time required to compute M/Rjob

2011-04-18 Thread real great..
sure,will do..:) On Mon, Apr 18, 2011 at 11:58 AM, Matthew Foley ma...@yahoo-inc.com wrote: R.V., I was only suggesting one way to tackle the problem; I don't have a list of appropriate parameters. I think Ted has much more experience in this area, and he is encouraging you to stay with the

Re: Problem starting up tasktracker - CDH3u0

2011-04-18 Thread Matthew Tovbin
Got all working. /app/hadoop/tmp required 755 On Mon, Apr 18, 2011 at 11:46, Matthew Tovbin matt...@tovbin.com wrote: Hey, I got CDH3u0 installed. I am trying to setup a single node configuration. I got name node, data node and job tracker up and running. But when I try to start the task

java.net.ConnectException

2011-04-18 Thread RAGHAVENDRA PRASAD
I am a newbie into hadoop. We are trying to set up hadoop infrastructure at our company. I am not sure whether it is a right forum to ask this question. Our Application server is windows. I was looking for tutorial by which we can connect from windows system to hadoop(on ubuntu) and have to run

Re: java.net.ConnectException

2011-04-18 Thread praveenesh kumar
Hi, Have you checked the ports on which map-reduce server and hdfs are running. I guess the plugin gives by default its own ports. you have to replace it with the ports on which you are running your map reduce and hdfs. I guess that might help you..!! Thanks, Praveenesh On Mon, Apr 18, 2011 at

A problem about the network

2011-04-18 Thread ajing.wang
HI,all. I have encountered a problem about hardware . I build up a hadoop cluster with four old computer. After start the cluster and running job for few days , I found the computer's network become slow,and the job's process time become longer and longer. It becomes normal after

Re: Map Result Caching

2011-04-18 Thread Robert Evans
DoomUs, To me it seems like it should be something at the application level and less at the Hadoop level. I would think if there really is very little delta between the runs then the application would save the output of a map only job, and the next time would do a union of that and the output

Re: A problem about the network

2011-04-18 Thread Juwei Shi
Did you look at the cluster wide resource usage such as CPU, memory and network I/O? You may check the resouce utilization when the processing becomes slow. Then the bottleneck can be detected. 在 2011年4月18日 下午9:38,ajing.wang ajing.w...@qq.com写道: HI,all. I have encountered a problem about

How HDFS decides where to put the block

2011-04-18 Thread Nan Zhu
Hi, all I'm confused by a question that how does the HDFS decide where to put the data blocks I mean that the user invokes some commands like ./hadoop put ***, we assume that this file consistes of 3 blocks, but how HDFS decides where these 3 blocks to be put? Most of the materials don't

A problem about the network

2011-04-18 Thread jbm3072
HI,all. I have encountered a problem about hardware . I build up a hadoop cluster with four old computer. After start the cluster and running job for few days , I found the computer's network become slow,and the job's process time become longer and longer. It becomes normal after

Problem starting up tasktracker - CDH3u0

2011-04-18 Thread Matthew Tovbin
Hey, I got CDH3u0 installed. I am trying to setup a single node configuration. I got name node, data node and job tracker up and running. But when I try to start the task tracker I got the following error message in log: / STARTUP_MSG:

Having a terrible time compiling WordCount

2011-04-18 Thread modemide
I'm getting many cannot find symbol errors. I've been searching everywhere and have given up. There has to be a good (and very simple) reason for why this is happening. My setup is as follows: hadoop installed to - /usr/local/hadoop Test folder (to compile word count) -

Re: Having a terrible time compiling WordCount

2011-04-18 Thread Mark Kerzner
You may want to look here, https://github.com/markkerzner/HadoopInPracticeCode and here http://hadoopinpractice.com/ http://hadoopinpractice.com/Sincerely, Mark On Mon, Apr 18, 2011 at 1:34 PM, modemide modem...@gmail.com wrote: I'm getting many cannot find symbol errors. I've been searching

Finding logs for a failed job

2011-04-18 Thread Chris Curtin
Hi, I'm getting reports from users that a job has failed, but when I go to the console the 'failed' jobs section is clear, having been pushed out by other jobs running on the cluster. All of the users submit jobs as the same hadoop user, so searching by OS user doesn't help. Of course the user

How do I serialize a MapWritable value?

2011-04-18 Thread W.P. McNeill
I have a reducer that outputs pairs of the form Text, MapWritable. The MapWritable is a map from Text to Long. The key and output values of my reducer are declared like so: job.setOutputKeyClass(Text.class); job.setOutputValueClass(MapWritable.class); The last few lines of my

Any experience with HBQL

2011-04-18 Thread Steve Lewis
1) I get connection refused when trying to use it 2) I am looking for a Maven repository to add it to my build -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com

Error while using distcp

2011-04-18 Thread sonia gehlot
Hi All, I am trying to copy files from one hadoop cluster to another hadoop cluster but I am getting following error: [phx1-rb-bi-dev50-metrics-qry1:]$ scripts/hadoop.sh distcp hftp://c17-dw-dev50-hdfs-dn-n1:50070/user/sgehlot/fact_lead.v0.txt.gz \

Re: Error while using distcp

2011-04-18 Thread James Seigel
Same versions of hadoop in each cluster? Sent from my mobile. Please excuse the typos. On 2011-04-18, at 6:31 PM, sonia gehlot sonia.geh...@gmail.com wrote: Hi All, I am trying to copy files from one hadoop cluster to another hadoop cluster but I am getting following error:

Re: Error while using distcp

2011-04-18 Thread sonia gehlot
Yes same versions of hadoop on both the clusters. On Mon, Apr 18, 2011 at 5:42 PM, James Seigel ja...@tynt.com wrote: Same versions of hadoop in each cluster? Sent from my mobile. Please excuse the typos. On 2011-04-18, at 6:31 PM, sonia gehlot sonia.geh...@gmail.com wrote: Hi All,

Re: Error while using distcp

2011-04-18 Thread sonia gehlot
Sorry guys, it was typos it works. Thanks, Sonia On Mon, Apr 18, 2011 at 5:45 PM, sonia gehlot sonia.geh...@gmail.comwrote: Yes same versions of hadoop on both the clusters. On Mon, Apr 18, 2011 at 5:42 PM, James Seigel ja...@tynt.com wrote: Same versions of hadoop in each cluster?

April SFHUG recap, May SFHUG meetup announcement

2011-04-18 Thread Aaron Kimball
Hello Hadoop fans, This last week we had a very successful meetup of the SF Hadoop User Group, hosted by Twitter. Breakout topics included: * Log analysis * Cluster resource management * FlumeBase * Reusable MapReduce scripts * Languages for MapReduce programming * Hadoop Roadmap * Oozie Best

Hadoop Speed Efficiency ??

2011-04-18 Thread praveenesh kumar
Hello everyone, I am new to hadoop... I set up a hadoop cluster of 4 ubuntu systems. ( Hadoop 0.20.2) and I am running the well known word count (gutenberg) example to test how fast my hadoop is working.. But whenever I am running wordcount example..I am not able to see any much processing time

Re: Hadoop Speed Efficiency ??

2011-04-18 Thread real great..
Whats your input size? On Tue, Apr 19, 2011 at 10:21 AM, praveenesh kumar praveen...@gmail.comwrote: Hello everyone, I am new to hadoop... I set up a hadoop cluster of 4 ubuntu systems. ( Hadoop 0.20.2) and I am running the well known word count (gutenberg) example to test how fast my