Re: Three Questions

2011-05-03 Thread Geoffry Roberts
David and All, Thanks again for helping. I found my problem and I'll post it here. I use eclipse as my IDE. When I set up my reduce class, I of course extended Reducer. Then I used the eclipse Source / Override/Implement Methods... This brings up a dialog that lists the methods that are avail

Re: Three Questions

2011-05-03 Thread David Rosenstrauch
On 05/03/2011 05:49 PM, Geoffry Roberts wrote: David, Thanks for the response. Last thing first: I am using org.apache.hadoop.mapreduce.lib.output.MultipleOutputs which is differs from what your link points to org.apache.hadoop.mapred.lib.MultipleOutputs. Using the class you propose, requires

Re: Three Questions

2011-05-03 Thread Geoffry Roberts
David, Thanks for the response. Last thing first: I am using org.apache.hadoop.mapreduce.lib.output.MultipleOutputs which is differs from what your link points to org.apache.hadoop.mapred.lib.MultipleOutputs. Using the class you propose, requires me to use a number of other classes from the sam

Re: Three Questions

2011-05-03 Thread David Rosenstrauch
On 05/03/2011 01:21 PM, Geoffry Roberts wrote: All, I have three questions I would appreciate if anyone could weigh in on. I apologise in advance if I sound whiny. 1. The namenode logs, when I view them from a browser, are displayed with the lines wrapped upon each other as if there were no n

NPE during RunningJob.getCounters()

2011-05-03 Thread Aaron Baff
Cross post from common-users. I'm using v0.21.0, with the Old API, and I have a daemon that runs and monitors MR Jobs, allows us to fetch data from the JobTracker about the MR Job's, etc. We're using Thrift as the API (so we can do PHP->Java). We're having an issue where some requests for MR Jo

Re: Configuration for small Cluster

2011-05-03 Thread hadoopman
I would dispute the assertion that linux isn't secure. I'm an MCSE and AIX Unix certified. I can setup windows servers that are very secure (and in secure). The same thing goes for Unix and Linux servers. Depends who's hands are on the keyboard imo :D If it was me, I would replace the celerons

Three Questions

2011-05-03 Thread Geoffry Roberts
All, I have three questions I would appreciate if anyone could weigh in on. I apologise in advance if I sound whiny. 1. The namenode logs, when I view them from a browser, are displayed with the lines wrapped upon each other as if there were no new line characters ('\n') in the output. I acces

Re: Is there any way I could keep both the Mapper and Reducer output in hdfs?

2011-05-03 Thread Jason
It is actually trivial to do using MultipleOutputs. You just need to emit your key-values to both MO and standard output context/collector in your mapper. Two things you should know about MO: 1. Early implementation has a serious (couple of order of magnitude) performance bug 2. Output files not

Re: Including external libraries in my job.

2011-05-03 Thread Harsh J
Niels, Am moving this to hbase-user, since its more relevant to HBase here than MR's typical job submissions. My reply below: On Tue, May 3, 2011 at 7:12 PM, Niels Basjes wrote: > Hi, > > I've written my first very simple job that does something with hbase. > > Now when I try to submit my jar i

Including external libraries in my job.

2011-05-03 Thread Niels Basjes
Hi, I've written my first very simple job that does something with hbase. Now when I try to submit my jar in my cluster I get this: [nbasjes@master ~/src/catalogloader/run]$ hadoop jar catalogloader-1.0-SNAPSHOT.jar nl.basjes.catalogloader.Loader /user/nbasjes/Minicatalog.xml Exception in thread

TestDFSIO Bechmark

2011-05-03 Thread baran cakici
Hi, I want to know I/O Performance of my Hadoop Cluster. Because of that I ran test.jar, hier is my Results; - TestDFSIO - : write Date & time: Mon May 02 14:38:29 CEST 2011 Number of files: 10 Total MBytes processed: 1 Throughput mb/sec: 12.809033955468113 Average IO rate mb/sec: 13.

Re: Configuration for small Cluster

2011-05-03 Thread baran cakici
Hi, I make this System at Work. For the Security Reasons I cant use Linux at the Company. They prefer to use Windows. thanks, Baran 2011/5/3 hadoopman > I'm curious if there is a compelling reason for running it under cygwin > instead of linux. > > I'm also concerned with the celeron and 2 gi

Re: Is there any way I could keep both the Mapper and Reducer output in hdfs?

2011-05-03 Thread Stanley Xu
But it will let us read the same data twice, which would be a waste in IO for large data. Thanks. Best wishes, Stanley Xu On Tue, May 3, 2011 at 4:09 PM, Bai, Gang wrote: > IMHO

Re: Is there any way I could keep both the Mapper and Reducer output in hdfs?

2011-05-03 Thread Bai, Gang
IMHO, it will be better if you separate your mapper and reducer into different jobs. Regards, BaiGang On Tue, May 3, 2011 at 2:09 PM, Stanley Xu wrote: > Dear all, > > We have a task to run a map-reduce job multiple times to do some machine > learning calculation. We will first use a mapper to