Re: InputFormat for fixed-width records?

2009-06-01 Thread Yabo-Arber Xu
I have a follow-up question on this thread: How do we make sure that at the getFileSplit phase, there is no records that cross the boundary of different file splits? To explain my point better, for example, if each of my record is 100 bytes, would there be such case that there is some record

Re: InputFormat for fixed-width records?

2009-06-01 Thread Yabo-Arber Xu
PM, Yabo-Arber Xu arber.resea...@gmail.com wrote: I have a follow-up question on this thread: How do we make sure that at the getFileSplit phase, there is no records that cross the boundary of different file splits? To explain my point better, for example, if each of my record is 100

Re: A brief report of Second Hadoop in China Salon

2009-05-16 Thread Yabo-Arber Xu
Congratulations! Wished I were there. :-) Best, Arber On Sat, May 16, 2009 at 9:09 AM, He Yongqiang heyongqi...@software.ict.ac.cn wrote: Hi, all In May 9, we held the second Hadoop In China salon. About 150 people attended, 46% of them are engineers/managers from industry companies, and

Re: How to access data node without a passphrase?

2009-04-22 Thread Yabo-Arber Xu
. So anyway, give Todd a shout if you want to try DEBs out. Otherwise, if you're interested in going down the Redhat derivative route (Fedora, RHEL, CentOS), you can use the RPMs. Alex On Tue, Apr 21, 2009 at 10:04 PM, Yabo-Arber Xu arber.resea...@gmail.com wrote: Thanks for all your

How to access data node without a passphrase?

2009-04-21 Thread Yabo-Arber Xu
Hi there, I setup a small cluster for testing. When I start my cluster on my master node, I have to type the password for starting each datanode and tasktracker. That's pretty annoying and may be hard to handle when the cluster grows. Any graceful way to handle this? Best, Arber

Re: How to access data node without a passphrase?

2009-04-21 Thread Yabo-Arber Xu
Thanks for all your help, especially Asteem's detailed instruction. It works now! Alex: I did not use RPMs, but several of my existing nodes are installed with Ubuntu. Is there any diff on running Hadoop on Ubuntu? I am thinking of choosing one before I started scaling up the cluster, but not

Re: Typical hardware configurations

2009-03-28 Thread Yabo-Arber Xu
Hi Amandeep, I just did the same investigation not long ago, and I was recommended to get Amazon EC2 X-Large equivalenthttp://www.google.com/url?q=http%3A%2F%2Faws.amazon.com%2Fec2%2F%23pricingsa=Dsntz=1usg=AFrqEzc1z8IB5p0hIR7SGe-mRVRZXW7Lvgnodes: , 8 EC2 Compute Units (4 virtual cores with 2 EC2