Re: Hadoop performance in PC cluster

2008-04-20 Thread Yingyuan Cheng
I implemented Map/Reduce class with java, but delegated native c++ code
to implement map/reduce functions. I think most of time spent on
decoding Unicode string to UTF-8(native charset) and extra jni callings.

--
yingyuan


Ted Dunning 写道:
> My quick impression is that this is a very slow network connection and not
> much memory.
>
>
>   



Re: Hadoop performance in PC cluster

2008-04-20 Thread Ted Dunning

My quick impression is that this is a very slow network connection and not
much memory.


On 4/20/08 8:42 PM, "Yingyuan Cheng" <[EMAIL PROTECTED]> wrote:

> 
> To be complete, I ran WordCount task implemented with JNI. The result is
> acceptable, though slower than pure java. It finished in 32mins, 30sec,
> 
> --
> Yingyuan
> 
> 
> Yingyuan Cheng 写道:
>> Does anyone run Hadoop in PC cluster?
>> 
>> I just tested WordCount in PC cluster, and my first impression as following:
>> 
>> *
>> **
>> 
>> Number of PCs: 7(512M RAM, 2.8G CPU, 100M NIC, CentOS 5.0, Handoop
>> 0.16.1, Sun jre 1.6)
>> Master(Namenode): 1
>> Master(Jobtracker): 1
>> Slaves(Datanode & Tasktracker): 5
>> 
>> ...
> 
>> 2. Map/Reduce with Java
>> --
>> 
>> Time elapsed: 19mins, 56sec
>> Bytes/time rate: 3,591,422 bytes/sec
>>   
> 



Re: Hadoop performance in PC cluster

2008-04-20 Thread Yingyuan Cheng

To be complete, I ran WordCount task implemented with JNI. The result is
acceptable, though slower than pure java. It finished in 32mins, 30sec,

--
Yingyuan


Yingyuan Cheng 写道:
> Does anyone run Hadoop in PC cluster?
>
> I just tested WordCount in PC cluster, and my first impression as following:
>
> ***
>
> Number of PCs: 7(512M RAM, 2.8G CPU, 100M NIC, CentOS 5.0, Handoop
> 0.16.1, Sun jre 1.6)
> Master(Namenode): 1
> Master(Jobtracker): 1
> Slaves(Datanode & Tasktracker): 5
>
> ...

> 2. Map/Reduce with Java
> --
>
> Time elapsed: 19mins, 56sec
> Bytes/time rate: 3,591,422 bytes/sec
>   



Hadoop performance in PC cluster

2008-04-11 Thread Yingyuan Cheng

Does anyone run Hadoop in PC cluster?

I just tested WordCount in PC cluster, and my first impression as following:

***

Number of PCs: 7(512M RAM, 2.8G CPU, 100M NIC, CentOS 5.0, Handoop
0.16.1, Sun jre 1.6)
Master(Namenode): 1
Master(Jobtracker): 1
Slaves(Datanode & Tasktracker): 5

1. Writing to HDFS
--

File size: 4,295,341,065 bytes(4.1G)
Time elapsed putting file into HDFS: 7m57.757s
Average rate: 8,990,583 bytes/sec
Average bandwidth usage: 68.59%

I also tested libhdfs, it's just as fine as java.


2. Map/Reduce with Java
--

Time elapsed: 19mins, 56sec
Bytes/time rate: 3,591,422 bytes/sec

Job Counters:
Launched map tasks 67
Launched reduce tasks 7
Data-local map tasks 64

Map-Reduce Framework:
Map input records 65,869,800
Map output records 697,923,360
Map input bytes 4,295,341,065
Map output bytes 6,504,944,565
Combine input records 697,923,360
Combine output records 2,330,048
Reduce input groups 5,201
Reduce input records 2,330,048
Reduce output records 5,201

It's acceptable. The main bottleneck was CPU, keeping 100% usage.


3. Map/Reduce with C++ Pipe(No combiner)
--

Time elapsed: 1hrs, 2mins, 47sec
Bytes/time rate: 1,140,255 bytes/sec

Job Counters:
Launched map tasks 68
Launched reduce tasks 5
Data-local map tasks 64

Map-Reduce Framework:
Map input records 65,869,800
Map output records 697,452,105
Map input bytes 4,295,341,065
Map output bytes 5,107,053,975
Combine input records 0
Combine output records 0
Reduce input groups 5,191
Reduce input records 697,452,105
Reduce output records 5,191

As my first impression, C++ pipe interface is slower than Java. If I add
C++ pipe combiner, the result become even worse: The main bottleneck is
RAM, a great deal of swapping space used, processes blocked, CPU keeping
waiting...

Adding more RAM maybe improve performance, but still slower than Java, I
think.


4. Map/Reduce with Python streaming(No combiner)
--

Time elapsed: 1hrs, 48mins, 53sec
Bytes/time rate: 657,483 bytes/sec

Job Counters:
Launched map tasks 68
Launched reduce tasks 5
Data-local map tasks 64

Map-Reduce Framework:
Map input records 65,869,800
Map output records 697,452,105
Map input bytes 4,295,341,065
Map output bytes 5,107,053,975
Combine input records 0
Combine output records 0
Reduce input groups 5,191
Reduce input records 697,452,105
Reduce output records 5,191

As you see, the result is not as good as C++ pipe interface. Maybe
python is slower, I didn't test other cases.

Are there any suggestions to improve such situation?



--
yingyuan