----- Original Message -----
From: donal0412 <donal0...@gmail.com>
Date: Tuesday, November 8, 2011 1:04 pm
Subject: dfs.write.packet.size set to 2G
To: hdfs-user@hadoop.apache.org

> Hi,
> I want to store lots of files in HDFS, the file size is <= 2G.
> I don't want the file to split into blocks,because I need the 
> whole file 
> while processing it, and I don't want to transfer blocks to one 
> node 
> when processing it.
> A easy way to do this would be set dfs.write.packet.size to 2G, I 
> wonder 
> if some one has similar experiences  or known whether this is  
> practicable.Will there be performance problems when set the packet 
> size to a big number?
Looks you are looking at wrong configuration for your case. If you dont want to 
split the file, you need to increase dfs.blocksize.
In DFS data transefer will happen packet by packet. dfs.write.packet.size will 
represents waht is the size for this packet. So, block will be splitted into 
packets at client side and maintain in dataqueue. DataStreamet thread will pick 
the packets and transfer to DN till the block size reaches. Once it reaches the 
block boundary, it will close the block streams.

BTW, how you are going to process the data here? You are not going to use 
mapreduce for processing you Data?
> 
> Thanks!
> donal
> 
Regards,
Uma

Reply via email to