----- Original Message ----- From: donal0412 <donal0...@gmail.com> Date: Tuesday, November 8, 2011 1:04 pm Subject: dfs.write.packet.size set to 2G To: hdfs-user@hadoop.apache.org
> Hi, > I want to store lots of files in HDFS, the file size is <= 2G. > I don't want the file to split into blocks,because I need the > whole file > while processing it, and I don't want to transfer blocks to one > node > when processing it. > A easy way to do this would be set dfs.write.packet.size to 2G, I > wonder > if some one has similar experiences or known whether this is > practicable.Will there be performance problems when set the packet > size to a big number? Looks you are looking at wrong configuration for your case. If you dont want to split the file, you need to increase dfs.blocksize. In DFS data transefer will happen packet by packet. dfs.write.packet.size will represents waht is the size for this packet. So, block will be splitted into packets at client side and maintain in dataqueue. DataStreamet thread will pick the packets and transfer to DN till the block size reaches. Once it reaches the block boundary, it will close the block streams. BTW, how you are going to process the data here? You are not going to use mapreduce for processing you Data? > > Thanks! > donal > Regards, Uma