Re: max concurrent connection to HDFS name node

2015-02-17 Thread Colin P. McCabe
Hi Demai, Nearly all input and output stream operations will talk directly to the DN without involving the NN. The NameNode is involved in metadata operations such as renaming or opening files, not in reading data. Hope this helps. best, Colin On Thu, Feb 12, 2015 at 4:21 PM, Demai Ni wrote:

Re: max concurrent connection to HDFS name node

2015-02-12 Thread Demai Ni
Colin, Thanks. 30~50K is smaller than I thought, through I understand that I shouldn't stress the traffic unnecessarily. If I can put my client(java/c) on a datanode and only read the local hdfs files, that is the files have their replicas on such datanode. Is there an API I can use to talk direc

Re: max concurrent connection to HDFS name node

2015-02-12 Thread Colin McCabe
The NN can do somewhere around 30,000 - 50,000 RPCs per second currently, depending on configuration. In general you do not want to have extremely high NN RPC traffic, because it will slow things down. You might consider re-architecting your application to do more DN traffic and less NN traffic, i

max concurrent connection to HDFS name node

2015-02-10 Thread Demai Ni
hi, folks, Is there a max limit of concurrent connection to a name node? or whether there is a best practice? My scenario is simple. Client(java/c++) program will open a connection through hdfs api call, and then open a few hdfs files, maybe read a bit data, then close the connection. In some cas