Colin, Thanks. 30~50K is smaller than I thought, through I understand that I shouldn't stress the traffic unnecessarily.
If I can put my client(java/c) on a datanode and only read the local hdfs files, that is the files have their replicas on such datanode. Is there an API I can use to talk directly to DN, without stressing NN? Thanks Demai On Thu, Feb 12, 2015 at 2:05 PM, Colin McCabe <cmcc...@alumni.cmu.edu> wrote: > The NN can do somewhere around 30,000 - 50,000 RPCs per second > currently, depending on configuration. In general you do not want to > have extremely high NN RPC traffic, because it will slow things down. > You might consider re-architecting your application to do more DN > traffic and less NN traffic, if possible. Hope that helps. > > best, > Colin > > On Tue, Feb 10, 2015 at 4:29 PM, Demai Ni <nid...@gmail.com> wrote: > > hi, folks, > > > > Is there a max limit of concurrent connection to a name node? or whether > > there is a best practice? > > > > My scenario is simple. Client(java/c++) program will open a connection > > through hdfs api call, and then open a few hdfs files, maybe read a bit > > data, then close the connection. In some case, the number of clients may > > be 50,000~100,000 concurrently. Is the number of connection acceptable? > > > > Thanks. > > > > Demai >