Colin,

Thanks. 30~50K is smaller than I thought, through I understand that I
shouldn't stress the traffic unnecessarily.

If I can put my client(java/c) on a datanode and only read the local hdfs
files, that is the files have their replicas on such datanode. Is there an
API I can use to talk directly to DN, without stressing NN?  Thanks

Demai

On Thu, Feb 12, 2015 at 2:05 PM, Colin McCabe <cmcc...@alumni.cmu.edu>
wrote:

> The NN can do somewhere around 30,000 - 50,000 RPCs per second
> currently, depending on configuration.  In general you do not want to
> have extremely high NN RPC traffic, because it will slow things down.
> You might consider re-architecting your application to do more DN
> traffic and less NN traffic, if possible.  Hope that helps.
>
> best,
> Colin
>
> On Tue, Feb 10, 2015 at 4:29 PM, Demai Ni <nid...@gmail.com> wrote:
> > hi, folks,
> >
> > Is there a max limit of concurrent connection to a name node? or whether
> > there is a best practice?
> >
> > My scenario is simple. Client(java/c++) program will open a connection
> > through hdfs api call, and then open a few hdfs files, maybe read a bit
> > data, then close the connection. In some case, the number of clients may
> > be  50,000~100,000 concurrently. Is the number of connection acceptable?
> >
> > Thanks.
> >
> > Demai
>

Reply via email to