Thanks for sharing your experiences and glad you got the ball moved down field a little!
On Fri, Jun 23, 2017 at 9:29 PM, w yishigan <yishi...@gmail.com> wrote: > I find the problem. The AbstractNonblockingServer.maxReadBufferBytes is > too > small. Now the tcp_mem is good and the data is clogged in jvm now. > Thanks. > > On Fri, Jun 23, 2017 at 9:46 AM, w yishigan <yishi...@gmail.com> wrote: > > > Hi Randy, > > Thanks for your email. Yes, there're 4 GB in the kernel's buffer. I'm > > using two 1Gig nic, so the throughput is only ~200 MB. It's thrift > 0.9.0. > > The problem is Scribe which is use c++ thrift library won't have such > > problem. The processor thread of scribe is busy, but only ~40MB data in > the > > tcp buffer of the kernel. Besides, I had add some tracing points in > > TThreadedSelector, and found that it take ~2ms and ~2000 times of > > handleRead invocation to fully read a frame which is only ~1MB. > > > > On Fri, Jun 23, 2017 at 8:53 AM, Randy Abernethy <r...@apache.org> wrote: > > > >> Hello, > >> If you truly mean simultaneously that's 4 GB of instantaneous data. If > you > >> are using 10Gig Ethernet it would take 4 seconds or more just to move > the > >> bytes in the perfect case. That sounds like a DDOS more than a use case. > >> You might consider scaling your servers out. If you mean 4,000 > >> simultaneous > >> clients sending 1MB at random intervals it would be helpful to > understand > >> the actual bytes per second per client. > >> -Randy > >> > >> On Thu, Jun 22, 2017 at 4:45 PM, w yishigan <yishi...@gmail.com> wrote: > >> > >> > Hi folks, > >> > I had been working on flume which use thrift server to receive data > >> > recently and encountered a problem. After removing all the user logic > >> and > >> > only keep the thrift server code, I try to use ~4000 thrift clients to > >> send > >> > data(~1MB per rpc) simultaneously to one thrift server. Seems like the > >> > thrift server can't work well and there're lots of data in the recv-q > >> which > >> > consumes up all the tcp_mem(/proc/sys/net/ipv4/tcp_mem ) of the OS > >> (cent > >> > OS), thus causing node's network down. I had tried THaHsServer, > >> > TThreadedSelectorServer and TThreadPoolServer, but none worked well > >> enough > >> > to solve the problem. I had also tried to increase the selector > threads > >> and > >> > other args. Any suggestions? Thank you very much and have a good day. > >> > > >> > > > > >