Hi Randy, Thanks for your email. Yes, there're 4 GB in the kernel's buffer. I'm using two 1Gig nic, so the throughput is only ~200 MB. It's thrift 0.9.0. The problem is Scribe which is use c++ thrift library won't have such problem. The processor thread of scribe is busy, but only ~40MB data in the tcp buffer of the kernel. Besides, I had add some tracing points in TThreadedSelector, and found that it take ~2ms and ~2000 times of handleRead invocation to fully read a frame which is only ~1MB.
On Fri, Jun 23, 2017 at 8:53 AM, Randy Abernethy <r...@apache.org> wrote: > Hello, > If you truly mean simultaneously that's 4 GB of instantaneous data. If you > are using 10Gig Ethernet it would take 4 seconds or more just to move the > bytes in the perfect case. That sounds like a DDOS more than a use case. > You might consider scaling your servers out. If you mean 4,000 simultaneous > clients sending 1MB at random intervals it would be helpful to understand > the actual bytes per second per client. > -Randy > > On Thu, Jun 22, 2017 at 4:45 PM, w yishigan <yishi...@gmail.com> wrote: > > > Hi folks, > > I had been working on flume which use thrift server to receive data > > recently and encountered a problem. After removing all the user logic and > > only keep the thrift server code, I try to use ~4000 thrift clients to > send > > data(~1MB per rpc) simultaneously to one thrift server. Seems like the > > thrift server can't work well and there're lots of data in the recv-q > which > > consumes up all the tcp_mem(/proc/sys/net/ipv4/tcp_mem ) of the OS (cent > > OS), thus causing node's network down. I had tried THaHsServer, > > TThreadedSelectorServer and TThreadPoolServer, but none worked well > enough > > to solve the problem. I had also tried to increase the selector threads > and > > other args. Any suggestions? Thank you very much and have a good day. > > >