Another thought: Brokers replicate data in. So a record weighing 10 bytes will be written out once for replication and one more time to a consumer so it will be 20 bytes out. Makes sense? On Thu, 14 Apr 2016 at 02:46 Jorge Rodriguez <jo...@bloomreach.com> wrote:
> Thanks for your response Asaf. I have 4 brokers. These measurements are > from the kafka brokers. > > This measurement on this graph comes from Kafka. It is a sum across all 4 > brokers of the > metric: kafka.server.BrokerTopicMetrics.BytesInPerSec.1MinuteRate. > > But I also have a system metric which I feed independently using collectd > "interface" plugin. And the bytes out and in match the ones reported by > kafka fairly well. As well there is a corresponding increase in network > packets sent. > > Also, in the SparkStreaming side, I can see that during these spikes, the > number of received packets and bytes also spikes. > > So during the spikes, I believe that some of the fetch requests are perhaps > failing and we hit a retry. I am debugging that currently and I think it's > related to the STW GC which happens on spark streaming occasionally. > Working on some GC tuning should alleviate this. > > However, even if this is the case, this would not explain though why under > normal operations, the number of bytes out is 2x the number of bytes in. > Since I only have 1 consumer for each topic, I would expect the numbers to > be fairly close. Do you > > > > > On Tue, Apr 12, 2016 at 8:31 PM, Asaf Mesika <asaf.mes...@gmail.com> > wrote: > > > Where exactly do you get the measurement from? Your broker? Do you have > > only one? Your producer? Your spark job? > > On Mon, 11 Apr 2016 at 23:54 Jorge Rodriguez <jo...@bloomreach.com> > wrote: > > > > > We are running a kafka cluster for our real-time pixel processing > > > pipeline. The data is produced from our pixel servers into kafka, and > > then > > > consumed by a spark streaming application. Based on this, I would > expect > > > that the bytes in vs bytes out should be roughly equal, as each message > > > should be consumed once. > > > > > > Under normal operations, the bytes out is a little less than 2X the > bytes > > > in. Does anyone know why this is? We do use a replication factor of > 2. > > > > > > Occasionally, we get a spike in Bytes out. But bytes in remain the > same > > > (see image below). This correlates with a significant delay in > > processing > > > time in the spark streaming side. > > > > > > Below is a chart of kafka reported bytes out vs in. The system level > > > network metrics show the same information (transferred bytes spike). > > > > > > Could anyone provide some tips for debugging/getting to the bottom of > > this > > > issue? > > > > > > Thanks, > > > Jorge > > > > > > *Kafka reported Bytes in Per topic and for all topics vs Kafka bytes > > out:* > > > > > > [image: Inline image 1] > > > > > >