Re: kafka latency for large message

2019-03-19 Thread Nan Xu
that's very good information from the slides, thanks. Our design to use
kafka has 2 purpose. one is use it as a cache, we use ktable for that
purpose, second purpose is use as message delivery mechanism to send it to
other system. Because we very much care the latency, the ktable with a
compact topic suit us very well, if has to find another system to do the
caching, big change involved. The way described in the slides, which break
the message to smaller chunks then reassemble them seems a viable solution.

do you know why kafka doesn't have a liner latency for big messages
comparing to small ones. for 2M message, I have avg latency less than 10
ms, more expecting for 30M has latency less than 10 * 20 = 200ms

On Mon, Mar 18, 2019 at 3:29 PM Bruce Markey  wrote:

> Hi Nan,
>
> Would you consider other approaches that may actually be a more efficient
> solution for you? There is a slide deck Handle Large Messages In Apache
> Kafka
> <
> https://www.slideshare.net/JiangjieQin/handle-large-messages-in-apache-kafka-58692297
> >.
> For messages this large, one of the approaches suggested is Reference Based
> Messaging where you write your large files to an external data store then
> produce a small Apache Kafka message with a reference for where to find the
> file. This would allow your consumer applications to find the file as
> needed rather than storing all that data in the event log.
>
> --  bjm
>
> On Thu, Mar 14, 2019 at 1:53 PM Xu, Nan  wrote:
>
> > Hi,
> >
> > We are using kafka to send messages and there is less than 1% of
> > message is very big, close to 30M. understanding kafka is not ideal for
> > sending big messages, because the large message rate is very low, we just
> > want let kafka do it anyway. But still want to get a reasonable latency.
> >
> > To test, I just setup up a topic test on a single broker local kafka,
> > with only 1 partition and 1 replica, using the following command
> >
> > ./kafka-producer-perf-test.sh  --topic test --num-records 200
> > --throughput 1 --record-size 3000 --producer.config
> > ../config/producer.properties
> >
> > Producer.config
> >
> > #Max 40M message
> > max.request.size=4000
> > buffer.memory=4000
> >
> > #2M buffer
> > send.buffer.bytes=200
> >
> > 6 records sent, 1.1 records/sec (31.00 MB/sec), 973.0 ms avg latency,
> > 1386.0 max latency.
> > 6 records sent, 1.0 records/sec (28.91 MB/sec), 787.2 ms avg latency,
> > 1313.0 max latency.
> > 5 records sent, 1.0 records/sec (27.92 MB/sec), 582.8 ms avg latency,
> > 643.0 max latency.
> > 6 records sent, 1.1 records/sec (30.16 MB/sec), 685.3 ms avg latency,
> > 1171.0 max latency.
> > 5 records sent, 1.0 records/sec (27.92 MB/sec), 629.4 ms avg latency,
> > 729.0 max latency.
> > 5 records sent, 1.0 records/sec (27.61 MB/sec), 635.6 ms avg latency,
> > 673.0 max latency.
> > 6 records sent, 1.1 records/sec (30.09 MB/sec), 736.2 ms avg latency,
> > 1255.0 max latency.
> > 5 records sent, 1.0 records/sec (27.62 MB/sec), 626.8 ms avg latency,
> > 685.0 max latency.
> > 5 records sent, 1.0 records/sec (28.38 MB/sec), 608.8 ms avg latency,
> > 685.0 max latency.
> >
> >
> > On the broker, I change the
> >
> > socket.send.buffer.bytes=2024000
> > # The receive buffer (SO_RCVBUF) used by the socket server
> > socket.receive.buffer.bytes=2224000
> >
> > and all others are default.
> >
> > I am a little surprised to see about 1 s max latency and average about
> 0.5
> > s. my understanding is kafka is doing the memory mapping for log file and
> > let system flush it. all the write are sequential. So flush should be not
> > affected by message size that much. Batching and network will take
> longer,
> > but those are memory based and local machine. my ssd should be far better
> > than 0.5 second. where the time got consumed? any suggestion?
> >
> > Thanks,
> > Nan
> >
> >
> >
> >
> >
> >
> >
> > --
> > This message, and any attachments, is for the intended recipient(s) only,
> > may contain information that is privileged, confidential and/or
> proprietary
> > and subject to important terms and conditions available at
> > http://www.bankofamerica.com/emaildisclaimer.   If you are not the
> > intended recipient, please delete this message.
> >
>


Re: kafka latency for large message

2019-03-18 Thread Bruce Markey
Hi Nan,

Would you consider other approaches that may actually be a more efficient
solution for you? There is a slide deck Handle Large Messages In Apache
Kafka
.
For messages this large, one of the approaches suggested is Reference Based
Messaging where you write your large files to an external data store then
produce a small Apache Kafka message with a reference for where to find the
file. This would allow your consumer applications to find the file as
needed rather than storing all that data in the event log.

--  bjm

On Thu, Mar 14, 2019 at 1:53 PM Xu, Nan  wrote:

> Hi,
>
> We are using kafka to send messages and there is less than 1% of
> message is very big, close to 30M. understanding kafka is not ideal for
> sending big messages, because the large message rate is very low, we just
> want let kafka do it anyway. But still want to get a reasonable latency.
>
> To test, I just setup up a topic test on a single broker local kafka,
> with only 1 partition and 1 replica, using the following command
>
> ./kafka-producer-perf-test.sh  --topic test --num-records 200
> --throughput 1 --record-size 3000 --producer.config
> ../config/producer.properties
>
> Producer.config
>
> #Max 40M message
> max.request.size=4000
> buffer.memory=4000
>
> #2M buffer
> send.buffer.bytes=200
>
> 6 records sent, 1.1 records/sec (31.00 MB/sec), 973.0 ms avg latency,
> 1386.0 max latency.
> 6 records sent, 1.0 records/sec (28.91 MB/sec), 787.2 ms avg latency,
> 1313.0 max latency.
> 5 records sent, 1.0 records/sec (27.92 MB/sec), 582.8 ms avg latency,
> 643.0 max latency.
> 6 records sent, 1.1 records/sec (30.16 MB/sec), 685.3 ms avg latency,
> 1171.0 max latency.
> 5 records sent, 1.0 records/sec (27.92 MB/sec), 629.4 ms avg latency,
> 729.0 max latency.
> 5 records sent, 1.0 records/sec (27.61 MB/sec), 635.6 ms avg latency,
> 673.0 max latency.
> 6 records sent, 1.1 records/sec (30.09 MB/sec), 736.2 ms avg latency,
> 1255.0 max latency.
> 5 records sent, 1.0 records/sec (27.62 MB/sec), 626.8 ms avg latency,
> 685.0 max latency.
> 5 records sent, 1.0 records/sec (28.38 MB/sec), 608.8 ms avg latency,
> 685.0 max latency.
>
>
> On the broker, I change the
>
> socket.send.buffer.bytes=2024000
> # The receive buffer (SO_RCVBUF) used by the socket server
> socket.receive.buffer.bytes=2224000
>
> and all others are default.
>
> I am a little surprised to see about 1 s max latency and average about 0.5
> s. my understanding is kafka is doing the memory mapping for log file and
> let system flush it. all the write are sequential. So flush should be not
> affected by message size that much. Batching and network will take longer,
> but those are memory based and local machine. my ssd should be far better
> than 0.5 second. where the time got consumed? any suggestion?
>
> Thanks,
> Nan
>
>
>
>
>
>
>
> --
> This message, and any attachments, is for the intended recipient(s) only,
> may contain information that is privileged, confidential and/or proprietary
> and subject to important terms and conditions available at
> http://www.bankofamerica.com/emaildisclaimer.   If you are not the
> intended recipient, please delete this message.
>


Re: kafka latency for large message

2019-03-18 Thread Mike Trienis
It takes time to send that much data over the network. Why would you expect
a smaller latency?

On Mon, Mar 18, 2019 at 8:05 AM Nan Xu  wrote:

> anyone can give some suggestion? or an explanation why kafka give a big
> latency for large payload.
>
> Thanks,
> Nan
>
> On Thu, Mar 14, 2019 at 3:53 PM Xu, Nan  wrote:
>
> > Hi,
> >
> > We are using kafka to send messages and there is less than 1% of
> > message is very big, close to 30M. understanding kafka is not ideal for
> > sending big messages, because the large message rate is very low, we just
> > want let kafka do it anyway. But still want to get a reasonable latency.
> >
> > To test, I just setup up a topic test on a single broker local kafka,
> > with only 1 partition and 1 replica, using the following command
> >
> > ./kafka-producer-perf-test.sh  --topic test --num-records 200
> > --throughput 1 --record-size 3000 --producer.config
> > ../config/producer.properties
> >
> > Producer.config
> >
> > #Max 40M message
> > max.request.size=4000
> > buffer.memory=4000
> >
> > #2M buffer
> > send.buffer.bytes=200
> >
> > 6 records sent, 1.1 records/sec (31.00 MB/sec), 973.0 ms avg latency,
> > 1386.0 max latency.
> > 6 records sent, 1.0 records/sec (28.91 MB/sec), 787.2 ms avg latency,
> > 1313.0 max latency.
> > 5 records sent, 1.0 records/sec (27.92 MB/sec), 582.8 ms avg latency,
> > 643.0 max latency.
> > 6 records sent, 1.1 records/sec (30.16 MB/sec), 685.3 ms avg latency,
> > 1171.0 max latency.
> > 5 records sent, 1.0 records/sec (27.92 MB/sec), 629.4 ms avg latency,
> > 729.0 max latency.
> > 5 records sent, 1.0 records/sec (27.61 MB/sec), 635.6 ms avg latency,
> > 673.0 max latency.
> > 6 records sent, 1.1 records/sec (30.09 MB/sec), 736.2 ms avg latency,
> > 1255.0 max latency.
> > 5 records sent, 1.0 records/sec (27.62 MB/sec), 626.8 ms avg latency,
> > 685.0 max latency.
> > 5 records sent, 1.0 records/sec (28.38 MB/sec), 608.8 ms avg latency,
> > 685.0 max latency.
> >
> >
> > On the broker, I change the
> >
> > socket.send.buffer.bytes=2024000
> > # The receive buffer (SO_RCVBUF) used by the socket server
> > socket.receive.buffer.bytes=2224000
> >
> > and all others are default.
> >
> > I am a little surprised to see about 1 s max latency and average about
> 0.5
> > s. my understanding is kafka is doing the memory mapping for log file and
> > let system flush it. all the write are sequential. So flush should be not
> > affected by message size that much. Batching and network will take
> longer,
> > but those are memory based and local machine. my ssd should be far better
> > than 0.5 second. where the time got consumed? any suggestion?
> >
> > Thanks,
> > Nan
> >
> >
> >
> >
> >
> >
> >
> > --
> > This message, and any attachments, is for the intended recipient(s) only,
> > may contain information that is privileged, confidential and/or
> proprietary
> > and subject to important terms and conditions available at
> > http://www.bankofamerica.com/emaildisclaimer.   If you are not the
> > intended recipient, please delete this message.
> >
>


-- 
Thanks, Mike


Re: kafka latency for large message

2019-03-18 Thread Nan Xu
anyone can give some suggestion? or an explanation why kafka give a big
latency for large payload.

Thanks,
Nan

On Thu, Mar 14, 2019 at 3:53 PM Xu, Nan  wrote:

> Hi,
>
> We are using kafka to send messages and there is less than 1% of
> message is very big, close to 30M. understanding kafka is not ideal for
> sending big messages, because the large message rate is very low, we just
> want let kafka do it anyway. But still want to get a reasonable latency.
>
> To test, I just setup up a topic test on a single broker local kafka,
> with only 1 partition and 1 replica, using the following command
>
> ./kafka-producer-perf-test.sh  --topic test --num-records 200
> --throughput 1 --record-size 3000 --producer.config
> ../config/producer.properties
>
> Producer.config
>
> #Max 40M message
> max.request.size=4000
> buffer.memory=4000
>
> #2M buffer
> send.buffer.bytes=200
>
> 6 records sent, 1.1 records/sec (31.00 MB/sec), 973.0 ms avg latency,
> 1386.0 max latency.
> 6 records sent, 1.0 records/sec (28.91 MB/sec), 787.2 ms avg latency,
> 1313.0 max latency.
> 5 records sent, 1.0 records/sec (27.92 MB/sec), 582.8 ms avg latency,
> 643.0 max latency.
> 6 records sent, 1.1 records/sec (30.16 MB/sec), 685.3 ms avg latency,
> 1171.0 max latency.
> 5 records sent, 1.0 records/sec (27.92 MB/sec), 629.4 ms avg latency,
> 729.0 max latency.
> 5 records sent, 1.0 records/sec (27.61 MB/sec), 635.6 ms avg latency,
> 673.0 max latency.
> 6 records sent, 1.1 records/sec (30.09 MB/sec), 736.2 ms avg latency,
> 1255.0 max latency.
> 5 records sent, 1.0 records/sec (27.62 MB/sec), 626.8 ms avg latency,
> 685.0 max latency.
> 5 records sent, 1.0 records/sec (28.38 MB/sec), 608.8 ms avg latency,
> 685.0 max latency.
>
>
> On the broker, I change the
>
> socket.send.buffer.bytes=2024000
> # The receive buffer (SO_RCVBUF) used by the socket server
> socket.receive.buffer.bytes=2224000
>
> and all others are default.
>
> I am a little surprised to see about 1 s max latency and average about 0.5
> s. my understanding is kafka is doing the memory mapping for log file and
> let system flush it. all the write are sequential. So flush should be not
> affected by message size that much. Batching and network will take longer,
> but those are memory based and local machine. my ssd should be far better
> than 0.5 second. where the time got consumed? any suggestion?
>
> Thanks,
> Nan
>
>
>
>
>
>
>
> --
> This message, and any attachments, is for the intended recipient(s) only,
> may contain information that is privileged, confidential and/or proprietary
> and subject to important terms and conditions available at
> http://www.bankofamerica.com/emaildisclaimer.   If you are not the
> intended recipient, please delete this message.
>