Re: Use one producer for both coordinator stream and users system?

2015-08-18 Thread Yan Fang
Hi Tao,

First, one kafka producer has an i/o thread. (correct me if I am wrong).

Second, after Samza 0.10.0, we have a coordinator stream, which stores the
checkpoint, config and other locality information for auto-scaling, dynamic
configuration, etc purpose. (See Samza-348
https://issues.apache.org/jira/browse/SAMZA-348). So we have a producer
for this coordinator stream.

Therefore, each contains will have at least two producers, one is for the
coordinator stream, one is for the users system.

My question is, can we use only one producer for both coordinator stream
and the users system to have better performance? (from the doc, it may
retrieve better performance.)

Thanks,

Fang, Yan
yanfang...@gmail.com

On Mon, Aug 17, 2015 at 9:49 PM, Tao Feng fengta...@gmail.com wrote:

 Hi Yan,

 Naive question: what do we need producer thread of coordinator stream for?

 Thanks,
 -Tao

 On Mon, Aug 17, 2015 at 2:09 PM, Yan Fang yanfang...@gmail.com wrote:

  Hi guys,
 
  I have this question because Kafka's doc
  
 
 http://kafka.apache.org/082/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
  
  seems recommending having one producer shared by all threads (*The
  producer is thread safe and should generally be shared among all threads
  for best performance.*), while currently the coordinator stream is
 using a
  separate producer (usually, there are two producers(two producer threads)
  in each container: one is for the coordinator stream , one is for the
  real job)
 
  1. Will having one producer shared by all thread really improve the
  performance? (haven't done the perf test myself. Guess Kafka has some
  proof).
 
  2. if yes, should we go this way?
 
  Thanks,
 
  Fang, Yan
  yanfang...@gmail.com
 



Re: Use one producer for both coordinator stream and users system?

2015-08-18 Thread Tao Feng
Thanks Yan. I guess I am not very clear with the coordinatorStream concept
before.
-Tao

On Tue, Aug 18, 2015 at 12:26 AM, Yan Fang yanfang...@gmail.com wrote:

 Hi Tao,

 First, one kafka producer has an i/o thread. (correct me if I am wrong).

 Second, after Samza 0.10.0, we have a coordinator stream, which stores the
 checkpoint, config and other locality information for auto-scaling, dynamic
 configuration, etc purpose. (See Samza-348
 https://issues.apache.org/jira/browse/SAMZA-348). So we have a producer
 for this coordinator stream.

 Therefore, each contains will have at least two producers, one is for the
 coordinator stream, one is for the users system.

 My question is, can we use only one producer for both coordinator stream
 and the users system to have better performance? (from the doc, it may
 retrieve better performance.)

 Thanks,

 Fang, Yan
 yanfang...@gmail.com

 On Mon, Aug 17, 2015 at 9:49 PM, Tao Feng fengta...@gmail.com wrote:

  Hi Yan,
 
  Naive question: what do we need producer thread of coordinator stream
 for?
 
  Thanks,
  -Tao
 
  On Mon, Aug 17, 2015 at 2:09 PM, Yan Fang yanfang...@gmail.com wrote:
 
   Hi guys,
  
   I have this question because Kafka's doc
   
  
 
 http://kafka.apache.org/082/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
   
   seems recommending having one producer shared by all threads (*The
   producer is thread safe and should generally be shared among all
 threads
   for best performance.*), while currently the coordinator stream is
  using a
   separate producer (usually, there are two producers(two producer
 threads)
   in each container: one is for the coordinator stream , one is for the
   real job)
  
   1. Will having one producer shared by all thread really improve the
   performance? (haven't done the perf test myself. Guess Kafka has some
   proof).
  
   2. if yes, should we go this way?
  
   Thanks,
  
   Fang, Yan
   yanfang...@gmail.com
  
 



Re: Use one producer for both coordinator stream and users system?

2015-08-18 Thread Roger Hoover
Hi Yan,

My (uneducated) guess is that the performance gains come from batching.  I
don't know if the new producer ever batches by destination broker.  If not
and it only batches by (broker,topic,partition) then I doubt that one vs
two producers will affect performance as they send to different topics.

Cheers,

Roger

On Tue, Aug 18, 2015 at 12:26 AM, Yan Fang yanfang...@gmail.com wrote:

 Hi Tao,

 First, one kafka producer has an i/o thread. (correct me if I am wrong).

 Second, after Samza 0.10.0, we have a coordinator stream, which stores the
 checkpoint, config and other locality information for auto-scaling, dynamic
 configuration, etc purpose. (See Samza-348
 https://issues.apache.org/jira/browse/SAMZA-348). So we have a producer
 for this coordinator stream.

 Therefore, each contains will have at least two producers, one is for the
 coordinator stream, one is for the users system.

 My question is, can we use only one producer for both coordinator stream
 and the users system to have better performance? (from the doc, it may
 retrieve better performance.)

 Thanks,

 Fang, Yan
 yanfang...@gmail.com

 On Mon, Aug 17, 2015 at 9:49 PM, Tao Feng fengta...@gmail.com wrote:

  Hi Yan,
 
  Naive question: what do we need producer thread of coordinator stream
 for?
 
  Thanks,
  -Tao
 
  On Mon, Aug 17, 2015 at 2:09 PM, Yan Fang yanfang...@gmail.com wrote:
 
   Hi guys,
  
   I have this question because Kafka's doc
   
  
 
 http://kafka.apache.org/082/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
   
   seems recommending having one producer shared by all threads (*The
   producer is thread safe and should generally be shared among all
 threads
   for best performance.*), while currently the coordinator stream is
  using a
   separate producer (usually, there are two producers(two producer
 threads)
   in each container: one is for the coordinator stream , one is for the
   real job)
  
   1. Will having one producer shared by all thread really improve the
   performance? (haven't done the perf test myself. Guess Kafka has some
   proof).
  
   2. if yes, should we go this way?
  
   Thanks,
  
   Fang, Yan
   yanfang...@gmail.com
  
 



Re: Use one producer for both coordinator stream and users system?

2015-08-17 Thread Tao Feng
Hi Yan,

Naive question: what do we need producer thread of coordinator stream for?

Thanks,
-Tao

On Mon, Aug 17, 2015 at 2:09 PM, Yan Fang yanfang...@gmail.com wrote:

 Hi guys,

 I have this question because Kafka's doc
 
 http://kafka.apache.org/082/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
 
 seems recommending having one producer shared by all threads (*The
 producer is thread safe and should generally be shared among all threads
 for best performance.*), while currently the coordinator stream is using a
 separate producer (usually, there are two producers(two producer threads)
 in each container: one is for the coordinator stream , one is for the
 real job)

 1. Will having one producer shared by all thread really improve the
 performance? (haven't done the perf test myself. Guess Kafka has some
 proof).

 2. if yes, should we go this way?

 Thanks,

 Fang, Yan
 yanfang...@gmail.com