Felix. My regets for confusing the matter. Please inform me of a
primary source for the canonical use case you reference, unless that was
scoped to the kafka community only. That sort of statement should be
clearly documented imho.
I am considering the matter closed with respect to this list. I have 3
publish options each with some degree of autonomy from the calling
code's designed behavior.
regards
On 08/20/2012 02:39 PM, Felix GV wrote:
I think the difference is merely that async publishing is a non-blocking
call, whereas sync publishing is a blocking call, meaning that the code
that does a sync publish call could choose to have an alternate behavior if
the publish failed, whereas the code that does an async publish would never
know whether the publish succeeded or not.
But like I said, in both cases, you can configure the batching size at the
producer level, and a batching size greater than 1 will provide you with
better throughput capabilities... In fact, I think this is the canonical
use case Kafka was originally built for.
--
Felix
On Mon, Aug 20, 2012 at 2:24 PM, will martin <[email protected]> wrote:
My understanding is that async is not meant to be an immediate send. As to
batching, I've not delved into the code differences.
But batching the sync is not possible at the Producer higher level; at
least that's what I've tried and had no success with, the default and
string encoders cannot handle lists, although the documentation suggests
they can.
I'm glad to be wrong on this; but I've had no luck with the serializer deep
in scala code tree accepting a composite of any type containing either
Message or String. I can batch myself, but doubt this is what any of us
think is the design goal?
On Mon, Aug 20, 2012 at 1:06 PM, Felix GV <[email protected]> wrote:
This may not be entirely related to what you're talking about, but why
would an async producer not be able to meet your throughput needs, and a
sync producer be able to?
Both sync and async producers can be configured to batch more than one
message together, and that's pretty much the main thing that's required
to
be able to achieve good throughput, AFAIK.
...?
--
Felix
On Mon, Aug 20, 2012 at 12:49 PM, will martin <[email protected]>
wrote:
Thanks Neha. All my data is of 1 type. The serializer in place doesn't
seem
to handle an array of String.
The ProducerData I use is a collection of same types of data wrapped
in a
single defintion, according to as I read spec. Am I to understand
that,
having a producer batch records itself is unsupported? The async
producer
can't meet my throughput needs and as I understand is targetted at
implicit
load balancing among different client machines.
Additionally, the sync producer can meet my needs, but requires more
use
of
the lower-level design features. For maintenance, it'd be great if I
could
create a list of Strings, create a ProducerData<String, List<String>>
and
have this be serialized.
It occurs to me that the described serialization may need my
attention?
Thx
On Mon, Aug 20, 2012 at 12:06 PM, Neha Narkhede <
[email protected]
wrote:
The producer takes in a "serializer.class" config that it uses to
serialize data sent by the Producer. A Producer instance is tied to
the type of data it is sending, so you won't be able to send data
belonging to diverse types using the same Producer object.
Thanks,
Neha
On Mon, Aug 20, 2012 at 8:02 AM, will martin <[email protected]>
wrote:
This use case is defined by the following snippet from the Design
section
of the doc pages.
class Producer {
public void send (ProducerData)
public void send (List<ProducerData>)
public void close()
}
I've tried various composites for the List<ProducerData> argument,
including strings and Messages. All of these throw serialization
errors
deep in the engine.
Is the list form of send supported in 7.1?
Thanks in advance,
mmartin