Makes sense!

Ryanne

On Fri, May 14, 2021, 9:39 AM Nakamura <nny...@gmail.com> wrote:

> Hey Ryanne,
>
> I see what you're saying about serde blocking, but I think we should
> consider it out of scope for this patch.  Right now we've nailed down a
> couple of use cases where we can unambiguously say, "I can make progress
> now" or "I cannot make progress now", which makes it possible to offload to
> a different thread only if we are unable to make progress.  Extending this
> to CPU work like serde would mean always offloading, which would be a
> really big performance change.  It might be worth exploring anyway, but I'd
> rather keep this patch focused on improving ergonomics, rather than
> muddying the waters with evaluating performance very deeply.
>
> I think if we really do want to support serde or interceptors that do IO on
> the send path (which seems like an anti-pattern to me), we should consider
> making that a separate SIP, and probably also consider changing the API to
> use Futures (or CompletionStages).  But I would rather avoid scope creep,
> so that we have a better chance of fixing this part of the problem.
>
> Yes, I think some exceptions will move to being async instead of sync.
> They'll still be surfaced in the Future, so I'm not so confident that it
> would be that big a shock to users though.
>
> Best,
> Moses
>
> On Thu, May 13, 2021 at 7:44 PM Ryanne Dolan <ryannedo...@gmail.com>
> wrote:
>
> > re serialization, my concern is that serialization often accounts for a
> lot
> > of the cycles spent before returning the future. It's not blocking per
> se,
> > but it's the same effect from the caller's perspective.
> >
> > Moreover, serde impls often block themselves, e.g. when fetching schemas
> > from a registry. I suppose it's also possible to block in Interceptors
> > (e.g. writing audit events or metrics), which happens before serdes iiuc.
> > So any blocking in either of those plugins would block the send unless we
> > queue first.
> >
> > So I think we want to queue first and do everything off-thread when using
> > the new API, whatever that looks like. I just want to make sure we don't
> do
> > that for clients that wouldn't expect it.
> >
> > Another consideration is exception handling. If we queue right away,
> we'll
> > defer some exceptions that currently are thrown to the caller (before the
> > future is returned). In the new API, the send() wouldn't throw any
> > exceptions, and instead the future would fail. I think that might mean
> that
> > a new method signature is required.
> >
> > Ryanne
> >
> >
> > On Thu, May 13, 2021, 2:57 PM Nakamura <nakamura.mo...@gmail.com> wrote:
> >
> > > Hey Ryanne,
> > >
> > > I agree we should add an additional constructor (or else an additional
> > > overload in KafkaProducer#send, but the new constructor would be easier
> > to
> > > understand) if we're targeting the "user provides the thread" approach.
> > >
> > > From looking at the code, I think we can keep record serialization on
> the
> > > user thread, if we consider that an important part of the semantics of
> > this
> > > method.  It doesn't seem like serialization depends on knowing the
> > cluster,
> > > I think it's incidental that it comes after the first "blocking"
> segment
> > in
> > > the method.
> > >
> > > Best,
> > > Moses
> > >
> > > On Thu, May 13, 2021 at 2:38 PM Ryanne Dolan <ryannedo...@gmail.com>
> > > wrote:
> > >
> > > > Hey Moses, I like the direction here. My thinking is that a single
> > > > additional work queue, s.t. send() can enqueue and return, seems like
> > the
> > > > lightest touch. However, I don't think we can trivially process that
> > > queue
> > > > in an internal thread pool without subtly changing behavior for some
> > > users.
> > > > For example, users will often run send() in multiple threads in order
> > to
> > > > serialize faster, but that wouldn't work quite the same if there were
> > an
> > > > internal thread pool.
> > > >
> > > > For this reason I'm thinking we need to make sure any such changes
> are
> > > > opt-in. Maybe a new constructor with an additional ThreadFactory
> > > parameter.
> > > > That would at least clearly indicate that work will happen
> off-thread,
> > > and
> > > > would require opt-in for the new behavior.
> > > >
> > > > Under the hood, this ThreadFactory could be used to create the worker
> > > > thread that process queued sends, which could fan-out to
> per-partition
> > > > threads from there.
> > > >
> > > > So then you'd have two ways to send: the existing way, where serde
> and
> > > > interceptors and whatnot are executed on the calling thread, and the
> > new
> > > > way, which returns right away and uses an internal Executor. As you
> > point
> > > > out, the semantics would be identical in either case, and it would be
> > > very
> > > > easy for clients to switch.
> > > >
> > > > Ryanne
> > > >
> > > > On Thu, May 13, 2021, 9:00 AM Nakamura <nny...@gmail.com> wrote:
> > > >
> > > > > Hey Folks,
> > > > > I just posted a new proposal
> > > > > <
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > >
> > > > > in the wiki.  I think we have an opportunity to improve the
> > > > > KafkaProducer#send user experience.  It would certainly make our
> > lives
> > > > > easier.  Please take a look!  There are two subproblems that I
> could
> > > use
> > > > > guidance on, so I would appreciate feedback on both of them.
> > > > > Best,
> > > > > Moses
> > > > >
> > > >
> > >
> >
>

Reply via email to