With this approach close(timeout > 0) will not actually close by the elapsed timeout right? This semantics mismatch is a bit concerning to me..
Guozhang On Fri, Mar 20, 2015 at 2:37 PM, Jun Rao <j...@confluent.io> wrote: > Yes, that probably is simpler. Basically, close() and close(timeout) will > always wait for the sender thread to complete. Close(0) will just initiate > the shutdown of the sender thread, but not waiting for the thread to > complete. > > Thanks, > > Jun > > On Fri, Mar 20, 2015 at 11:29 AM, Jiangjie Qin <j...@linkedin.com.invalid> > wrote: > > > That’s a very good point. > > Currently if timeout > 0, it will wait up to timeout for the sender > thread > > to complete. After that it will do a force close but not block. > > > > Maybe we should make it to be: > > If timeout > 0, wait up to timeout for the sender thread to complete. > > After that, make a force close but still try to join the sender thread. > > If timeout = 0, it is non-blocking. > > > > Jiangjie (Becket) Qin > > > > On 3/20/15, 9:16 AM, "Jun Rao" <j...@confluent.io> wrote: > > > > >Got it. Just to clarify, does close(timeout) always wait for the sender > > >thread to complete? > > > > > >Thanks, > > > > > >Jun > > > > > >On Fri, Mar 20, 2015 at 7:41 AM, Jiangjie Qin <j...@linkedin.com.invalid > > > > >wrote: > > > > > >> I think the we agreed that we are going to log an error and block. By > > >> doing this we can make sure the error log to be checked by user in all > > >> cases. > > >> If we silently replace close() to close(0) in sender thread, in some > > >>cases > > >> such as send error during a normal close(), user might not notice > > >> something went wrong. > > >> > > >> On 3/19/15, 10:34 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote: > > >> > > >> >close can probably just getCurrentThread and check if == > senderThread. > > >> >I'm actually not sure from this thread if there was clear agreement > on > > >> >whether it should change close(timeout)/close() to close(0) or if it > > >> >should > > >> >log an error and block up to the timeout. > > >> > > > >> >On Thursday, March 19, 2015, Jun Rao <j...@confluent.io> wrote: > > >> > > > >> >> So in (1), if a close() or close(timeout) is called from a > callback, > > >>we > > >> >> will just turn that into a close(0)? Implementation wise, how do we > > >>know > > >> >> whether a close() call is made from the sender thread or not? > > >> >> > > >> >> Thanks, > > >> >> > > >> >> Jun > > >> >> > > >> >> On Wed, Mar 18, 2015 at 2:13 PM, Jiangjie Qin > > >> >><j...@linkedin.com.invalid> > > >> >> wrote: > > >> >> > > >> >> > It looks we have another option and are now deciding between the > > >> >> following > > >> >> > two interfaces: > > >> >> > > > >> >> > 1. Close() + close(timeout) > > >> >> > - timeout could be either positive or zero. > > >> >> > - only close(0) can be called from sender thread > > >> >> > > > >> >> > 2. Close() + abort() + close(timeout) > > >> >> > - timeout can either be positive or zero > > >> >> > - only abort() can be called from sender thread > > >> >> > > > >> >> > - abort() is equivalent to close(0) in 1) but does not join > > >>sender > > >> >> > thread and does not close metrics. > > >> >> > - Another thread has to call close() or close(timeout) in order > > >>to > > >> >>make > > >> >> > sure the resources in producer are gone. > > >> >> > > > >> >> > The tow approach provides the same function we need, the > > >>difference is > > >> >> > approach 2) follows convention of close() and abort(). On the > other > > >> >>hand, > > >> >> > approach 1) saves one interface compared with approach 2) but > does > > >>not > > >> >> > follow the convention. > > >> >> > > > >> >> > When the two approaches come to user code, it is probably > something > > >> >>like > > >> >> > this: > > >> >> > > > >> >> > Try { > > >> >> > While(!finished) > > >> >> > Producer.send(record, callback) > > >> >> > } catch (Exception e) { > > >> >> > Producer.close(5) > > >> >> > } > > >> >> > > > >> >> > Class CallbackImpl implements Callback { > > >> >> > onCompletion(RecordMetadata metadata Exception e) { > > >> >> > If (e != null) > > >> >> > Abort() / close() > > >> >> > } > > >> >> > } > > >> >> > > > >> >> > Because the two approach leads to almost the same user code, > > >>assuming > > >> >> > users are always calling producer.close() as a clean up step, > > >> >>personally > > >> >> I > > >> >> > prefer approach 2) as it follows convention. > > >> >> > > > >> >> > Any thoughts? > > >> >> > > > >> >> > Jiangjie (Becket) Qin > > >> >> > > > >> >> > > > >> >> > On 3/17/15, 10:25 AM, "Jiangjie Qin" <j...@linkedin.com > > >> >><javascript:;>> > > >> >> wrote: > > >> >> > > > >> >> > >Hi Jun, > > >> >> > > > > >> >> > >Yes, as Guozhang said, the main reason we set a flag is because > > >> >>close(0) > > >> >> > >is expected to be called by sender thread itself. > > >> >> > >If we want to maintain the semantic meaning of close(), one > > >> >>alternative > > >> >> is > > >> >> > >to have an abort() method does the same thing as close(0) except > > >> >> cleanup. > > >> >> > >And in close(timeout), after timeout we call abort() and join > the > > >> >>sender > > >> >> > >thread. This was one of the previous proposal. We merged abort > to > > >> >> close(0) > > >> >> > >because they are almost doing the same thing. But from what you > > >> >> mentioned, > > >> >> > >it might make sense to have two separate methods. > > >> >> > > > > >> >> > >Thanks. > > >> >> > > > > >> >> > >Jiangjie (Becket) Qin > > >> >> > > > > >> >> > >On 3/16/15, 10:31 PM, "Guozhang Wang" <wangg...@gmail.com > > >> >> <javascript:;>> wrote: > > >> >> > > > > >> >> > >>Yeah in this sense the sender thread will not exist immediately > > >>in > > >> >>the > > >> >> > >>close(0) call, but will only terminate after the current > response > > >> >>batch > > >> >> > >>has > > >> >> > >>been processed, as will the producer instance itself. > > >> >> > >> > > >> >> > >>There is a reason for this though: for a clean shutdown the > > >>caller > > >> >> thread > > >> >> > >>has to wait for the sender thread to join before closing the > > >> >>producer > > >> >> > >>instance, but this cannot be achieve if close(0) is called by > the > > >> >> sender > > >> >> > >>thread itself (for example in KAFKA-1659, there is a proposal > > >>from > > >> >> Andrew > > >> >> > >>Stein on using thread.interrupt and thread.stop, but if it is > > >> >>called by > > >> >> > >>the > > >> >> > >>ioThread itself the stop call will fail). Hence we came up with > > >>the > > >> >> flag > > >> >> > >>approach to let the sender thread to close as soon as it is at > > >>the > > >> >> > >>barrier > > >> >> > >>of the run loop. > > >> >> > >> > > >> >> > >>Guozhang > > >> >> > >> > > >> >> > >>On Mon, Mar 16, 2015 at 9:41 PM, Jun Rao <j...@confluent.io > > >> >> <javascript:;>> wrote: > > >> >> > >> > > >> >> > >>> Hmm, does that mean that after close(0), the sender thread is > > >>not > > >> >> > >>>necessary > > >> >> > >>> gone? Normally, after closing an entity, we expect all > internal > > >> >> threads > > >> >> > >>> associated with the entity are shut down completely. > > >> >> > >>> > > >> >> > >>> Thanks, > > >> >> > >>> > > >> >> > >>> Jun > > >> >> > >>> > > >> >> > >>> On Mon, Mar 16, 2015 at 3:18 PM, Jiangjie Qin > > >> >> > >>><j...@linkedin.com.invalid> > > >> >> > >>> wrote: > > >> >> > >>> > > >> >> > >>> > Hi Jun, > > >> >> > >>> > > > >> >> > >>> > Close(0) will set two flags in sender. Running=false and a > > >>newly > > >> >> > >>>added > > >> >> > >>> > forceClose=true. It will also set accumulator.closed=true > so > > >>no > > >> >> > >>>further > > >> >> > >>> > producer.send() will succeed. > > >> >> > >>> > The sender thread will finish executing all the callbacks > in > > >> >> current > > >> >> > >>> batch > > >> >> > >>> > of responses, then it will see the forceClose flag. It will > > >>just > > >> >> fail > > >> >> > >>>all > > >> >> > >>> > the incomplete batches in the producer and exit. > > >> >> > >>> > So close(0) is a non-blocking call and sender thread will > not > > >> >>try > > >> >> to > > >> >> > >>>join > > >> >> > >>> > itself in close(0). > > >> >> > >>> > > > >> >> > >>> > Thanks. > > >> >> > >>> > > > >> >> > >>> > Jiangjie (Becket) Qin > > >> >> > >>> > > > >> >> > >>> > On 3/16/15, 2:50 PM, "Jun Rao" <j...@confluent.io > > >> <javascript:;>> > > >> >> wrote: > > >> >> > >>> > > > >> >> > >>> > >How does close(0) work if it's called from the sender > > >>thread? > > >> >>If > > >> >> > >>> close(0) > > >> >> > >>> > >needs to wait for the sender thread to join, wouldn't this > > >> >>cause a > > >> >> > >>> > >deadlock? > > >> >> > >>> > > > > >> >> > >>> > >Thanks, > > >> >> > >>> > > > > >> >> > >>> > >Jun > > >> >> > >>> > > > > >> >> > >>> > >On Mon, Mar 16, 2015 at 2:26 PM, Jiangjie Qin > > >> >> > >>><j...@linkedin.com.invalid > > >> >> > >>> > > > >> >> > >>> > >wrote: > > >> >> > >>> > > > > >> >> > >>> > >> Thanks Guozhang. It wouldn’t be as thoroughly considered > > >> >>without > > >> >> > >>> > >> discussing with you :) > > >> >> > >>> > >> > > >> >> > >>> > >> Jiangjie (Becket) Qin > > >> >> > >>> > >> > > >> >> > >>> > >> On 3/16/15, 1:07 PM, "Guozhang Wang" < > wangg...@gmail.com > > >> >> <javascript:;>> wrote: > > >> >> > >>> > >> > > >> >> > >>> > >> >Thanks Jiangjie, > > >> >> > >>> > >> > > > >> >> > >>> > >> >After talking to you offline on this, I have been > > >>convinced > > >> >>and > > >> >> > >>> > >>changed my > > >> >> > >>> > >> >preference to blocking. The immediate shutdown approach > > >>does > > >> >> have > > >> >> > >>> some > > >> >> > >>> > >> >unsafeness in some cases. > > >> >> > >>> > >> > > > >> >> > >>> > >> >Guozhang > > >> >> > >>> > >> > > > >> >> > >>> > >> >On Mon, Mar 16, 2015 at 11:50 AM, Jiangjie Qin > > >> >> > >>> > >><j...@linkedin.com.invalid > > >> >> > >>> > >> > > > >> >> > >>> > >> >wrote: > > >> >> > >>> > >> > > > >> >> > >>> > >> >> It looks that the problem we want to solve and the > > >> >>purpose we > > >> >> > >>>want > > >> >> > >>> to > > >> >> > >>> > >> >> achieve is: > > >> >> > >>> > >> >> If user uses close() in callback, we want to let user > > >>be > > >> >> aware > > >> >> > >>>that > > >> >> > >>> > >>they > > >> >> > >>> > >> >> should use close(0) instead of close() in the > callback. > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> We have agreed that we will have an error log to > inform > > >> >>user > > >> >> > >>>about > > >> >> > >>> > >>this > > >> >> > >>> > >> >> mis-usage. The options differ in the way how we can > > >>force > > >> >> user > > >> >> > >>>to > > >> >> > >>> > >>take a > > >> >> > >>> > >> >> look at that error log. > > >> >> > >>> > >> >> There are two scenarios: > > >> >> > >>> > >> >> 1. User does not expect the program to exit. > > >> >> > >>> > >> >> 2. User expect the program to exit. > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> For scenario 1), blocking will probably delay the > > >> >>discovery > > >> >> of > > >> >> > >>>the > > >> >> > >>> > >> >> problem. Calling close(0) exposes the problem > quicker. > > >>In > > >> >> this > > >> >> > >>> > >>scenario > > >> >> > >>> > >> >> producer just encounter a send failure when running > > >> >>normally. > > >> >> > >>> > >> >> For scenario 2), blocking will expose the problem > > >>quick. > > >> >> > >>>Calling > > >> >> > >>> > >> >>close(-1) > > >> >> > >>> > >> >> might hide the problem. This scenario might include: > a) > > >> >>Unit > > >> >> > >>>test > > >> >> > >>> > >>for a > > >> >> > >>> > >> >> send failure. b) Message sending during a close() > call > > >> >>from a > > >> >> > >>>user > > >> >> > >>> > >> >>thread. > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> So as a summary table: > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> Scenario 1) > > >> >> Scenario > > >> >> > >>>2) > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> Blocking Delay problem discovery > > >>Guaranteed > > >> >> > >>>problem > > >> >> > >>> > >> >>discovery > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> Close(-1) Immediate problem discovery Problem > > >> >>might > > >> >> be > > >> >> > >>> hidden > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> Personally I prefer blocking because it seems > providing > > >> >>more > > >> >> > >>> > >>guarantees > > >> >> > >>> > >> >> and safer. > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> Thanks. > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> Jiangjie (Becket) Qin > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> On 3/16/15, 10:11 AM, "Guozhang Wang" > > >><wangg...@gmail.com > > >> >> <javascript:;>> > > >> >> > >>>wrote: > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> >HI Jiangjie, > > >> >> > >>> > >> >> > > > >> >> > >>> > >> >> >As far as I understand calling close() in the > > >>ioThread is > > >> >> not > > >> >> > >>> > >>common, > > >> >> > >>> > >> >>as > > >> >> > >>> > >> >> >it > > >> >> > >>> > >> >> >may only trigger when we saw some non-retriable > error. > > >> >>Hence > > >> >> > >>>when > > >> >> > >>> > >>user > > >> >> > >>> > >> >>run > > >> >> > >>> > >> >> >their program it is unlikely that close() will be > > >> >>triggered > > >> >> > >>>and > > >> >> > >>> > >>problem > > >> >> > >>> > >> >> >will be detected. So it seems to me that from the > > >>error > > >> >> > >>>detection > > >> >> > >>> > >> >>aspect > > >> >> > >>> > >> >> >these two options seems to be the same as people > will > > >> >> usually > > >> >> > >>> > >>detect it > > >> >> > >>> > >> >> >from the producer metrics all dropping to 0. > > >> >> > >>> > >> >> > > > >> >> > >>> > >> >> >Guozhang > > >> >> > >>> > >> >> > > > >> >> > >>> > >> >> >On Mon, Mar 16, 2015 at 9:52 AM, Jiangjie Qin > > >> >> > >>> > >> >><j...@linkedin.com.invalid> > > >> >> > >>> > >> >> >wrote: > > >> >> > >>> > >> >> > > > >> >> > >>> > >> >> >> It seems there are two options we can choose from > > >>when > > >> >> > >>>close() > > >> >> > >>> is > > >> >> > >>> > >> >>called > > >> >> > >>> > >> >> >> from sender thread (callback): > > >> >> > >>> > >> >> >> 1. Log an error and close the producer using > > >>close(-1) > > >> >> > >>> > >> >> >> 2. Log an error and block. > > >> >> > >>> > >> >> >> (Throwing an exception will not work because we > > >>catch > > >> >>all > > >> >> > >>>the > > >> >> > >>> > >> >>exception > > >> >> > >>> > >> >> >> thrown from user callback. It will just lead to an > > >> >>error > > >> >> > >>>log.) > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> >> My concern for the first option is that the > producer > > >> >>will > > >> >> be > > >> >> > >>> > >>closed > > >> >> > >>> > >> >>even > > >> >> > >>> > >> >> >> if we logged and error. I am wondering if some > user > > >> >>would > > >> >> > >>>not > > >> >> > >>> even > > >> >> > >>> > >> >>take > > >> >> > >>> > >> >> >>a > > >> >> > >>> > >> >> >> look at the log if producer is closed normally. > > >>Because > > >> >> from > > >> >> > >>>the > > >> >> > >>> > >> >> >>programs > > >> >> > >>> > >> >> >> behavior, everything looks good. If that is the > > >>case, > > >> >>the > > >> >> > >>>error > > >> >> > >>> > >> >>message > > >> >> > >>> > >> >> >>we > > >> >> > >>> > >> >> >> logged probably will just be ignored until some > day > > >> >>when > > >> >> > >>>people > > >> >> > >>> > >>check > > >> >> > >>> > >> >> >>the > > >> >> > >>> > >> >> >> log and see it. > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> >> As for the second option, because producer does > not > > >> >>close > > >> >> > >>>but > > >> >> > >>> > >>blocks. > > >> >> > >>> > >> >> >>User > > >> >> > >>> > >> >> >> will notice this the first time they run the > > >>program. > > >> >>They > > >> >> > >>> > >>probably > > >> >> > >>> > >> >>will > > >> >> > >>> > >> >> >> look at the log to see why producer could not be > > >>closed > > >> >> and > > >> >> > >>>they > > >> >> > >>> > >>will > > >> >> > >>> > >> >> >>see > > >> >> > >>> > >> >> >> the error log we put there. So they will get > > >>informed > > >> >> about > > >> >> > >>>this > > >> >> > >>> > >> >> >>mis-usage > > >> >> > >>> > >> >> >> of close() in sender thread the first time they > run > > >>the > > >> >> code > > >> >> > >>> > >>instead > > >> >> > >>> > >> >>of > > >> >> > >>> > >> >> >> some time later. > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> >> Personally I prefer the second one because it is > > >>more > > >> >> > >>>obvious > > >> >> > >>> that > > >> >> > >>> > >> >> >> something was wrong. > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> >> Jiangjie (Becket) Qin > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> >> On 3/15/15, 4:27 PM, "Guozhang Wang" > > >> >><wangg...@gmail.com > > >> >> <javascript:;>> > > >> >> > >>> wrote: > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> >> >Yeah I agree we should not silently change the > > >> >>behavior > > >> >> of > > >> >> > >>>the > > >> >> > >>> > >> >>function > > >> >> > >>> > >> >> >> >with the given parameters; and I would prefer > > >> >> > >>> > >> >> >>error-logging-and-shutdown > > >> >> > >>> > >> >> >> >over blocking when close(>0) is used, since as > Neha > > >> >> > >>>suggested > > >> >> > >>> > >> >>blocking > > >> >> > >>> > >> >> >> >would also not proceed with sending any data, bu > > >>will > > >> >> just > > >> >> > >>>let > > >> >> > >>> > >> >>users to > > >> >> > >>> > >> >> >> >realize the issue later than sooner. > > >> >> > >>> > >> >> >> > > > >> >> > >>> > >> >> >> >On Sun, Mar 15, 2015 at 3:25 PM, Neha Narkhede > > >> >> > >>> > >><n...@confluent.io <javascript:;>> > > >> >> > >>> > >> >> >>wrote: > > >> >> > >>> > >> >> >> > > > >> >> > >>> > >> >> >> >> > > > >> >> > >>> > >> >> >> >> > And I also agree it is better if we can make > > >> >>producer > > >> >> > >>>block > > >> >> > >>> > >>when > > >> >> > >>> > >> >> >> >> > close() is called from sender thread so user > > >>will > > >> >> > >>>notice > > >> >> > >>> > >> >>something > > >> >> > >>> > >> >> >> >>went > > >> >> > >>> > >> >> >> >> > wrong. > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> >> This isn't a great experience either. Why can't > > >>we > > >> >>just > > >> >> > >>>throw > > >> >> > >>> > >>an > > >> >> > >>> > >> >> >> >>exception > > >> >> > >>> > >> >> >> >> for a behavior we know is incorrect and we'd > like > > >> >>the > > >> >> > >>>user to > > >> >> > >>> > >> >>know. > > >> >> > >>> > >> >> >> >> Blocking as a means of doing that seems wrong > and > > >> >> > >>>annoying. > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> >> On Sun, Mar 15, 2015 at 11:56 AM, Jay Kreps > > >> >> > >>> > >><jay.kr...@gmail.com <javascript:;>> > > >> >> > >>> > >> >> >> wrote: > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> >> > Cool. > > >> >> > >>> > >> >> >> >> > > > >> >> > >>> > >> >> >> >> > I think blocking is good or alternately > > >>throwing > > >> >>an > > >> >> > >>> exception > > >> >> > >>> > >> >> >>directly > > >> >> > >>> > >> >> >> >> from > > >> >> > >>> > >> >> >> >> > close(). Basically I would just worry about > > >>subtly > > >> >> > >>>doing > > >> >> > >>> > >> >>something > > >> >> > >>> > >> >> >> >> slightly > > >> >> > >>> > >> >> >> >> > different from what the user asked for as it > > >>will > > >> >>be > > >> >> > >>>hard > > >> >> > >>> to > > >> >> > >>> > >> >>notice > > >> >> > >>> > >> >> >> >>that > > >> >> > >>> > >> >> >> >> > behavior difference. > > >> >> > >>> > >> >> >> >> > > > >> >> > >>> > >> >> >> >> > -Jay > > >> >> > >>> > >> >> >> >> > > > >> >> > >>> > >> >> >> >> > On Sat, Mar 14, 2015 at 5:48 PM, Jiangjie Qin > > >> >> > >>> > >> >> >> >><j...@linkedin.com.invalid > > >> >> > >>> > >> >> >> >> > > > >> >> > >>> > >> >> >> >> > wrote: > > >> >> > >>> > >> >> >> >> > > > >> >> > >>> > >> >> >> >> > > Hi Jay, > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > I have modified the KIP as you suggested. I > > >> >>thinks > > >> >> as > > >> >> > >>> long > > >> >> > >>> > >>as > > >> >> > >>> > >> >>we > > >> >> > >>> > >> >> >> >>have > > >> >> > >>> > >> >> >> >> > > consistent define for timeout across Kafka > > >> >> interface, > > >> >> > >>> there > > >> >> > >>> > >> >>would > > >> >> > >>> > >> >> >> >>be no > > >> >> > >>> > >> >> >> >> > > problem. And I also agree it is better if > we > > >>can > > >> >> make > > >> >> > >>> > >>producer > > >> >> > >>> > >> >> >>block > > >> >> > >>> > >> >> >> >> when > > >> >> > >>> > >> >> >> >> > > close() is called from sender thread so > user > > >> >>will > > >> >> > >>>notice > > >> >> > >>> > >> >> >>something > > >> >> > >>> > >> >> >> >>went > > >> >> > >>> > >> >> >> >> > > wrong. > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > Thanks. > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > Jiangjie (Becket) Qin > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > On 3/14/15, 11:37 AM, "Jay Kreps" > > >> >> > >>><jay.kr...@gmail.com <javascript:;>> > > >> >> > >>> > >> wrote: > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > >Hey Jiangjie, > > >> >> > >>> > >> >> >> >> > > > > > >> >> > >>> > >> >> >> >> > > >I think this is going to be very confusing > > >>that > > >> >> > >>> > >> >> >> >> > > > close(0) waits indefinitely and > > >> >> > >>> > >> >> >> >> > > > close(-1) waits for 0. > > >> >> > >>> > >> >> >> >> > > >I understand this appears in other apis, > > >>but it > > >> >> is a > > >> >> > >>> > >>constant > > >> >> > >>> > >> >> >> >>cause of > > >> >> > >>> > >> >> >> >> > > >bugs. Let's not repeat that mistake. > > >> >> > >>> > >> >> >> >> > > > > > >> >> > >>> > >> >> >> >> > > >Let's make close(0) wait for 0. We don't > > >>need a > > >> >> way > > >> >> > >>>to > > >> >> > >>> > >>wait > > >> >> > >>> > >> >> >> >> indefinitely > > >> >> > >>> > >> >> >> >> > > >as > > >> >> > >>> > >> >> >> >> > > >we already have close() so having a > magical > > >> >> constant > > >> >> > >>>for > > >> >> > >>> > >> >>that is > > >> >> > >>> > >> >> >> >> > > >redundant. > > >> >> > >>> > >> >> >> >> > > > > > >> >> > >>> > >> >> >> >> > > >Calling close() from the I/O thread was > > >>already > > >> >> > >>>possible > > >> >> > >>> > >>and > > >> >> > >>> > >> >> >>would > > >> >> > >>> > >> >> >> >> block > > >> >> > >>> > >> >> >> >> > > >indefinitely. I think trying to silently > > >>change > > >> >> the > > >> >> > >>> > >>behavior > > >> >> > >>> > >> >>is > > >> >> > >>> > >> >> >> >> probably > > >> >> > >>> > >> >> >> >> > > >not right. I.e. if the user calls close() > in > > >> >>the > > >> >> > >>> callback > > >> >> > >>> > >> >>there > > >> >> > >>> > >> >> >>is > > >> >> > >>> > >> >> >> >> > > >actually > > >> >> > >>> > >> >> >> >> > > >some misunderstanding and they need to > think > > >> >>more, > > >> >> > >>> > >>silently > > >> >> > >>> > >> >> >>making > > >> >> > >>> > >> >> >> >> this > > >> >> > >>> > >> >> >> >> > > >not > > >> >> > >>> > >> >> >> >> > > >block will hide the problem from them > which > > >>is > > >> >>the > > >> >> > >>> > >>opposite > > >> >> > >>> > >> >>of > > >> >> > >>> > >> >> >> >>what we > > >> >> > >>> > >> >> >> >> > > >want. > > >> >> > >>> > >> >> >> >> > > > > > >> >> > >>> > >> >> >> >> > > >-Jay > > >> >> > >>> > >> >> >> >> > > > > > >> >> > >>> > >> >> >> >> > > >On Thu, Mar 12, 2015 at 1:49 AM, Jiangjie > > >>Qin > > >> >> > >>> > >> >> >> >> <j...@linkedin.com.invalid > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > >wrote: > > >> >> > >>> > >> >> >> >> > > > > > >> >> > >>> > >> >> >> >> > > >> Hey Joe & Jay, > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> Thanks for the comments on the voting > > >>thread. > > >> >> > >>>Since it > > >> >> > >>> > >> >>seems > > >> >> > >>> > >> >> >>we > > >> >> > >>> > >> >> >> >> > probably > > >> >> > >>> > >> >> >> >> > > >> will have more discussion on this, I am > > >>just > > >> >> > >>>replying > > >> >> > >>> > >>from > > >> >> > >>> > >> >>the > > >> >> > >>> > >> >> >> >> > > >>discussion > > >> >> > >>> > >> >> >> >> > > >> thread here. > > >> >> > >>> > >> >> >> >> > > >> I’ve updated the KIP page to make it > less > > >> >>like > > >> >> > >>> > >>half-baked, > > >> >> > >>> > >> >> >> >>apologize > > >> >> > >>> > >> >> >> >> > for > > >> >> > >>> > >> >> >> >> > > >> the rush... > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> The contract in current KIP is: > > >> >> > >>> > >> >> >> >> > > >> 1. close() - wait until all requests > > >>either > > >> >> are > > >> >> > >>>sent > > >> >> > >>> > >>or > > >> >> > >>> > >> >> >>reach > > >> >> > >>> > >> >> >> >> > request > > >> >> > >>> > >> >> >> >> > > >> timeout. > > >> >> > >>> > >> >> >> >> > > >> 2. close(-1, TimeUnit.MILLISECONDS) - > > >>close > > >> >> > >>> > >>immediately > > >> >> > >>> > >> >> >> >> > > >> 3. close(0, TimeUnit.MILLISECONDS) - > > >> >> equivalent > > >> >> > >>>to > > >> >> > >>> > >> >>close(), > > >> >> > >>> > >> >> >> >>i.e. > > >> >> > >>> > >> >> >> >> > Wait > > >> >> > >>> > >> >> >> >> > > >> until all requests are sent or reach > > >>request > > >> >> > >>>timeout > > >> >> > >>> > >> >> >> >> > > >> 4. close(5, TimeUnit.MILLISECONDS) - > try > > >> >>the > > >> >> > >>>best to > > >> >> > >>> > >> >>finish > > >> >> > >>> > >> >> >> >> sending > > >> >> > >>> > >> >> >> >> > > >>in 5 > > >> >> > >>> > >> >> >> >> > > >> milliseconds, if something went wrong, > > >>just > > >> >> > >>>shutdown > > >> >> > >>> the > > >> >> > >>> > >> >> >>producer > > >> >> > >>> > >> >> >> >> > > >>anyway, > > >> >> > >>> > >> >> >> >> > > >> my callback will handle the failures. > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> About how we define what timeout value > > >>stands > > >> >> for, > > >> >> > >>>I > > >> >> > >>> > >> >>actually > > >> >> > >>> > >> >> >> >> > struggled > > >> >> > >>> > >> >> >> >> > > >>a > > >> >> > >>> > >> >> >> >> > > >> little bit when wrote the patch. > > >>Intuitively, > > >> >> > >>>close(0) > > >> >> > >>> > >> >>should > > >> >> > >>> > >> >> >> >>mean > > >> >> > >>> > >> >> >> >> > > >> immediately, however it seems that all > the > > >> >> > >>>existing > > >> >> > >>> java > > >> >> > >>> > >> >>class > > >> >> > >>> > >> >> >> >>have > > >> >> > >>> > >> >> >> >> > this > > >> >> > >>> > >> >> >> >> > > >> convention of timeout=0 means no timeout > > >>or > > >> >> never > > >> >> > >>> > >>timeout > > >> >> > >>> > >> >> >> >> > > >>(Thread.join(0), > > >> >> > >>> > >> >> >> >> > > >> Object.wait(0), etc.) So here the > dilemma > > >>is > > >> >> > >>>either we > > >> >> > >>> > >> >>follow > > >> >> > >>> > >> >> >>the > > >> >> > >>> > >> >> >> >> > > >> intuition or we follow the convention. > > >>What I > > >> >> > >>>chose is > > >> >> > >>> > >>to > > >> >> > >>> > >> >> >>follow > > >> >> > >>> > >> >> >> >>the > > >> >> > >>> > >> >> >> >> > > >> convention but document the interface to > > >>let > > >> >> user > > >> >> > >>>be > > >> >> > >>> > >>aware > > >> >> > >>> > >> >>of > > >> >> > >>> > >> >> >>the > > >> >> > >>> > >> >> >> >> > usage. > > >> >> > >>> > >> >> >> >> > > >> The reason is that I think > > >>producer.close() > > >> >>is a > > >> >> > >>> public > > >> >> > >>> > >> >> >> >>interface so > > >> >> > >>> > >> >> >> >> > it > > >> >> > >>> > >> >> >> >> > > >> might be better to follow java > convention. > > >> >> Whereas > > >> >> > >>> > >> >>selector is > > >> >> > >>> > >> >> >> >>not a > > >> >> > >>> > >> >> >> >> > > >> public interface that used by user, so > as > > >> >>long > > >> >> as > > >> >> > >>>it > > >> >> > >>> > >>makes > > >> >> > >>> > >> >> >>sense > > >> >> > >>> > >> >> >> >>to > > >> >> > >>> > >> >> >> >> > us, > > >> >> > >>> > >> >> >> >> > > >>it > > >> >> > >>> > >> >> >> >> > > >> is less a problem to be different from > > >>java > > >> >> > >>> convention. > > >> >> > >>> > >> >>That > > >> >> > >>> > >> >> >>said > > >> >> > >>> > >> >> >> >> > since > > >> >> > >>> > >> >> >> >> > > >> consumer.poll(timeout) is also a public > > >> >> interface, > > >> >> > >>>I > > >> >> > >>> > >>think > > >> >> > >>> > >> >>it > > >> >> > >>> > >> >> >> >>also > > >> >> > >>> > >> >> >> >> > makes > > >> >> > >>> > >> >> >> >> > > >> sense to make producer.close() to have > the > > >> >>same > > >> >> > >>> > >>definition > > >> >> > >>> > >> >>of > > >> >> > >>> > >> >> >> >> > > >> consumer.poll(timeout). > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> The main argument for keeping a timeout > in > > >> >>close > > >> >> > >>>would > > >> >> > >>> > >>be > > >> >> > >>> > >> >> >> >>separating > > >> >> > >>> > >> >> >> >> > the > > >> >> > >>> > >> >> >> >> > > >> close timeout from request timeout, > which > > >> >> probably > > >> >> > >>> makes > > >> >> > >>> > >> >> >>sense. I > > >> >> > >>> > >> >> >> >> > would > > >> >> > >>> > >> >> >> >> > > >> guess typically the request timeout > would > > >>be > > >> >> long > > >> >> > >>> (e.g. > > >> >> > >>> > >>60 > > >> >> > >>> > >> >> >> >>seconds) > > >> >> > >>> > >> >> >> >> > > >> because we might want to consider > retries > > >> >>with > > >> >> > >>>back > > >> >> > >>> off > > >> >> > >>> > >> >>time. > > >> >> > >>> > >> >> >>If > > >> >> > >>> > >> >> >> >>we > > >> >> > >>> > >> >> >> >> > have > > >> >> > >>> > >> >> >> >> > > >> multiple batches in accumulator, in > worst > > >> >>case > > >> >> > >>>that > > >> >> > >>> > >>could > > >> >> > >>> > >> >>take > > >> >> > >>> > >> >> >> >>up to > > >> >> > >>> > >> >> >> >> > > >> several minutes to complete all the > > >>requests. > > >> >> But > > >> >> > >>>when > > >> >> > >>> > >>we > > >> >> > >>> > >> >> >>close a > > >> >> > >>> > >> >> >> >> > > >> producer, we might not want to wait for > > >>that > > >> >> long > > >> >> > >>>as > > >> >> > >>> it > > >> >> > >>> > >> >>might > > >> >> > >>> > >> >> >> >>cause > > >> >> > >>> > >> >> >> >> > some > > >> >> > >>> > >> >> >> >> > > >> other problem like deployment tool > > >>timeout. > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> There is also a subtle difference > between > > >> >> > >>> close(timeout) > > >> >> > >>> > >> >>and > > >> >> > >>> > >> >> >> >> > > >> flush(timeout). The only purpose for > > >>flush() > > >> >>is > > >> >> to > > >> >> > >>> write > > >> >> > >>> > >> >>data > > >> >> > >>> > >> >> >>to > > >> >> > >>> > >> >> >> >>the > > >> >> > >>> > >> >> >> >> > > >> broker, so it makes perfect sense to > wait > > >> >>until > > >> >> > >>> request > > >> >> > >>> > >> >> >>timeout. > > >> >> > >>> > >> >> >> >>I > > >> >> > >>> > >> >> >> >> > think > > >> >> > >>> > >> >> >> >> > > >> that is why flush(timeout) looks > strange. > > >>On > > >> >>the > > >> >> > >>>other > > >> >> > >>> > >> >>hand, > > >> >> > >>> > >> >> >>the > > >> >> > >>> > >> >> >> >>top > > >> >> > >>> > >> >> >> >> > > >> priority for close() is to close the > > >>producer > > >> >> > >>>rather > > >> >> > >>> > >>than > > >> >> > >>> > >> >> >>flush() > > >> >> > >>> > >> >> >> >> > data, > > >> >> > >>> > >> >> >> >> > > >>so > > >> >> > >>> > >> >> >> >> > > >> close(timeout) gives guarantee on > bounded > > >> >> waiting > > >> >> > >>>for > > >> >> > >>> > >>its > > >> >> > >>> > >> >>main > > >> >> > >>> > >> >> >> >>job. > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> Sorry for the confusion about forceClose > > >> >>flag. > > >> >> It > > >> >> > >>>is > > >> >> > >>> > >>not a > > >> >> > >>> > >> >> >>public > > >> >> > >>> > >> >> >> >> > > >> interface. I mentioned it in Proposed > > >>Changes > > >> >> > >>>section > > >> >> > >>> > >> >>which I > > >> >> > >>> > >> >> >> >> thought > > >> >> > >>> > >> >> >> >> > > >>was > > >> >> > >>> > >> >> >> >> > > >> supposed to provide implementation > > >>details. > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> Thanks again for all the comments and > > >> >> suggestions! > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> Jiangjie (Becket) Qin > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> On 3/10/15, 8:57 PM, "Jiangjie Qin" < > > >> >> > >>> j...@linkedin.com <javascript:;>> > > >> >> > >>> > >> >> wrote: > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> >The KIP page has been updated per Jay¹s > > >> >> comments. > > >> >> > >>> > >> >> >> >> > > >> >I¹d like to initiate the voting process > > >>if > > >> >>no > > >> >> > >>>further > > >> >> > >>> > >> >> >>comments > > >> >> > >>> > >> >> >> >>are > > >> >> > >>> > >> >> >> >> > > >> >received by tomorrow. > > >> >> > >>> > >> >> >> >> > > >> > > > >> >> > >>> > >> >> >> >> > > >> >Jiangjie (Becket) Qin > > >> >> > >>> > >> >> >> >> > > >> > > > >> >> > >>> > >> >> >> >> > > >> >On 3/8/15, 9:45 AM, "Jay Kreps" > > >> >> > >>><jay.kr...@gmail.com <javascript:;> > > >> >> > >>> > > > >> >> > >>> > >> >>wrote: > > >> >> > >>> > >> >> >> >> > > >> > > > >> >> > >>> > >> >> >> >> > > >> >>Hey Jiangjie, > > >> >> > >>> > >> >> >> >> > > >> >> > > >> >> > >>> > >> >> >> >> > > >> >>Can you capture the full motivation > and > > >>use > > >> >> > >>>cases > > >> >> > >>> for > > >> >> > >>> > >>the > > >> >> > >>> > >> >> >> >>feature? > > >> >> > >>> > >> >> >> >> > > >>This > > >> >> > >>> > >> >> >> >> > > >> >>mentions your interest in having a way > > >>of > > >> >> > >>>aborting > > >> >> > >>> > >>from > > >> >> > >>> > >> >> >>inside > > >> >> > >>> > >> >> >> >>the > > >> >> > >>> > >> >> >> >> > > >> >>Callback. But it doesn't really > explain > > >> >>that > > >> >> > >>>usage > > >> >> > >>> or > > >> >> > >>> > >>why > > >> >> > >>> > >> >> >>other > > >> >> > >>> > >> >> >> >> > people > > >> >> > >>> > >> >> >> >> > > >> >>would want to do that. It also doesn't > > >>list > > >> >> the > > >> >> > >>> > >>primary > > >> >> > >>> > >> >>use > > >> >> > >>> > >> >> >> >>case > > >> >> > >>> > >> >> >> >> for > > >> >> > >>> > >> >> >> >> > > >> >>having > > >> >> > >>> > >> >> >> >> > > >> >>close with a bounded timeout which was > > >>to > > >> >> avoid > > >> >> > >>> > >>blocking > > >> >> > >>> > >> >>too > > >> >> > >>> > >> >> >> >>long > > >> >> > >>> > >> >> >> >> on > > >> >> > >>> > >> >> >> >> > > >> >>shutdown. > > >> >> > >>> > >> >> >> >> > > >> >> > > >> >> > >>> > >> >> >> >> > > >> >>-Jay > > >> >> > >>> > >> >> >> >> > > >> >> > > >> >> > >>> > >> >> >> >> > > >> >> > > >> >> > >>> > >> >> >> >> > > >> >> > > >> >> > >>> > >> >> >> >> > > >> >>On Sat, Mar 7, 2015 at 12:25 PM, > > >>Jiangjie > > >> >>Qin > > >> >> > >>> > >> >> >> >> > > >><j...@linkedin.com.invalid > > >> >> > >>> > >> >> >> >> > > >> > > > >> >> > >>> > >> >> >> >> > > >> >>wrote: > > >> >> > >>> > >> >> >> >> > > >> >> > > >> >> > >>> > >> >> >> >> > > >> >>> Hi, > > >> >> > >>> > >> >> >> >> > > >> >>> > > >> >> > >>> > >> >> >> >> > > >> >>> I just created a KIP for adding a > > >> >> > >>>close(timeout) > > >> >> > >>> to > > >> >> > >>> > >>new > > >> >> > >>> > >> >> >> >> producer. > > >> >> > >>> > >> >> >> >> > > >>Most > > >> >> > >>> > >> >> >> >> > > >> >>>of > > >> >> > >>> > >> >> >> >> > > >> >>> the previous discussions are in > > >> >>KAFKA-1660 > > >> >> > >>>where > > >> >> > >>> > >>Parth > > >> >> > >>> > >> >> >> >> Brahmbhatt > > >> >> > >>> > >> >> >> >> > > >>has > > >> >> > >>> > >> >> >> >> > > >> >>> already done a lot of work. > > >> >> > >>> > >> >> >> >> > > >> >>> Since this is an interface change so > > >>we > > >> >>are > > >> >> > >>>going > > >> >> > >>> > >> >>through > > >> >> > >>> > >> >> >>the > > >> >> > >>> > >> >> >> >> KIP > > >> >> > >>> > >> >> >> >> > > >> >>>process. > > >> >> > >>> > >> >> >> >> > > >> >>> Here is the KIP link: > > >> >> > >>> > >> >> >> >> > > >> >>> > > >> >> > >>> > >> >> >> >> > > >> >>> > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> > > >> >> > >>> > >> > > >> >> > >>> > > > >> >> > >>> > > >> >> > > > >> > > >>>> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=537397 > > >>>>8 > > >> >> > >>> > >> >> >> >> > > >> >>>2 > > >> >> > >>> > >> >> >> >> > > >> >>> > > >> >> > >>> > >> >> >> >> > > >> >>> Thanks. > > >> >> > >>> > >> >> >> >> > > >> >>> > > >> >> > >>> > >> >> >> >> > > >> >>> Jiangjie (Becket) Qin > > >> >> > >>> > >> >> >> >> > > >> >>> > > >> >> > >>> > >> >> >> >> > > >> > > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > >> > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > > > >> >> > >>> > >> >> >> >> > > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> >> -- > > >> >> > >>> > >> >> >> >> Thanks, > > >> >> > >>> > >> >> >> >> Neha > > >> >> > >>> > >> >> >> >> > > >> >> > >>> > >> >> >> > > > >> >> > >>> > >> >> >> > > > >> >> > >>> > >> >> >> > > > >> >> > >>> > >> >> >> >-- > > >> >> > >>> > >> >> >> >-- Guozhang > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> >> > > >> >> > >>> > >> >> > > > >> >> > >>> > >> >> > > > >> >> > >>> > >> >> >-- > > >> >> > >>> > >> >> >-- Guozhang > > >> >> > >>> > >> >> > > >> >> > >>> > >> >> > > >> >> > >>> > >> > > > >> >> > >>> > >> > > > >> >> > >>> > >> >-- > > >> >> > >>> > >> >-- Guozhang > > >> >> > >>> > >> > > >> >> > >>> > >> > > >> >> > >>> > > > >> >> > >>> > > > >> >> > >>> > > >> >> > >> > > >> >> > >> > > >> >> > >> > > >> >> > >>-- > > >> >> > >>-- Guozhang > > >> >> > > > > >> >> > > > >> >> > > > >> >> > > >> > > > >> > > > >> >-- > > >> >Sent from Gmail Mobile > > >> > > >> > > > > > -- -- Guozhang