Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-07 Thread Dong Lin
ler will still see the notification. > 2. In the notification znode we have Event field as an integer. Can we > document what is the value of LogDirFailure? And also are there any other > possible values? > > Thanks, > > Jiangjie (Becket) Qin > > On Tue, Mar 7, 2017 at 11:30 AM, Dong Lin

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-07 Thread Dong Lin
licas on the good log directories". > > 5. In the protocol definition, we have isNewReplica, but it should probably > be is_new_replica. > Good point. My bad. It is fixed now. > > Thanks, > Ismael > > > On Thu, Jan 12, 2017 at 6:46 PM, Dong Lin <lindon...@gmail.com>

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-07 Thread Dong Lin
t; Thanks, > > Jun > > On Fri, Mar 3, 2017 at 11:25 AM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Ismael, > > > > Thank for the detailed explanation. Here is my thought: > > > > 1. purge vs. delete > > > > We have originally consi

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-03-06 Thread Dong Lin
; Yes, there is a sensor in the patch about the split occurrence. > >> > >> Currently it is a count instead of rate. In practice, it seems count is > >> easier to use in this case. But I am open to change. > >> > >> Thanks, > >> > >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-05 Thread Dong Lin
ake a compelling case for JBOD, > perhaps we should discuss KIP-113 before voting for both? I left some > comments in the other thread. > > Thanks, > > Jun > > On Wed, Mar 1, 2017 at 1:58 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Jun, > > >

[jira] [Updated] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connect

2017-03-05 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4841: Description: KAFKA-4820 allows new request to be enqueued to unsent by user thread while some other thread

[jira] [Updated] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connect

2017-03-05 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4841: Description: KAFKA-4820 allows new request to be enqueued to unsent by user thread while some other thread

[jira] [Updated] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connect

2017-03-05 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4841: Summary: NetworkClient should only consider a connection to be fail after attempt to connect

[jira] [Created] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connct

2017-03-05 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4841: --- Summary: NetworkClient should only consider a connection to be fail after attempt to connct Key: KAFKA-4841 URL: https://issues.apache.org/jira/browse/KAFKA-4841 Project

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-03-03 Thread Dong Lin
Hey Becket, I haven't looked at the patch yet. But since we are going to try the split-on-oversize solution, should the KIP also add a sensor that shows the rate of split per second and the probability of split? Thanks, Dong On Fri, Mar 3, 2017 at 6:39 PM, Becket Qin

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-03 Thread Dong Lin
istency. Once we do that, we should also > consider > > if `Before` should be in the method name or should be in the parameter > > class. Just an example to describe what I mean, one could say > > `deleteRecords(DeleteRecordsParams.before(offsetsForPartition)`. That >

Re: [VOTE] KIP-119: Drop Support for Scala 2.10 in Kafka 0.11

2017-03-03 Thread Dong Lin
+1 (non-binding) On Thu, Mar 2, 2017 at 11:18 AM, Becket Qin wrote: > Thanks for the clarification, Ismael. In that case, it is reasonable to > drop support for Scala 2.10. LinkedIn is probably fine with this change. > > I did not notice we have recommended Scala version

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-03 Thread Dong Lin
be in the method name or should be in the parameter > class. Just an example to describe what I mean, one could say > `deleteRecords(DeleteRecordsParams.before(offsetsForPartition)`. That way, > we could provide a different way of deleting by simply updating the > parameters class. > >

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-03-02 Thread Dong Lin
o be consistent on ChangeReplicaDirRequest vs > ChangeReplicaRequest. > I think ChangeReplicaRequest and ChangeReplicaResponse is my typo. Sorry, they are fixed now. > > Thanks, > > Jun > > > On Fri, Feb 3, 2017 at 6:19 PM, Dong Lin <lindon...@gmail.com> wrote: >

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-02 Thread Dong Lin
t;j...@confluent.io> wrote: > Hi, Dong, > > It seems that delete means removing everything while purge means removing a > portion. So, it seems that it's better to be able to distinguish the two? > > Thanks, > > Jun > > On Wed, Mar 1, 2017 at 1:57 PM, Dong Lin <l

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-01 Thread Dong Lin
Hey Jun, Do you think it is OK to keep the existing wire protocol in the KIP? I am wondering if we can initiate vote for this KIP. Thanks, Dong On Tue, Feb 28, 2017 at 2:41 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Jun, > > I just realized that StopReplicaRequest itself

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-01 Thread Dong Lin
an option. Personally I don't have strong preference between "purge" and "delete". I am wondering if anyone object to this change. Thanks, Dong On Wed, Mar 1, 2017 at 9:46 AM, Dong Lin <lindon...@gmail.com> wrote: > Hi Ismael, > > I actually mean log_start_offset.

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-01 Thread Dong Lin
offset "? I could > only find the latter in the KIP. If so, would log_start_offset be a better > name? > > Ismael > > On Tue, Feb 28, 2017 at 4:26 AM, Dong Lin <lindon...@gmail.com> wrote: > > > Hi Jun and everyone, > > > > I would like to chang

[jira] [Updated] (KAFKA-4820) ConsumerNetworkClient.send() should not require global lock

2017-02-28 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4820: Description: Currently `ConsumerNetworkClient.send()` needs to acquire global lock

[jira] [Created] (KAFKA-4820) ConsumerNetworkClient.send() should not require global lock

2017-02-28 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4820: --- Summary: ConsumerNetworkClient.send() should not require global lock Key: KAFKA-4820 URL: https://issues.apache.org/jira/browse/KAFKA-4820 Project: Kafka Issue Type

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-28 Thread Dong Lin
the isNewReplica for the broker that receives LeaderAndIsrRequest. Thanks, Dong On Tue, Feb 28, 2017 at 2:14 PM, Dong Lin <lindon...@gmail.com> wrote: > Hi Jun, > > Yeah there is tradeoff between controller's implementation complexity vs. > wire-protocol complexity. I personall

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-28 Thread Dong Lin
> Thanks for the feedback. That's very useful. > > Jun > > On Tue, Feb 28, 2017 at 10:25 AM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Jun, > > > > Certainly, I have added Todd to reply to the thread. And I have updated > the > > item to in the

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-02-28 Thread Dong Lin
Thanks Jun. I have updated the KIP to reflect this change. On Tue, Feb 28, 2017 at 9:44 AM, Jun Rao <j...@confluent.io> wrote: > Hi, Dong, > > Yes, this change makes sense to me. > > Thanks, > > Jun > > On Mon, Feb 27, 2017 at 8:26 PM, Dong Lin <lind

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-28 Thread Dong Lin
nt to start a > separate discussion thread on KIP-113? I do have some comments there. > > Thanks for working on this! > > Jun > > > On Mon, Feb 27, 2017 at 5:51 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hi Jun, > > > > In addition to the E

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-02-27 Thread Dong Lin
can allow purge operation to succeed when some replica is offline. Are you OK with this change? If so, I will go ahead to update the KIP and implement this behavior. Thanks, Dong On Tue, Jan 17, 2017 at 10:18 AM, Dong Lin <lindon...@gmail.com> wrote: > Hey Jun, > > Do you have

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-27 Thread Dong Lin
will shrink ISR, expand it and shrink it > again after the timeout. > > The KIP seems to still reference " > /broker/topics/[topic]/partitions/[partitionId]/controller_managed_state". > > Thanks, > > Jun > > On Sat, Feb 25, 2017 at 7:49 PM, Dong Lin <lindon

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-25 Thread Dong Lin
you can see, we will be adding some complexity to support > JBOD in Kafka one way or another. If we can tune the performance of RAID5 > to match that of RAID10, perhaps using RAID5 is a simpler solution. > > Thanks, > > Jun > > > On Fri, Feb 24, 2017 at 10:17 AM, Dong Lin <lind

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-24 Thread Dong Lin
Hey Jun, I don't think we should allow failed replicas to be re-created on the good disks. Say there are 2 disks and each of them is 51% loaded. If any disk fail, and we allow replicas to be re-created on the other disks, both disks will fail. Alternatively we can disable replica creation if

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-23 Thread Dong Lin
But controller still needs to learn about offline replicas from LeaderAndIsrResponse. I think this is better than the current design. Do you have any concern with this design? Thanks, Dong On Thu, Feb 23, 2017 at 7:12 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Jun, > > Sure, h

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-23 Thread Dong Lin
. Thanks, Dong On Thu, Feb 23, 2017 at 6:46 PM, Jun Rao <j...@confluent.io> wrote: > Hi, Dong, > > My replies are inlined below. > > On Thu, Feb 23, 2017 at 4:47 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Jun, > > > > Thanks for you reply! Le

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-23 Thread Dong Lin
nt. Well, if all log directories are available, the failed log > directory path will be cleared. In the rarer case that a log directory is > still offline and one of the replicas registered in the failed log > directory shows up in another available log directory, I am not quite sure. &g

Re: [DISCUSS] KIP-125: ZookeeperConsumerConnector to KafkaConsumer Migration and Rollback

2017-02-23 Thread Dong Lin
ZKCC because MEZKCC has > > "dual.commit.enabled" set to true as well as "offsets.storage" set to > > kafka. The combination of these configs results in the consumer fetching > > offsets from both kafka and zookeeper and just picking the greater of the > > two. >

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-23 Thread Dong Lin
be an important change depending on the answer to 1) above. We probably need to document this more explicitly. Dong On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <lindon...@gmail.com> wrote: > Hey Jun, > > Yeah you are right. I thought it wasn't because at LinkedIn it will be too

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-23 Thread Dong Lin
+1 (non-binding) On Wed, Feb 22, 2017 at 10:52 PM, Manikumar wrote: > +1 (non-binding) > > On Thu, Feb 23, 2017 at 3:27 AM, Mayuresh Gharat < > gharatmayures...@gmail.com > > wrote: > > > Hi Jun, > > > > Thanks a lot for the comments and reviews. > > I agree we should

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-23 Thread Dong Lin
> So there will be no reuse of existing metrics/sensors. The new ones > for > > > > request processing time based throttling will be completely > independent > > > of > > > > existing metrics/sensors, but will be consistent in format. > > > > > >

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-22 Thread Dong Lin
Hey Rajini, I think it makes a lot of sense to use io_thread_units as metric to quota user's traffic here. LGTM overall. I have some questions regarding sensors. - Can you be more specific in the KIP what sensors will be added? For example, it will be useful to specify the name and attributes of

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-02-22 Thread Dong Lin
not sure that will be the case. This is at least a concern where MM is mirroring traffic for only a few partitions of high byte-in rate. Thus I am wondering if we should do the optimization proposed above. Thanks, Dong On Wed, Feb 22, 2017 at 6:39 PM, Dong Lin <lindon...@gmail.com> wrote: > H

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-02-22 Thread Dong Lin
Hey Becket, Thanks for the KIP. I have one question here. Suppose producer's batch.size=100 KB, max.in.flight.requests.per.connection=1. Since each ProduceRequest contains one batch per partition, it means that 100 KB compressed data will be produced per partition per round-trip time as of

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-22 Thread Dong Lin
caRequest > to any offline replica. > > Thanks, > > Jun > > > On Tue, Feb 21, 2017 at 2:37 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Jun, > > > > Thanks much for your comments. > > > > I actually proposed the design to store

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-21 Thread Dong Lin
Hey Jun, Motivated by your suggestion, I think we can also store the information of created replicas in per-broker znode at /brokers/created_replicas/ids/[id]. Does this sound good? Regards, Dong On Tue, Feb 21, 2017 at 2:37 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Jun, > &

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-21 Thread Dong Lin
the broker detects this during broker startup, it can probably > just log an error and exit. The admin can remove the redundant partitions > manually and then restart the broker. > > Thanks, > > Jun > > On Sat, Feb 18, 2017 at 9:31 PM, Dong Lin <lindon...@gmail.com> wrote: &g

Re: [DISCUSS] KIP-125: ZookeeperConsumerConnector to KafkaConsumer Migration and Rollback

2017-02-20 Thread Dong Lin
Hey Onur, Thanks for the well-written KIP! I have two questions below. 1) In the process of migrating from OZKCCs and MDZKCCs to MEZKCCs, we will may a mix of OZKCCs, MDZKCCs and MEZKCCs. OZKCC and MDZKCC will only commit to zookeeper and MDZKCC will use kafka-based offset storage. Would we lose

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-20 Thread Dong Lin
> As of the request rate quota, while it seems easy to enforce and > intuitive, > > there are some caveats. > > 1. Users do not have direct control over the request rate, i.e. users do > > not known when a request will be sent by the clients. > > 2. Each request may requir

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-18 Thread Dong Lin
I couldn't find information in the KIP on > where this window would be configured. > > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <lindon...@gmail.com> wrote: > > > To correct the typo above: It seems to me that determination of request > > rate is not any more diffic

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-18 Thread Dong Lin
Hey Jun, Could you please let me know if the solutions above could address your concern? I really want to move the discussion forward. Thanks, Dong On Tue, Feb 14, 2017 at 8:17 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Jun, > > Thanks for all your help and time to discuss

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-17 Thread Dong Lin
To correct the typo above: It seems to me that determination of request rate is not any more difficult than determination of *byte* rate as both metrics are commonly used to measure performance and provide guarantee to user. On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <lindon...@gmail.com>

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-17 Thread Dong Lin
Hey Rajini, Thanks for the KIP. I have some questions: - I am wondering why throttling based on request rate is listed as a rejected alternative. Can you provide more specific reason why it is difficult for administrators to decide request rates to allocate? It seems to me that determination of

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-16 Thread Dong Lin
Hey Colin, Thanks for the update. I have two comments: - I actually think it is simpler and good enough to have per-topic API instead of batch-of-topic API. This is different from the argument for batch-of-partition API because, unlike operation on topic, people usually operate on multiple

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-14 Thread Dong Lin
ad time should be less than 10% of the existing total zk read time during controller failover. Thanks! Dong On Tue, Feb 14, 2017 at 7:30 AM, Dong Lin <lindon...@gmail.com> wrote: > Hey Jun, > > I just realized that you may be suggesting that a tool for listing offline > directorie

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-14 Thread Dong Lin
this script in KIP-113. Regardless, my hope is to finish both KIPs ASAP and make them in the same release since both KIPs are needed for the JBOD setup. Thanks, Dong On Mon, Feb 13, 2017 at 5:52 PM, Dong Lin <lindon...@gmail.com> wrote: > And the test plan has also been updated to simu

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-13 Thread Dong Lin
And the test plan has also been updated to simulate disk failure by changing log directory permission to 000. On Mon, Feb 13, 2017 at 5:50 PM, Dong Lin <lindon...@gmail.com> wrote: > Hi Jun, > > Thanks for the reply. These comments are very helpful. Let me answer them > inline.

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-13 Thread Dong Lin
Hi Jun, Thanks for the reply. These comments are very helpful. Let me answer them inline. On Mon, Feb 13, 2017 at 3:25 PM, Jun Rao <j...@confluent.io> wrote: > Hi, Dong, > > Thanks for the reply. A few more replies and new comments below. > > On Fri, Feb 10, 2017 at 4:27

[jira] [Created] (KAFKA-4763) Handle disk failure for JBOD (KIP-112)

2017-02-13 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4763: --- Summary: Handle disk failure for JBOD (KIP-112) Key: KAFKA-4763 URL: https://issues.apache.org/jira/browse/KAFKA-4763 Project: Kafka Issue Type: Improvement

Re: [VOTE] KIP-48 Support for delegation tokens as an authentication mechanism

2017-02-13 Thread Dong Lin
+1 (non-binding) On Mon, Feb 13, 2017 at 10:21 AM, Harsha Chintalapani wrote: > +1. > -Harsha > > On Fri, Feb 10, 2017 at 11:12 PM Manikumar > wrote: > > > Yes, owners and the renewers can always describe their own tokens. > Updated > > the KIP. > > >

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-11 Thread Dong Lin
. offsetsForTimes) which are typically used for operation on multiple partitions at a time. On Fri, Feb 10, 2017 at 5:05 PM, Dong Lin <lindon...@gmail.com> wrote: > Hi Jun, > > Currently KIP-107 uses this API: > > Future<Map<TopicPartition, PurgeDataResult>> > purgeD

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-10 Thread Dong Lin
, we do batching in purgeDataBefore(). In Colin's > current proposal, there is no batching. > > Thanks, > > Jun > > On Thu, Feb 9, 2017 at 10:54 AM, Dong Lin <lindon...@gmail.com> wrote: > > > Thanks for the explanation. This makes sense. > > > >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-10 Thread Dong Lin
may be easier to develop in the long term if we separate these two requests. I agree that ideally we want to create replicas in the right log directory in the first place. But I am not sure if there is any performance or correctness concern with the existing way of moving it after it is created. Besi

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-10 Thread Dong Lin
for the comments, Dong On Thu, Feb 9, 2017 at 4:45 PM, Dong Lin <lindon...@gmail.com> wrote: > > > On Thu, Feb 9, 2017 at 3:37 PM, Colin McCabe <cmcc...@apache.org> wrote: > >> On Thu, Feb 9, 2017, at 11:40, Dong Lin wrote: >> > Thanks for all the comments Co

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-09 Thread Dong Lin
On Thu, Feb 9, 2017 at 3:37 PM, Colin McCabe <cmcc...@apache.org> wrote: > On Thu, Feb 9, 2017, at 11:40, Dong Lin wrote: > > Thanks for all the comments Colin! > > > > To answer your questions: > > - Yes, a broker will shutdown if all its log directories are ba

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-09 Thread Dong Lin
iles. Clearly, log dirs that are completely > inaccessible will still be considered bad after a broker process bounce. > > best, > Colin > > > > > +1 (non-binding) aside from that > > > > > > > > On Wed, Feb 8, 2017, at 00:47, Dong Lin wrote: >

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-09 Thread Dong Lin
Thanks for the explanation. This makes sense. Best, Dong On Thu, Feb 9, 2017 at 10:51 AM, Colin McCabe <cmcc...@apache.org> wrote: > On Wed, Feb 8, 2017, at 19:02, Dong Lin wrote: > > I am not aware of any semantics that will be caused by sharing > > NetworkClient betw

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-08 Thread Dong Lin
another client. Also, the > NetworkClient is an internal class which is not really meant for users. Do > we really want to open that up? Is the only benefit saving the number of > connections? Seems not worth it in my opinion. > > -Jason > > On Wed, Feb 8, 2017 at 6:43 PM,

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-08 Thread Dong Lin
BTW, the idea to share NetworkClient is suggested by Radai and I like this idea. On Wed, Feb 8, 2017 at 6:39 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Colin, > > Thanks for updating the KIP. I have two followup questions: > > - It seems that setCreationConfig(...) is

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-08 Thread Dong Lin
Hey Colin, Thanks for updating the KIP. I have two followup questions: - It seems that setCreationConfig(...) is a bit redundant given that most arguments (e.g. topic name, partition num) are already passed to TopicsContext.create(...) when user creates topic. Should we pass the creationConfig

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-08 Thread Dong Lin
On Tue, Feb 7, 2017 at 5:23 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Eno, > > Thanks much for the comment! > > I still think the complexity added to Kafka is justified by its benefit. > Let me provide my reasons below. > > 1) The additional logic is easy t

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-07 Thread Dong Lin
Hey Jorge, Thanks for the KIP. I have some quick comments: - Should we allow user to use wildcard to reset offset of all groups for a given topic as well? - Should we allow user to specify timestamp per topic partition in the json file as well? - Should the script take some credential file to

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-07 Thread Dong Lin
her the complexity added to Kafka is justified. > Operationally it seems to me an admin will still have to do all the three > items above. > > Looking forward to the discussion > Thanks > Eno > > > > On 1 Feb 2017, at 17:21, Dong Lin <lindon...@gmail.com> wrote: >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-07 Thread Dong Lin
inlined > below. > > On Mon, Feb 6, 2017 at 7:22 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Jun, > > > > Thanks for the review! Please see reply inline. > > > > On Mon, Feb 6, 2017 at 6:21 PM, Jun Rao <j...@confluent.io> wrote: > >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-07 Thread Dong Lin
admin can pool a few > > disks together to create a volume/directory and give that to Kafka. > > > > > > The kernel of my question will be that the admin already has tools to 1) > > create volumes/directories from a JBOD and 2) start a broker on a desired > > machine and 3

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-06 Thread Dong Lin
ead of ZK. Should we have a separate KIP in the future to migrate all existing notification to using RPC? > > Jun > > > On Wed, Jan 25, 2017 at 1:50 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Colin, > > > > Good point! Yeah we have actually considere

[jira] [Created] (KAFKA-4735) Fix deadlock issue during MM shutdown

2017-02-05 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4735: --- Summary: Fix deadlock issue during MM shutdown Key: KAFKA-4735 URL: https://issues.apache.org/jira/browse/KAFKA-4735 Project: Kafka Issue Type: Bug

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-03 Thread Dong Lin
org> wrote: > On Thu, Feb 2, 2017, at 17:54, Dong Lin wrote: > > Hey Colin, > > > > Thanks for the KIP. I have a few comments below: > > > > - I share similar view with Ismael that a Future-based API is better. > > PurgeDataFrom() is an example API that u

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-02 Thread Dong Lin
Hey Colin, Thanks for the KIP. I have a few comments below: - I share similar view with Ismael that a Future-based API is better. PurgeDataFrom() is an example API that user may want to do it asynchronously even though there is only one request in flight at a time. In the future we may also have

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-02 Thread Dong Lin
> > > On 2 Feb 2017, at 02:53, Dong Lin <lindon...@gmail.com> wrote: > > > > Hey Eno, Colin, > > > > Would you have time next Tuesday morning to discuss the KIP? How about > 10 - > > 11 am? > > > > To make best use of our time, can you ple

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-01 Thread Dong Lin
seems > >> better to put this field in the Session class to avoid changing > interface > >> that needs to be implemented by custom principal. > >> -> Doing this might be backwards incompatible as we need to > >> preserve the existing behavior of kafka-

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
Sorry for the typo. I mean that before the KIP meeting, please free feel to provide comment in this email thread so that discussion in the KIP meeting can be more efficient. On Wed, Feb 1, 2017 at 6:53 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Eno, Colin, > > Would you

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
no concern the KIP after the KIP meeting. In the meeting time, please feel free to provide comment in the thread so that discussion in the KIP meeting can be more efficient. Thanks, Dong On Wed, Feb 1, 2017 at 5:43 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Colin, > > Thanks much

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
Hey Colin, Thanks much for the comment. Please see my reply inline. On Wed, Feb 1, 2017 at 1:54 PM, Colin McCabe <cmcc...@apache.org> wrote: > On Wed, Feb 1, 2017, at 11:35, Dong Lin wrote: > > Hey Grant, Colin, > > > > My bad, I misunderstood Grant's s

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
e code against disk errors. Formerly it was OK to just crash on a > > disk error; now it is not. It would be nice to see more in the test > > plan about injecting IOExceptions into disk handling code and verifying > > that we can handle it correctly. > > > > regards, &

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
w people using min.isr to keep > their data safe and the cluster operators would see a shrink in many ISRs > and hopefully an obvious log message leading to a quick fix. I haven't > thought through this idea in depth though. So there could be some > shortfalls. > > Thanks, >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
ing (as I understand it). > > Disks are not the only resource on a machine, there are several instances > where multiple NICs are used for example. Do we want fine grained > management of all these resources? I'd argue that opens us the system to a > lot of complexity. > > Th

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-31 Thread Dong Lin
mplexity of administration with more > running instances. > > is anyone running kafka with anywhere near 100GB heaps? i thought the point > was to rely on kernel page cache to do the disk buffering > > On Thu, Jan 26, 2017 at 11:00 AM, Dong Lin <lindon...@gmail.com> wrot

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-31 Thread Dong Lin
This thread was been closed on Jan 18. We had more discussion after Guozhang's feedback on Jan 21. But no major change was made to the KIP after the discussion. On Tue, Jan 31, 2017 at 5:47 PM, Dong Lin <lindon...@gmail.com> wrote: > Hey Apurva, > > I think the KIP

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-31 Thread Dong Lin
ion of > returning > > a future from purgeDataFrom(). We can keep it that way. > > > > Thanks, > > > > Jun > > > > On Mon, Jan 23, 2017 at 4:24 PM, Dong Lin <lindon...@gmail.com> wrote: > > > > > Hi all, > > > > &g

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-26 Thread Dong Lin
Hey Colin, Thanks much for the comment. Please see me comment inline. On Thu, Jan 26, 2017 at 10:15 AM, Colin McCabe <cmcc...@apache.org> wrote: > On Wed, Jan 25, 2017, at 13:50, Dong Lin wrote: > > Hey Colin, > > > > Good point! Yeah we have actually consider

Re: [VOTE] KIP-115: Enforce offsets.topic.replication.factor

2017-01-25 Thread Dong Lin
+1 On Wed, Jan 25, 2017 at 4:37 PM, Ismael Juma wrote: > +1 (binding) > > Ismael > > On Thu, Jan 26, 2017 at 12:34 AM, Onur Karaman < > onurkaraman.apa...@gmail.com > > wrote: > > > I'd like to start the vote for KIP-115: Enforce > > offsets.topic.replication.factor > > > >

Re: [DISCUSS] KIP-115: Enforce offsets.topic.replication.factor

2017-01-25 Thread Dong Lin
+1 On Wed, Jan 25, 2017 at 4:22 PM, Ismael Juma wrote: > An important question is if this needs to wait for a major release or not. > > Ismael > > On Thu, Jan 26, 2017 at 12:19 AM, Ismael Juma wrote: > > > +1 from me too. > > > > Ismael > > > > On Thu, Jan

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-25 Thread Dong Lin
ed in the "alternate designs" design, even if you end > up deciding it's not the way to go. > > best, > Colin > > > On Thu, Jan 12, 2017, at 10:46, Dong Lin wrote: > > Hi all, > > > > We created KIP-112: Handle disk failure for JBOD. Please find the KIP

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-24 Thread Dong Lin
Hey Mayuresh, Thanks for the KIP. I actually like the suggestions by Ismael and Jun. Here are my comments: 1. I am not sure we need to add the method buildPrincipal(Map principalConfigs). It seems that user can simply do principalBuilder.configure(...).buildPrincipal(...) without

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-01-24 Thread Dong Lin
3:30 AM, Alexey Ozeritsky <aozerit...@yandex.ru> wrote: > > > 23.01.2017, 22:11, "Dong Lin" <lindon...@gmail.com>: > > Thanks. Please see my comment inline. > > > > On Mon, Jan 23, 2017 at 6:45 AM, Alexey Ozeritsky <aozerit...@yandex.ru> > > w

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-23 Thread Dong Lin
rg/confluence/pages/diffpagesbyversion.action?pageId=67636826=13=14>. Please let me know if you have any concern with this change. Thanks, Dong On Mon, Jan 23, 2017 at 11:20 AM, Dong Lin <lindon...@gmail.com> wrote: > Thanks for the comment Jun. > > Yeah, I think there is us

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-23 Thread Dong Lin
ly, it seems that it's simpler to just have a > blocking api and returns Map<TopicPartition, PurgeDataResult>? > > Thanks, > > Jun > > On Sun, Jan 22, 2017 at 3:56 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Thanks for the comment Guozhang.

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-01-23 Thread Dong Lin
Thanks. Please see my comment inline. On Mon, Jan 23, 2017 at 6:45 AM, Alexey Ozeritsky <aozerit...@yandex.ru> wrote: > > > 13.01.2017, 22:29, "Dong Lin" <lindon...@gmail.com>: > > Hey Alexey, > > > > Thanks for your review and the alternative app

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-22 Thread Dong Lin
ir level. > > 4. When broker had one of the dir failed, it can modify its " > /brokers/ids/[brokerId]" registry and remove the dir id, controller already > listening on this path can then be notified and run the replica assignment > accordingly where replica id is computed as

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-22 Thread Dong Lin
onding before the vote is called. Just wanted to point out > for the record that this approach may have some operational scenarios where > one of the replication files is missing and we need to treat them > specifically. > > > Guozhang > > > On Sun, Jan 22, 2017 at

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-22 Thread Dong Lin
t have a partition that > only have one of the watermarks in case of a failure in between writing two > files. > > Guozhang > > On Sun, Jan 22, 2017 at 12:03 AM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Guozhang, > > > > Thanks for the revi

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-22 Thread Dong Lin
will not be affected. > > > Guozhang > > > > On Wed, Jan 18, 2017 at 6:12 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Thanks to everyone who voted and provided feedback! > > > > This KIP is now adopted with 3 binding +1s (Jun, Joel, Becket) an

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-18 Thread Dong Lin
ate. +1 > > Jun > > On Wed, Jan 18, 2017 at 1:44 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hi Jun, > > > > After some more thinking, I agree with you that it is better to simply > > throw OffsetOutOfRangeException and not update low_watermark if > >

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-18 Thread Dong Lin
than highWatermark? > > Thanks, > > Jun > > On Tue, Jan 17, 2017 at 9:54 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hi Jun, > > > > Thank you. Please see my answers below. The KIP is updated to answer > these > > questions (see here > > <

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-18 Thread Dong Lin
6826=7=8> change in the KIP. On Tue, Jan 17, 2017 at 9:54 PM, Dong Lin <lindon...@gmail.com> wrote: > Hi Jun, > > Thank you. Please see my answers below. The KIP is updated to answer these > questions (see here > <https://cwiki.apache.org/confluence/pages/diffpagesbyv

<    1   2   3   4   5   6   7   8   9   10   >