Permissions to create a KIP

2019-08-18 Thread W. D.
Hi, I would like to create a KIP with my user werner.daehn.

Thanks in advance

Werner


Re: [DISCUSS] KIP-482: The Kafka Protocol should Support Optional Fields

2019-08-18 Thread Satish Duggana
Please read struct type as a complex record type in my earlier mail.
The complex type seems to be defined as Schema[1] in the protocol
types.

1. 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/protocol/types/Schema.java#L27


On Mon, Aug 19, 2019 at 9:46 AM Satish Duggana  wrote:
>
> Sorry! Colin, I may not have been clear in my earlier query about
> optional field type restriction. It is mentioned in one of your
> replies "optional fields are serialized starting with their total
> length". This brings the question of whether optional fields support
> struct types (with or without array values). It seems struct types are
> currently not serialized with total length. I may be missing something
> here.
>
> Thanks,
> Satish.
>
>
> On Wed, Aug 14, 2019 at 8:03 AM Satish Duggana  
> wrote:
> >
> > Hi Colin,
> > Thanks for the KIP. Optional fields and var length encoding support is a 
> > great
> > improvement for the protocol.
> >
> > >>Optional fields can have any type, except that they cannot be arrays.
> > Note that the restriction against having tagged arrays is just to simplify
> > serialization.  We can relax this restriction in the future without changing
> > the protocol on the wire.
> >
> > Can an Optional field have a struct type which internally contains an array
> > field at any level?
> >
> > Thanks,
> > Satish.
> >
> >
> >
> > On Tue, Aug 13, 2019 at 11:49 PM David Jacot  wrote:
> > >
> > > Hi Colin,
> > >
> > > Thank you for the KIP! Things are well explained!. It is huge improvement
> > > for the Kafka protocol. I have few comments on the proposal:
> > >
> > > 1. The interleaved tag/length header sounds like a great optimisation as 
> > > it
> > > would be shorter on average. The downside, as
> > > you already pointed out, is that it makes the decoding and the specs more
> > > complex. Personally, I would also favour using two
> > > vaints in this particular case to keep things simple.
> > >
> > > 2. As discussed, I wonder if it would make sense to extend to KIP to also
> > > support optional fields in the Record Header. I think
> > > that it could be interesting to have such capability for common fields
> > > across all the requests or responses (e.g. tracing id).
> > >
> > > Regards,
> > > David
> > >
> > >
> > >
> > > On Tue, Aug 13, 2019 at 10:00 AM Jason Gustafson  
> > > wrote:
> > >
> > > > > Right, I was planning on doing exactly that for all the auto-generated
> > > > RPCs. For the manual RPCs, it would be a lot of work. It’s probably a
> > > > better use of time to convert the manual ones to auto gen first (with 
> > > > the
> > > > possible exception of Fetch/Produce, where the ROI may be higher for the
> > > > manual work)
> > > >
> > > > Yeah, that makes sense. Maybe we can include the version bump for all 
> > > > RPCs
> > > > in this KIP, but we can implement it lazily as the protocols are 
> > > > converted.
> > > >
> > > > -Jason
> > > >
> > > > On Mon, Aug 12, 2019 at 7:16 PM Colin McCabe  wrote:
> > > >
> > > > > On Mon, Aug 12, 2019, at 11:22, Jason Gustafson wrote:
> > > > > > Hi Colin,
> > > > > >
> > > > > > Thanks for the KIP! This is a significant improvement. One of my
> > > > personal
> > > > > > interests in this proposal is solving the compatibility problems we
> > > > have
> > > > > > with the internal schemas used to define consumer offsets and
> > > > transaction
> > > > > > metadata. Currently we have to guard schema bumps with the 
> > > > > > inter-broker
> > > > > > protocol format. Once the format is bumped, there is no way to
> > > > downgrade.
> > > > > > By fixing this, we can potentially begin using the new schemas 
> > > > > > before
> > > > the
> > > > > > IBP is bumped while still allowing downgrade.
> > > > > >
> > > > > > There are a surprising number of other situations we have 
> > > > > > encountered
> > > > > this
> > > > > > sort of problem. We have hacked around it in special cases by 
> > > > > > allowing
> > > > > > nullable fields to the end of the schema, but this is not really an
> > > > > > extensible approach. I'm looking forward to having a better option.
> > > > >
> > > > > Yeah, this problem keeps coming up.
> > > > >
> > > > > >
> > > > > > With that said, I have a couple questions on the proposal:
> > > > > >
> > > > > > 1. For each request API, we need one version bump to begin support 
> > > > > > for
> > > > > > "flexible versions." Until then, we won't have the option of using
> > > > tagged
> > > > > > fields even if the broker knows how to handle them. Does it make 
> > > > > > sense
> > > > to
> > > > > > go ahead and do a universal bump of each request API now so that 
> > > > > > we'll
> > > > > have
> > > > > > this option going forward?
> > > > >
> > > > > Right, I was planning on doing exactly that for all the auto-generated
> > > > > RPCs. For the manual RPCs, it would be a lot of work. It’s probably a
> > > > > better use of time to convert the manual ones to auto gen first (w

Re: [DISCUSS] KIP-482: The Kafka Protocol should Support Optional Fields

2019-08-18 Thread Satish Duggana
Sorry! Colin, I may not have been clear in my earlier query about
optional field type restriction. It is mentioned in one of your
replies "optional fields are serialized starting with their total
length". This brings the question of whether optional fields support
struct types (with or without array values). It seems struct types are
currently not serialized with total length. I may be missing something
here.

Thanks,
Satish.


On Wed, Aug 14, 2019 at 8:03 AM Satish Duggana  wrote:
>
> Hi Colin,
> Thanks for the KIP. Optional fields and var length encoding support is a great
> improvement for the protocol.
>
> >>Optional fields can have any type, except that they cannot be arrays.
> Note that the restriction against having tagged arrays is just to simplify
> serialization.  We can relax this restriction in the future without changing
> the protocol on the wire.
>
> Can an Optional field have a struct type which internally contains an array
> field at any level?
>
> Thanks,
> Satish.
>
>
>
> On Tue, Aug 13, 2019 at 11:49 PM David Jacot  wrote:
> >
> > Hi Colin,
> >
> > Thank you for the KIP! Things are well explained!. It is huge improvement
> > for the Kafka protocol. I have few comments on the proposal:
> >
> > 1. The interleaved tag/length header sounds like a great optimisation as it
> > would be shorter on average. The downside, as
> > you already pointed out, is that it makes the decoding and the specs more
> > complex. Personally, I would also favour using two
> > vaints in this particular case to keep things simple.
> >
> > 2. As discussed, I wonder if it would make sense to extend to KIP to also
> > support optional fields in the Record Header. I think
> > that it could be interesting to have such capability for common fields
> > across all the requests or responses (e.g. tracing id).
> >
> > Regards,
> > David
> >
> >
> >
> > On Tue, Aug 13, 2019 at 10:00 AM Jason Gustafson  wrote:
> >
> > > > Right, I was planning on doing exactly that for all the auto-generated
> > > RPCs. For the manual RPCs, it would be a lot of work. It’s probably a
> > > better use of time to convert the manual ones to auto gen first (with the
> > > possible exception of Fetch/Produce, where the ROI may be higher for the
> > > manual work)
> > >
> > > Yeah, that makes sense. Maybe we can include the version bump for all RPCs
> > > in this KIP, but we can implement it lazily as the protocols are 
> > > converted.
> > >
> > > -Jason
> > >
> > > On Mon, Aug 12, 2019 at 7:16 PM Colin McCabe  wrote:
> > >
> > > > On Mon, Aug 12, 2019, at 11:22, Jason Gustafson wrote:
> > > > > Hi Colin,
> > > > >
> > > > > Thanks for the KIP! This is a significant improvement. One of my
> > > personal
> > > > > interests in this proposal is solving the compatibility problems we
> > > have
> > > > > with the internal schemas used to define consumer offsets and
> > > transaction
> > > > > metadata. Currently we have to guard schema bumps with the 
> > > > > inter-broker
> > > > > protocol format. Once the format is bumped, there is no way to
> > > downgrade.
> > > > > By fixing this, we can potentially begin using the new schemas before
> > > the
> > > > > IBP is bumped while still allowing downgrade.
> > > > >
> > > > > There are a surprising number of other situations we have encountered
> > > > this
> > > > > sort of problem. We have hacked around it in special cases by allowing
> > > > > nullable fields to the end of the schema, but this is not really an
> > > > > extensible approach. I'm looking forward to having a better option.
> > > >
> > > > Yeah, this problem keeps coming up.
> > > >
> > > > >
> > > > > With that said, I have a couple questions on the proposal:
> > > > >
> > > > > 1. For each request API, we need one version bump to begin support for
> > > > > "flexible versions." Until then, we won't have the option of using
> > > tagged
> > > > > fields even if the broker knows how to handle them. Does it make sense
> > > to
> > > > > go ahead and do a universal bump of each request API now so that we'll
> > > > have
> > > > > this option going forward?
> > > >
> > > > Right, I was planning on doing exactly that for all the auto-generated
> > > > RPCs. For the manual RPCs, it would be a lot of work. It’s probably a
> > > > better use of time to convert the manual ones to auto gen first (with 
> > > > the
> > > > possible exception of Fetch/Produce, where the ROI may be higher for the
> > > > manual work)
> > > >
> > > > > 2. The alternating length/tag header encoding lets us save a byte in
> > > the
> > > > > common case. The downside is that it's a bit more complex to specify.
> > > It
> > > > > also has some extra cost if the length exceeds the tag substantially.
> > > > > Basically we'd have to pad the tag, right? I think I'm wondering if we
> > > > > should just bite the bullet and use two varints instead.
> > > >
> > > > That’s a fair point. It would be shorter on average, but worse for some
> > > > exceptional cases. Also, the dec

[jira] [Created] (KAFKA-8814) Consumer benchmark test for paused partitions

2019-08-18 Thread Sean Glover (JIRA)
Sean Glover created KAFKA-8814:
--

 Summary: Consumer benchmark test for paused partitions
 Key: KAFKA-8814
 URL: https://issues.apache.org/jira/browse/KAFKA-8814
 Project: Kafka
  Issue Type: New Feature
  Components: consumer, system tests, tools
Reporter: Sean Glover
Assignee: Sean Glover


A new performance benchmark and corresponding {{ConsumerPerformance}} tools 
addition to support the paused partition performance improvement implemented in 
KAFKA-7548.  Before the fix, when the user would poll for completed fetched 
records for partitions that were paused, the consumer would throw away the data 
because it no longer fetchable.  If the partition is resumed then the data 
would have to be fetched over again.  The fix will cache completed fetched 
records for paused partitions indefinitely so they can be potentially be 
returned once the partition is resumed.

In the Jira issue KAFKA-7548 there are several informal test results shown 
based on a number of different paused partition scenarios, but it was suggested 
that a test in the benchmarks testsuite would be ideal to demonstrate the 
performance improvement.  In order to the implement this benchmark we must 
implement a new feature in {{ConsumerPerformance}} used by the benchmark 
testsuite and the {{kafka-consumer-perf-test.sh}} bin script that will pause 
partitions.  I added the following parameter:

{code:scala}
val pausedPartitionsOpt = parser.accepts("paused-partitions-percent", "The 
percentage [0-1] of subscribed " +
  "partitions to pause each poll.")
.withOptionalArg()
.describedAs("percent")
.withValuesConvertedBy(regex("^0(\\.\\d+)?|1\\.0$")) // matches [0-1] 
with decimals
.ofType(classOf[Float])
.defaultsTo(0F)
{code}

This allows the user to specify a percentage (represented a floating point 
value from {{0..1}}) of partitions to pause each poll interval.  When the value 
is greater than {{0}} then we will take the next _n_ partitions to pause.  I 
ran the test on `trunk` and rebased onto the `2.3.0` tag for the following test 
summaries of 
{{kafkatest.benchmarks.core.benchmark_test.Benchmark.test_consumer_throughput}}.
  The test will rotate through pausing {{80%}} of assigned partitions (5/6) 
each poll interval.  I ran this on my laptop.

{{trunk}} ({{aa4ba8eee8e6f52a9d80a98fb2530b5bcc1b9a11}})

{code}

SESSION REPORT (ALL TESTS)
ducktape version: 0.7.5
session_id:   2019-08-18--010
run time: 2 minutes 29.104 seconds
tests run:1
passed:   1
failed:   0
ignored:  0

test_id:
kafkatest.benchmarks.core.benchmark_test.Benchmark.test_consumer_throughput.paused_partitions_percent=0.8
status: PASS
run time:   2 minutes 29.048 seconds
{"records_per_sec": 450207.0953, "mb_per_sec": 42.9351}

{code}

{{2.3.0}}

{code}

SESSION REPORT (ALL TESTS)
ducktape version: 0.7.5
session_id:   2019-08-18--011
run time: 2 minutes 41.228 seconds
tests run:1
passed:   1
failed:   0
ignored:  0

test_id:
kafkatest.benchmarks.core.benchmark_test.Benchmark.test_consumer_throughput.paused_partitions_percent=0.8
status: PASS
run time:   2 minutes 41.168 seconds
{"records_per_sec": 246574.6024, "mb_per_sec": 23.5152}

{code}

The increase in record and data throughput is significant.  Based on other 
consumer fetch metrics there are also improvements to fetch rate.  Depending on 
how often partitions are paused and resumed it's possible to save a lot of data 
transfer between the consumer and broker as well.

Please see the pull request for the associated changes.  I was unsure if I 
needed to create a KIP because while technically I added a new public api to 
the {{ConsumerPerformance}} tool, it was only to enable this benchmark to run.  
If you feel that a KIP is necessary I'll create one.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KAFKA-8813) Race condition when creating topics and changing their configuration

2019-08-18 Thread Gwen Shapira (JIRA)
Gwen Shapira created KAFKA-8813:
---

 Summary: Race condition when creating topics and changing their 
configuration
 Key: KAFKA-8813
 URL: https://issues.apache.org/jira/browse/KAFKA-8813
 Project: Kafka
  Issue Type: Bug
Reporter: Gwen Shapira


In Partition.createLog we do:

{{val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
props)val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
isFutureReplica)}}

Config changes that arrive after configs are loaded from ZK, but before 
LogManager added the partition to `futureLogs` or `currentLogs` where the 
dynamic config handlers picks up topics to update their configs, will be lost.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [VOTE] KIP-504 - Add new Java Authorizer Interface

2019-08-18 Thread Mickael Maison
+1 (non binding)
Thanks for the KIP!

On Sun, Aug 18, 2019 at 3:05 PM Ron Dagostino  wrote:
>
> +1 (non-binding)
>
> Thanks, Rajini.
>
> Ron
>
> On Sat, Aug 17, 2019 at 10:16 AM Harsha Chintalapani 
> wrote:
>
> > +1 (binding).
> >
> > Thanks,
> > Harsha
> >
> >
> > On Sat, Aug 17, 2019 at 2:53 AM, Manikumar 
> > wrote:
> >
> > > Hi,
> > >
> > > +1 (binding).
> > >
> > > Thanks for the KIP. LGTM.
> > >
> > > Regards,
> > > Manikumar
> > >
> > > On Sat, Aug 17, 2019 at 4:41 AM Colin McCabe  wrote:
> > >
> > > +1 (binding)
> > >
> > > Thanks, Rajini!
> > >
> > > best,
> > > Colin
> > >
> > > On Fri, Aug 16, 2019, at 09:52, Rajini Sivaram wrote:
> > >
> > > Hi all,
> > >
> > > I would like to start the vote for KIP-504:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/
> > > KIP-504+-+Add+new+Java+Authorizer+Interface
> > >
> > > This KIP replaces the Scala Authorizer API with a new Java API similar to
> > > other pluggable APIs in the broker and addresses known limitations in the
> > > existing API.
> > >
> > > Thanks for all the feedback!
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> >


Re: [DISCUSS] KIP-504 - Add new Java Authorizer Interface

2019-08-18 Thread Don Bosco Durai
Hi Rajini

Thanks for clarifying. I am good for now.

Regards

Bosco


On 8/16/19, 11:30 AM, "Rajini Sivaram"  wrote:

Hi Don,

That should be fine. I guess Ranger loads policies from the database
synchronously when the authorizer is configured, similar to
SimpleAclAuthorizer loading from ZooKeeper. Ranger can continue to load
synchronously from `configure()` or `start()` and return an empty map from
`start()`. That would retain the existing behaviour.. When the same
database stores policies for all listeners and the policies are not stored
in Kafka, there is no value in making the load asynchronous.

Regards,

Rajini


On Fri, Aug 16, 2019 at 6:43 PM Don Bosco Durai  wrote:

> Hi Rajini
>
> Assuming this doesn't affect custom plugins, I don't have any concerns
> regarding this.
>
> I do have one question regarding:
>
> "For authorizers that don’t store metadata in ZooKeeper, ensure that
> authorizer metadata for each listener is available before starting up the
> listener. This enables different authorization metadata stores for
> different listeners."
>
> Since Ranger uses its own database for storing policies, do you anticipate
> any issues?
>
> Thanks
>
> Bosco
>
>
> On 8/16/19, 6:49 AM, "Rajini Sivaram"  wrote:
>
> Hi all,
>
> I made another change to the KIP. The KIP was originally proposing to
> extend SimpleAclAuthorizer to also implement the new API (in addition
> to
> the existing API). But since we use the new API when available, this
> can
> break custom authorizers that extend this class and override specific
> methods of the old API. To avoid breaking any existing custom
> implementations that extend this class, particularly because it is in
> the
> public package kafka.security.auth, the KIP now proposes to retain the
> old
> authorizer as-is.  SimpleAclAuthorizer will continue to implement the
> old
> API, but will be deprecated. A new authorizer implementation
> kafka.security.authorizer.AclAuthorizer will be added for the new API
> (this
> will not be in the public package).
>
> Please let me know if you have any concerns.
>
> Regards,
>
> Rajini
>
>
> On Fri, Aug 16, 2019 at 8:48 AM Rajini Sivaram <
> rajinisiva...@gmail.com>
> wrote:
>
> > Thanks Colin.
> >
> > If there are no other concerns, I will start vote later today. Many
> thanks
> > to every one for the feedback.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Thu, Aug 15, 2019 at 11:57 PM Colin McCabe 
> wrote:
> >
> >> Thanks, Rajini.  It looks good to me.
> >>
> >> best,
> >> Colin
> >>
> >>
> >> On Thu, Aug 15, 2019, at 11:37, Rajini Sivaram wrote:
> >> > Hi Colin,
> >> >
> >> > Thanks for the review. I have updated the KIP to move the
> interfaces for
> >> > request context and server info to the authorizer package. These
> are now
> >> > called AuthorizableRequestContext and AuthorizerServerInfo.
> Endpoint is
> >> now
> >> > a class in org.apache.kafka.common to make it reusable since we
> already
> >> > have multiple implementations of it. I have removed requestName
> from the
> >> > request context interface since authorizers can distinguish
> follower
> >> fetch
> >> > and consumer fetch from the operation being authorized. So 16-bit
> >> request
> >> > type should be sufficient for audit logging.  Also replaced
> AuditFlag
> >> with
> >> > two booleans as you suggested.
> >> >
> >> > Can you take another look and see if the KIP is ready for voting?
> >> >
> >> > Thanks for all your help!
> >> >
> >> > Regards,
> >> >
> >> > Rajini
> >> >
> >> > On Wed, Aug 14, 2019 at 8:59 PM Colin McCabe 
> >> wrote:
> >> >
> >> > > Hi Rajini,
> >> > >
> >> > > I think it would be good to rename KafkaRequestContext to
> something
> >> like
> >> > > AuthorizableRequestContext, and put it in the
> >> > > org.apache.kafka.server.authorizer namespace.  If we put it in
> the
> >> > > org.apache.kafka.common namespace, then it's not really clear
> that
> >> it's
> >> > > part of the Authorizer API.  Since this class is purely an
> interface,
> >> with
> >> > > no concrete implementation of anything, there's nothing common
> to
> >> really
> >> > > reuse in any case.  We definitel

Re: [VOTE] KIP-503: deleted topics metric

2019-08-18 Thread Gwen Shapira
+1 (binding)
This will be most useful. Thank you.

On Tue, Aug 13, 2019 at 12:08 PM David Arthur  wrote:
>
> Hello all,
>
> I'd like to start the vote on KIP-503
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-503%3A+Add+metric+for+number+of+topics+marked+for+deletion
>
> Thanks!
> David



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Re: [VOTE] KIP-503: deleted topics metric

2019-08-18 Thread Satish Duggana
+1(non-binding), very useful metrics for admin/ops.

Thanks,
Satish.


On Sat, Aug 17, 2019 at 4:08 PM Manikumar  wrote:

> +1 (binding).
>
> Thanks for the KIP. LGTM.
>
> Thanks,
> Manikumar
>
> On Wed, Aug 14, 2019 at 11:54 PM David Jacot  wrote:
>
> > +1 (non-binding)
> >
> > Thanks for the KIP! Simple yet very useful.
> >
> > Best,
> > David
> >
> > On Wed, Aug 14, 2019 at 9:24 AM Robert Barrett  >
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > This will be good to have, thanks David!
> > >
> > > Bob
> > >
> > > On Wed, Aug 14, 2019 at 6:08 AM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > > wrote:
> > >
> > > > +1 non binding
> > > > Thank you!
> > > >
> > > > On Tue, Aug 13, 2019 at 9:07 PM Stanislav Kozlovski
> > > >  wrote:
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks for the simple but very useful KIP!
> > > > > Best,
> > > > > Stanislav
> > > > >
> > > > > On Tue, Aug 13, 2019 at 8:32 PM Harsha Chintalapani <
> ka...@harsha.io
> > >
> > > > wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > Thanks,
> > > > > > Harsha
> > > > > >
> > > > > >
> > > > > > On Tue, Aug 13, 2019 at 12:08 PM, David Arthur <
> > > davidart...@apache.org
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello all,
> > > > > > >
> > > > > > > I'd like to start the vote on KIP-503
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/
> > > > > > > KIP-503%3A+Add+metric+for+number+of+topics+marked+for+deletion
> > > > > > >
> > > > > > > Thanks!
> > > > > > > David
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best,
> > > > > Stanislav
> > > >
> > >
> >
>


Re: all tests passed but still got ‘Some checks were not successful’

2019-08-18 Thread M. Manna
Could you send a retest ? Add “Retest this please” comment. It’ll kick off.

On Sun, 18 Aug 2019 at 16:16, hong mao  wrote:

> Hi all,
> I submitted a PR and all testcases passed in Jenkins, but I still got a
> 'Some checks were not successful' tip. Could anybody give some advices?
> Here's the PR link https://github.com/apache/kafka/pull/7153
> [image: image.png]
>
> Thanks!
>


all tests passed but still got ‘Some checks were not successful’

2019-08-18 Thread hong mao
Hi all,
I submitted a PR and all testcases passed in Jenkins, but I still got a
'Some checks were not successful' tip. Could anybody give some advices?
Here's the PR link https://github.com/apache/kafka/pull/7153
[image: image.png]

Thanks!


[jira] [Resolved] (KAFKA-8812) Rebalance Producers - yes, I mean it ;-)

2019-08-18 Thread Werner Daehn (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Werner Daehn resolved KAFKA-8812.
-
Resolution: Invalid

Will create a KIP.

> Rebalance Producers - yes, I mean it ;-)
> 
>
> Key: KAFKA-8812
> URL: https://issues.apache.org/jira/browse/KAFKA-8812
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Werner Daehn
>Priority: Major
>
> Please bare with me. Initially this thought sounds stupid but it has its 
> merits.
>  
> How do you build a distributed producer at the moment? You use Kafka Connect 
> which in turn requires a cluster that tells which instance is producing what 
> partitions.
> On the consumer side it is different. There Kafka itself does the data 
> distribution. If you have 10 Kafka partitions and 10 consumers, each will get 
> data for one partition. With 5 consumers, each will get data from two 
> partitions. And if there is only a single consumer active, it gets all data. 
> All is managed by Kafka, all you have to do is start as many consumers as you 
> want.
>  
> I'd like to suggest something similar for the producers. A producer would 
> tell Kafka that its source has 10 partitions. The Kafka server then responds 
> with a list of partitions this instance shall be responsible for. If it is 
> the only producer, the response would be all 10 partitions. If it is the 
> second instance starting up, the first instance would get the information it 
> should produce data for partition 1-5 and the new one for partition 6-10. If 
> the producer fails to respond with an alive packet, a rebalance does happen, 
> informing the active producer to take more load and the dead producer will 
> get an error when sending data again.
> For restart, the producer rebalance has to send the starting point where to 
> start producing the data onwards from as well, of course. Would be best if 
> this is a user generated pointer and not the topic offset. Then it can be 
> e.g. the database system change number, a database transaction id or 
> something similar.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [VOTE] KIP-504 - Add new Java Authorizer Interface

2019-08-18 Thread Ron Dagostino
+1 (non-binding)

Thanks, Rajini.

Ron

On Sat, Aug 17, 2019 at 10:16 AM Harsha Chintalapani 
wrote:

> +1 (binding).
>
> Thanks,
> Harsha
>
>
> On Sat, Aug 17, 2019 at 2:53 AM, Manikumar 
> wrote:
>
> > Hi,
> >
> > +1 (binding).
> >
> > Thanks for the KIP. LGTM.
> >
> > Regards,
> > Manikumar
> >
> > On Sat, Aug 17, 2019 at 4:41 AM Colin McCabe  wrote:
> >
> > +1 (binding)
> >
> > Thanks, Rajini!
> >
> > best,
> > Colin
> >
> > On Fri, Aug 16, 2019, at 09:52, Rajini Sivaram wrote:
> >
> > Hi all,
> >
> > I would like to start the vote for KIP-504:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/
> > KIP-504+-+Add+new+Java+Authorizer+Interface
> >
> > This KIP replaces the Scala Authorizer API with a new Java API similar to
> > other pluggable APIs in the broker and addresses known limitations in the
> > existing API.
> >
> > Thanks for all the feedback!
> >
> > Regards,
> >
> > Rajini
> >
> >
>


[jira] [Created] (KAFKA-8812) Rebalance Producers - yes, I mean it ;-)

2019-08-18 Thread Werner Daehn (JIRA)
Werner Daehn created KAFKA-8812:
---

 Summary: Rebalance Producers - yes, I mean it ;-)
 Key: KAFKA-8812
 URL: https://issues.apache.org/jira/browse/KAFKA-8812
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 2.3.0
Reporter: Werner Daehn


Please bare with me. Initially this thought sounds stupid but it has its merits.

 

How do you build a distributed producer at the moment? You use Kafka Connect 
which in turn requires a cluster that tells which instance is producing what 
partitions.

On the consumer side it is different. There Kafka itself does the data 
distribution. If you have 10 Kafka partitions and 10 consumers, each will get 
data for one partition. With 5 consumers, each will get data from two 
partitions. And if there is only a single consumer active, it gets all data. 
All is managed by Kafka, all you have to do is start as many consumers as you 
want.

 

I'd like to suggest something similar for the producers. A producer would tell 
Kafka that its source has 10 partitions. The Kafka server then responds with a 
list of partitions this instance shall be responsible for. If it is the only 
producer, the response would be all 10 partitions. If it is the second instance 
starting up, the first instance would get the information it should produce 
data for partition 1-5 and the new one for partition 6-10. If the producer 
fails to respond with an alive packet, a rebalance does happen, informing the 
active producer to take more load and the dead producer will get an error when 
sending data again.

For restart, the producer rebalance has to send the starting point where to 
start producing the data onwards from as well, of course. Would be best if this 
is a user generated pointer and not the topic offset. Then it can be e.g. the 
database system change number, a database transaction id or something similar.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)