[jira] [Created] (KAFKA-17120) Race condition in KafkaStreams.close can result in StreamsException "Failed to shut down while in state"

2024-07-11 Thread Andy Coates (Jira)
Andy Coates created KAFKA-17120:
---

 Summary: Race condition in KafkaStreams.close can result in 
StreamsException "Failed to shut down while in state"
 Key: KAFKA-17120
 URL: https://issues.apache.org/jira/browse/KAFKA-17120
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.6.1
Reporter: Andy Coates


If `KafkaStreams.close` is called while the app is transitioning to an error 
state, i.e.  `PENDING_ERROR` and `ERROR`, it seems there is a race condition 
that can result in the `close` method throwing a `StreamsException` with the 
message "Failed to shut down while in state PENDING_ERROR":

```

[2024-07-02T13:09:05.182Z] 13:09:04.921 
[mktx.com.cdc:cdc-processor-b586687d-e969-45eb-949c-09042b0d358b-b76cd6d2-0549-4cf9-b7e5-6f886d408495-StreamThread-1]
 INFO org.apache.kafka.streams.KafkaStreams - stream-client 
[mktx.com.cdc:cdc-processor-b586687d-e969-45eb-949c-09042b0d358b-b76cd6d2-0549-4cf9-b7e5-6f886d408495]
 State transition from RUNNING to PENDING_ERROR

2024-07-02T13:09:05.182Z] 13:09:04.921 [Thread-30] ERROR 
org.apache.kafka.streams.KafkaStreams - stream-client 
[cdc:cdc-processor-b586687d-e969-45eb-949c-09042b0d358b-b76cd6d2-0549-4cf9-b7e5-6f886d408495]
 Failed to transition to PENDING_SHUTDOWN, current state is PENDING_ERROR

[2024-07-02T13:09:05.182Z] 13:09:04.922 
[cdc:cdc-processor-b586687d-e969-45eb-949c-09042b0d358b-b76cd6d2-0549-4cf9-b7e5-6f886d408495-StreamThread-1]
 INFO com.cdc.observability.LoggingApplicationLifecycleListener - 
\{"state":"PENDING_ERROR","message":"Application-lifecycle","oldState":"RUNNING"}

[2024-07-02T13:09:05.182Z] 13:09:04.922 [Thread-30] ERROR 
com.cdc.observability.LoggingApplicationLifecycleListener - {"reason":"Failed 
to stop the Kafka Streams app as streams.close() threw an 
exception","cause":"org.apache.kafka.streams.errors.StreamsException: Failed to 
shut down while in state PENDING_ERROR\n\tat 
org.apache.kafka.streams.KafkaStreams.close(KafkaStreams.java:1447)\n\tat 
org.apache.kafka.streams.KafkaStreams.close(KafkaStreams.java:1497)\n\tat 
com.cdc.streams.KafkaStreamsExecutor.shutdownStreams(KafkaStreamsExecutor.java:90)\n\tat
 
com.cdc.streams.KafkaStreamsExecutor.execute(KafkaStreamsExecutor.java:70)\n\tat
 com.cdc.Main.startKafkaStreams(Main.java:148)\n
```

Could this be caused by lack of synchronisation in the `close` method around 
the state checks?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14660) Divide by zero security vulnerability (sonatype-2019-0422)

2023-02-04 Thread Andy Coates (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates resolved KAFKA-14660.
-
Resolution: Fixed

> Divide by zero security vulnerability (sonatype-2019-0422)
> --
>
> Key: KAFKA-14660
> URL: https://issues.apache.org/jira/browse/KAFKA-14660
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.3.2
>    Reporter: Andy Coates
>Assignee: Matthias J. Sax
>Priority: Minor
> Fix For: 3.5.0
>
>
> Looks like SonaType has picked up a "Divide by Zero" issue reported in a PR 
> and, because the PR was never merged, is now reporting it as a security 
> vulnerability in the latest Kafka Streams library.
>  
> See:
>  * [Vulnerability: 
> sonatype-2019-0422]([https://ossindex.sonatype.org/vulnerability/sonatype-2019-0422?component-type=maven&component-name=org.apache.kafka%2Fkafka-streams&utm_source=ossindex-client&utm_medium=integration&utm_content=1.7.0)]
>  * [Original PR]([https://github.com/apache/kafka/pull/7414])
>  
> While it looks from the comments made by [~mjsax] and [~bbejeck] that the 
> divide-by-zero is not really an issue, the fact that its now being reported 
> as a vulnerability is, especially with regulators.
> PITA, but we should consider either getting this vulnerability removed 
> (Google wasn't very helpful in providing info on how to do this), or fixed 
> (Again, not sure how to tag the fix as fixing this issue).  One option may 
> just be to reopen the PR and merge (and then fix forward by switching it to 
> throw an exception).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (KAFKA-14660) Divide by zero security vulnerability (sonatype-2019-0422)

2023-02-02 Thread Andy Coates (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates reopened KAFKA-14660:
-

The issue here is more the SonaType security vulnerability report than any 
impossible to reach divide by zero issue. Unfortunately, I'm struggling to find 
information on _how_ to mark the vulnerability resolved in SonaType.  This was 
why I was suggesting opening and merging the PR, as it seems the PR is the 
cause of the report.

I realise the PR's solution wasn't ideal. Hence I was suggesting to merge and 
put in a second change after to fix the fix, so to speak.

If you've already summited a fix for the DBZ, then I see two potential ways 
forward:
 # work out how to inform SonaType the issue is fixed:
 ## There is a [Report 
correction|https://ossindex.sonatype.org/doc/report-vulnerability] link on the 
bug report.  May you, or I if you let me know the PR you fixed the DBZ in, can 
use this to raise the fact its been fixed?
 ## Maybe just tagging the [SonaType 
issue|https://ossindex.sonatype.org/vulnerability/sonatype-2019-0422?component-type=maven&component-name=org.apache.kafka%2Fkafka-streams&utm_source=ossindex-client&utm_medium=integration&utm_content=1.7.0]
 in your PR would be enough?
 ## Does someone in Confluent know about this stuff that you can talk to?
 ## 
 # reopen, 'adjust' and merge the original PR... hopefully triggering SonaType 
to mark the issue resolved.

> Divide by zero security vulnerability (sonatype-2019-0422)
> --
>
> Key: KAFKA-14660
> URL: https://issues.apache.org/jira/browse/KAFKA-14660
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.3.2
>Reporter: Andy Coates
>Assignee: Matthias J. Sax
>Priority: Minor
> Fix For: 3.5.0
>
>
> Looks like SonaType has picked up a "Divide by Zero" issue reported in a PR 
> and, because the PR was never merged, is now reporting it as a security 
> vulnerability in the latest Kafka Streams library.
>  
> See:
>  * [Vulnerability: 
> sonatype-2019-0422]([https://ossindex.sonatype.org/vulnerability/sonatype-2019-0422?component-type=maven&component-name=org.apache.kafka%2Fkafka-streams&utm_source=ossindex-client&utm_medium=integration&utm_content=1.7.0)]
>  * [Original PR]([https://github.com/apache/kafka/pull/7414])
>  
> While it looks from the comments made by [~mjsax] and [~bbejeck] that the 
> divide-by-zero is not really an issue, the fact that its now being reported 
> as a vulnerability is, especially with regulators.
> PITA, but we should consider either getting this vulnerability removed 
> (Google wasn't very helpful in providing info on how to do this), or fixed 
> (Again, not sure how to tag the fix as fixing this issue).  One option may 
> just be to reopen the PR and merge (and then fix forward by switching it to 
> throw an exception).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14660) Divide by zero security vulnerability

2023-01-30 Thread Andy Coates (Jira)
Andy Coates created KAFKA-14660:
---

 Summary: Divide by zero security vulnerability
 Key: KAFKA-14660
 URL: https://issues.apache.org/jira/browse/KAFKA-14660
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.3.2
Reporter: Andy Coates


Looks like SonaType has picked up a "Divide by Zero" issue reported in a PR 
and, because the PR was never merged, is now reporting a it as a security 
vulnerability in the latest Kafka Streams library.

 

See:
 * [Vulnerability: 
sonatype-2019-0422]([https://ossindex.sonatype.org/vulnerability/sonatype-2019-0422?component-type=maven&component-name=org.apache.kafka%2Fkafka-streams&utm_source=ossindex-client&utm_medium=integration&utm_content=1.7.0)]

 * [Original PR](https://github.com/apache/kafka/pull/7414)

 

While it looks from the comments made by [~mjsax] and [~bbejeck] that the 
divide-by-zero is not really an issue, the fact that its now being reported as 
a vulnerability is, especially with regulators.

PITA, but we should consider either getting this vulnerability removed (Google 
wasn't very helpful in providing info on how to do this), or fixed (Again, not 
sure how to tag the fix as fixing this issue).  One option may just be to 
reopen the PR and merge (and then fix forward by switching it to throw an 
exception).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-10494) Streams: enableSendingOldValues should not call parent if node is itself materialized

2020-09-17 Thread Andy Coates (Jira)
Andy Coates created KAFKA-10494:
---

 Summary: Streams: enableSendingOldValues should not call parent if 
node is itself materialized
 Key: KAFKA-10494
 URL: https://issues.apache.org/jira/browse/KAFKA-10494
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Andy Coates


Calling `enableSendingOldValues` on many nodes, e.g. `KTableFilter`, is 
unnecessarily calling `enableSendingOldValues` on the parent, even when the 
processor itself is materialized. This can force the parent table to be 
materialized unnecessarily.

 

For example:

```

StreamsBuilder builder = new StreamsBuilder();

builder
   .table("t1", Consumed.of(...))
   .filter(predicate, Materialized.as("t2"))
   .

```

If `downStreamOps` result in a call to `enableSendingOldValues` on the table 
returned by the `filter` call, i.e. `t2`, then it will result in `t1` being 
materialized unnecessarily.


This ticket was raised off the back of [comments in a 
PR](https://github.com/apache/kafka/pull/9156#discussion_r490152263) while 
working on https://issues.apache.org/jira/browse/KAFKA-10077.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10077) Filter downstream of state-store results in suprious tombstones

2020-06-01 Thread Andy Coates (Jira)
Andy Coates created KAFKA-10077:
---

 Summary: Filter downstream of state-store results in suprious 
tombstones
 Key: KAFKA-10077
 URL: https://issues.apache.org/jira/browse/KAFKA-10077
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.5.0
Reporter: Andy Coates


Adding a `filter` call downstream of anything that has a state store, e.g. a 
table source, results in spurious tombstones being emitted from the topology 
for any key where a new entry doesn't match the filter, _even when no previous 
value existed for the row_.

To put this another way: a filer downstream of a state-store will output a 
tombstone on an INSERT the doesn't match the filter, when it should only output 
a tombstone on an UPDATE.

 

This code shows the problem:


{code:java}
final StreamsBuilder builder = new StreamsBuilder();

builder
 .table("table", Materialized.with(Serdes.Long(), Serdes.Long()))
 .filter((k, v) -> v % 2 == 0)
 .toStream()
 .to("bob");

final Topology topology = builder.build();

final Properties props = new Properties();
props.put("application.id", "fred");
props.put("bootstrap.servers", "who cares");

final TopologyTestDriver driver = new TopologyTestDriver(topology, props);

final TestInputTopic input = driver
 .createInputTopic("table", Serdes.Long().serializer(), 
Serdes.Long().serializer());

input.pipeInput(1L, 2L);
input.pipeInput(1L, 1L);
input.pipeInput(2L, 1L);


final TestOutputTopic output = driver
 .createOutputTopic("bob", Serdes.Long().deserializer(), 
Serdes.Long().deserializer());

final List> keyValues = output.readKeyValuesToList();

// keyValues contains:
// 1 -> 1
// 1 -> null <-- correct tombstone: deletes previous row.
// 2 -> null <-- spurious tombstone: no previous row. 
{code}
 

These spurious tombstones can cause a LOT of noise when, for example, the 
filter is looking for a specific key.  In such a situation, _every input record 
that does not have that key results in a tombstone!_ meaning there are many 
more tombstones than useful data.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-598: Augment TopologyDescription with store and source / sink serde information

2020-05-11 Thread Andy Coates
Hi Guozhang,

Thanks for writing this up. I’m very interested to see this, so I hope you 
don’t mind me commenting.

I’ve only really one comment to make, and that’s on the text printed for the 
serde classes:

As I understand it, the name will either come from the passed in config, or may 
default to “unknown”, or may be obtained from the instances passed while 
building the topology. It’s this latter case that interests me.  Where you have 
an actual serde instance could we not output more information?

The examples use simple (de)serialization classes such as `LongDeseriailizer` 
where the name alone imparts all the information the user is likely to need. 
However, users may provide there own custom serialisers and such serialisers 
may contain state that is important, e.g. the serialiser may know the schema of 
the data being serialized.  May there be benefit from taking the `toString()` 
representation of the serialiser?

Of course, this would require adding suitable `toString` impls to our own stock 
serialisers, but may ultimately prove more versatile in the future.  The 
downside is potential to corrupt the topology description, e.g. a toString that 
includes new lines etc.

Thanks,

Andy



> On 4 May 2020, at 19:27, Bruno Cadonna  wrote:
> 
> Hi Guozhang,
> 
> Thank you for the KIP!
> 
> Exposing also the inner types of the wrapper serdes would be
> important. For debugging as Matthias has already mentioned and to see
> more easily changes that are applied to a topology.
> 
> I am also +1 on the `toJson()` method to easily access the topology
> description programmatically and to make the description backward
> compatible.
> 
> Regarding `List serdeNames();`, I would be in favour of a more
> expressive return type, like a Map that assigns labels to Serde names.
> For example, for key and value serdes the label could be "key" and
> "value". Or something similar.
> 
> Best,
> Bruno
> 
> 
> 
> On Mon, May 4, 2020 at 2:25 AM Guozhang Wang  wrote:
>> 
>> Hello Matthias John, thanks for your comments!! Replied them inline.
>> 
>> I think there are a couple open questions that I'd like to hear your
>> opinions on with the context:
>> 
>> a. For stores's serdes, the reason I proposed to expose a set of serde
>> names instead of a pair of key / value serdes is for future possible store
>> types which may not be key-values. I admit it could just be over-killing
>> here so if you have a strong preference on the latter, I could be convinced
>> to change that part but I'd want to make the original motivation clear.
>> 
>> b. I think I'm convinced that I'd just augment the `toString` result
>> regardless of which func generated the Topology (and hence its
>> TopologyDescription), note this would mean that we break the compatibility
>> of the current `toString` function. As a remedy for that, we will also
>> expose a `toJson` function to programmatical purposes.
>> 
>> Guozhang
>> 
>> 
>>> (1) In the new TopologyDescription output, the line for the
>>> windowed-count processor is:
>>> 
 Processor: myname (stores: [(myname-store, serdes:
>> [SessionWindowedSerde, FullChangeSerde])])
>>> 
>>> For this case, both Serdes are wrappers and user would actually only
>>> specified wrapped Serdes for the key and value. Can we do anything about
>>> this? Otherwise, there might still be a runtime `ClassCastException`
>>> that a user cannot easily debug.
>>> 
>>> 
>>> (2) Nit: The JavaDocs of `Processor#storeSet()` seems to be incorrect
>>> (it says "The names of all connected stores." -- guess it's c&p error)?
>>> 
>> Yes! Fixed.
>> 
>>> 
>>> (3) The KIP mentioned to add `Store#changelogTopic()` method, but the
>>> output of `TopologyDescription#toString()` does not contain it. I think
>>> it might be good do add it, too?
>>> 
>> Yes, that's right. I'm going to add to the example as well.
>> 
>>> 
>>> (4) The KIP also list https://issues.apache.org/jira/browse/KAFKA-9913
>>> but it seems not to address it yet?
>>> 
>> I actually did intent to have it addressed; the proposal includes:
>> 
>> a. Return the set of source / sink nodes of a sub-topology, and their
>> corresponding source / sink topics could be accessed.
>> b. Return the set of stores of a sub-topology, and their corresponding
>> changelog topics could be accessed.
>> 
>> The reason I did not choose to just expose the set of all topics directly,
>> but indirectly, is stated in the wiki:
>> 
>> "the reason we did not expose APIs for topic names directly is that for
>> source nodes, it is possible to have Pattern and for sink nodes, it is
>> possible to have topic-extractors, and hence it's better to let users
>> leveraging on the lower-level APIs to construct the topic names
>> programmatically themselves."
>> 
>>> 
>>> (5) As John, I also noticed that `List Store#sedersNames()` is
>>> not a great API. I am not sure if I understand your reply thought.
>>> AFAIK, there is no exiting API
>>> 
 List StoreBuilder#serdes()
>>> 
>>> (cf
>>> 
>> https://github.c

Re: [VOTE] KIP-594: Expose output topic names from TopologyTestDriver

2020-04-23 Thread Andy Coates
Vote is: +4 binding for, no against.

Thanks all!



On Thu, 23 Apr 2020 at 22:05, Guozhang Wang  wrote:

> +1. Thanks Andy!
>
>
> Guozhang
>
> On Thu, Apr 23, 2020 at 6:53 AM Bill Bejeck  wrote:
>
> > Hi Andy,
> >
> > Thanks for the KIP.  It's a + 1(binding) for me.
> >
> > -Bill
> >
> > On Thu, Apr 23, 2020 at 9:45 AM John Roesler 
> wrote:
> >
> > > Hi Andy,
> > >
> > > I just took another look, and it looks good to me! Thanks for the
> > > contribution.
> > >
> > > I’m +1 (binding)
> > >
> > > -John
> > >
> > > On Thu, Apr 23, 2020, at 06:49, Andy Coates wrote:
> > > > Dump
> > > >
> > > > On Thu, 16 Apr 2020 at 00:48, Andy Coates  wrote:
> > > >
> > > > > Hey all,
> > > > >
> > > > > I would like to start the vote for KIP- <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-594%3A+Expose+output+topic+names+from+TopologyTestDriver
> > >594
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-594%3A+Expose+output+topic+names+from+TopologyTestDriver
> > > >.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Andy
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> -- Guozhang
>


Re: [VOTE] KIP-594: Expose output topic names from TopologyTestDriver

2020-04-23 Thread Andy Coates
Dump

On Thu, 16 Apr 2020 at 00:48, Andy Coates  wrote:

> Hey all,
>
> I would like to start the vote for KIP- 
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-594%3A+Expose+output+topic+names+from+TopologyTestDriver>594
>  
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-594%3A+Expose+output+topic+names+from+TopologyTestDriver>.
>
> Thanks,
>
> Andy
>
>
>


Re: [DISCUSS] KIP-594: Expose output topic names from TopologyTestDriver

2020-04-20 Thread Andy Coates
speed you to a successful vote!
> > >
> > > Thanks,
> > > John
> > >
> > > On Tue, Apr 14, 2020, at 19:49, Matthias J. Sax wrote:
> > >> Andy,
> > >>
> > >> thanks for the KIP. The motivation is a little unclear to me:
> > >>
> > >>> This information allows all the outputs of a test run to be captured
> without prior knowledge of the output topics.
> > >>
> > >> Given that the TTD users writes the `Topology` themselves, they should
> > >> always have prior knowledge what output topics they use. So why would
> > >> this be useful?
> > >>
> > >> Also, there is `Topology#describe()` to get all topic names (even the
> > >> name of internal topics -- to be fair, changelog topic names are not
> > >> exposed, only store names, but those can we used to infer changelog
> > >> topic names, too).
> > >>
> > >> Can you elaborate about the motivation? So far, it's not convincing
> to me.
> > >>
> > >>
> > >>
> > >> -Matthias
> > >>
> > >>
> > >> On 4/14/20 8:09 AM, Andy Coates wrote:
> > >>> Hey all,
> > >>> I would like to start off the discussion for KIP-594:
> > >>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-594%3A+Expose+output+topic+names+from+TopologyTestDriver
> > >>>
> > >>> This KIP proposes to expose the names of the topics a topology
> produces
> > >>> records during a test run from the TopologyTestDriver class.
> > >>>
> > >>> Let me know your thoughts!
> > >>> Andy
> > >>>
> > >>
> > >>
> > >> Attachments:
> > >> * signature.asc
> >
> >
> > Attachments:
> > * signature.asc
>


[VOTE] KIP-594: Expose output topic names from TopologyTestDriver

2020-04-15 Thread Andy Coates
Hey all,

I would like to start the vote for KIP-
594
.

Thanks,

Andy


[DISCUSS] KIP-594: Expose output topic names from TopologyTestDriver

2020-04-14 Thread Andy Coates
Hey all,
I would like to start off the discussion for KIP-594:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-594%3A+Expose+output+topic+names+from+TopologyTestDriver

This KIP proposes to expose the names of the topics a topology produces
records during a test run from the TopologyTestDriver class.

Let me know your thoughts!
Andy


[jira] [Created] (KAFKA-9865) Expose output topic names from TopologyTestDriver

2020-04-14 Thread Andy Coates (Jira)
Andy Coates created KAFKA-9865:
--

 Summary: Expose output topic names from TopologyTestDriver
 Key: KAFKA-9865
 URL: https://issues.apache.org/jira/browse/KAFKA-9865
 Project: Kafka
  Issue Type: Bug
  Components: streams-test-utils
Affects Versions: 2.4.1
Reporter: Andy Coates


Expose the output topic names from TopologyTestDriver, i.e. 
`outputRecordsByTopic.keySet()`.

This is useful to users of the test driver, as they can use it to determine the 
names of all output topics. Which can then be used to capture all output of a 
topology, without having to manually list all the output topics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9668) Iterating over KafkaStreams.getAllMetadata() results in ConcurrentModificationException

2020-03-05 Thread Andy Coates (Jira)
Andy Coates created KAFKA-9668:
--

 Summary: Iterating over KafkaStreams.getAllMetadata() results in 
ConcurrentModificationException
 Key: KAFKA-9668
 URL: https://issues.apache.org/jira/browse/KAFKA-9668
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.1.0
Reporter: Andy Coates
Assignee: Andy Coates


`KafkaStreams.getAllMetadata()` returns 
`StreamsMetadataState.getAllMetadata()`. All the latter methods is 
`synchronized` it returns a reference to internal mutable state.  Not only does 
this break encapsulation, but it means any thread iterating over the returned 
collection when the metadata gets rebuilt will encounter a 
`ConcurrentModificationException`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9667) Connect JSON serde strip trailing zeros

2020-03-05 Thread Andy Coates (Jira)
Andy Coates created KAFKA-9667:
--

 Summary: Connect JSON serde strip trailing zeros
 Key: KAFKA-9667
 URL: https://issues.apache.org/jira/browse/KAFKA-9667
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Andy Coates
Assignee: Andy Coates


The Connect Json serde was recently enhanced to support serializing decimals as 
standard JSON numbers, e.g. `1.23`.  However, there is a bug in the 
implementation: it's stripping trailing zeros!  `1.23` is _not_ the same as 
`1.230`.  Trailing zeros should not be dropped when de(serializing) decimals.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9416) Streams get stuck in `PENDING_SHUTDOWN` is underlying topics deleted.

2020-01-13 Thread Andy Coates (Jira)
Andy Coates created KAFKA-9416:
--

 Summary: Streams get stuck in `PENDING_SHUTDOWN` is underlying 
topics deleted.
 Key: KAFKA-9416
 URL: https://issues.apache.org/jira/browse/KAFKA-9416
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.4.0
Reporter: Andy Coates


We've noticed that if topics are deleted from under a running topology, e.g. 
repartition, changelog or sink topics, then the stream threads transition from 
`RUNNING` to `PENDING_SHUTDOWN`, but then do not transition to `ERROR`.

Likewise, if a Kafka cluster has auto topic create disabled and a topology is 
started where its sink topic(s) do not exist, then the topology similarly gets 
stuck in `PENDING_SHUTDOWN`.

Once the query is stuck in `PENDING_SHUTDOWN` any call to close the topology 
blocks, as per https://issues.apache.org/jira/browse/KAFKA-9398.

We would like to see Kafka Streams handle this case and correctly transition to 
`ERROR` state.

The cause of this is covered in more detail here: 
https://github.com/confluentinc/ksql/issues/4268#issuecomment-573020281



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-08-06 Thread Andy Coates
Hi all,

Just a quick note to let you all know that the KIP ran into a slight hiccup
along the way.  The original change saw the return value of
`KafkaClientSupplier.getAdminClient` changed from `AdminClient` to the new
`Admin`, thereby allowing implementers to return a proxy is they so
wanted.  However, changing the return value from the class to the interface
was a binary incompatible change - doh!  Hence the KIP has been updated to
instead leave `getAdminClient`s signature as-is and just deprecate it in
favour of a new `Admin getAdmin(final Map config)` method.

PR here: https://github.com/apache/kafka/pull/7162

Let me know if anyone has any issues / suggestions, etc.

Andy

On Fri, 12 Jul 2019 at 20:35, Andy Coates  wrote:

> Awesome sauce - so I'd like to close the voting. final vote was:
>
> 4 for binding, none against
> 3 non-binding, none against.
>
> I'll update the KIP to reflect the passing of the vote.
>
> Thanks you all for your time & brain power,
>
> Andy
>
> On Thu, 11 Jul 2019 at 14:51, Rajini Sivaram 
> wrote:
>
>> +1 (binding)
>>
>> Thanks for the KIP, Andy!
>>
>> Regards,
>>
>> Rajini
>>
>>
>> On Thu, Jul 11, 2019 at 1:18 PM Gwen Shapira  wrote:
>>
>> > +1 (binding)
>> >
>> > Thank you for the improvement.
>> >
>> > On Thu, Jul 11, 2019, 3:53 AM Andy Coates  wrote:
>> >
>> > > Hi All,
>> > >
>> > > So voting currently stands on:
>> > >
>> > > Binding:
>> > > +1 Matthias,
>> > > +1 Colin
>> > >
>> > > Non-binding:
>> > > +1  Thomas Becker
>> > > +1 Satish Guggana
>> > > +1 Ryan Dolan
>> > >
>> > > So we're still 1 binding vote short. :(
>> > >
>> > >
>> > > On Wed, 3 Jul 2019 at 23:08, Matthias J. Sax 
>> > > wrote:
>> > >
>> > > > Thanks for the details Colin and Andy.
>> > > >
>> > > > My indent was not to block the KIP, but it seems to be a fair
>> question
>> > > > to ask.
>> > > >
>> > > > I talked to Ismael offline about it and understand his reasoning
>> better
>> > > > now. If we don't deprecate `abstract AdminClient` class, it seems
>> > > > reasonable to not deprecate the corresponding factory methods
>> either.
>> > > >
>> > > >
>> > > > +1 (binding) on the current proposal
>> > > >
>> > > >
>> > > >
>> > > > -Matthias
>> > > >
>> > > > On 7/3/19 5:03 AM, Andy Coates wrote:
>> > > > > Matthias,
>> > > > >
>> > > > > I was referring to platforms such as spark or flink that support
>> > > multiple
>> > > > > versions of the Kafka clients. Ismael mentioned this higher up on
>> the
>> > > > > thread.
>> > > > >
>> > > > > I'd prefer this KIP didn't get held up over somewhat unrelated
>> > change,
>> > > > i.e.
>> > > > > should the factory method be on the interface or utility class.
>> > > Surely,
>> > > > > now would be a great time to change this if we wanted, but we can
>> > also
>> > > > > change this later if we need to.  In the interest of moving
>> forward,
>> > > can
>> > > > I
>> > > > > propose we leave the factory methods as they are in the KIP?
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > > Andy
>> > > > >
>> > > > > On Tue, 2 Jul 2019 at 17:14, Colin McCabe 
>> > wrote:
>> > > > >
>> > > > >> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
>> > > > >>> On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
>> > > > >>>> Not sure, if I understand the argument?
>> > > > >>>>
>> > > > >>>> Why would anyone need to support multiple client side versions?
>> > > > >>>> Clients/brokers are forward/backward compatible anyway.
>> > > > >>>
>> > > > >>> When you're using many different libraries, it is helpful if
>> they
>> > > don't
>> > > > >>> impose tight constraints on what versions their dependencies
>>

Re: [DISCUSS] KIP-481: SerDe Improvements for Connect Decimal type in JSON

2019-07-29 Thread Andy Coates
The way I see it, we need to control two seperate things:

1. How do we _deserialize_ a decimal type if we encounter a text node in
the JSON?(We should _always_ be able to deserialize a standard JSON
number as a decimal).
2. How do we chose how we want decimals to be _serialized_.

This looks to fits well with your second suggestion of slightly different
configs names for serialization vs deserialization.
a, For deserialization we only care about how to handle text nodes: `
deserialization.decimal.*text*.format`, which should only have two valid
values BINARY | TEXT.
b. For serialization we need all three: `serialization.decimal.format`,
which should support all three options: BINARY | TEXT | NUMERIC.

Implementation wise, I think these should be two separate enums, rather
than one shared enum and throwing an error if the deserializer is set to
NUMERIC.  Mainly as this means the enums reflect the options available,
rather than this being hidden in config checking code.  But that's a minor
implementation detail.

Personally, I'd be tempted to have the BINARY value named something like
`LEGACY` or `LEGACY_BINARY` as a way of encouraging users to move away from
it.

It's a real shame that both of these settings require a default of BINARY
for backwards compatibility, but I agree that discussions / plans around
switching the defaults should not block this KIP.

Andy


On Thu, 25 Jul 2019 at 18:26, Almog Gavra  wrote:

> Thanks for the replies Andy and Andrew (2x Andy?)!
>
> > Is the text decimal a base16 encoded number, or is it base16 encoded
> binary
> > form of the number?
>
> The conversion happens as decimal.unscaledValue().toByteArray() and then
> the byte array is converted to a hex string, so it's definitely the binary
> form of the number converted to base16. Whether or not that's the same as
> the base16 encoded number is a good question (toByteArray returns a byte
> array containing a signed, big-endian, two's complement representation of
> the big integer).
>
> > One suggestion I have is to change the proposed new config to only affect
> > decimals stored as text, i.e. to switch between the current base16 and
> the
> > more common base10.   Then add another config to the serializer only that
> > controls if decimals should be serialized as text or numeric.
>
> I think we need to be able to handle all mappings from serialization format
> to deserialization format (e.g. read in BINARY and output TEXT), which I
> think would be impossible with the alternative suggestion. I agree that
> automatically deserializing numerics is valuable. I see two other ways to
> get this, both keeping the serialization.format config the same:
>
> - have json.decimal.deserialization.format accept all three formats. if set
> to BINARY/TEXT, numerics would be automatically supported. If set to
> NUMERIC, then any string coming in would result in deserialization error
> (defaults to BINARY for backwards compatibility)
> - change json.decimal.deserialization.format to
> json.decimal.deserialization.string.format which accepts only BINARY/TEXT
> (defaults to BINARY for backwards compatibility)
>
> > would be a breaking change in that things that previously failed would
> > suddenly start deserializing.  This is a price I'm willing to pay.
>
> I agree. I'm willing to pay this price too.
>
> > IMHO, we should then plan to switch the default of decimal serialization
> to
> > numeric, and text serialization to base 10 in the next major release.
>
> I think that can be a separate discussion, I don't want to block this KIP
> on it.
>
> Thoughts?
>
> On Thu, Jul 25, 2019 at 6:35 AM Andrew Otto  wrote:
>
> > This is a bit orthogonal, but in JsonSchemaConverter I use JSONSchemas to
> > indicate whether a JSON number should be deserialized as an integer or a
> > decimal
> > <
> >
> https://github.com/ottomata/kafka-connect-jsonschema/blob/master/src/main/java/org/wikimedia/kafka/connect/jsonschema/JsonSchemaConverter.java#L251-L261
> > >.
> > Not everyone is going to have JSONSchemas available when converting, but
> if
> > you do, it is an easy way to support JSON numbers as decimals.
> >
> > Carry on! :)
> >
> > On Thu, Jul 25, 2019 at 9:12 AM Andy Coates  wrote:
> >
> > > Hi Almog,
> > >
> > > Like the KIP - I think being able to support decimals in JSON in the
> same
> > > way most other systems do is a great improvement.
> > >
> > > It's not 100% clear to me from the KIP what the current format is.  Is
> > the
> > > text decimal a base16 encoded number, or is it base16 encoded binary
> form
> > > of the number? (I've not tried

Re: [DISCUSS] KIP-481: SerDe Improvements for Connect Decimal type in JSON

2019-07-25 Thread Andy Coates
Hi Almog,

Like the KIP - I think being able to support decimals in JSON in the same
way most other systems do is a great improvement.

It's not 100% clear to me from the KIP what the current format is.  Is the
text decimal a base16 encoded number, or is it base16 encoded binary form
of the number? (I've not tried to get my head around if these two are even
different!)

One suggestion I have is to change the proposed new config to only affect
decimals stored as text, i.e. to switch between the current base16 and the
more common base10.   Then add another config to the serialzier only that
controls if decimals should be serialized as text or numeric.  The benefit
of this approach is it allows us to enhance the deserializer to
automatically handle numeric decimals even without any config having to be
set, i.e. default config in the deserializer would be able to handle
numeric decimals.  Of course, this is a two edged sword: this would make
the deserializer work out of the box with numeric decimals, (yay!), but
would be a breaking change in that things that previously failed would
suddenly start deserializing.  This is a price I'm willing to pay.

IMHO, we should then plan to switch the default of decimal serialization to
numeric, and text serialization to base 10 in the next major release.
(With upgrade notes to match). Though I know this is more contentious, I
think it moves us forward in a much more standard way that the current
encoding of decimals.

On Tue, 25 Jun 2019 at 01:03, Almog Gavra  wrote:

> Hi Everyone!
>
> Kicking off discussion for a new KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-481%3A+SerDe+Improvements+for+Connect+Decimal+type+in+JSON
>
> For those who are interested, I have a prototype implementation that helped
> guide my design: https://github.com/agavra/kafka/pull/1
>
> Cheers,
> Almog
>


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-12 Thread Andy Coates
Awesome sauce - so I'd like to close the voting. final vote was:

4 for binding, none against
3 non-binding, none against.

I'll update the KIP to reflect the passing of the vote.

Thanks you all for your time & brain power,

Andy

On Thu, 11 Jul 2019 at 14:51, Rajini Sivaram 
wrote:

> +1 (binding)
>
> Thanks for the KIP, Andy!
>
> Regards,
>
> Rajini
>
>
> On Thu, Jul 11, 2019 at 1:18 PM Gwen Shapira  wrote:
>
> > +1 (binding)
> >
> > Thank you for the improvement.
> >
> > On Thu, Jul 11, 2019, 3:53 AM Andy Coates  wrote:
> >
> > > Hi All,
> > >
> > > So voting currently stands on:
> > >
> > > Binding:
> > > +1 Matthias,
> > > +1 Colin
> > >
> > > Non-binding:
> > > +1  Thomas Becker
> > > +1 Satish Guggana
> > > +1 Ryan Dolan
> > >
> > > So we're still 1 binding vote short. :(
> > >
> > >
> > > On Wed, 3 Jul 2019 at 23:08, Matthias J. Sax 
> > > wrote:
> > >
> > > > Thanks for the details Colin and Andy.
> > > >
> > > > My indent was not to block the KIP, but it seems to be a fair
> question
> > > > to ask.
> > > >
> > > > I talked to Ismael offline about it and understand his reasoning
> better
> > > > now. If we don't deprecate `abstract AdminClient` class, it seems
> > > > reasonable to not deprecate the corresponding factory methods either.
> > > >
> > > >
> > > > +1 (binding) on the current proposal
> > > >
> > > >
> > > >
> > > > -Matthias
> > > >
> > > > On 7/3/19 5:03 AM, Andy Coates wrote:
> > > > > Matthias,
> > > > >
> > > > > I was referring to platforms such as spark or flink that support
> > > multiple
> > > > > versions of the Kafka clients. Ismael mentioned this higher up on
> the
> > > > > thread.
> > > > >
> > > > > I'd prefer this KIP didn't get held up over somewhat unrelated
> > change,
> > > > i.e.
> > > > > should the factory method be on the interface or utility class.
> > > Surely,
> > > > > now would be a great time to change this if we wanted, but we can
> > also
> > > > > change this later if we need to.  In the interest of moving
> forward,
> > > can
> > > > I
> > > > > propose we leave the factory methods as they are in the KIP?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Andy
> > > > >
> > > > > On Tue, 2 Jul 2019 at 17:14, Colin McCabe 
> > wrote:
> > > > >
> > > > >> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
> > > > >>> On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
> > > > >>>> Not sure, if I understand the argument?
> > > > >>>>
> > > > >>>> Why would anyone need to support multiple client side versions?
> > > > >>>> Clients/brokers are forward/backward compatible anyway.
> > > > >>>
> > > > >>> When you're using many different libraries, it is helpful if they
> > > don't
> > > > >>> impose tight constraints on what versions their dependencies are.
> > > > >>> Otherwise you can easily get in a situation where the constraints
> > > can't
> > > > >>> be satisfied.
> > > > >>>
> > > > >>>>
> > > > >>>> Also, if one really supports multiple client side versions,
> won't
> > > they
> > > > >>>> use multiple shaded dependencies for different versions?
> > > > >>>
> > > > >>> Shading the Kafka client doesn't really work, because of how we
> use
> > > > >> reflection.
> > > > >>>
> > > > >>>>
> > > > >>>> Last, it's possible to suppress warnings (at least in Java).
> > > > >>>
> > > > >>> But not in Scala.  So that does not help (for example), Scala
> > users.
> > > > >>
> > > > >> I meant to write "Spark users" here.
> > > > >>
> > > > >> C.
> > > > >>
> > > > >>&

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-11 Thread Andy Coates
Hi All,

So voting currently stands on:

Binding:
+1 Matthias,
+1 Colin

Non-binding:
+1  Thomas Becker
+1 Satish Guggana
+1 Ryan Dolan

So we're still 1 binding vote short. :(


On Wed, 3 Jul 2019 at 23:08, Matthias J. Sax  wrote:

> Thanks for the details Colin and Andy.
>
> My indent was not to block the KIP, but it seems to be a fair question
> to ask.
>
> I talked to Ismael offline about it and understand his reasoning better
> now. If we don't deprecate `abstract AdminClient` class, it seems
> reasonable to not deprecate the corresponding factory methods either.
>
>
> +1 (binding) on the current proposal
>
>
>
> -Matthias
>
> On 7/3/19 5:03 AM, Andy Coates wrote:
> > Matthias,
> >
> > I was referring to platforms such as spark or flink that support multiple
> > versions of the Kafka clients. Ismael mentioned this higher up on the
> > thread.
> >
> > I'd prefer this KIP didn't get held up over somewhat unrelated change,
> i.e.
> > should the factory method be on the interface or utility class.  Surely,
> > now would be a great time to change this if we wanted, but we can also
> > change this later if we need to.  In the interest of moving forward, can
> I
> > propose we leave the factory methods as they are in the KIP?
> >
> > Thanks,
> >
> > Andy
> >
> > On Tue, 2 Jul 2019 at 17:14, Colin McCabe  wrote:
> >
> >> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
> >>> On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
> >>>> Not sure, if I understand the argument?
> >>>>
> >>>> Why would anyone need to support multiple client side versions?
> >>>> Clients/brokers are forward/backward compatible anyway.
> >>>
> >>> When you're using many different libraries, it is helpful if they don't
> >>> impose tight constraints on what versions their dependencies are.
> >>> Otherwise you can easily get in a situation where the constraints can't
> >>> be satisfied.
> >>>
> >>>>
> >>>> Also, if one really supports multiple client side versions, won't they
> >>>> use multiple shaded dependencies for different versions?
> >>>
> >>> Shading the Kafka client doesn't really work, because of how we use
> >> reflection.
> >>>
> >>>>
> >>>> Last, it's possible to suppress warnings (at least in Java).
> >>>
> >>> But not in Scala.  So that does not help (for example), Scala users.
> >>
> >> I meant to write "Spark users" here.
> >>
> >> C.
> >>
> >>>
> >>> I agree that in general we should be using deprecation when
> >>> appropriate, regardless of the potential annoyances to users.  But I'm
> >>> not sure deprecating this method is really worth it.
> >>>
> >>> best,
> >>> Colin
> >>>
> >>>
> >>>>
> >>>> Can you elaborate?
> >>>>
> >>>> IMHO, just adding a statement to JavaDocs is a little weak, and at
> some
> >>>> point, we need to deprecate those methods anyway if we ever want to
> >>>> remove them. The earlier we deprecate them, the earlier we can remove
> >> them.
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 7/1/19 4:22 AM, Andy Coates wrote:
> >>>>> Done. I've not deprecated the factory methods on the AdminClient for
> >> the
> >>>>> same reason the AdminClient itself is not deprecated, i.e. this
> >> would cause
> >>>>> unavoidable warnings for libraries / platforms that support multiple
> >>>>> versions of Kafka. However, I think we add a note to the Java docs of
> >>>>> `AdminClient` to indicate that its use, going forward, is
> >> discouraged in
> >>>>> favour of the new `Admin` interface and explain why its not  been
> >>>>> deprecated, but that it may/will be removed in a future version.
> >>>>>
> >>>>> Regarding factory methods on interfaces: there seems to be some
> >> difference
> >>>>> of opinion here. I'm not sure of the best approach to revolve this.
> >> At the
> >>>>> moment the KIP has factory methods on the new `Admin` interface,
> >> rather
> >>>>> than some utility cla

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-03 Thread Andy Coates
Matthias,

I was referring to platforms such as spark or flink that support multiple
versions of the Kafka clients. Ismael mentioned this higher up on the
thread.

I'd prefer this KIP didn't get held up over somewhat unrelated change, i.e.
should the factory method be on the interface or utility class.  Surely,
now would be a great time to change this if we wanted, but we can also
change this later if we need to.  In the interest of moving forward, can I
propose we leave the factory methods as they are in the KIP?

Thanks,

Andy

On Tue, 2 Jul 2019 at 17:14, Colin McCabe  wrote:

> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
> > On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
> > > Not sure, if I understand the argument?
> > >
> > > Why would anyone need to support multiple client side versions?
> > > Clients/brokers are forward/backward compatible anyway.
> >
> > When you're using many different libraries, it is helpful if they don't
> > impose tight constraints on what versions their dependencies are.
> > Otherwise you can easily get in a situation where the constraints can't
> > be satisfied.
> >
> > >
> > > Also, if one really supports multiple client side versions, won't they
> > > use multiple shaded dependencies for different versions?
> >
> > Shading the Kafka client doesn't really work, because of how we use
> reflection.
> >
> > >
> > > Last, it's possible to suppress warnings (at least in Java).
> >
> > But not in Scala.  So that does not help (for example), Scala users.
>
> I meant to write "Spark users" here.
>
> C.
>
> >
> > I agree that in general we should be using deprecation when
> > appropriate, regardless of the potential annoyances to users.  But I'm
> > not sure deprecating this method is really worth it.
> >
> > best,
> > Colin
> >
> >
> > >
> > > Can you elaborate?
> > >
> > > IMHO, just adding a statement to JavaDocs is a little weak, and at some
> > > point, we need to deprecate those methods anyway if we ever want to
> > > remove them. The earlier we deprecate them, the earlier we can remove
> them.
> > >
> > >
> > > -Matthias
> > >
> > > On 7/1/19 4:22 AM, Andy Coates wrote:
> > > > Done. I've not deprecated the factory methods on the AdminClient for
> the
> > > > same reason the AdminClient itself is not deprecated, i.e. this
> would cause
> > > > unavoidable warnings for libraries / platforms that support multiple
> > > > versions of Kafka. However, I think we add a note to the Java docs of
> > > > `AdminClient` to indicate that its use, going forward, is
> discouraged in
> > > > favour of the new `Admin` interface and explain why its not  been
> > > > deprecated, but that it may/will be removed in a future version.
> > > >
> > > > Regarding factory methods on interfaces: there seems to be some
> difference
> > > > of opinion here. I'm not sure of the best approach to revolve this.
> At the
> > > > moment the KIP has factory methods on the new `Admin` interface,
> rather
> > > > than some utility class. I prefer the utility class, but this isn't
> inline
> > > > with the patterns in the code base and some of the core team have
> expressed
> > > > they'd prefer to continue to have the factory methods on the
> interface.
> > > > I'm happy with this if others are.
> > > >
> > > > Thanks,
> > > >
> > > > Andy
> > > >
> > > > On Thu, 27 Jun 2019 at 23:21, Matthias J. Sax 
> wrote:
> > > >
> > > >> @Andy:
> > > >>
> > > >> What about the factory methods of `AdminClient` class? Should they
> be
> > > >> deprecated?
> > > >>
> > > >> One nit about the KIP: can you maybe insert "code blocks" to
> highlight
> > > >> the API changes? Code blocks would simplify to read the KIP a lot.
> > > >>
> > > >>
> > > >> -Matthias
> > > >>
> > > >> On 6/26/19 6:56 AM, Ryanne Dolan wrote:
> > > >>> +1 (non-binding)
> > > >>>
> > > >>> Thanks.
> > > >>> Ryanne
> > > >>>
> > > >>> On Tue, Jun 25, 2019 at 10:21 PM Satish Duggana <
> > > >> satish.dugg...@gmail.com>
> > > 

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-01 Thread Andy Coates
Done. I've not deprecated the factory methods on the AdminClient for the
same reason the AdminClient itself is not deprecated, i.e. this would cause
unavoidable warnings for libraries / platforms that support multiple
versions of Kafka. However, I think we add a note to the Java docs of
`AdminClient` to indicate that its use, going forward, is discouraged in
favour of the new `Admin` interface and explain why its not  been
deprecated, but that it may/will be removed in a future version.

Regarding factory methods on interfaces: there seems to be some difference
of opinion here. I'm not sure of the best approach to revolve this. At the
moment the KIP has factory methods on the new `Admin` interface, rather
than some utility class. I prefer the utility class, but this isn't inline
with the patterns in the code base and some of the core team have expressed
they'd prefer to continue to have the factory methods on the interface.
I'm happy with this if others are.

Thanks,

Andy

On Thu, 27 Jun 2019 at 23:21, Matthias J. Sax  wrote:

> @Andy:
>
> What about the factory methods of `AdminClient` class? Should they be
> deprecated?
>
> One nit about the KIP: can you maybe insert "code blocks" to highlight
> the API changes? Code blocks would simplify to read the KIP a lot.
>
>
> -Matthias
>
> On 6/26/19 6:56 AM, Ryanne Dolan wrote:
> > +1 (non-binding)
> >
> > Thanks.
> > Ryanne
> >
> > On Tue, Jun 25, 2019 at 10:21 PM Satish Duggana <
> satish.dugg...@gmail.com>
> > wrote:
> >
> >> +1 (non-binding)
> >>
> >> On Wed, Jun 26, 2019 at 8:37 AM Satish Duggana <
> satish.dugg...@gmail.com>
> >> wrote:
> >>>
> >>> +1 Matthias/Andy.
> >>> IMHO, interface is about the contract, it should not have/expose any
> >>> implementation. I am fine with either way as it is more of taste or
> >>> preference.
> >>>
> >>> Agree with Ismael/Colin/Ryanne on not deprecating for good reasons.
> >>>
> >>>
> >>> On Mon, Jun 24, 2019 at 8:33 PM Andy Coates  wrote:
> >>>>
> >>>> I agree Matthias.
> >>>>
> >>>> (In Scala, such factory methods are on a companion object. As Java
> >> doesn't
> >>>> have the concept of a companion object, an equivalent would be a
> >> utility
> >>>> class with a similar name...)
> >>>>
> >>>> However, I'll update the KIP to include the factory method on the
> >> interface.
> >>>>
> >>>> On Fri, 21 Jun 2019 at 23:40, Matthias J. Sax 
> >> wrote:
> >>>>
> >>>>> I still think, that an interface does not need to know anything about
> >>>>> its implementation. But I am also fine if we add a factory method to
> >> the
> >>>>> new interface if that is preferred by most people.
> >>>>>
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>> On 6/21/19 7:10 AM, Ismael Juma wrote:
> >>>>>> This is even more reason not to deprecate immediately, there is
> >> very
> >>>>> little
> >>>>>> maintenance cost for us. We should be mindful that many of our
> >> users (eg
> >>>>>> Spark, Flink, etc.) typically allow users to specify the kafka
> >> clients
> >>>>>> version and hence avoid using new classes/interfaces for some
> >> time. They
> >>>>>> would get a bunch of warnings they cannot do anything about apart
> >> from
> >>>>>> suppressing.
> >>>>>>
> >>>>>> Ismael
> >>>>>>
> >>>>>> On Fri, Jun 21, 2019 at 4:00 AM Andy Coates 
> >> wrote:
> >>>>>>
> >>>>>>> Hi Ismael,
> >>>>>>>
> >>>>>>> I’m happy enough to not deprecate the existing `AdminClient`
> >> class as
> >>>>> part
> >>>>>>> of this change.
> >>>>>>>
> >>>>>>> However, note that, the class will likely be empty, i.e. all
> >> methods and
> >>>>>>> implementations will be inherited from the interface:
> >>>>>>>
> >>>>>>> public abstract class AdminClient implements Admin {
> >>>>>>> }
> >>>>>>>
> >>>>>>> Not marking it as d

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-07-01 Thread Andy Coates
Done.

On Thu, 27 Jun 2019 at 23:21, Matthias J. Sax  wrote:

> @Andy:
>
> What about the factory methods of `AdminClient` class? Should they be
> deprecated?
>
> One nit about the KIP: can you maybe insert "code blocks" to highlight
> the API changes? Code blocks would simplify to read the KIP a lot.
>
>
> -Matthias
>
> On 6/26/19 6:56 AM, Ryanne Dolan wrote:
> > +1 (non-binding)
> >
> > Thanks.
> > Ryanne
> >
> > On Tue, Jun 25, 2019 at 10:21 PM Satish Duggana <
> satish.dugg...@gmail.com>
> > wrote:
> >
> >> +1 (non-binding)
> >>
> >> On Wed, Jun 26, 2019 at 8:37 AM Satish Duggana <
> satish.dugg...@gmail.com>
> >> wrote:
> >>>
> >>> +1 Matthias/Andy.
> >>> IMHO, interface is about the contract, it should not have/expose any
> >>> implementation. I am fine with either way as it is more of taste or
> >>> preference.
> >>>
> >>> Agree with Ismael/Colin/Ryanne on not deprecating for good reasons.
> >>>
> >>>
> >>> On Mon, Jun 24, 2019 at 8:33 PM Andy Coates  wrote:
> >>>>
> >>>> I agree Matthias.
> >>>>
> >>>> (In Scala, such factory methods are on a companion object. As Java
> >> doesn't
> >>>> have the concept of a companion object, an equivalent would be a
> >> utility
> >>>> class with a similar name...)
> >>>>
> >>>> However, I'll update the KIP to include the factory method on the
> >> interface.
> >>>>
> >>>> On Fri, 21 Jun 2019 at 23:40, Matthias J. Sax 
> >> wrote:
> >>>>
> >>>>> I still think, that an interface does not need to know anything about
> >>>>> its implementation. But I am also fine if we add a factory method to
> >> the
> >>>>> new interface if that is preferred by most people.
> >>>>>
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>> On 6/21/19 7:10 AM, Ismael Juma wrote:
> >>>>>> This is even more reason not to deprecate immediately, there is
> >> very
> >>>>> little
> >>>>>> maintenance cost for us. We should be mindful that many of our
> >> users (eg
> >>>>>> Spark, Flink, etc.) typically allow users to specify the kafka
> >> clients
> >>>>>> version and hence avoid using new classes/interfaces for some
> >> time. They
> >>>>>> would get a bunch of warnings they cannot do anything about apart
> >> from
> >>>>>> suppressing.
> >>>>>>
> >>>>>> Ismael
> >>>>>>
> >>>>>> On Fri, Jun 21, 2019 at 4:00 AM Andy Coates 
> >> wrote:
> >>>>>>
> >>>>>>> Hi Ismael,
> >>>>>>>
> >>>>>>> I’m happy enough to not deprecate the existing `AdminClient`
> >> class as
> >>>>> part
> >>>>>>> of this change.
> >>>>>>>
> >>>>>>> However, note that, the class will likely be empty, i.e. all
> >> methods and
> >>>>>>> implementations will be inherited from the interface:
> >>>>>>>
> >>>>>>> public abstract class AdminClient implements Admin {
> >>>>>>> }
> >>>>>>>
> >>>>>>> Not marking it as deprecated has the benefit that users won’t see
> >> any
> >>>>>>> deprecation warnings on the next release. Conversely, deprecating
> >> it
> >>>>> will
> >>>>>>> mean we can choose to remove this, now pointless class, in the
> >> future
> >>>>> if we
> >>>>>>> choose.
> >>>>>>>
> >>>>>>> That’s my thinking for deprecation, but as I’ve said I’m happy
> >> either
> >>>>> way.
> >>>>>>>
> >>>>>>> Andy
> >>>>>>>
> >>>>>>>> On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
> >>>>>>>>
> >>>>>>>> I agree with Ryanne, I think we should avoid deprecating
> >> AdminClient
> >>>>> and
> >>>>>>>&g

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-24 Thread Andy Coates
Hi all,

KIP updated:
- No deprecation
- Factory method back onto Admin interface

I'd like to kick off another round of voting please.

Thanks,

Andy

On Mon, 24 Jun 2019 at 16:03, Andy Coates  wrote:

> I agree Matthias.
>
> (In Scala, such factory methods are on a companion object. As Java doesn't
> have the concept of a companion object, an equivalent would be a utility
> class with a similar name...)
>
> However, I'll update the KIP to include the factory method on the
> interface.
>
> On Fri, 21 Jun 2019 at 23:40, Matthias J. Sax 
> wrote:
>
>> I still think, that an interface does not need to know anything about
>> its implementation. But I am also fine if we add a factory method to the
>> new interface if that is preferred by most people.
>>
>>
>> -Matthias
>>
>> On 6/21/19 7:10 AM, Ismael Juma wrote:
>> > This is even more reason not to deprecate immediately, there is very
>> little
>> > maintenance cost for us. We should be mindful that many of our users (eg
>> > Spark, Flink, etc.) typically allow users to specify the kafka clients
>> > version and hence avoid using new classes/interfaces for some time. They
>> > would get a bunch of warnings they cannot do anything about apart from
>> > suppressing.
>> >
>> > Ismael
>> >
>> > On Fri, Jun 21, 2019 at 4:00 AM Andy Coates  wrote:
>> >
>> >> Hi Ismael,
>> >>
>> >> I’m happy enough to not deprecate the existing `AdminClient` class as
>> part
>> >> of this change.
>> >>
>> >> However, note that, the class will likely be empty, i.e. all methods
>> and
>> >> implementations will be inherited from the interface:
>> >>
>> >> public abstract class AdminClient implements Admin {
>> >> }
>> >>
>> >> Not marking it as deprecated has the benefit that users won’t see any
>> >> deprecation warnings on the next release. Conversely, deprecating it
>> will
>> >> mean we can choose to remove this, now pointless class, in the future
>> if we
>> >> choose.
>> >>
>> >> That’s my thinking for deprecation, but as I’ve said I’m happy either
>> way.
>> >>
>> >> Andy
>> >>
>> >>> On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
>> >>>
>> >>> I agree with Ryanne, I think we should avoid deprecating AdminClient
>> and
>> >>> causing so much churn for users who don't actually care about this
>> niche
>> >>> use case.
>> >>>
>> >>> Ismael
>> >>>
>> >>> On Tue, Jun 18, 2019 at 6:43 AM Andy Coates 
>> wrote:
>> >>>
>> >>>> Hi Ryanne,
>> >>>>
>> >>>> If we don't change the client code, then everywhere will still expect
>> >>>> subclasses of `AdminClient`, so the interface will be of no use,
>> i.e. I
>> >>>> can't write a class that implements the new interface and pass it to
>> the
>> >>>> client code.
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>> Andy
>> >>>>
>> >>>> On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan 
>> >> wrote:
>> >>>>
>> >>>>> Andy, while I agree that the new interface is useful, I'm not
>> convinced
>> >>>>> adding an interface requires deprecating AdminClient and changing so
>> >> much
>> >>>>> client code. Why not just add the Admin interface, have AdminClient
>> >>>>> implement it, and have done?
>> >>>>>
>> >>>>> Ryanne
>> >>>>>
>> >>>>> On Mon, Jun 17, 2019 at 12:09 PM Andy Coates 
>> >> wrote:
>> >>>>>
>> >>>>>> Hi all,
>> >>>>>>
>> >>>>>> I think I've addressed all concerns. Let me know if I've not.  Can
>> I
>> >>>> call
>> >>>>>> another round of votes please?
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>>
>> >>>>>> Andy
>> >>>>>>
>> >>>>>> On Fri, 14 Jun 2019 at 04:55, Satish Duggana <
>> >> satish.dugg...@gmail.com
>> >>>>>
>> >&g

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-24 Thread Andy Coates
I agree Matthias.

(In Scala, such factory methods are on a companion object. As Java doesn't
have the concept of a companion object, an equivalent would be a utility
class with a similar name...)

However, I'll update the KIP to include the factory method on the interface.

On Fri, 21 Jun 2019 at 23:40, Matthias J. Sax  wrote:

> I still think, that an interface does not need to know anything about
> its implementation. But I am also fine if we add a factory method to the
> new interface if that is preferred by most people.
>
>
> -Matthias
>
> On 6/21/19 7:10 AM, Ismael Juma wrote:
> > This is even more reason not to deprecate immediately, there is very
> little
> > maintenance cost for us. We should be mindful that many of our users (eg
> > Spark, Flink, etc.) typically allow users to specify the kafka clients
> > version and hence avoid using new classes/interfaces for some time. They
> > would get a bunch of warnings they cannot do anything about apart from
> > suppressing.
> >
> > Ismael
> >
> > On Fri, Jun 21, 2019 at 4:00 AM Andy Coates  wrote:
> >
> >> Hi Ismael,
> >>
> >> I’m happy enough to not deprecate the existing `AdminClient` class as
> part
> >> of this change.
> >>
> >> However, note that, the class will likely be empty, i.e. all methods and
> >> implementations will be inherited from the interface:
> >>
> >> public abstract class AdminClient implements Admin {
> >> }
> >>
> >> Not marking it as deprecated has the benefit that users won’t see any
> >> deprecation warnings on the next release. Conversely, deprecating it
> will
> >> mean we can choose to remove this, now pointless class, in the future
> if we
> >> choose.
> >>
> >> That’s my thinking for deprecation, but as I’ve said I’m happy either
> way.
> >>
> >> Andy
> >>
> >>> On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
> >>>
> >>> I agree with Ryanne, I think we should avoid deprecating AdminClient
> and
> >>> causing so much churn for users who don't actually care about this
> niche
> >>> use case.
> >>>
> >>> Ismael
> >>>
> >>> On Tue, Jun 18, 2019 at 6:43 AM Andy Coates  wrote:
> >>>
> >>>> Hi Ryanne,
> >>>>
> >>>> If we don't change the client code, then everywhere will still expect
> >>>> subclasses of `AdminClient`, so the interface will be of no use, i.e.
> I
> >>>> can't write a class that implements the new interface and pass it to
> the
> >>>> client code.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Andy
> >>>>
> >>>> On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan 
> >> wrote:
> >>>>
> >>>>> Andy, while I agree that the new interface is useful, I'm not
> convinced
> >>>>> adding an interface requires deprecating AdminClient and changing so
> >> much
> >>>>> client code. Why not just add the Admin interface, have AdminClient
> >>>>> implement it, and have done?
> >>>>>
> >>>>> Ryanne
> >>>>>
> >>>>> On Mon, Jun 17, 2019 at 12:09 PM Andy Coates 
> >> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> I think I've addressed all concerns. Let me know if I've not.  Can I
> >>>> call
> >>>>>> another round of votes please?
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Andy
> >>>>>>
> >>>>>> On Fri, 14 Jun 2019 at 04:55, Satish Duggana <
> >> satish.dugg...@gmail.com
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi Andy,
> >>>>>>> Thanks for the KIP. This is a good change and it gives the user a
> >>>>> better
> >>>>>>> handle on Admin client usage. I agree with the proposal except the
> >>>> new
> >>>>>>> `Admin` interface having all the methods from `AdminClient`
> abstract
> >>>>>> class.
> >>>>>>> It should be kept clean having only the admin operations as methods
> >>>>> from
> >>>>>>> KafkaClient abst

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-24 Thread Andy Coates
That makes a lot of sense.  OK, no deprecation.

On Fri, 21 Jun 2019 at 15:11, Ismael Juma  wrote:

> This is even more reason not to deprecate immediately, there is very little
> maintenance cost for us. We should be mindful that many of our users (eg
> Spark, Flink, etc.) typically allow users to specify the kafka clients
> version and hence avoid using new classes/interfaces for some time. They
> would get a bunch of warnings they cannot do anything about apart from
> suppressing.
>
> Ismael
>
> On Fri, Jun 21, 2019 at 4:00 AM Andy Coates  wrote:
>
> > Hi Ismael,
> >
> > I’m happy enough to not deprecate the existing `AdminClient` class as
> part
> > of this change.
> >
> > However, note that, the class will likely be empty, i.e. all methods and
> > implementations will be inherited from the interface:
> >
> > public abstract class AdminClient implements Admin {
> > }
> >
> > Not marking it as deprecated has the benefit that users won’t see any
> > deprecation warnings on the next release. Conversely, deprecating it will
> > mean we can choose to remove this, now pointless class, in the future if
> we
> > choose.
> >
> > That’s my thinking for deprecation, but as I’ve said I’m happy either
> way.
> >
> > Andy
> >
> > > On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
> > >
> > > I agree with Ryanne, I think we should avoid deprecating AdminClient
> and
> > > causing so much churn for users who don't actually care about this
> niche
> > > use case.
> > >
> > > Ismael
> > >
> > > On Tue, Jun 18, 2019 at 6:43 AM Andy Coates  wrote:
> > >
> > >> Hi Ryanne,
> > >>
> > >> If we don't change the client code, then everywhere will still expect
> > >> subclasses of `AdminClient`, so the interface will be of no use, i.e.
> I
> > >> can't write a class that implements the new interface and pass it to
> the
> > >> client code.
> > >>
> > >> Thanks,
> > >>
> > >> Andy
> > >>
> > >> On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan 
> > wrote:
> > >>
> > >>> Andy, while I agree that the new interface is useful, I'm not
> convinced
> > >>> adding an interface requires deprecating AdminClient and changing so
> > much
> > >>> client code. Why not just add the Admin interface, have AdminClient
> > >>> implement it, and have done?
> > >>>
> > >>> Ryanne
> > >>>
> > >>> On Mon, Jun 17, 2019 at 12:09 PM Andy Coates 
> > wrote:
> > >>>
> > >>>> Hi all,
> > >>>>
> > >>>> I think I've addressed all concerns. Let me know if I've not.  Can I
> > >> call
> > >>>> another round of votes please?
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Andy
> > >>>>
> > >>>> On Fri, 14 Jun 2019 at 04:55, Satish Duggana <
> > satish.dugg...@gmail.com
> > >>>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Andy,
> > >>>>> Thanks for the KIP. This is a good change and it gives the user a
> > >>> better
> > >>>>> handle on Admin client usage. I agree with the proposal except the
> > >> new
> > >>>>> `Admin` interface having all the methods from `AdminClient`
> abstract
> > >>>> class.
> > >>>>> It should be kept clean having only the admin operations as methods
> > >>> from
> > >>>>> KafkaClient abstract class but not the factory methods as mentioned
> > >> in
> > >>>> the
> > >>>>> earlier mail.
> > >>>>>
> > >>>>> I know about dynamic proxies(which were widely used in RMI/EJB
> > >> world).
> > >>> I
> > >>>> am
> > >>>>> curious about the usecase using dynamic proxies with Admin client
> > >>>>> interface. Dynamic proxy can have performance penalty if it is used
> > >> in
> > >>>>> critical path. Is that the primary motivation for creating the KIP?
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Satish.
> > >>>>>
> > >>>>> On Wed, Ju

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-21 Thread Andy Coates
Hi Ismael,

I’m happy enough to not deprecate the existing `AdminClient` class as part of 
this change.

However, note that, the class will likely be empty, i.e. all methods and 
implementations will be inherited from the interface:

public abstract class AdminClient implements Admin {
}

Not marking it as deprecated has the benefit that users won’t see any 
deprecation warnings on the next release. Conversely, deprecating it will mean 
we can choose to remove this, now pointless class, in the future if we choose. 

That’s my thinking for deprecation, but as I’ve said I’m happy either way.

Andy

> On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
> 
> I agree with Ryanne, I think we should avoid deprecating AdminClient and
> causing so much churn for users who don't actually care about this niche
> use case.
> 
> Ismael
> 
> On Tue, Jun 18, 2019 at 6:43 AM Andy Coates  wrote:
> 
>> Hi Ryanne,
>> 
>> If we don't change the client code, then everywhere will still expect
>> subclasses of `AdminClient`, so the interface will be of no use, i.e. I
>> can't write a class that implements the new interface and pass it to the
>> client code.
>> 
>> Thanks,
>> 
>> Andy
>> 
>> On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan  wrote:
>> 
>>> Andy, while I agree that the new interface is useful, I'm not convinced
>>> adding an interface requires deprecating AdminClient and changing so much
>>> client code. Why not just add the Admin interface, have AdminClient
>>> implement it, and have done?
>>> 
>>> Ryanne
>>> 
>>> On Mon, Jun 17, 2019 at 12:09 PM Andy Coates  wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> I think I've addressed all concerns. Let me know if I've not.  Can I
>> call
>>>> another round of votes please?
>>>> 
>>>> Thanks,
>>>> 
>>>> Andy
>>>> 
>>>> On Fri, 14 Jun 2019 at 04:55, Satish Duggana >> 
>>>> wrote:
>>>> 
>>>>> Hi Andy,
>>>>> Thanks for the KIP. This is a good change and it gives the user a
>>> better
>>>>> handle on Admin client usage. I agree with the proposal except the
>> new
>>>>> `Admin` interface having all the methods from `AdminClient` abstract
>>>> class.
>>>>> It should be kept clean having only the admin operations as methods
>>> from
>>>>> KafkaClient abstract class but not the factory methods as mentioned
>> in
>>>> the
>>>>> earlier mail.
>>>>> 
>>>>> I know about dynamic proxies(which were widely used in RMI/EJB
>> world).
>>> I
>>>> am
>>>>> curious about the usecase using dynamic proxies with Admin client
>>>>> interface. Dynamic proxy can have performance penalty if it is used
>> in
>>>>> critical path. Is that the primary motivation for creating the KIP?
>>>>> 
>>>>> Thanks,
>>>>> Satish.
>>>>> 
>>>>> On Wed, Jun 12, 2019 at 8:43 PM Andy Coates 
>> wrote:
>>>>> 
>>>>>> I'm not married to that part.  That was only done to keep it more
>> or
>>>> less
>>>>>> inline with what's already there, (an abstract class that has a
>>> factory
>>>>>> method that returns a subclass sounds like the same
>> anti-pattern
>>>> ;))
>>>>>> 
>>>>>> An alternative would to have an `AdminClients` utility class to
>>> create
>>>>> the
>>>>>> admin client.
>>>>>> 
>>>>>> On Mon, 10 Jun 2019 at 19:31, Matthias J. Sax <
>> matth...@confluent.io
>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> Hmmm...
>>>>>>> 
>>>>>>> So the new interface, returns an instance of a class that
>>> implements
>>>>> the
>>>>>>> interface. This sounds a little bit like an anti-pattern?
>> Shouldn't
>>>>>>> interfaces actually not know anything about classes that
>> implement
>>>> the
>>>>>>> interface?
>>>>>>> 
>>>>>>> 
>>>>>>> -Matthias
>>>>>>> 
>>>>>>> On 6/10/19 11:22 AM, Andy Coates wrote:
>>>>>>>> `AdminClient` would be deprecated purely b

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-21 Thread Andy Coates
Hi Ismael,

Matthias thought having the interface also provide a factory method that 
returns a specific implementation was a bit of an anti-pattern, and I would 
tend to agree, though I’ve used this same pattern myself at times where the set 
of implementations is known.

Matthias may want to answer with his own thoughts, but from my own experience 
I’ve often found it cleaner to have a companion utility class to an interface 
in an API, that has factory methods to create instances of the common set of 
implementations. In this case it would be `KafkaAdminClient` only at the 
moment, though if we were to support something like the `DelegatingAdminClient` 
that Colin mentioned in the future, this too could be on the utility class.

My preference is the separate classes. But in the interest of moving this 
forward I’m happy with either.

Andy



> On 18 Jun 2019, at 16:07, Ismael Juma  wrote:
> 
> I don't agree with this change. The idea that an interface cannot have a
> default implementation is outdated in my view. Can someone provide any
> benefit to introducing a separate class for the factory method?
> 
> Ismael
> 
> On Mon, Jun 17, 2019 at 10:07 AM Andy Coates  wrote:
> 
>> Hi All,
>> 
>> I've updated the KIP to move the `create` factory method implementation
>> into a new `AdminClients` utility class, rather than on the new `Admin`
>> interface.
>> 
>> Satish,
>> 
>> As above, the KIP has been updated to only have the operations on the
>> `Admin` api. As for the overhead of dynamic proxies... the use of dynamic
>> proxies is totally up to the users of the library. In KSQL we use dynamic
>> proxies because the overhead is acceptable and it decouples us from
>> additions to the client interfaces. Others may decide otherwise for their
>> project. By making the admin api an interface we're empowering users to
>> choose the right approach for them.
>> 
>> This is the primary motivation for the KIP from my point of view. However,
>> it also brings it inline with the other Kafka clients, and gives users the
>> freedom to do what they want, rather than requiring the use of an abstract
>> base class.
>> 
>> Thanks,
>> 
>> Andy
>> 
>> 
>> On Fri, 14 Jun 2019 at 04:55, Satish Duggana 
>> wrote:
>> 
>>> Hi Andy,
>>> Thanks for the KIP. This is a good change and it gives the user a better
>>> handle on Admin client usage. I agree with the proposal except the new
>>> `Admin` interface having all the methods from `AdminClient` abstract
>> class.
>>> It should be kept clean having only the admin operations as methods from
>>> KafkaClient abstract class but not the factory methods as mentioned in
>> the
>>> earlier mail.
>>> 
>>> I know about dynamic proxies(which were widely used in RMI/EJB world). I
>> am
>>> curious about the usecase using dynamic proxies with Admin client
>>> interface. Dynamic proxy can have performance penalty if it is used in
>>> critical path. Is that the primary motivation for creating the KIP?
>>> 
>>> Thanks,
>>> Satish.
>>> 
>>> On Wed, Jun 12, 2019 at 8:43 PM Andy Coates  wrote:
>>> 
>>>> I'm not married to that part.  That was only done to keep it more or
>> less
>>>> inline with what's already there, (an abstract class that has a factory
>>>> method that returns a subclass sounds like the same anti-pattern
>> ;))
>>>> 
>>>> An alternative would to have an `AdminClients` utility class to create
>>> the
>>>> admin client.
>>>> 
>>>> On Mon, 10 Jun 2019 at 19:31, Matthias J. Sax 
>>>> wrote:
>>>> 
>>>>> Hmmm...
>>>>> 
>>>>> So the new interface, returns an instance of a class that implements
>>> the
>>>>> interface. This sounds a little bit like an anti-pattern? Shouldn't
>>>>> interfaces actually not know anything about classes that implement
>> the
>>>>> interface?
>>>>> 
>>>>> 
>>>>> -Matthias
>>>>> 
>>>>> On 6/10/19 11:22 AM, Andy Coates wrote:
>>>>>> `AdminClient` would be deprecated purely because it would no longer
>>>> serve
>>>>>> any purpose and would be virtually empty, getting all of its
>>>>> implementation
>>>>>> from the new interfar. It would be nice to remove this from the API
>>> at
>>>>> the
>>>>>> next major version bu

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-18 Thread Andy Coates
Hi Ryanne,

If we don't change the client code, then everywhere will still expect
subclasses of `AdminClient`, so the interface will be of no use, i.e. I
can't write a class that implements the new interface and pass it to the
client code.

Thanks,

Andy

On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan  wrote:

> Andy, while I agree that the new interface is useful, I'm not convinced
> adding an interface requires deprecating AdminClient and changing so much
> client code. Why not just add the Admin interface, have AdminClient
> implement it, and have done?
>
> Ryanne
>
> On Mon, Jun 17, 2019 at 12:09 PM Andy Coates  wrote:
>
> > Hi all,
> >
> > I think I've addressed all concerns. Let me know if I've not.  Can I call
> > another round of votes please?
> >
> > Thanks,
> >
> > Andy
> >
> > On Fri, 14 Jun 2019 at 04:55, Satish Duggana 
> > wrote:
> >
> > > Hi Andy,
> > > Thanks for the KIP. This is a good change and it gives the user a
> better
> > > handle on Admin client usage. I agree with the proposal except the new
> > > `Admin` interface having all the methods from `AdminClient` abstract
> > class.
> > > It should be kept clean having only the admin operations as methods
> from
> > > KafkaClient abstract class but not the factory methods as mentioned in
> > the
> > > earlier mail.
> > >
> > > I know about dynamic proxies(which were widely used in RMI/EJB world).
> I
> > am
> > > curious about the usecase using dynamic proxies with Admin client
> > > interface. Dynamic proxy can have performance penalty if it is used in
> > > critical path. Is that the primary motivation for creating the KIP?
> > >
> > > Thanks,
> > > Satish.
> > >
> > > On Wed, Jun 12, 2019 at 8:43 PM Andy Coates  wrote:
> > >
> > > > I'm not married to that part.  That was only done to keep it more or
> > less
> > > > inline with what's already there, (an abstract class that has a
> factory
> > > > method that returns a subclass sounds like the same anti-pattern
> > ;))
> > > >
> > > > An alternative would to have an `AdminClients` utility class to
> create
> > > the
> > > > admin client.
> > > >
> > > > On Mon, 10 Jun 2019 at 19:31, Matthias J. Sax  >
> > > > wrote:
> > > >
> > > > > Hmmm...
> > > > >
> > > > > So the new interface, returns an instance of a class that
> implements
> > > the
> > > > > interface. This sounds a little bit like an anti-pattern? Shouldn't
> > > > > interfaces actually not know anything about classes that implement
> > the
> > > > > interface?
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 6/10/19 11:22 AM, Andy Coates wrote:
> > > > > > `AdminClient` would be deprecated purely because it would no
> longer
> > > > serve
> > > > > > any purpose and would be virtually empty, getting all of its
> > > > > implementation
> > > > > > from the new interfar. It would be nice to remove this from the
> API
> > > at
> > > > > the
> > > > > > next major version bump, hence the need to deprecate.
> > > > > >
> > > > > > `AdminClient.create()` would return what it does today, (so not a
> > > > > breaking
> > > > > > change).
> > > > > >
> > > > > > On Tue, 4 Jun 2019 at 22:24, Ryanne Dolan  >
> > > > wrote:
> > > > > >
> > > > > >>> The existing `AdminClient` will be marked as deprecated.
> > > > > >>
> > > > > >> What's the reasoning behind this? I'm fine with the other
> changes,
> > > but
> > > > > >> would prefer to keep the existing public API intact if it's not
> > > > hurting
> > > > > >> anything.
> > > > > >>
> > > > > >> Also, what will AdminClient.create() return? Would it be a
> > breaking
> > > > > change?
> > > > > >>
> > > > > >> Ryanne
> > > > > >>
> > > > > >> On Tue, Jun 4, 2019, 11:17 AM Andy Coates 
> > > wrote:
> > > > > >>
> > > > > >>> Hi folks
> > > > > >>>
> > > > > >>> As there's been no chatter on this KIP I'm assuming it's
> > > > > non-contentious,
> > > > > >>> (or just boring), hence I'd like to call a vote for KIP-476:
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>>
> > > > > >>> Andy
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-17 Thread Andy Coates
Hi all,

I think I've addressed all concerns. Let me know if I've not.  Can I call
another round of votes please?

Thanks,

Andy

On Fri, 14 Jun 2019 at 04:55, Satish Duggana 
wrote:

> Hi Andy,
> Thanks for the KIP. This is a good change and it gives the user a better
> handle on Admin client usage. I agree with the proposal except the new
> `Admin` interface having all the methods from `AdminClient` abstract class.
> It should be kept clean having only the admin operations as methods from
> KafkaClient abstract class but not the factory methods as mentioned in the
> earlier mail.
>
> I know about dynamic proxies(which were widely used in RMI/EJB world). I am
> curious about the usecase using dynamic proxies with Admin client
> interface. Dynamic proxy can have performance penalty if it is used in
> critical path. Is that the primary motivation for creating the KIP?
>
> Thanks,
> Satish.
>
> On Wed, Jun 12, 2019 at 8:43 PM Andy Coates  wrote:
>
> > I'm not married to that part.  That was only done to keep it more or less
> > inline with what's already there, (an abstract class that has a factory
> > method that returns a subclass sounds like the same anti-pattern ;))
> >
> > An alternative would to have an `AdminClients` utility class to create
> the
> > admin client.
> >
> > On Mon, 10 Jun 2019 at 19:31, Matthias J. Sax 
> > wrote:
> >
> > > Hmmm...
> > >
> > > So the new interface, returns an instance of a class that implements
> the
> > > interface. This sounds a little bit like an anti-pattern? Shouldn't
> > > interfaces actually not know anything about classes that implement the
> > > interface?
> > >
> > >
> > > -Matthias
> > >
> > > On 6/10/19 11:22 AM, Andy Coates wrote:
> > > > `AdminClient` would be deprecated purely because it would no longer
> > serve
> > > > any purpose and would be virtually empty, getting all of its
> > > implementation
> > > > from the new interfar. It would be nice to remove this from the API
> at
> > > the
> > > > next major version bump, hence the need to deprecate.
> > > >
> > > > `AdminClient.create()` would return what it does today, (so not a
> > > breaking
> > > > change).
> > > >
> > > > On Tue, 4 Jun 2019 at 22:24, Ryanne Dolan 
> > wrote:
> > > >
> > > >>> The existing `AdminClient` will be marked as deprecated.
> > > >>
> > > >> What's the reasoning behind this? I'm fine with the other changes,
> but
> > > >> would prefer to keep the existing public API intact if it's not
> > hurting
> > > >> anything.
> > > >>
> > > >> Also, what will AdminClient.create() return? Would it be a breaking
> > > change?
> > > >>
> > > >> Ryanne
> > > >>
> > > >> On Tue, Jun 4, 2019, 11:17 AM Andy Coates 
> wrote:
> > > >>
> > > >>> Hi folks
> > > >>>
> > > >>> As there's been no chatter on this KIP I'm assuming it's
> > > non-contentious,
> > > >>> (or just boring), hence I'd like to call a vote for KIP-476:
> > > >>>
> > > >>>
> > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Andy
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
>


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-17 Thread Andy Coates
Hi All,

I've updated the KIP to move the `create` factory method implementation
into a new `AdminClients` utility class, rather than on the new `Admin`
interface.

Satish,

As above, the KIP has been updated to only have the operations on the
`Admin` api. As for the overhead of dynamic proxies... the use of dynamic
proxies is totally up to the users of the library. In KSQL we use dynamic
proxies because the overhead is acceptable and it decouples us from
additions to the client interfaces. Others may decide otherwise for their
project. By making the admin api an interface we're empowering users to
choose the right approach for them.

This is the primary motivation for the KIP from my point of view. However,
it also brings it inline with the other Kafka clients, and gives users the
freedom to do what they want, rather than requiring the use of an abstract
base class.

Thanks,

Andy


On Fri, 14 Jun 2019 at 04:55, Satish Duggana 
wrote:

> Hi Andy,
> Thanks for the KIP. This is a good change and it gives the user a better
> handle on Admin client usage. I agree with the proposal except the new
> `Admin` interface having all the methods from `AdminClient` abstract class.
> It should be kept clean having only the admin operations as methods from
> KafkaClient abstract class but not the factory methods as mentioned in the
> earlier mail.
>
> I know about dynamic proxies(which were widely used in RMI/EJB world). I am
> curious about the usecase using dynamic proxies with Admin client
> interface. Dynamic proxy can have performance penalty if it is used in
> critical path. Is that the primary motivation for creating the KIP?
>
> Thanks,
> Satish.
>
> On Wed, Jun 12, 2019 at 8:43 PM Andy Coates  wrote:
>
> > I'm not married to that part.  That was only done to keep it more or less
> > inline with what's already there, (an abstract class that has a factory
> > method that returns a subclass sounds like the same anti-pattern ;))
> >
> > An alternative would to have an `AdminClients` utility class to create
> the
> > admin client.
> >
> > On Mon, 10 Jun 2019 at 19:31, Matthias J. Sax 
> > wrote:
> >
> > > Hmmm...
> > >
> > > So the new interface, returns an instance of a class that implements
> the
> > > interface. This sounds a little bit like an anti-pattern? Shouldn't
> > > interfaces actually not know anything about classes that implement the
> > > interface?
> > >
> > >
> > > -Matthias
> > >
> > > On 6/10/19 11:22 AM, Andy Coates wrote:
> > > > `AdminClient` would be deprecated purely because it would no longer
> > serve
> > > > any purpose and would be virtually empty, getting all of its
> > > implementation
> > > > from the new interfar. It would be nice to remove this from the API
> at
> > > the
> > > > next major version bump, hence the need to deprecate.
> > > >
> > > > `AdminClient.create()` would return what it does today, (so not a
> > > breaking
> > > > change).
> > > >
> > > > On Tue, 4 Jun 2019 at 22:24, Ryanne Dolan 
> > wrote:
> > > >
> > > >>> The existing `AdminClient` will be marked as deprecated.
> > > >>
> > > >> What's the reasoning behind this? I'm fine with the other changes,
> but
> > > >> would prefer to keep the existing public API intact if it's not
> > hurting
> > > >> anything.
> > > >>
> > > >> Also, what will AdminClient.create() return? Would it be a breaking
> > > change?
> > > >>
> > > >> Ryanne
> > > >>
> > > >> On Tue, Jun 4, 2019, 11:17 AM Andy Coates 
> wrote:
> > > >>
> > > >>> Hi folks
> > > >>>
> > > >>> As there's been no chatter on this KIP I'm assuming it's
> > > non-contentious,
> > > >>> (or just boring), hence I'd like to call a vote for KIP-476:
> > > >>>
> > > >>>
> > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Andy
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
>


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-12 Thread Andy Coates
I'm not married to that part.  That was only done to keep it more or less
inline with what's already there, (an abstract class that has a factory
method that returns a subclass sounds like the same anti-pattern ;))

An alternative would to have an `AdminClients` utility class to create the
admin client.

On Mon, 10 Jun 2019 at 19:31, Matthias J. Sax  wrote:

> Hmmm...
>
> So the new interface, returns an instance of a class that implements the
> interface. This sounds a little bit like an anti-pattern? Shouldn't
> interfaces actually not know anything about classes that implement the
> interface?
>
>
> -Matthias
>
> On 6/10/19 11:22 AM, Andy Coates wrote:
> > `AdminClient` would be deprecated purely because it would no longer serve
> > any purpose and would be virtually empty, getting all of its
> implementation
> > from the new interfar. It would be nice to remove this from the API at
> the
> > next major version bump, hence the need to deprecate.
> >
> > `AdminClient.create()` would return what it does today, (so not a
> breaking
> > change).
> >
> > On Tue, 4 Jun 2019 at 22:24, Ryanne Dolan  wrote:
> >
> >>> The existing `AdminClient` will be marked as deprecated.
> >>
> >> What's the reasoning behind this? I'm fine with the other changes, but
> >> would prefer to keep the existing public API intact if it's not hurting
> >> anything.
> >>
> >> Also, what will AdminClient.create() return? Would it be a breaking
> change?
> >>
> >> Ryanne
> >>
> >> On Tue, Jun 4, 2019, 11:17 AM Andy Coates  wrote:
> >>
> >>> Hi folks
> >>>
> >>> As there's been no chatter on this KIP I'm assuming it's
> non-contentious,
> >>> (or just boring), hence I'd like to call a vote for KIP-476:
> >>>
> >>>
> >>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface
> >>>
> >>> Thanks,
> >>>
> >>> Andy
> >>>
> >>
> >
>
>


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-12 Thread Andy Coates
With 3.0 not imminent I would prefer to make this change soon, rather than
later.

On Tue, 11 Jun 2019 at 21:46, Colin McCabe  wrote:

> On Tue, Jun 11, 2019, at 12:12, Andy Coates wrote:
> > Thanks for the response Colin,
> >
> > > What specific benefits do we get from transitioning to using an
> interface
> > > rather than an abstract class?
> >
> > This is covered in the KLIP: "An AdminClient interface has several
> > advantages over an abstract base class, most notably allowing
> > multi-inheritance and the use of dynamic proxies"
>
> Hi Andy,
>
> I was not that familiar with dynamic proxies, which is why the advantages
> weren't clear to me.  I had a conversation offline with Ismael and he
> explained some of the background here.
>
> The JDK includes a java.lang.reflect.Proxy class which can be used to
> create dynamic proxies, but only for interfaces -- not for abstract
> classes.  This is just a limitation in the library, though-- there are
> other ways of creating a dynamic proxy, like using cglib, that work with
> abstract classes as well as with interfaces.  But using something like
> cglib involves taking an external dependency, which could be annoying.
>
> >
> > > If we are serious about doing this, would it be cleaner to just change
> > > AdminClient from an abstract class to an interface in Kafka 3.0?  It
> would
> > > break binary compatibility, but you have to break binary compatibility
> in
> > > any case to get what you want here (no abstract class).  And it would
> have
> > > the advantage of not creating a lot of churn in the codebase as people
> > > replaced "AdminClient" with "Admin" all over the place.  What do you
> think?
> >
> > How far off is Kafka 3.0? This is causing us pain right now on a regular
> > basis and, from Ryanne's email above we're not alone. I'm not against
> > making this a change in Kafka 3.0, but only if its imminent.
>
> I don't believe 3.0 is imminent.
>
> >
> > > On a related note, one problem I've seen is that people will subclass
> > > AdminClient for testing.  Then, every time Kafka adds a new API, we
> add a
> > > new abstract method to the base class, which breaks compilation for
> them.
> > > Their test classes would have been fine simply failing when the new
> API was
> > > called.  So perhaps one useful class would be a class that implements
> the
> > > AdminClient API, and throws UnimplementedException for every method.
> The
> > > test classes could subclass this method and never have to worry about
> new
> > > methods getting added again.
> >
> > This is similar to what KSQL does for the other Kafka client APIs, except
> > we use a dynamic proxy, rather than having to hand code the exception
> > throwing.
>
> There is some performance penalty for the dynamic proxy approach, but it
> certainly is simpler.
>
> With this context in mind, the change sounds reasonable to me.
>
> best,
> Colin
>
> >
> > > Another pattern I've seen is people wanting to implement a class which
> is
> > > similar to KafkaAdminClient in every way, except for the behavior of
> > > close().  Specifically, people want to implement reference counting in
> > > order to reuse AdminClient instances.  So they start by implementing
> > > essentially a delegating class, which just forwards every method to an
> > > underlying AdminClient instance.  But this has the same problem that it
> > > breaks every time the upstream project adds an API.  In order to
> support
> > > this, we could have an official DelegatingAdminClient base class that
> > > forwarded every method to an underlying AdminClient instance.  Then the
> > > client code could override the methods they needed, like close.
> >
> > Again, this would be trivial to implement using dynamic proxies. No need
> > for us to implement any `DelegatingAdminClient`. If this is an interface
> we
> > empower users to do this for themselves.
> >
> > best,
> > Colin
> >
> > On Mon, 10 Jun 2019 at 21:44, Colin McCabe  wrote:
> >
> > > Hi Andy,
> > >
> > > This is a big change, and I don't think there has been a lot of
> discussion
> > > about the pros and cons.  What specific benefits do we get from
> > > transitioning to using an interface rather than an abstract class?
> > >
> > > If we are serious about doing this, would it be cleaner to just change
> > > AdminClient from an abstract c

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-12 Thread Andy Coates
Thanks Great.

Do that mean you're a +1?

On Tue, 11 Jun 2019 at 21:46, Colin McCabe  wrote:

> On Tue, Jun 11, 2019, at 12:12, Andy Coates wrote:
> > Thanks for the response Colin,
> >
> > > What specific benefits do we get from transitioning to using an
> interface
> > > rather than an abstract class?
> >
> > This is covered in the KLIP: "An AdminClient interface has several
> > advantages over an abstract base class, most notably allowing
> > multi-inheritance and the use of dynamic proxies"
>
> Hi Andy,
>
> I was not that familiar with dynamic proxies, which is why the advantages
> weren't clear to me.  I had a conversation offline with Ismael and he
> explained some of the background here.
>
> The JDK includes a java.lang.reflect.Proxy class which can be used to
> create dynamic proxies, but only for interfaces -- not for abstract
> classes.  This is just a limitation in the library, though-- there are
> other ways of creating a dynamic proxy, like using cglib, that work with
> abstract classes as well as with interfaces.  But using something like
> cglib involves taking an external dependency, which could be annoying.
>
> >
> > > If we are serious about doing this, would it be cleaner to just change
> > > AdminClient from an abstract class to an interface in Kafka 3.0?  It
> would
> > > break binary compatibility, but you have to break binary compatibility
> in
> > > any case to get what you want here (no abstract class).  And it would
> have
> > > the advantage of not creating a lot of churn in the codebase as people
> > > replaced "AdminClient" with "Admin" all over the place.  What do you
> think?
> >
> > How far off is Kafka 3.0? This is causing us pain right now on a regular
> > basis and, from Ryanne's email above we're not alone. I'm not against
> > making this a change in Kafka 3.0, but only if its imminent.
>
> I don't believe 3.0 is imminent.
>
> >
> > > On a related note, one problem I've seen is that people will subclass
> > > AdminClient for testing.  Then, every time Kafka adds a new API, we
> add a
> > > new abstract method to the base class, which breaks compilation for
> them.
> > > Their test classes would have been fine simply failing when the new
> API was
> > > called.  So perhaps one useful class would be a class that implements
> the
> > > AdminClient API, and throws UnimplementedException for every method.
> The
> > > test classes could subclass this method and never have to worry about
> new
> > > methods getting added again.
> >
> > This is similar to what KSQL does for the other Kafka client APIs, except
> > we use a dynamic proxy, rather than having to hand code the exception
> > throwing.
>
> There is some performance penalty for the dynamic proxy approach, but it
> certainly is simpler.
>
> With this context in mind, the change sounds reasonable to me.
>
> best,
> Colin
>
> >
> > > Another pattern I've seen is people wanting to implement a class which
> is
> > > similar to KafkaAdminClient in every way, except for the behavior of
> > > close().  Specifically, people want to implement reference counting in
> > > order to reuse AdminClient instances.  So they start by implementing
> > > essentially a delegating class, which just forwards every method to an
> > > underlying AdminClient instance.  But this has the same problem that it
> > > breaks every time the upstream project adds an API.  In order to
> support
> > > this, we could have an official DelegatingAdminClient base class that
> > > forwarded every method to an underlying AdminClient instance.  Then the
> > > client code could override the methods they needed, like close.
> >
> > Again, this would be trivial to implement using dynamic proxies. No need
> > for us to implement any `DelegatingAdminClient`. If this is an interface
> we
> > empower users to do this for themselves.
> >
> > best,
> > Colin
> >
> > On Mon, 10 Jun 2019 at 21:44, Colin McCabe  wrote:
> >
> > > Hi Andy,
> > >
> > > This is a big change, and I don't think there has been a lot of
> discussion
> > > about the pros and cons.  What specific benefits do we get from
> > > transitioning to using an interface rather than an abstract class?
> > >
> > > If we are serious about doing this, would it be cleaner to just change
> > > AdminClient from an abstract class to an interface in Kafka 3.0

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-11 Thread Andy Coates
Thanks for the response Colin,

> What specific benefits do we get from transitioning to using an interface
rather than an abstract class?

This is covered in the KLIP: "An AdminClient interface has several
advantages over an abstract base class, most notably allowing
multi-inheritance and the use of dynamic proxies"

> If we are serious about doing this, would it be cleaner to just change
AdminClient from an abstract class to an interface in Kafka 3.0?  It would
break binary compatibility, but you have to break binary compatibility in
any case to get what you want here (no abstract class).  And it would have
the advantage of not creating a lot of churn in the codebase as people
replaced "AdminClient" with "Admin" all over the place.  What do you think?

How far off is Kafka 3.0? This is causing us pain right now on a regular
basis and, from Ryanne's email above we're not alone. I'm not against
making this a change in Kafka 3.0, but only if its imminent.

> On a related note, one problem I've seen is that people will subclass
AdminClient for testing.  Then, every time Kafka adds a new API, we add a
new abstract method to the base class, which breaks compilation for them.
Their test classes would have been fine simply failing when the new API was
called.  So perhaps one useful class would be a class that implements the
AdminClient API, and throws UnimplementedException for every method.  The
test classes could subclass this method and never have to worry about new
methods getting added again.

This is similar to what KSQL does for the other Kafka client APIs, except
we use a dynamic proxy, rather than having to hand code the exception
throwing.

> Another pattern I've seen is people wanting to implement a class which is
similar to KafkaAdminClient in every way, except for the behavior of
close().  Specifically, people want to implement reference counting in
order to reuse AdminClient instances.  So they start by implementing
essentially a delegating class, which just forwards every method to an
underlying AdminClient instance.  But this has the same problem that it
breaks every time the upstream project adds an API.  In order to support
this, we could have an official DelegatingAdminClient base class that
forwarded every method to an underlying AdminClient instance.  Then the
client code could override the methods they needed, like close.

Again, this would be trivial to implement using dynamic proxies. No need
for us to implement any `DelegatingAdminClient`. If this is an interface we
empower users to do this for themselves.

best,
Colin

On Mon, 10 Jun 2019 at 21:44, Colin McCabe  wrote:

> Hi Andy,
>
> This is a big change, and I don't think there has been a lot of discussion
> about the pros and cons.  What specific benefits do we get from
> transitioning to using an interface rather than an abstract class?
>
> If we are serious about doing this, would it be cleaner to just change
> AdminClient from an abstract class to an interface in Kafka 3.0?  It would
> break binary compatibility, but you have to break binary compatibility in
> any case to get what you want here (no abstract class).  And it would have
> the advantage of not creating a lot of churn in the codebase as people
> replaced "AdminClient" with "Admin" all over the place.  What do you think?
>
> On a related note, one problem I've seen is that people will subclass
> AdminClient for testing.  Then, every time Kafka adds a new API, we add a
> new abstract method to the base class, which breaks compilation for them.
> Their test classes would have been fine simply failing when the new API was
> called.  So perhaps one useful class would be a class that implements the
> AdminClient API, and throws UnimplementedException for every method.  The
> test classes could subclass this method and never have to worry about new
> methods getting added again.
>
> Another pattern I've seen is people wanting to implement a class which is
> similar to KafkaAdminClient in every way, except for the behavior of
> close().  Specifically, people want to implement reference counting in
> order to reuse AdminClient instances.  So they start by implementing
> essentially a delegating class, which just forwards every method to an
> underlying AdminClient instance.  But this has the same problem that it
> breaks every time the upstream project adds an API.  In order to support
> this, we could have an official DelegatingAdminClient base class that
> forwarded every method to an underlying AdminClient instance.  Then the
> client code could override the methods they needed, like close.
>
> best,
> Colin
>
>
> On Tue, Jun 4, 2019, at 09:17, Andy Coates wrote:
> > Hi folks
> >
> > As there's been no chatter on this KIP I'm assuming it's non-contentious,
> > (or just boring), hence I'd like to call a vote for KIP-476:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface
> >
> > Thanks,
> >
> > Andy
> >
>


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-10 Thread Andy Coates
`AdminClient` would be deprecated purely because it would no longer serve
any purpose and would be virtually empty, getting all of its implementation
from the new interfar. It would be nice to remove this from the API at the
next major version bump, hence the need to deprecate.

`AdminClient.create()` would return what it does today, (so not a breaking
change).

On Tue, 4 Jun 2019 at 22:24, Ryanne Dolan  wrote:

> > The existing `AdminClient` will be marked as deprecated.
>
> What's the reasoning behind this? I'm fine with the other changes, but
> would prefer to keep the existing public API intact if it's not hurting
> anything.
>
> Also, what will AdminClient.create() return? Would it be a breaking change?
>
> Ryanne
>
> On Tue, Jun 4, 2019, 11:17 AM Andy Coates  wrote:
>
>> Hi folks
>>
>> As there's been no chatter on this KIP I'm assuming it's non-contentious,
>> (or just boring), hence I'd like to call a vote for KIP-476:
>>
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface
>>
>> Thanks,
>>
>> Andy
>>
>


[VOTE] KIP-476: Add Java AdminClient interface

2019-06-04 Thread Andy Coates
Hi folks

As there's been no chatter on this KIP I'm assuming it's non-contentious,
(or just boring), hence I'd like to call a vote for KIP-476:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface

Thanks,

Andy


[jira] [Created] (KAFKA-8454) Add Java AdminClient interface

2019-05-31 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-8454:
--

 Summary: Add Java AdminClient interface
 Key: KAFKA-8454
 URL: https://issues.apache.org/jira/browse/KAFKA-8454
 Project: Kafka
  Issue Type: Bug
  Components: admin, clients, core, streams
Reporter: Andy Coates
Assignee: Andy Coates


Task to track the work of [KIP-476: Add Java AdminClient 
Interface|https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSS] KIP-476: Add Java AdminClient Interface

2019-05-31 Thread Andy Coates
Hi folks,

I'd like to start a discussion thread for KIP-476:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface

Thanks,

Andy


Re: [DISCUSS] KIP-464 Default Replication Factor for AdminClient#createTopic

2019-05-02 Thread Andy Coates
Rather than adding overloaded constructors, which can lead to API bloat, how 
about using a builder pattern?

I see it’s already got some constructor overloading, but we could add a single 
new constructor that takes just the name, and support everything else being set 
via builder methods.

This would result in a better long term api as the number of options increases.

Sent from my iPhone

> On 30 Apr 2019, at 16:28, Almog Gavra  wrote:
> 
> Hello Everyone,
> 
> I'd like to start a discussion on KIP-464:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-464%3A+Default+Replication+Factor+for+AdminClient%23createTopic
> 
> It's about allowing users of the AdminClient to supply only a partition
> count and to use the default replication factor configured by the kafka
> cluster. Happy to receive any and all feedback!
> 
> Cheers,
> Almog


Re: [VOTE] KIP-421: Support resolving externalized secrets in AbstractConfig

2019-01-25 Thread Andy Coates
t how to make this work with dynamic configs (which would also be a
> > nice thing to extend to, e.g., Connect).
> >
> > As a practical suggestion, while it doesn't give you the update for free,
> > we could consider also deprecating the existing constructor to encourage
> > people to update. Further, if we're worried about confusion about how to
> > load the two files, we could have a constructor that does that default
> > pattern for you.
> >
> > -Ewen
> >
> > On Thu, Jan 24, 2019 at 11:36 AM Colin McCabe 
> wrote:
> >
> > > On Thu, Jan 24, 2019, at 11:25, TEJAL ADSUL wrote:
> > > >
> > > >
> > > > On 2019/01/24 17:26:02, Andy Coates  wrote:
> > > > > I'm wondering why we're rejected changing AbstractConfig to
> > > automatically
> > > > > resolve the variables?
> > > > >
> > > > > > 1. Change AbstractConfig to *automatically* resolve variables of
> > the
> > > form
> > > > > specified in KIP-297. This was rejected because it would change the
> > > > > behavior of existing code and might cause unexpected effects.
> > > > >
> > > > > Doing so seems to me to have two very large benefits:
> > > > >
> > > > > 1. It allows the config providers to be defined within the same
> file
> > > as the
> > > > > config that uses the providers, e.g.
> > > > >
> > > > > config.providers=file,vault
> > > > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/config.providers=file,vault
> > > >
> > > > > config.providers.file.
> > > > > <
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/config.providers.file
> > .>
> > > > > class=org.apache.kafka.connect.configs.FileConfigProvider
> > > > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/org.apache.kafka.connect.configs.FileConfigProvider
> > > >
> > > > > config.providers.file.param.path=
> > > > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/config.providers.file.other.prop=another
> > > >
> > > > > /mnt/secrets/passwords
> > > > >
> > > > > foo.baz=/usr/temp/
> > > > > <
> > https://cwiki.apache.org/confluence/display/KAFKA/foo.baz=/usr/temp/>
> > > > > foo.bar=$ <
> > https://cwiki.apache.org/confluence/display/KAFKA/foo.bar=$
> > > >
> > > > > {file:/path/to/variables.properties:foo.bar}
> > > > >
> > > > > Is this possible with what's currently being proposed? i.e could
> you
> > > load
> > > > > the file and pass the map first to `loadConfigProviders` and then
> > > again to
> > > > > the constructor?
> > > > >
> > > > > 2. It allows _all_ existing clients of the class, e.g. those in
> > Apache
> > > > > Kafka or in applications written by other people that use the
> class,
> > > to get
> > > > > this functionality for free, i.e. without any code changes.  (I
> > realize
> > > > > this is probably where the 'unexpected effects' comes from).
> > > > >
> > > > > I'm assuming the unexpected side effects come about if an existing
> > > > > properties file already contains compatible config.providers
> > > > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/config.providers=file,vault
> > > >
> > > > >  entries _and_ has other properties in the form ${xx:yy} or
> > > ${xx:yy:zz}.
> > > > > While possible, these seems fairly unlikely unless for random
> client
> > > > > property files. So I'm assuming there's a specific instance where
> we
> > > think
> > > > > this is likely? Something to do with Connect config maybe?
> > > > >
> > > > > Personally, I think we should do our best to make this work
> > seamlessly
> > > /
> > > > > transparently, because we're likely going to have this
> functionality
> > > for a
> > > > > long time.
> > > > >
> > > > > Andy
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, 22 Jan 2019 at 

Re: [VOTE] KIP-421: Support resolving externalized secrets in AbstractConfig

2019-01-24 Thread Andy Coates
I'm wondering why we're rejected changing AbstractConfig to automatically
resolve the variables?

> 1. Change AbstractConfig to *automatically* resolve variables of the form
specified in KIP-297. This was rejected because it would change the
behavior of existing code and might cause unexpected effects.

Doing so seems to me to have two very large benefits:

1. It allows the config providers to be defined within the same file as the
config that uses the providers, e.g.

config.providers=file,vault

config.providers.file.

class=org.apache.kafka.connect.configs.FileConfigProvider

config.providers.file.param.path=

/mnt/secrets/passwords

foo.baz=/usr/temp/

foo.bar=$ 
{file:/path/to/variables.properties:foo.bar}

Is this possible with what's currently being proposed? i.e could you load
the file and pass the map first to `loadConfigProviders` and then again to
the constructor?

2. It allows _all_ existing clients of the class, e.g. those in Apache
Kafka or in applications written by other people that use the class, to get
this functionality for free, i.e. without any code changes.  (I realize
this is probably where the 'unexpected effects' comes from).

I'm assuming the unexpected side effects come about if an existing
properties file already contains compatible config.providers

 entries _and_ has other properties in the form ${xx:yy} or ${xx:yy:zz}.
While possible, these seems fairly unlikely unless for random client
property files. So I'm assuming there's a specific instance where we think
this is likely? Something to do with Connect config maybe?

Personally, I think we should do our best to make this work seamlessly /
transparently, because we're likely going to have this functionality for a
long time.

Andy




On Tue, 22 Jan 2019 at 17:38, te...@confluent.io  wrote:

> Hi all,
>
> We would like to start vote on KIP-421 to to enhance the AbstractConfig
> base class to support replacing variables in configurations just prior to
> parsing and validation.
>
> Link for the KIP:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-421%3A+Support+resolving+externalized+secrets+in+AbstractConfig
>
>
> Thanks,
> Tejal
>


[jira] [Resolved] (KAFKA-7069) AclCommand does not allow 'create' operation on 'topic'

2018-06-18 Thread Andy Coates (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates resolved KAFKA-7069.

Resolution: Invalid

> AclCommand does not allow 'create'  operation on 'topic'
> 
>
> Key: KAFKA-7069
> URL: https://issues.apache.org/jira/browse/KAFKA-7069
> Project: Kafka
>  Issue Type: Bug
>  Components: core, security
>Affects Versions: 2.0.0
>    Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Major
>
> KAFKA-6726 saw 
> [KIP-277|https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API]
>  implemented, which extended the set of operations allowed on the 'topic' 
> resource type to include 'create'.
> The AclCommands CLI class currently rejects this new operation. e.g. running:
> {{bin/kafka-acls --authorizer-properties zookeeper.connect=localhost:2181 
> --add --allow-principal User:KSQL --operation create --topic t1}}
> Fails with error:
> {{ResourceType Topic only supports operations 
> Read,All,AlterConfigs,DescribeConfigs,Delete,Write,Describe}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7069) AclCommand does not allow 'create' operation on 'topic'

2018-06-18 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7069:
--

 Summary: AclCommand does not allow 'create'  operation on 'topic'
 Key: KAFKA-7069
 URL: https://issues.apache.org/jira/browse/KAFKA-7069
 Project: Kafka
  Issue Type: Bug
  Components: core, security
Affects Versions: 2.0.0
    Reporter: Andy Coates
    Assignee: Andy Coates


KAFKA-6726 saw 
[KIP-277|https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API]
 implemented, which extended the set of operations allowed on the 'topic' 
resource type to include 'create'.

The AclCommands CLI class currently rejects this new operation. e.g. running:

{{bin/kafka-acls --authorizer-properties zookeeper.connect=localhost:2181 --add 
--allow-principal User:KSQL --operation create --topic t1}}

Fails with error:

{{ResourceType Topic only supports operations 
Read,All,AlterConfigs,DescribeConfigs,Delete,Write,Describe}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-06-13 Thread Andy Coates
Hi All,

Just a note to say the KIP documentation has been updated inline with the
current implementation.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-290%3A+Support+for+Prefixed+ACLs

Thanks,

Andy

On 25 May 2018 at 17:44, Andy Coates  wrote:

> > Since Resource is a concrete class now, we can't make it an interface
> without breaking API compatibility.
>
> Very true... hummm hacky, but we could sub-class Resource.
>
> > Even if it were possible to do compatibly, I would argue it's a bad
> idea.  If we need to add another bit of state like case insensitivity, we
> don't want to have LiteralCaseInsensistiveResource,
> WildcardSuffixedCaseInsensitiveResource, etc. etc.  You need 2^n
> subclasses classes to represent N bits of state.
>
> Not sure I agree - I would implement such dimensions using composition,
> not different implementations, e.g. new CaseInsenisticeResourceFilter(new
> PrefixedResourceFilter("/foo")) to get a case-insensitive prefixed filter.
>
>
>
> On 22 May 2018 at 05:15, Colin McCabe  wrote:
>
>> On Mon, May 21, 2018, at 04:53, Andy Coates wrote:
>> > Hey Piyush,
>> >
>> > Thanks for the updated KIP! Couple of minor points from me:
>> >
>> > When storing wildcard-suffixed Acls in ZK, drop the asterisk of the end
>> for
>> > the path, e.g. change "*/kafka-wildcard-acl/Topic/teamA*" * to "*/*
>> > *kafka-wildcard-acl**/Topic/**teamA"*. This reduces error conditions,
>> i.e.
>> > this is a place for storing wildcard-suffixed Acls, so it implicitly
>> ends
>> > in an asterisk. If you include the asterisk in the path then you need to
>> > validate each entry, when reading, ends with an asterisk, and do
>> something
>> > if they don't. If you don't include the training asterisk then there is
>> > nothing to validate and you can treat the prefix as a literal, (i.e. no
>> > escaping needed).  TBH I'd probably drop the asterisk from the in-memory
>> > representation as well, for the same reason.
>>
>> Hi Andy,
>>
>> I agree.  If everything in ZK under /kafka-wildcard-acl/ is a prefix ACL,
>> there is no need to include the star at the end.  And really, it should be
>> called something like /kafka-prefix-acl/, since it's only vaguely related
>> to the idea of wildcards.
>>
>> >
>> > Rather than creating an enum to indicate the type of a resource, you
>> could
>> > instead use polymorphism, e.g. make Resource an interface and have two
>> > implementations: LiteralResource and WildcardSuffixedResource.  This is
>> > also extensible, but may also allow for a cleaner implementation.
>>
>> Since Resource is a concrete class now, we can't make it an interface
>> without breaking API compatibility.
>>
>> Even if it were possible to do compatibly, I would argue it's a bad
>> idea.  If we need to add another bit of state like case insensitivity, we
>> don't want to have LiteralCaseInsensistiveResource,
>> WildcardSuffixedCaseInsensitiveResource, etc. etc.  You need 2^n
>> subclasses classes to represent N bits of state.
>>
>> I would argue that there should be a field in Resource like NameType
>> which can be LITERAL or PREFIX.  That leaves us in a good position when
>> someone inevitably comes up with a new NameType.
>>
>> Does this still have a chance to get in, or has the KIP window closed?  I
>> would argue with one or two minor changes it's ready to go.  Pretty much
>> all of the compatibility problems are solved with the separate ZK hierarchy.
>>
>> best,
>> Colin
>>
>> >
>> > Andy
>> >
>> > On 21 May 2018 at 01:58, Rajini Sivaram 
>> wrote:
>> >
>> > > Hi Piyush, Thanks for the KIP!
>> > >
>> > > +1 (binding)
>> > >
>> > > Regards,
>> > >
>> > > Rajini
>> > >
>> > > On Sun, May 20, 2018 at 2:53 PM, Andy Coates 
>> wrote:
>> > >
>> > > > Awesome last minute effort Piyush.
>> > > >
>> > > > Really appreciate your time and input,
>> > > >
>> > > > Andy
>> > > >
>> > > > Sent from my iPhone
>> > > >
>> > > > > On 19 May 2018, at 03:43, Piyush Vijay 
>> wrote:
>> > > > >
>> > > > > Updated the KIP.
>> > > > >
>> > > > > 1. New enum field 'ResourceNameType' in Resource a

[jira] [Resolved] (KAFKA-7008) Consider replacing the Resource field in AclBinding with a ResourceFilter or ResourceMatcher

2018-06-07 Thread Andy Coates (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates resolved KAFKA-7008.

Resolution: Won't Fix

> Consider replacing the Resource field in AclBinding with a ResourceFilter or 
> ResourceMatcher
> 
>
> Key: KAFKA-7008
> URL: https://issues.apache.org/jira/browse/KAFKA-7008
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core, security
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Major
> Fix For: 2.0.0
>
>
> Relating to one of the outstanding work items in PR 
> [#5117|[https://github.com/apache/kafka/pull/5117]...]
> The AclBinding class currently has a Resource field member. But this may make 
> more sense as a ResourceFitler, or some new ResourceMatchers / 
> ResourceSelector class.
> Investigate...
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7011) Investigate if its possible to drop the ResourceNameType field from Java Resource class.

2018-06-06 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7011:
--

 Summary: Investigate if its possible to drop the ResourceNameType 
field from Java Resource class.
 Key: KAFKA-7011
 URL: https://issues.apache.org/jira/browse/KAFKA-7011
 Project: Kafka
  Issue Type: Sub-task
  Components: core, security
Reporter: Andy Coates
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7010) Rename ResourceNameType.ANY to MATCH

2018-06-06 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7010:
--

 Summary: Rename ResourceNameType.ANY to MATCH
 Key: KAFKA-7010
 URL: https://issues.apache.org/jira/browse/KAFKA-7010
 Project: Kafka
  Issue Type: Sub-task
  Components: core, security
Reporter: Andy Coates
 Fix For: 2.0.0


Following on from the PR [#5117|[https://github.com/apache/kafka/pull/5117]...] 
and discussions with Colin McCabe...

The current ResourceNameType.ANY may be misleading as it performs pattern 
matching for wildcard and prefixed bindings. Where as ResourceName.ANY just 
brings back any resource name.

Renaming to ResourceNameType.MATCH and adding more Java doc should clear this 
up.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7008) Consider replacing the Resource field in AclBinding with a ResourceFilter or ResourceMatcher.

2018-06-06 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7008:
--

 Summary: Consider replacing the Resource field in AclBinding with 
a ResourceFilter or ResourceMatcher.
 Key: KAFKA-7008
 URL: https://issues.apache.org/jira/browse/KAFKA-7008
 Project: Kafka
  Issue Type: Sub-task
  Components: core, security
Reporter: Andy Coates
Assignee: Andy Coates
 Fix For: 2.0.0


Relating to one of the outstanding work items in PR 
[#5117|[https://github.com/apache/kafka/pull/5117]...]

The AclBinding class currently has a Resource field member. But this may make 
more sense as a ResourceFitler, or some new ResourceMatchers / ResourceSelector 
class.

Investigate...

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7007) All ACL changes should use single /kafka-acl-changes path

2018-06-06 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7007:
--

 Summary: All ACL changes should use single /kafka-acl-changes path 
 Key: KAFKA-7007
 URL: https://issues.apache.org/jira/browse/KAFKA-7007
 Project: Kafka
  Issue Type: Sub-task
  Components: core, security
Reporter: Andy Coates
Assignee: Andy Coates
 Fix For: 2.0.0


Relating to one of the outstanding work items in PR 
[#5117|[https://github.com/apache/kafka/pull/5117]...]

 

The above PR seeing ACL change notifications come through two paths.  Change 
the code to use a single path, with a Json value that defines the 
resource-name-type of the changed binding.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7006) Remove duplicate Scala ResourceNameType class.

2018-06-06 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7006:
--

 Summary: Remove duplicate Scala ResourceNameType class.
 Key: KAFKA-7006
 URL: https://issues.apache.org/jira/browse/KAFKA-7006
 Project: Kafka
  Issue Type: Sub-task
  Components: core, security
Reporter: Andy Coates
Assignee: Andy Coates
 Fix For: 2.0.0


Relating to one of the outstanding work items in PR 
[#5117|[https://github.com/apache/kafka/pull/5117]...]

The kafka.security.auth.ResourceTypeName class should be dropped in favour of 
the Java.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7005) Remove duplicate Java Resource class.

2018-06-06 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-7005:
--

 Summary: Remove duplicate Java Resource class.
 Key: KAFKA-7005
 URL: https://issues.apache.org/jira/browse/KAFKA-7005
 Project: Kafka
  Issue Type: Sub-task
  Components: core, security
Reporter: Andy Coates
Assignee: Andy Coates
 Fix For: 2.0.0


Relating to one of the outstanding work items in PR 
[#5117|[https://github.com/apache/kafka/pull/5117]...]

The o.a.k.c.request.Resource class could be dropped in favour of 
o.a.k.c..config.ConfigResource.

This will remove the duplication of `Resource` classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-25 Thread Andy Coates
> Since Resource is a concrete class now, we can't make it an interface
without breaking API compatibility.

Very true... hummm hacky, but we could sub-class Resource.

> Even if it were possible to do compatibly, I would argue it's a bad
idea.  If we need to add another bit of state like case insensitivity, we
don't want to have LiteralCaseInsensistiveResource,
WildcardSuffixedCaseInsensitiveResource, etc. etc.  You need 2^n subclasses
classes to represent N bits of state.

Not sure I agree - I would implement such dimensions using composition, not
different implementations, e.g. new CaseInsenisticeResourceFilter(new
PrefixedResourceFilter("/foo")) to get a case-insensitive prefixed filter.



On 22 May 2018 at 05:15, Colin McCabe  wrote:

> On Mon, May 21, 2018, at 04:53, Andy Coates wrote:
> > Hey Piyush,
> >
> > Thanks for the updated KIP! Couple of minor points from me:
> >
> > When storing wildcard-suffixed Acls in ZK, drop the asterisk of the end
> for
> > the path, e.g. change "*/kafka-wildcard-acl/Topic/teamA*" * to "*/*
> > *kafka-wildcard-acl**/Topic/**teamA"*. This reduces error conditions,
> i.e.
> > this is a place for storing wildcard-suffixed Acls, so it implicitly ends
> > in an asterisk. If you include the asterisk in the path then you need to
> > validate each entry, when reading, ends with an asterisk, and do
> something
> > if they don't. If you don't include the training asterisk then there is
> > nothing to validate and you can treat the prefix as a literal, (i.e. no
> > escaping needed).  TBH I'd probably drop the asterisk from the in-memory
> > representation as well, for the same reason.
>
> Hi Andy,
>
> I agree.  If everything in ZK under /kafka-wildcard-acl/ is a prefix ACL,
> there is no need to include the star at the end.  And really, it should be
> called something like /kafka-prefix-acl/, since it's only vaguely related
> to the idea of wildcards.
>
> >
> > Rather than creating an enum to indicate the type of a resource, you
> could
> > instead use polymorphism, e.g. make Resource an interface and have two
> > implementations: LiteralResource and WildcardSuffixedResource.  This is
> > also extensible, but may also allow for a cleaner implementation.
>
> Since Resource is a concrete class now, we can't make it an interface
> without breaking API compatibility.
>
> Even if it were possible to do compatibly, I would argue it's a bad idea.
> If we need to add another bit of state like case insensitivity, we don't
> want to have LiteralCaseInsensistiveResource,
> WildcardSuffixedCaseInsensitiveResource, etc. etc.  You need 2^n
> subclasses classes to represent N bits of state.
>
> I would argue that there should be a field in Resource like NameType which
> can be LITERAL or PREFIX.  That leaves us in a good position when someone
> inevitably comes up with a new NameType.
>
> Does this still have a chance to get in, or has the KIP window closed?  I
> would argue with one or two minor changes it's ready to go.  Pretty much
> all of the compatibility problems are solved with the separate ZK hierarchy.
>
> best,
> Colin
>
> >
> > Andy
> >
> > On 21 May 2018 at 01:58, Rajini Sivaram  wrote:
> >
> > > Hi Piyush, Thanks for the KIP!
> > >
> > > +1 (binding)
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > > On Sun, May 20, 2018 at 2:53 PM, Andy Coates 
> wrote:
> > >
> > > > Awesome last minute effort Piyush.
> > > >
> > > > Really appreciate your time and input,
> > > >
> > > > Andy
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On 19 May 2018, at 03:43, Piyush Vijay 
> wrote:
> > > > >
> > > > > Updated the KIP.
> > > > >
> > > > > 1. New enum field 'ResourceNameType' in Resource and ResourceFilter
> > > > classes.
> > > > > 2. modify getAcls() and rely on ResourceNameType' field in
> Resource to
> > > > > return either exact matches or all matches based on
> wildcard-suffix.
> > > > > 3. CLI changes to identify if resource name is literal or
> > > wildcard-suffix
> > > > > 4. Escaping doesn't work and isn't required if we're keeping a
> separate
> > > > > path on ZK (kafka-wildcard-acl) to store wildcard-suffix ACLs.
> > > > > 5. New API keys for Create / Delete / Describe Acls request with a
> new
> > > > &

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-21 Thread Andy Coates
Hey Piyush,

Thanks for the updated KIP! Couple of minor points from me:

When storing wildcard-suffixed Acls in ZK, drop the asterisk of the end for
the path, e.g. change "*/kafka-wildcard-acl/Topic/teamA*" * to "*/*
*kafka-wildcard-acl**/Topic/**teamA"*. This reduces error conditions, i.e.
this is a place for storing wildcard-suffixed Acls, so it implicitly ends
in an asterisk. If you include the asterisk in the path then you need to
validate each entry, when reading, ends with an asterisk, and do something
if they don't. If you don't include the training asterisk then there is
nothing to validate and you can treat the prefix as a literal, (i.e. no
escaping needed).  TBH I'd probably drop the asterisk from the in-memory
representation as well, for the same reason.

Rather than creating an enum to indicate the type of a resource, you could
instead use polymorphism, e.g. make Resource an interface and have two
implementations: LiteralResource and WildcardSuffixedResource.  This is
also extensible, but may also allow for a cleaner implementation.

Andy

On 21 May 2018 at 01:58, Rajini Sivaram  wrote:

> Hi Piyush, Thanks for the KIP!
>
> +1 (binding)
>
> Regards,
>
> Rajini
>
> On Sun, May 20, 2018 at 2:53 PM, Andy Coates  wrote:
>
> > Awesome last minute effort Piyush.
> >
> > Really appreciate your time and input,
> >
> > Andy
> >
> > Sent from my iPhone
> >
> > > On 19 May 2018, at 03:43, Piyush Vijay  wrote:
> > >
> > > Updated the KIP.
> > >
> > > 1. New enum field 'ResourceNameType' in Resource and ResourceFilter
> > classes.
> > > 2. modify getAcls() and rely on ResourceNameType' field in Resource to
> > > return either exact matches or all matches based on wildcard-suffix.
> > > 3. CLI changes to identify if resource name is literal or
> wildcard-suffix
> > > 4. Escaping doesn't work and isn't required if we're keeping a separate
> > > path on ZK (kafka-wildcard-acl) to store wildcard-suffix ACLs.
> > > 5. New API keys for Create / Delete / Describe Acls request with a new
> > > field in schemas for 'ResourceNameType'.
> > >
> > > Looks ready to me for the vote, will start voting thread now. Thanks
> > > everyone for the valuable feedback.
> > >
> > >
> > >
> > >
> > > Piyush Vijay
> > >
> > >
> > > Piyush Vijay
> > >
> > >> On Fri, May 18, 2018 at 6:07 PM, Andy Coates 
> wrote:
> > >>
> > >> Hi Piyush,
> > >>
> > >> We're fast approaching the KIP deadline. Are you actively working on
> > this?
> > >> If you're not I can take over.
> > >>
> > >> Thanks,
> > >>
> > >> Andy
> > >>
> > >>> On 18 May 2018 at 14:25, Andy Coates  wrote:
> > >>>
> > >>> OK I've read it now.
> > >>>
> > >>> 1. I see you have an example:
> > >>>> For example: If I want to fetch all ACLs that match ’topicA*’, it’s
> > not
> > >>> possible without introducing new API AND maintaining backwards
> > >>> compatibility.
> > >>> getAcls takes a Resource, right, which would be either a full
> resource
> > >>> name or 'ALL', i.e. '*', right?  The point of the call is to get all
> > ACLs
> > >>> relating to a specific resource, not a partial resource like
> 'topicA*'.
> > >>> Currently, I'm guessing / half-remembering that if you ask it for
> ACLs
> > >> for
> > >>> topic 'foo' it doesn't include global 'ALL' ACLs in the list - that
> > would
> > >>> be a different call.  With the introduction of partial wildcards I
> > think
> > >>> the _most_ backwards compatible change would be to have
> > >>> getAcls("topic:foo") to return all the ACLs, including that affect
> this
> > >>> topic. This could include any '*'/ALL Acls, (which would be a small
> > >>> backwards compatible change), or exclude them as it current does.
> > >>> Excluding any matching partial wildcard acl, e.g. 'f*' would break
> > >>> compatibility IMHO.
> > >>>
> > >>> 2. Example command lines, showing how to add ACLs to specific
> resources
> > >>> that *end* with an asterisk char and adding wildcard-suffixed ACLs,
> > would
> > >>> really hel

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-20 Thread Andy Coates
Awesome last minute effort Piyush.

Really appreciate your time and input,

Andy

Sent from my iPhone

> On 19 May 2018, at 03:43, Piyush Vijay  wrote:
> 
> Updated the KIP.
> 
> 1. New enum field 'ResourceNameType' in Resource and ResourceFilter classes.
> 2. modify getAcls() and rely on ResourceNameType' field in Resource to
> return either exact matches or all matches based on wildcard-suffix.
> 3. CLI changes to identify if resource name is literal or wildcard-suffix
> 4. Escaping doesn't work and isn't required if we're keeping a separate
> path on ZK (kafka-wildcard-acl) to store wildcard-suffix ACLs.
> 5. New API keys for Create / Delete / Describe Acls request with a new
> field in schemas for 'ResourceNameType'.
> 
> Looks ready to me for the vote, will start voting thread now. Thanks
> everyone for the valuable feedback.
> 
> 
> 
> 
> Piyush Vijay
> 
> 
> Piyush Vijay
> 
>> On Fri, May 18, 2018 at 6:07 PM, Andy Coates  wrote:
>> 
>> Hi Piyush,
>> 
>> We're fast approaching the KIP deadline. Are you actively working on this?
>> If you're not I can take over.
>> 
>> Thanks,
>> 
>> Andy
>> 
>>> On 18 May 2018 at 14:25, Andy Coates  wrote:
>>> 
>>> OK I've read it now.
>>> 
>>> 1. I see you have an example:
>>>> For example: If I want to fetch all ACLs that match ’topicA*’, it’s not
>>> possible without introducing new API AND maintaining backwards
>>> compatibility.
>>> getAcls takes a Resource, right, which would be either a full resource
>>> name or 'ALL', i.e. '*', right?  The point of the call is to get all ACLs
>>> relating to a specific resource, not a partial resource like 'topicA*'.
>>> Currently, I'm guessing / half-remembering that if you ask it for ACLs
>> for
>>> topic 'foo' it doesn't include global 'ALL' ACLs in the list - that would
>>> be a different call.  With the introduction of partial wildcards I think
>>> the _most_ backwards compatible change would be to have
>>> getAcls("topic:foo") to return all the ACLs, including that affect this
>>> topic. This could include any '*'/ALL Acls, (which would be a small
>>> backwards compatible change), or exclude them as it current does.
>>> Excluding any matching partial wildcard acl, e.g. 'f*' would break
>>> compatibility IMHO.
>>> 
>>> 2. Example command lines, showing how to add ACLs to specific resources
>>> that *end* with an asterisk char and adding wildcard-suffixed ACLs, would
>>> really help clarify the KIP. e.g.
>>> 
>>> bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:
>> 2181
>>> --add --allow-principal User:Bob --allow-principal User:Alice
>> --allow-host
>>> 198.51.100.0 --allow-host 198.51.100.1 --operation Read --group my-app-*
>>> 
>>> With the above command I can't see how the code can know if the user
>> means
>>> a literal group called 'my-app-*', or a wildcard suffix for any group
>>> starting with 'my-app-'. Escaping isn't enough as the escape char can
>> clash
>>> too, e.g. escaping a literal to 'my-app-\*' can still clash with someone
>>> wanting a wildcard sufiix matching any group starting with 'my-app-\'.
>>> 
>>> So there needs to be a syntax change here, I think.  Maybe some new
>>> command line switch to either explicitly enable or disable
>>> 'wildcard-suffix' support?  Probably defaulting to wildcard-suffix being
>>> on, (better experience going forward), though off is more backwards
>>> compatible.
>>> 
>>> 
>>> 3. Again, examples of how to store ACLs for specific resources that
>> *end* with
>>> an asterisk and wildcard-suffix ACLs, with any escaping would really
>> help.
>>> 
>>> 
>>> 
>>>> On 18 May 2018 at 13:55, Andy Coates  wrote:
>>>> 
>>>> Hey Piyush,
>>>> 
>>>> Thanks for getting this in! :D
>>>> 
>>>> About to read now. But just quickly...
>>>> 
>>>> 1. I'll read up on the need for getMatchingAcls - but just playing
>> devils
>>>> advocate for a moment - if a current caller of getAcls() expects it to
>>>> return the full set of ACLs for a given resource, would post this change
>>>> only returning a sub set and requiri

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-18 Thread Andy Coates
Hi Piyush,

We're fast approaching the KIP deadline. Are you actively working on this?
If you're not I can take over.

Thanks,

Andy

On 18 May 2018 at 14:25, Andy Coates  wrote:

> OK I've read it now.
>
> 1. I see you have an example:
> > For example: If I want to fetch all ACLs that match ’topicA*’, it’s not
> possible without introducing new API AND maintaining backwards
> compatibility.
> getAcls takes a Resource, right, which would be either a full resource
> name or 'ALL', i.e. '*', right?  The point of the call is to get all ACLs
> relating to a specific resource, not a partial resource like 'topicA*'.
>  Currently, I'm guessing / half-remembering that if you ask it for ACLs for
> topic 'foo' it doesn't include global 'ALL' ACLs in the list - that would
> be a different call.  With the introduction of partial wildcards I think
> the _most_ backwards compatible change would be to have
> getAcls("topic:foo") to return all the ACLs, including that affect this
> topic. This could include any '*'/ALL Acls, (which would be a small
> backwards compatible change), or exclude them as it current does.
> Excluding any matching partial wildcard acl, e.g. 'f*' would break
> compatibility IMHO.
>
> 2. Example command lines, showing how to add ACLs to specific resources
> that *end* with an asterisk char and adding wildcard-suffixed ACLs, would
> really help clarify the KIP. e.g.
>
> bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181
> --add --allow-principal User:Bob --allow-principal User:Alice --allow-host
> 198.51.100.0 --allow-host 198.51.100.1 --operation Read --group my-app-*
>
> With the above command I can't see how the code can know if the user means
> a literal group called 'my-app-*', or a wildcard suffix for any group
> starting with 'my-app-'. Escaping isn't enough as the escape char can clash
> too, e.g. escaping a literal to 'my-app-\*' can still clash with someone
> wanting a wildcard sufiix matching any group starting with 'my-app-\'.
>
> So there needs to be a syntax change here, I think.  Maybe some new
> command line switch to either explicitly enable or disable
> 'wildcard-suffix' support?  Probably defaulting to wildcard-suffix being
> on, (better experience going forward), though off is more backwards
> compatible.
>
>
> 3. Again, examples of how to store ACLs for specific resources that *end* with
> an asterisk and wildcard-suffix ACLs, with any escaping would really help.
>
>
>
> On 18 May 2018 at 13:55, Andy Coates  wrote:
>
>> Hey Piyush,
>>
>> Thanks for getting this in! :D
>>
>> About to read now. But just quickly...
>>
>> 1. I'll read up on the need for getMatchingAcls - but just playing devils
>> advocate for a moment - if a current caller of getAcls() expects it to
>> return the full set of ACLs for a given resource, would post this change
>> only returning a sub set and requiring them to return getMatchingAcls to
>> get the full set not itself be a break in compatibility? I'm thinking about
>> any tooling / UI / etc people may have built on top of this.  If Im missing
>> the point, then maybe a concrete example, (if you've not already added one
>> to the doc), may help.
>>
>> 2. Something must change on the command line, surely? e.g. as command
>> line user how would the command differ if I wanted to add an ACL onto a
>> group called 'foo*' as opposed to a all groups starting with 'foo'?
>>
>> 3. Thinking this through, I actually bracktracked - I don't think it will
>> work due to path collisions, even with escaping - as the escaped version
>> can still collide.
>>
>> Off to read the doc now.
>>
>> On 18 May 2018 at 13:33, Piyush Vijay  wrote:
>>
>>> Ready to review. Let me know if something looks missing or not clear.
>>>
>>> Thanks
>>>
>>>
>>> Piyush Vijay
>>>
>>> On Fri, May 18, 2018 at 12:54 PM, Piyush Vijay 
>>> wrote:
>>>
>>> > Andy,
>>> >
>>> > 1. Updated the KIP about need for getMatchingAcls(). Basically, it's
>>> > required to add an inspection method without breaking compatibility.
>>> > getAcls() only looks at a single location.
>>> >
>>> > 2. Nothing will change from user's perspective. they will add / delete/
>>> > list ACLs as earlier.
>>> >
>>> > 3. Good point about moving everything to a v2 path. We might still
>>

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-18 Thread Andy Coates
OK I've read it now.

1. I see you have an example:
> For example: If I want to fetch all ACLs that match ’topicA*’, it’s not
possible without introducing new API AND maintaining backwards
compatibility.
getAcls takes a Resource, right, which would be either a full resource name
or 'ALL', i.e. '*', right?  The point of the call is to get all ACLs
relating to a specific resource, not a partial resource like 'topicA*'.
 Currently, I'm guessing / half-remembering that if you ask it for ACLs for
topic 'foo' it doesn't include global 'ALL' ACLs in the list - that would
be a different call.  With the introduction of partial wildcards I think
the _most_ backwards compatible change would be to have
getAcls("topic:foo") to return all the ACLs, including that affect this
topic. This could include any '*'/ALL Acls, (which would be a small
backwards compatible change), or exclude them as it current does.
Excluding any matching partial wildcard acl, e.g. 'f*' would break
compatibility IMHO.

2. Example command lines, showing how to add ACLs to specific resources
that *end* with an asterisk char and adding wildcard-suffixed ACLs, would
really help clarify the KIP. e.g.

bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181
--add --allow-principal User:Bob --allow-principal User:Alice --allow-host
198.51.100.0 --allow-host 198.51.100.1 --operation Read --group my-app-*

With the above command I can't see how the code can know if the user means
a literal group called 'my-app-*', or a wildcard suffix for any group
starting with 'my-app-'. Escaping isn't enough as the escape char can clash
too, e.g. escaping a literal to 'my-app-\*' can still clash with someone
wanting a wildcard sufiix matching any group starting with 'my-app-\'.

So there needs to be a syntax change here, I think.  Maybe some new command
line switch to either explicitly enable or disable 'wildcard-suffix'
support?  Probably defaulting to wildcard-suffix being on, (better
experience going forward), though off is more backwards compatible.


3. Again, examples of how to store ACLs for specific resources that *end* with
an asterisk and wildcard-suffix ACLs, with any escaping would really help.



On 18 May 2018 at 13:55, Andy Coates  wrote:

> Hey Piyush,
>
> Thanks for getting this in! :D
>
> About to read now. But just quickly...
>
> 1. I'll read up on the need for getMatchingAcls - but just playing devils
> advocate for a moment - if a current caller of getAcls() expects it to
> return the full set of ACLs for a given resource, would post this change
> only returning a sub set and requiring them to return getMatchingAcls to
> get the full set not itself be a break in compatibility? I'm thinking about
> any tooling / UI / etc people may have built on top of this.  If Im missing
> the point, then maybe a concrete example, (if you've not already added one
> to the doc), may help.
>
> 2. Something must change on the command line, surely? e.g. as command line
> user how would the command differ if I wanted to add an ACL onto a group
> called 'foo*' as opposed to a all groups starting with 'foo'?
>
> 3. Thinking this through, I actually bracktracked - I don't think it will
> work due to path collisions, even with escaping - as the escaped version
> can still collide.
>
> Off to read the doc now.
>
> On 18 May 2018 at 13:33, Piyush Vijay  wrote:
>
>> Ready to review. Let me know if something looks missing or not clear.
>>
>> Thanks
>>
>>
>> Piyush Vijay
>>
>> On Fri, May 18, 2018 at 12:54 PM, Piyush Vijay 
>> wrote:
>>
>> > Andy,
>> >
>> > 1. Updated the KIP about need for getMatchingAcls(). Basically, it's
>> > required to add an inspection method without breaking compatibility.
>> > getAcls() only looks at a single location.
>> >
>> > 2. Nothing will change from user's perspective. they will add / delete/
>> > list ACLs as earlier.
>> >
>> > 3. Good point about moving everything to a v2 path. We might still have
>> to
>> > support snowflakes but I will this better.
>> >
>> > I'm giving it a final read. I'll update here once I think it's ready.
>> >
>> > Thanks
>> >
>> >
>> > Piyush Vijay
>> >
>> > On Fri, May 18, 2018 at 12:18 PM, Piyush Vijay 
>> > wrote:
>> >
>> >> On it, Andy. It should be out in next 30 mins.
>> >>
>> >> On Fri, May 18, 2018 at 12:17 PM Andy Coates 
>> wrote:
>> >>
>> >>> Hey Piyush,
>>

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-18 Thread Andy Coates
Hey Piyush,

Thanks for getting this in! :D

About to read now. But just quickly...

1. I'll read up on the need for getMatchingAcls - but just playing devils
advocate for a moment - if a current caller of getAcls() expects it to
return the full set of ACLs for a given resource, would post this change
only returning a sub set and requiring them to return getMatchingAcls to
get the full set not itself be a break in compatibility? I'm thinking about
any tooling / UI / etc people may have built on top of this.  If Im missing
the point, then maybe a concrete example, (if you've not already added one
to the doc), may help.

2. Something must change on the command line, surely? e.g. as command line
user how would the command differ if I wanted to add an ACL onto a group
called 'foo*' as opposed to a all groups starting with 'foo'?

3. Thinking this through, I actually bracktracked - I don't think it will
work due to path collisions, even with escaping - as the escaped version
can still collide.

Off to read the doc now.

On 18 May 2018 at 13:33, Piyush Vijay  wrote:

> Ready to review. Let me know if something looks missing or not clear.
>
> Thanks
>
>
> Piyush Vijay
>
> On Fri, May 18, 2018 at 12:54 PM, Piyush Vijay 
> wrote:
>
> > Andy,
> >
> > 1. Updated the KIP about need for getMatchingAcls(). Basically, it's
> > required to add an inspection method without breaking compatibility.
> > getAcls() only looks at a single location.
> >
> > 2. Nothing will change from user's perspective. they will add / delete/
> > list ACLs as earlier.
> >
> > 3. Good point about moving everything to a v2 path. We might still have
> to
> > support snowflakes but I will this better.
> >
> > I'm giving it a final read. I'll update here once I think it's ready.
> >
> > Thanks
> >
> >
> > Piyush Vijay
> >
> > On Fri, May 18, 2018 at 12:18 PM, Piyush Vijay 
> > wrote:
> >
> >> On it, Andy. It should be out in next 30 mins.
> >>
> >> On Fri, May 18, 2018 at 12:17 PM Andy Coates  wrote:
> >>
> >>> Hey Piyush,
> >>>
> >>> How are you getting on updating the KIP? Can we offer any support?
> We're
> >>> starting to fly really close to the required 72 hours cut off for KIPs.
> >>> This doesn't leave much room for resolving any issues any committers
> >>> find.
> >>> Also, we now require at least three committers to review this KIP today
> >>> _and_ find no issues if we're to get this KIP accepted.
> >>>
> >>> Thanks,
> >>>
> >>> Andy
> >>>
> >>> On 18 May 2018 at 01:21, Piyush Vijay  wrote:
> >>>
> >>> > Hi Andy,
> >>> >
> >>> > I still have some minor changes left to the KIP. I'll make them in
> the
> >>> > morning. I'm sorry I got caught up in some other things today. But
> that
> >>> > would still give us 72 hours before the deadline :)
> >>> >
> >>> > Thanks
> >>> >
> >>> >
> >>> > Piyush Vijay
> >>> >
> >>> > On Thu, May 17, 2018 at 1:27 PM, Andy Coates 
> >>> wrote:
> >>> >
> >>> > > Hey Piyush - my bad. Sorry.
> >>> > >
> >>> > > On 17 May 2018 at 13:23, Piyush Vijay 
> >>> wrote:
> >>> > >
> >>> > > > It's still not complete. I'll drop a message here when I'm done
> >>> with
> >>> > the
> >>> > > > updates.
> >>> > > >
> >>> > > > Thanks
> >>> > > >
> >>> > > >
> >>> > > > Piyush Vijay
> >>> > > >
> >>> > > > On Thu, May 17, 2018 at 12:04 PM, Andy Coates  >
> >>> > wrote:
> >>> > > >
> >>> > > > > Thanks for the update to the KIP Piyush!
> >>> > > > >
> >>> > > > > Reading it through again, I've a couple of questions:
> >>> > > > >
> >>> > > > > 1. Why is there a need for a new 'getMatchingAcls' method, over
> >>> the
> >>> > > > > existing getAcls method? They both take a Resource instance and
> >>> > return
> >>> > > a
> >>> > > > > set of Acls. What is the difference in

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-18 Thread Andy Coates
Hey Piyush,

How are you getting on updating the KIP? Can we offer any support? We're
starting to fly really close to the required 72 hours cut off for KIPs.
This doesn't leave much room for resolving any issues any committers find.
Also, we now require at least three committers to review this KIP today
_and_ find no issues if we're to get this KIP accepted.

Thanks,

Andy

On 18 May 2018 at 01:21, Piyush Vijay  wrote:

> Hi Andy,
>
> I still have some minor changes left to the KIP. I'll make them in the
> morning. I'm sorry I got caught up in some other things today. But that
> would still give us 72 hours before the deadline :)
>
> Thanks
>
>
> Piyush Vijay
>
> On Thu, May 17, 2018 at 1:27 PM, Andy Coates  wrote:
>
> > Hey Piyush - my bad. Sorry.
> >
> > On 17 May 2018 at 13:23, Piyush Vijay  wrote:
> >
> > > It's still not complete. I'll drop a message here when I'm done with
> the
> > > updates.
> > >
> > > Thanks
> > >
> > >
> > > Piyush Vijay
> > >
> > > On Thu, May 17, 2018 at 12:04 PM, Andy Coates 
> wrote:
> > >
> > > > Thanks for the update to the KIP Piyush!
> > > >
> > > > Reading it through again, I've a couple of questions:
> > > >
> > > > 1. Why is there a need for a new 'getMatchingAcls' method, over the
> > > > existing getAcls method? They both take a Resource instance and
> return
> > a
> > > > set of Acls. What is the difference in their behaviour?
> > > > 2. It's not clear to me from the KIP alone what will change, from a
> > users
> > > > perspective, on how they add / list / delete ACLs.  I'm assuming this
> > > won't
> > > > change.
> > > > 3. Writing ACLs to a new location to get around the issues of
> embedded
> > > > wildcards in existing group ACLs makes sense to me - but just a
> > thought,
> > > > will we be writing all new ACLs under this new path, or just those
> that
> > > are
> > > > partial wildcards?  I'm assuming its the latter, but it could just be
> > > 'all'
> > > > right? As we could escape illegal chars.  So we could just make this
> > new
> > > > path 'v2' rather wildcard.
> > > >
> > > > Andy
> > > >
> > > > On 17 May 2018 at 09:32, Colin McCabe  wrote:
> > > >
> > > > > On Thu, May 17, 2018, at 09:28, Piyush Vijay wrote:
> > > > > > I was planning to do that.
> > > > > >
> > > > > > Another unrelated detail is the presence of the support for ‘*’
> ACL
> > > > > > currently. Looks like we’ll have to keep supporting this as a
> > special
> > > > > case,
> > > > > > even though using a different location for wildcard-suffix ACLs
> on
> > > Zk.
> > > > >
> > > > > +1.
> > > > >
> > > > > Thanks, Piyush.
> > > > >
> > > > > Colin
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, May 17, 2018 at 9:15 AM Colin McCabe  >
> > > > wrote:
> > > > > >
> > > > > > > Thanks, Piyush.  +1 for starting the vote soon.
> > > > > > >
> > > > > > > Can you please also add a discussion about escaping?  For
> > example,
> > > > > earlier
> > > > > > > we discussed using backslashes to escape special characters.
> So
> > > that
> > > > > users
> > > > > > > can create an ACL referring to a literal "foo*" group by
> creating
> > > an
> > > > > ACL
> > > > > > > for "foo\*"  Similarly, you can get a literal backslash with
> > "\\".
> > > > > This is
> > > > > > > the standard UNIX escaping mechanism.
> > > > > > >
> > > > > > > Also, for the section that says "Changes to AdminClient (needs
> > > > > > > discussion)", we need a new method that will allow users to
> > escape
> > > > > consumer
> > > > > > > group names and other names.  So you can feed this method your
> > > > "foo\*"
> > > > > > > consumer group name, and it will gi

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-17 Thread Andy Coates
Hey Piyush - my bad. Sorry.

On 17 May 2018 at 13:23, Piyush Vijay  wrote:

> It's still not complete. I'll drop a message here when I'm done with the
> updates.
>
> Thanks
>
>
> Piyush Vijay
>
> On Thu, May 17, 2018 at 12:04 PM, Andy Coates  wrote:
>
> > Thanks for the update to the KIP Piyush!
> >
> > Reading it through again, I've a couple of questions:
> >
> > 1. Why is there a need for a new 'getMatchingAcls' method, over the
> > existing getAcls method? They both take a Resource instance and return a
> > set of Acls. What is the difference in their behaviour?
> > 2. It's not clear to me from the KIP alone what will change, from a users
> > perspective, on how they add / list / delete ACLs.  I'm assuming this
> won't
> > change.
> > 3. Writing ACLs to a new location to get around the issues of embedded
> > wildcards in existing group ACLs makes sense to me - but just a thought,
> > will we be writing all new ACLs under this new path, or just those that
> are
> > partial wildcards?  I'm assuming its the latter, but it could just be
> 'all'
> > right? As we could escape illegal chars.  So we could just make this new
> > path 'v2' rather wildcard.
> >
> > Andy
> >
> > On 17 May 2018 at 09:32, Colin McCabe  wrote:
> >
> > > On Thu, May 17, 2018, at 09:28, Piyush Vijay wrote:
> > > > I was planning to do that.
> > > >
> > > > Another unrelated detail is the presence of the support for ‘*’ ACL
> > > > currently. Looks like we’ll have to keep supporting this as a special
> > > case,
> > > > even though using a different location for wildcard-suffix ACLs on
> Zk.
> > >
> > > +1.
> > >
> > > Thanks, Piyush.
> > >
> > > Colin
> > >
> > > >
> > > >
> > > >
> > > > On Thu, May 17, 2018 at 9:15 AM Colin McCabe 
> > wrote:
> > > >
> > > > > Thanks, Piyush.  +1 for starting the vote soon.
> > > > >
> > > > > Can you please also add a discussion about escaping?  For example,
> > > earlier
> > > > > we discussed using backslashes to escape special characters.  So
> that
> > > users
> > > > > can create an ACL referring to a literal "foo*" group by creating
> an
> > > ACL
> > > > > for "foo\*"  Similarly, you can get a literal backslash with "\\".
> > > This is
> > > > > the standard UNIX escaping mechanism.
> > > > >
> > > > > Also, for the section that says "Changes to AdminClient (needs
> > > > > discussion)", we need a new method that will allow users to escape
> > > consumer
> > > > > group names and other names.  So you can feed this method your
> > "foo\*"
> > > > > consumer group name, and it will give you "foo\\\*", which is what
> > you
> > > > > would need to use to create an ACL for this consumer group in
> > > AdminClient.
> > > > > I think that's the only change we need to admin client
> > > > >
> > > > > regards,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Thu, May 17, 2018, at 08:55, Piyush Vijay wrote:
> > > > > > Hi Rajini/Colin,
> > > > > >
> > > > > > I will remove the wildcard principals from the scope for now,
> > > updating
> > > > > KIP
> > > > > > right now and will open it for vote.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > Piyush Vijay
> > > > > >
> > > > > > On Thu, May 17, 2018 at 6:59 AM, Rajini Sivaram <
> > > rajinisiva...@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Piyush,
> > > > > > >
> > > > > > > I have added a PR (https://github.com/apache/kafka/pull/5030)
> > with
> > > > > tests
> > > > > > > to
> > > > > > > show how group principals can be used for authorization with
> > custom
> > > > > > > principal builders. One of the tests uses SASL. It is not quite
> > the
> > > > > same as
> > > > &g

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-17 Thread Andy Coates
> > support+for+SASL).
> > > > > > If
> > > > > > > > that satisfies your requirements, perhaps we can move
> wildcarded
> > > > > > principals
> > > > > > > > out of this KIP and focus on wildcarded resources?
> > > > > >
> > > > > > +1.
> > > > > >
> > > > > > We also need to determine which characters will be reserved for
> the
> > > > > > future.  I think previously we thought about @, #, $, %, ^, &, *.
> > > > > >
> > > > > > > > On Tue, May 15, 2018 at 7:15 PM, Piyush Vijay <
> > > > > piyushvij...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Hi Colin,
> > > > > > > >>
> > > > > > > >> Escaping at this level is making sense to me but let me
> think
> > > more
> > > > > > and get
> > > > > > > >> back to you.
> > > > > >
> > > > > > Thanks, Piyush.  What questions do you think are still open
> regarding
> > > > > > escape characters?
> > > > > > As Rajini mentioned, we have to get this in soon in order to make
> > > the KIP
> > > > > > freeze.
> > > > > >
> > > > > > > >>
> > > > > > > >> But should we not just get rid of one of AclBinding or
> > > > > > AclBindingFilter
> > > > > > > >> then? Is there a reason to keep both given that
> > > AclBindingFilter and
> > > > > > > >> AclBinding look exact copy of each other after this change?
> This
> > > > > will
> > > > > > be a
> > > > > > > >> one-time breaking change in APIs marked as "Evolving", but
> makes
> > > > > > sense in
> > > > > > > >> the long term? Am I missing something here?
> > > > > >
> > > > > > AclBinding represents an ACL.  AclBindingFilter is a filter which
> > > can be
> > > > > > used to locate AclBinding objects.  Similarly with Resource and
> > > > > > ResourceFilter.  There is no reason to combine them because they
> > > > > represent
> > > > > > different things.  Although they contain many of the same fields,
> > > they
> > > > > are
> > > > > > not exact copies.  Many fields can be null in AclBindingFilter--
> > > fields
> > > > > can
> > > > > > never be null in AclBinding.
> > > > > >
> > > > > > For example, you can have an AclBindingFilter that matches every
> > > > > > AclBinding.  There is more discussion of this on the original KIP
> > > that
> > > > > > added ACL support to AdminClient.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Piyush Vijay
> > > > > > > >>
> > > > > > > >> On Tue, May 15, 2018 at 9:01 AM, Colin McCabe <
> > > cmcc...@apache.org>
> > > > > > wrote:
> > > > > > > >>
> > > > > > > >> > Hi Piyush,
> > > > > > > >> >
> > > > > > > >> > I think AclBinding should operate the same way as
> > > > > AclBindingFilter.
> > > > > > > >> >
> > > > > > > >> > So you should be able to do something like this:
> > > > > > > >> > > AclBindingFilter filter = new AclBindingFiler(new
> > > > > > > >> > ResourceFilter(ResourceType.GROUP, "foo*"))
> > > > > > > >> > > AclBinding binding = new AclBinding(new
> > > > > > Resource(ResourceType.GROUP,
> > > > > > > >> > "foo*"))
> > > > > > > >> > > assertTrue(filter.matches(binding));
> > > > > > > >> >
> > > > > > > >> > Thinking about this more, it's starting to feel really
>

Re: [VOTE] KIP-292: Add KTable#transformValues() method in Kafka Streams DSL

2018-05-14 Thread Andy Coates
Hey all,

With 3 binding votes and 1 non-binding, this KIP has been accepted.

Thanks for everyones input and time.

Andy



On 14 May 2018 at 09:42, Guozhang Wang  wrote:

> Thanks Andy. That's fine, we can continue the vote thread.
>
> On Mon, May 14, 2018 at 9:25 AM, Andy Coates  wrote:
>
> > Hey All,
> >
> > I've updated the KIP to also cover the Scala API for completeness. (The
> PR
> > already covered this).
> >
> > Does this require another vote?
> >
> > On 11 May 2018 at 14:29, Matthias J. Sax  wrote:
> >
> > > +1 (binding)
> > >
> > >
> > > On 5/11/18 2:17 PM, Ted Yu wrote:
> > > > +1
> > > >
> > > > On Fri, May 11, 2018 at 2:16 PM, Guozhang Wang 
> > > wrote:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> Also slightly changed the title beyond changing [DISCUSS] to [VOTE]
> > > since
> > > >> Gmail will collapse these two threads into one, so people may not
> > > realize
> > > >> there is a voting started already.
> > > >>
> > > >>
> > > >> Guozhang
> > > >>
> > > >>
> > > >> On Fri, May 11, 2018 at 2:07 PM, Andy Coates 
> > wrote:
> > > >>
> > > >>> Hi all,
> > > >>>
> > > >>> I would like to start the vote on KIP-292: Add transformValues()
> > method
> > > >> to
> > > >>> KTable
> > > >>>
> > > >>> The link to this KIP is here:
> > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >>> 292%3A+Add+transformValues%28%29+method+to+KTable
> > > >>>
> > > >>> The discussion thread is here:
> > > >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.mbox/%
> > > >>> 3CCAO1RfpHyhEEwkwad5yxd0rLgPEK-PJ4y0%2B0y%3DwidHP0uOtToJQ%
> > > >>> 40mail.gmail.com%3E
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Andy
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> -- Guozhang
> > > >>
> > > >
> > >
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-14 Thread Andy Coates
+1

On 11 May 2018 at 17:14, Colin McCabe  wrote:

> Hi Andy,
>
> I see what you mean.  I guess my thought here is that if the fields are
> private, we can change it later if we need to.
>
> I definitely agree that we should use the scheme you describe for sending
> ACLs over the wire (just the string + version number)
>
> cheers,
> Colin
>
>
> On Fri, May 11, 2018, at 09:39, Andy Coates wrote:
> > i think I'm agreeing with you. I was merely suggesting that having an
> > additional field that controls how the current field is interpreted is
> more
> > flexible / extensible in the future than using a 'union' style approach,
> > where only one of several possible fields should be populated. But it's a
> > minor thing.
> >
> >
> >
> >
> >
> >
> > On 10 May 2018 at 09:29, Colin McCabe  wrote:
> >
> > > Hi Andy,
> > >
> > > The issue that I was trying to solve here is the Java API.  Right now,
> > > someone can write "new ResourceFilter(ResourceType.TRANSACTIONAL_ID,
> > > "foo*") and have a ResourceFilter that applies to a Transactional ID
> named
> > > "foo*".  This has to continue to work, or else we have broken
> compatibility.
> > >
> > > I was proposing that there would be something like a new function like
> > > ResourceFilter.fromPattern(ResourceType.TRANSACTIONAL_ID, "foo*")
> which
> > > would create a ResourceFilter that applied to transactional IDs
> starting
> > > with "foo", rather than transactional IDs named "foo*" specifically.
> > >
> > > I don't think it's important whether the Java class has an integer, an
> > > enum, or two string fields.  The important thing is that there's a new
> > > static function, or new constructor overload, etc. that works for
> patterns
> > > rather than literal strings.
> > >
> > > On Thu, May 10, 2018, at 03:30, Andy Coates wrote:
> > > > Rather than having name and pattern fields on the ResourceFilter,
> where
> > > > it’s only valid for one to be set, and we want to restrict the
> character
> > > > set in case future enhancements need them, we could instead add a new
> > > > integer ‘nameType’ field, and use constants to indicate how the name
> > > > field should be interpreted, e.g. 0 = literal, 1 = wildcard. This
> would
> > > > be extendable, e.g we can later add 2 = regex, or what ever, and
> > > > wouldn’t require any escaping.
> > >
> > > This is very user-unfriendly, though.  Users don't want to have to
> > > explicitly supply a version number when using the API, which is what
> this
> > > would force them to do.  I don't think users are going to want to
> memorize
> > > that version 4 supprted "+", whereas version 3 only supported "[0-9]",
> or
> > > whatever.
> > >
> > > Just as an example, do you remember which versions of FetchRequest
> added
> > > which features?  I don't.  I always have to look at the code to
> remember.
> > >
> > > Also, escaping is still required any time you overload a character to
> mean
> > > two things.  Escaping is required in the current proposal to be able to
> > > create a pattern that matches only "foo*".  You have to type "foo\*"
> It
> > > would be required if we forced users to specify a version, as well.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On 7 May 2018, at 05:16, Piyush Vijay 
> wrote:
> > > > >
> > > > > Makes sense. I'll update the KIP.
> > > > >
> > > > > Does anyone have any other comments? :)
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > Piyush Vijay
> > > > >
> > > > >> On Thu, May 3, 2018 at 11:55 AM, Colin McCabe  >
> > > wrote:
> > > > >>
> > > > >> Yeah, I guess that's a good point.  It probably makes sense to
> > > support the
> > > > >> prefix scheme for consumer groups and transactional IDs as well as
> > > topics.
> > > > >>
> > > > >> I agree that the current situation where anything goes in consumer
> > > group
> > > > >> names and transactional ID names is 

Re: [VOTE] KIP-292: Add KTable#transformValues() method in Kafka Streams DSL

2018-05-14 Thread Andy Coates
Hey All,

I've updated the KIP to also cover the Scala API for completeness. (The PR
already covered this).

Does this require another vote?

On 11 May 2018 at 14:29, Matthias J. Sax  wrote:

> +1 (binding)
>
>
> On 5/11/18 2:17 PM, Ted Yu wrote:
> > +1
> >
> > On Fri, May 11, 2018 at 2:16 PM, Guozhang Wang 
> wrote:
> >
> >> +1 (binding)
> >>
> >> Also slightly changed the title beyond changing [DISCUSS] to [VOTE]
> since
> >> Gmail will collapse these two threads into one, so people may not
> realize
> >> there is a voting started already.
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Fri, May 11, 2018 at 2:07 PM, Andy Coates  wrote:
> >>
> >>> Hi all,
> >>>
> >>> I would like to start the vote on KIP-292: Add transformValues() method
> >> to
> >>> KTable
> >>>
> >>> The link to this KIP is here:
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 292%3A+Add+transformValues%28%29+method+to+KTable
> >>>
> >>> The discussion thread is here:
> >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.mbox/%
> >>> 3CCAO1RfpHyhEEwkwad5yxd0rLgPEK-PJ4y0%2B0y%3DwidHP0uOtToJQ%
> >>> 40mail.gmail.com%3E
> >>>
> >>> Thanks,
> >>>
> >>> Andy
> >>>
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>
>


[VOTE] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
Hi all,

I would like to start the vote on KIP-292: Add transformValues() method to
KTable

The link to this KIP is here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-292%3A+Add+transformValues%28%29+method+to+KTable

The discussion thread is here:
http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.mbox/%3CCAO1RfpHyhEEwkwad5yxd0rLgPEK-PJ4y0%2B0y%3DwidHP0uOtToJQ%40mail.gmail.com%3E

Thanks,

Andy


Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
Awesome. Thanks to you both - KIP updated appropriately.

On 11 May 2018 at 11:52, Matthias J. Sax  wrote:

> Sounds good to me to do it in a different KIP (if at all---not convinced
> yet we need this restriction).
>
>
> -Matthias
>
>
>
> On 5/11/18 11:36 AM, Guozhang Wang wrote:
> > Matthias' point is valid, that today we are not effectively preventing
> > users to programmatically access any materialized state store and mutate
> it
> > on the fly anyways. This is a caveat that we should improve since as Andy
> > asked, the implications of mutating a materialized store is "undefined"
> as
> > of today. But thinking about it a bit more I think we do not need to fix
> in
> > this KIP, but rather in a different PR, to consider the following:
> >
> > 1) ProcessorContext#getStateStore(): should we return the internal
> store,
> > even if user provided the right internal store name?
> >
> > 2) For stores used in materialization, should we allow users to write to
> > them?
> >
> >
> > Guozhang
> >
> >
> > On Fri, May 11, 2018 at 11:17 AM, Andy Coates  wrote:
> >
> >> I'm no expert, so happy to go with what ever is decided.
> >>
> >> Implementing it so that a call to getStateStore(queryableStoreName)
> throws
> >> an exception is trivial enough to do, though I can see your point that
> this
> >> might be better done in a more consistent / single place, rather than
> >> peppered through the code.  I'm not sure there are any other places in
> the
> >> API that take both a Materialized instance and a set of state store
> names,
> >> so maybe this is a special/first case?
> >>
> >> I guess my main question would be what are the implications if a user
> did
> >> mutate the underlying statestate within the transformer itself? Or to
> put
> >> it another way, would allowing access to the state store result it weird
> >> and wonderful bugs, and become a common source of bugs and another
> 'thing'
> >> developers need to know not to do, or would it 'just work'?  If its the
> >> former, then I think that's a strong case to have the code explicitly
> stop
> >> users doing dumb stuff.
> >>
> >> On 11 May 2018 at 11:00, Matthias J. Sax  wrote:
> >>
> >>> I don't think it worth to complicate things and restrict the access to
> >>> the store that is created by the library via `Materialized` parameter.
> >>> The reason is, that we don't prevent this from happening anyway atm --
> >>> users can always create an arbitrary processor and connect a KTable
> >>> state with the processor and mess around with the KTable state. Of
> >>> course, they should not do this, but it's possible after all and I
> think
> >>> it's a user error if they do.
> >>>
> >>> Thus, if we really want to restrict the access to internal
> >>> created/manages stores, we should do it consistently -- but it's
> unclear
> >>> to me how to do this: note that Processor API and DSL should be
> >>> separated (they are still not as clearly as they should, but we put
> some
> >>> effort into this the last year). Also, accessing the `context` is a
> PAPI
> >>> level operation. If we want to restrict the access to stores there, we
> >>> need to introduce two "categories" of stores -- PAPI stores and DSL
> >>> stores. This would be a step backwards IMHO and mangle DSL and PAPI
> >>> together in an undesired way.
> >>>
> >>> Instead, we should clearly communicate to the users, that if they use
> >>> `Materialized` they should never write to this store as it's managed by
> >>> the library (reading is ok).
> >>>
> >>> WDYT?
> >>>
> >>> -Matthias
> >>>
> >>> On 5/11/18 10:50 AM, Guozhang Wang wrote:
> >>>> Yes I was indeed thinking about now allowing `getStateStore()` to with
> >>> the name
> >>>> specified in Materialized. I understand it is a bit too restrictive,
> >> but
> >>> I
> >>>> cannot think of a elegant way to work around the following:
> >>>>
> >>>> 1) to programmatically enforce that the restore store is read-only.
> >>>>
> >>>> 2) With Materialized, the store name may not be specified by the user
> >> and
> >>>> hence it will be created inter

Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
I'm no expert, so happy to go with what ever is decided.

Implementing it so that a call to getStateStore(queryableStoreName) throws
an exception is trivial enough to do, though I can see your point that this
might be better done in a more consistent / single place, rather than
peppered through the code.  I'm not sure there are any other places in the
API that take both a Materialized instance and a set of state store names,
so maybe this is a special/first case?

I guess my main question would be what are the implications if a user did
mutate the underlying statestate within the transformer itself? Or to put
it another way, would allowing access to the state store result it weird
and wonderful bugs, and become a common source of bugs and another 'thing'
developers need to know not to do, or would it 'just work'?  If its the
former, then I think that's a strong case to have the code explicitly stop
users doing dumb stuff.

On 11 May 2018 at 11:00, Matthias J. Sax  wrote:

> I don't think it worth to complicate things and restrict the access to
> the store that is created by the library via `Materialized` parameter.
> The reason is, that we don't prevent this from happening anyway atm --
> users can always create an arbitrary processor and connect a KTable
> state with the processor and mess around with the KTable state. Of
> course, they should not do this, but it's possible after all and I think
> it's a user error if they do.
>
> Thus, if we really want to restrict the access to internal
> created/manages stores, we should do it consistently -- but it's unclear
> to me how to do this: note that Processor API and DSL should be
> separated (they are still not as clearly as they should, but we put some
> effort into this the last year). Also, accessing the `context` is a PAPI
> level operation. If we want to restrict the access to stores there, we
> need to introduce two "categories" of stores -- PAPI stores and DSL
> stores. This would be a step backwards IMHO and mangle DSL and PAPI
> together in an undesired way.
>
> Instead, we should clearly communicate to the users, that if they use
> `Materialized` they should never write to this store as it's managed by
> the library (reading is ok).
>
> WDYT?
>
> -Matthias
>
> On 5/11/18 10:50 AM, Guozhang Wang wrote:
> > Yes I was indeed thinking about now allowing `getStateStore()` to with
> the name
> > specified in Materialized. I understand it is a bit too restrictive, but
> I
> > cannot think of a elegant way to work around the following:
> >
> > 1) to programmatically enforce that the restore store is read-only.
> >
> > 2) With Materialized, the store name may not be specified by the user and
> > hence it will be created internally; what would happen if users call
> > `getStateStore()` with the correct internal store name? If not the
> > semantics is a bit complex, if yes we are breaking the protocol to not
> > expose  internal store.
> >
> >
> > Guozhang
> >
> > On Fri, May 11, 2018 at 10:42 AM, Andy Coates  wrote:
> >
> >> Just a thought - but on the subject of disallowing access to the
> >> materialized state store from within the transformer's init method...
> might
> >> this not be overly restrictive? Could there be valid uses where
> read-only
> >> access would be useful / valid.
> >>
> >> On 11 May 2018 at 10:35, Andy Coates  wrote:
> >>
> >>> OK, KIP updated:
> >>>  - added overloads taking `Materialized`
> >>>  - dropped overloads taking `ValueTransformerSupplier` in favour of the
> >>> `withKey` variants.
> >>>  - added more info around the limitations of the ProcessorContext
> passed
> >>> in to the transformer's init calls, i.e. no forward calls allowed or
> >> calls
> >>> to getStateStore where the store name matches the materialized result
> of
> >>> the call.
> >>>
> >>> I'll sort out the PR next.
> >>>
> >>> On 11 May 2018 at 10:26, Damian Guy  wrote:
> >>>
> >>>> I'm a +1 for Guozhang's suggestion
> >>>>
> >>>> On Fri, 11 May 2018 at 10:20 Andy Coates  wrote:
> >>>>
> >>>>> Makes sense to me.  What do others think?
> >>>>>
> >>>>> On 11 May 2018 at 10:13, Guozhang Wang  wrote:
> >>>>>
> >>>>>> Hi folks,
> >>>>>>
> >>>>>> While looking into the overloaded functions, I'm wondering if we can
> >>>> save
> >

Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
Thinking that through more - I guess if the user wanted the output to also
be fed back in as the input on future computations, they wouldn't use the
transformValues() overload with Materialized, but rather create a
materialized store and pass it in via stateStoreNames.

On 11 May 2018 at 10:42, Andy Coates  wrote:

> Just a thought - but on the subject of disallowing access to the
> materialized state store from within the transformer's init method... might
> this not be overly restrictive? Could there be valid uses where read-only
> access would be useful / valid.
>
> On 11 May 2018 at 10:35, Andy Coates  wrote:
>
>> OK, KIP updated:
>>  - added overloads taking `Materialized`
>>  - dropped overloads taking `ValueTransformerSupplier` in favour of the
>> `withKey` variants.
>>  - added more info around the limitations of the ProcessorContext passed
>> in to the transformer's init calls, i.e. no forward calls allowed or calls
>> to getStateStore where the store name matches the materialized result of
>> the call.
>>
>> I'll sort out the PR next.
>>
>> On 11 May 2018 at 10:26, Damian Guy  wrote:
>>
>>> I'm a +1 for Guozhang's suggestion
>>>
>>> On Fri, 11 May 2018 at 10:20 Andy Coates  wrote:
>>>
>>> > Makes sense to me.  What do others think?
>>> >
>>> > On 11 May 2018 at 10:13, Guozhang Wang  wrote:
>>> >
>>> > > Hi folks,
>>> > >
>>> > > While looking into the overloaded functions, I'm wondering if we can
>>> save
>>> > > the transformers without key, i.e. only add two overloaded functions:
>>> > >
>>> > >  KTable transformValues(final
>>> > ValueTransformerWithKeySupplier>> > > super K, ? super V, ? extends VR> valueTransformerSupplier,
>>> > >final String... stateStoreNames);
>>> > >
>>> > >  KTable transformValues(final
>>> > ValueTransformerWithKeySupplier>> > > super K, ? super V, ? extends VR> valueTransformerSupplier,
>>> > >final Materialized>> > > KeyValueStore> materialized,
>>> > >final String... stateStoreNames);
>>> > >
>>> > > The reason is that, in KIP-149 we've added the overloaded functions
>>> > > `withKey`, which should be covering the case without key already
>>> because
>>> > if
>>> > > users do not really need the key, they can just take it as a dummy
>>> > > parameter. We did not deprecate the old ones since some of them have
>>> just
>>> > > been added one version back. But if we agree that by the end of the
>>> day
>>> > we
>>> > > would only maintain the overloaded value functions "with key" only,
>>> then
>>> > we
>>> > > should not add the ones without keys any more in new KIPs.
>>> > >
>>> > > WDYT?
>>> > >
>>> > >
>>> > > Guozhang
>>> > >
>>> > >
>>> > > On Fri, May 11, 2018 at 9:42 AM, Andy Coates 
>>> wrote:
>>> > >
>>> > > > Sorry for my lack of response - I've been out of action with a bad
>>> back
>>> > > for
>>> > > > a few days!
>>> > > >
>>> > > > I originally had the `Materialized` overloads added to the API.
>>> I'll
>>> > > update
>>> > > > the KIP / PR with these again. In terms of semantics, as Matthias
>>> > > suggests,
>>> > > > these should be consistent with filter() and mapValues(), etc.
>>> > > >
>>> > > > On 8 May 2018 at 17:59, Guozhang Wang  wrote:
>>> > > >
>>> > > > > To follow on Matthias and Damian's comments here:
>>> > > > >
>>> > > > > If we are going to add the overload functions as
>>> > > > >
>>> > > > > ```
>>> > > > >  KTable transformValues(final
>>> ValueTransformerSupplier>> > > super
>>> > > > > V,
>>> > > > > ? extends VR> valueTransformerSupplier,
>>> > > > >final String...
>>> stateStoreNames,
>>> > > > > 

Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
Just a thought - but on the subject of disallowing access to the
materialized state store from within the transformer's init method... might
this not be overly restrictive? Could there be valid uses where read-only
access would be useful / valid.

On 11 May 2018 at 10:35, Andy Coates  wrote:

> OK, KIP updated:
>  - added overloads taking `Materialized`
>  - dropped overloads taking `ValueTransformerSupplier` in favour of the
> `withKey` variants.
>  - added more info around the limitations of the ProcessorContext passed
> in to the transformer's init calls, i.e. no forward calls allowed or calls
> to getStateStore where the store name matches the materialized result of
> the call.
>
> I'll sort out the PR next.
>
> On 11 May 2018 at 10:26, Damian Guy  wrote:
>
>> I'm a +1 for Guozhang's suggestion
>>
>> On Fri, 11 May 2018 at 10:20 Andy Coates  wrote:
>>
>> > Makes sense to me.  What do others think?
>> >
>> > On 11 May 2018 at 10:13, Guozhang Wang  wrote:
>> >
>> > > Hi folks,
>> > >
>> > > While looking into the overloaded functions, I'm wondering if we can
>> save
>> > > the transformers without key, i.e. only add two overloaded functions:
>> > >
>> > >  KTable transformValues(final
>> > ValueTransformerWithKeySupplier> > > super K, ? super V, ? extends VR> valueTransformerSupplier,
>> > >final String... stateStoreNames);
>> > >
>> > >  KTable transformValues(final
>> > ValueTransformerWithKeySupplier> > > super K, ? super V, ? extends VR> valueTransformerSupplier,
>> > >final Materialized> > > KeyValueStore> materialized,
>> > >final String... stateStoreNames);
>> > >
>> > > The reason is that, in KIP-149 we've added the overloaded functions
>> > > `withKey`, which should be covering the case without key already
>> because
>> > if
>> > > users do not really need the key, they can just take it as a dummy
>> > > parameter. We did not deprecate the old ones since some of them have
>> just
>> > > been added one version back. But if we agree that by the end of the
>> day
>> > we
>> > > would only maintain the overloaded value functions "with key" only,
>> then
>> > we
>> > > should not add the ones without keys any more in new KIPs.
>> > >
>> > > WDYT?
>> > >
>> > >
>> > > Guozhang
>> > >
>> > >
>> > > On Fri, May 11, 2018 at 9:42 AM, Andy Coates 
>> wrote:
>> > >
>> > > > Sorry for my lack of response - I've been out of action with a bad
>> back
>> > > for
>> > > > a few days!
>> > > >
>> > > > I originally had the `Materialized` overloads added to the API. I'll
>> > > update
>> > > > the KIP / PR with these again. In terms of semantics, as Matthias
>> > > suggests,
>> > > > these should be consistent with filter() and mapValues(), etc.
>> > > >
>> > > > On 8 May 2018 at 17:59, Guozhang Wang  wrote:
>> > > >
>> > > > > To follow on Matthias and Damian's comments here:
>> > > > >
>> > > > > If we are going to add the overload functions as
>> > > > >
>> > > > > ```
>> > > > >  KTable transformValues(final
>> ValueTransformerSupplier> > > super
>> > > > > V,
>> > > > > ? extends VR> valueTransformerSupplier,
>> > > > >final String...
>> stateStoreNames,
>> > > > >final Materialized> > > > > VR, KeyValueStore materialized);
>> > > > >
>> > > > >  KTable transformValues(final
>> > ValueTransformerWithKeySupplie
>> > > > r> > > > > super K, ? super V, ? extends VR> valueTransformerSupplier,
>> > > > >final String...
>> stateStoreNames,
>> > > > >  final Materialized> > > > > VR, KeyValueStore materialized);
>> > > > > ```
>> > > > >
>> > > > > Then are we going to still only allow the valueTra

Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
OK, KIP updated:
 - added overloads taking `Materialized`
 - dropped overloads taking `ValueTransformerSupplier` in favour of the
`withKey` variants.
 - added more info around the limitations of the ProcessorContext passed in
to the transformer's init calls, i.e. no forward calls allowed or calls to
getStateStore where the store name matches the materialized result of the
call.

I'll sort out the PR next.

On 11 May 2018 at 10:26, Damian Guy  wrote:

> I'm a +1 for Guozhang's suggestion
>
> On Fri, 11 May 2018 at 10:20 Andy Coates  wrote:
>
> > Makes sense to me.  What do others think?
> >
> > On 11 May 2018 at 10:13, Guozhang Wang  wrote:
> >
> > > Hi folks,
> > >
> > > While looking into the overloaded functions, I'm wondering if we can
> save
> > > the transformers without key, i.e. only add two overloaded functions:
> > >
> > >  KTable transformValues(final
> > ValueTransformerWithKeySupplier > > super K, ? super V, ? extends VR> valueTransformerSupplier,
> > >final String... stateStoreNames);
> > >
> > >  KTable transformValues(final
> > ValueTransformerWithKeySupplier > > super K, ? super V, ? extends VR> valueTransformerSupplier,
> > >final Materialized > > KeyValueStore> materialized,
> > >final String... stateStoreNames);
> > >
> > > The reason is that, in KIP-149 we've added the overloaded functions
> > > `withKey`, which should be covering the case without key already
> because
> > if
> > > users do not really need the key, they can just take it as a dummy
> > > parameter. We did not deprecate the old ones since some of them have
> just
> > > been added one version back. But if we agree that by the end of the day
> > we
> > > would only maintain the overloaded value functions "with key" only,
> then
> > we
> > > should not add the ones without keys any more in new KIPs.
> > >
> > > WDYT?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Fri, May 11, 2018 at 9:42 AM, Andy Coates 
> wrote:
> > >
> > > > Sorry for my lack of response - I've been out of action with a bad
> back
> > > for
> > > > a few days!
> > > >
> > > > I originally had the `Materialized` overloads added to the API. I'll
> > > update
> > > > the KIP / PR with these again. In terms of semantics, as Matthias
> > > suggests,
> > > > these should be consistent with filter() and mapValues(), etc.
> > > >
> > > > On 8 May 2018 at 17:59, Guozhang Wang  wrote:
> > > >
> > > > > To follow on Matthias and Damian's comments here:
> > > > >
> > > > > If we are going to add the overload functions as
> > > > >
> > > > > ```
> > > > >  KTable transformValues(final ValueTransformerSupplier > > super
> > > > > V,
> > > > > ? extends VR> valueTransformerSupplier,
> > > > >final String... stateStoreNames,
> > > > >final Materialized > > > > VR, KeyValueStore materialized);
> > > > >
> > > > >  KTable transformValues(final
> > ValueTransformerWithKeySupplie
> > > > r > > > > super K, ? super V, ? extends VR> valueTransformerSupplier,
> > > > >final String... stateStoreNames,
> > > > >  final Materialized > > > > VR, KeyValueStore materialized);
> > > > > ```
> > > > >
> > > > > Then are we going to still only allow the valueTransofmer.init() /
> > > > > process() to be able to access N stores, with N stores specified
> with
> > > the
> > > > > stateStoreNames, but not the one specified in materialized.name()?
> > > > > Personally I think it should be the case as the materialized store
> > > should
> > > > > be managed by the Streams library itself, but we should probably
> help
> > > > users
> > > > > to understand if they have some stores used for the same purpose
> > > (storing
> > > > > the value that are going to be sent to the downstream changelog
> > stream
> > > 

Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
Makes sense to me.  What do others think?

On 11 May 2018 at 10:13, Guozhang Wang  wrote:

> Hi folks,
>
> While looking into the overloaded functions, I'm wondering if we can save
> the transformers without key, i.e. only add two overloaded functions:
>
>  KTable transformValues(final ValueTransformerWithKeySupplier super K, ? super V, ? extends VR> valueTransformerSupplier,
>final String... stateStoreNames);
>
>  KTable transformValues(final ValueTransformerWithKeySupplier super K, ? super V, ? extends VR> valueTransformerSupplier,
>final Materialized KeyValueStore> materialized,
>final String... stateStoreNames);
>
> The reason is that, in KIP-149 we've added the overloaded functions
> `withKey`, which should be covering the case without key already because if
> users do not really need the key, they can just take it as a dummy
> parameter. We did not deprecate the old ones since some of them have just
> been added one version back. But if we agree that by the end of the day we
> would only maintain the overloaded value functions "with key" only, then we
> should not add the ones without keys any more in new KIPs.
>
> WDYT?
>
>
> Guozhang
>
>
> On Fri, May 11, 2018 at 9:42 AM, Andy Coates  wrote:
>
> > Sorry for my lack of response - I've been out of action with a bad back
> for
> > a few days!
> >
> > I originally had the `Materialized` overloads added to the API. I'll
> update
> > the KIP / PR with these again. In terms of semantics, as Matthias
> suggests,
> > these should be consistent with filter() and mapValues(), etc.
> >
> > On 8 May 2018 at 17:59, Guozhang Wang  wrote:
> >
> > > To follow on Matthias and Damian's comments here:
> > >
> > > If we are going to add the overload functions as
> > >
> > > ```
> > >  KTable transformValues(final ValueTransformerSupplier super
> > > V,
> > > ? extends VR> valueTransformerSupplier,
> > >final String... stateStoreNames,
> > >final Materialized > > VR, KeyValueStore materialized);
> > >
> > >  KTable transformValues(final ValueTransformerWithKeySupplie
> > r > > super K, ? super V, ? extends VR> valueTransformerSupplier,
> > >final String... stateStoreNames,
> > >  final Materialized > > VR, KeyValueStore materialized);
> > > ```
> > >
> > > Then are we going to still only allow the valueTransofmer.init() /
> > > process() to be able to access N stores, with N stores specified with
> the
> > > stateStoreNames, but not the one specified in materialized.name()?
> > > Personally I think it should be the case as the materialized store
> should
> > > be managed by the Streams library itself, but we should probably help
> > users
> > > to understand if they have some stores used for the same purpose
> (storing
> > > the value that are going to be sent to the downstream changelog stream
> of
> > > KTable), they should save that store and not creating by themselves as
> it
> > > will be auto created by the Streams library.
> > >
> > >
> > > Guozhang
> > >
> > >
> > >
> > >
> > > On Tue, May 8, 2018 at 7:45 AM, Damian Guy 
> wrote:
> > >
> > > > Initially i thought materializing a store would be overkill, but
> from a
> > > > consistency point of view it makes sense to add an overload that
> takes
> > a
> > > > `Materialized` and only create the store if that is supplied.
> > > >
> > > > On Sun, 6 May 2018 at 17:52 Matthias J. Sax 
> > > wrote:
> > > >
> > > > > Andy,
> > > > >
> > > > > thanks for the KIP. I don't have any further comments.
> > > > >
> > > > > My 2cents about Guozhang's questions: as I like consistent
> behavior,
> > I
> > > > > think transfromValues() should behave the same way as filter() and
> > > > > mapValues().
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 5/2/18 2:24 PM, Guozhang Wang wrote:
> > > > > > Hello Andy,
> > > > > >
> > > > > > Thanks for the KIP. The motivation and the general proposal 

Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
With reference to Guozhang's comment:

Then are we going to still only allow the valueTransofmer.init() /
process() to be able to access N stores, with N stores specified with the
stateStoreNames, but not the one specified in materialized.name()?
Personally I think it should be the case as the materialized store should
be managed by the Streams library itself, but we should probably help users
to understand if they have some stores used for the same purpose (storing
the value that are going to be sent to the downstream changelog stream of
KTable), they should save that store and not creating by themselves as it
will be auto created by the Streams library.

In the PR I've pulled out the inline `ProcessorContext` implementation that
was in KStreamTransformValues, that throws an exception if any forward()
method is called.  Sound's like you're suggesting the KTableTransformValues
should also not allow calls to getStateStore() where the name is the
materialzied.name(). If that's the case, then sure that shouldn't be an
issue.

On 11 May 2018 at 09:42, Andy Coates  wrote:

> Sorry for my lack of response - I've been out of action with a bad back
> for a few days!
>
> I originally had the `Materialized` overloads added to the API. I'll
> update the KIP / PR with these again. In terms of semantics, as Matthias
> suggests, these should be consistent with filter() and mapValues(), etc.
>
> On 8 May 2018 at 17:59, Guozhang Wang  wrote:
>
>> To follow on Matthias and Damian's comments here:
>>
>> If we are going to add the overload functions as
>>
>> ```
>>  KTable transformValues(final ValueTransformerSupplier> V,
>> ? extends VR> valueTransformerSupplier,
>>final String... stateStoreNames,
>>final Materialized> VR, KeyValueStore materialized);
>>
>>  KTable transformValues(final ValueTransformerWithKeySupplie
>> r> super K, ? super V, ? extends VR> valueTransformerSupplier,
>>final String... stateStoreNames,
>>  final Materialized> VR, KeyValueStore materialized);
>> ```
>>
>> Then are we going to still only allow the valueTransofmer.init() /
>> process() to be able to access N stores, with N stores specified with the
>> stateStoreNames, but not the one specified in materialized.name()?
>> Personally I think it should be the case as the materialized store should
>> be managed by the Streams library itself, but we should probably help
>> users
>> to understand if they have some stores used for the same purpose (storing
>> the value that are going to be sent to the downstream changelog stream of
>> KTable), they should save that store and not creating by themselves as it
>> will be auto created by the Streams library.
>>
>>
>> Guozhang
>>
>>
>>
>>
>> On Tue, May 8, 2018 at 7:45 AM, Damian Guy  wrote:
>>
>> > Initially i thought materializing a store would be overkill, but from a
>> > consistency point of view it makes sense to add an overload that takes a
>> > `Materialized` and only create the store if that is supplied.
>> >
>> > On Sun, 6 May 2018 at 17:52 Matthias J. Sax 
>> wrote:
>> >
>> > > Andy,
>> > >
>> > > thanks for the KIP. I don't have any further comments.
>> > >
>> > > My 2cents about Guozhang's questions: as I like consistent behavior, I
>> > > think transfromValues() should behave the same way as filter() and
>> > > mapValues().
>> > >
>> > >
>> > > -Matthias
>> > >
>> > > On 5/2/18 2:24 PM, Guozhang Wang wrote:
>> > > > Hello Andy,
>> > > >
>> > > > Thanks for the KIP. The motivation and the general proposal looks
>> good
>> > to
>> > > > me. I think in KTable it is indeed valuable to add the functions
>> that
>> > > does
>> > > > not change key, such as mapValues, transformValues, and filter.
>> > > >
>> > > > There are a few meta comments I have about the semantics of the
>> newly
>> > > added
>> > > > functions:
>> > > >
>> > > > 1) For the resulted KTable, how should its "queryableStoreName()" be
>> > > > returning?
>> > > >
>> > > > 2) More specifically, how do we decide if the resulted KTable is to
>> be
>> > > > materialized or not? E.g. if there is no store names provided t

Re: [DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-11 Thread Andy Coates
Sorry for my lack of response - I've been out of action with a bad back for
a few days!

I originally had the `Materialized` overloads added to the API. I'll update
the KIP / PR with these again. In terms of semantics, as Matthias suggests,
these should be consistent with filter() and mapValues(), etc.

On 8 May 2018 at 17:59, Guozhang Wang  wrote:

> To follow on Matthias and Damian's comments here:
>
> If we are going to add the overload functions as
>
> ```
>  KTable transformValues(final ValueTransformerSupplier V,
> ? extends VR> valueTransformerSupplier,
>final String... stateStoreNames,
>final Materialized VR, KeyValueStore materialized);
>
>  KTable transformValues(final ValueTransformerWithKeySupplier super K, ? super V, ? extends VR> valueTransformerSupplier,
>final String... stateStoreNames,
>  final Materialized VR, KeyValueStore materialized);
> ```
>
> Then are we going to still only allow the valueTransofmer.init() /
> process() to be able to access N stores, with N stores specified with the
> stateStoreNames, but not the one specified in materialized.name()?
> Personally I think it should be the case as the materialized store should
> be managed by the Streams library itself, but we should probably help users
> to understand if they have some stores used for the same purpose (storing
> the value that are going to be sent to the downstream changelog stream of
> KTable), they should save that store and not creating by themselves as it
> will be auto created by the Streams library.
>
>
> Guozhang
>
>
>
>
> On Tue, May 8, 2018 at 7:45 AM, Damian Guy  wrote:
>
> > Initially i thought materializing a store would be overkill, but from a
> > consistency point of view it makes sense to add an overload that takes a
> > `Materialized` and only create the store if that is supplied.
> >
> > On Sun, 6 May 2018 at 17:52 Matthias J. Sax 
> wrote:
> >
> > > Andy,
> > >
> > > thanks for the KIP. I don't have any further comments.
> > >
> > > My 2cents about Guozhang's questions: as I like consistent behavior, I
> > > think transfromValues() should behave the same way as filter() and
> > > mapValues().
> > >
> > >
> > > -Matthias
> > >
> > > On 5/2/18 2:24 PM, Guozhang Wang wrote:
> > > > Hello Andy,
> > > >
> > > > Thanks for the KIP. The motivation and the general proposal looks
> good
> > to
> > > > me. I think in KTable it is indeed valuable to add the functions that
> > > does
> > > > not change key, such as mapValues, transformValues, and filter.
> > > >
> > > > There are a few meta comments I have about the semantics of the newly
> > > added
> > > > functions:
> > > >
> > > > 1) For the resulted KTable, how should its "queryableStoreName()" be
> > > > returning?
> > > >
> > > > 2) More specifically, how do we decide if the resulted KTable is to
> be
> > > > materialized or not? E.g. if there is no store names provided then it
> > is
> > > > likely that the resulted KTable is not materialized, or at least not
> > > > logically materialized and not be queryable. What if there is at
> least
> > > one
> > > > state store provided? Will any of them be provided as the
> materialized
> > > > store, or should we still add a Materialized parameter for this
> > purpose?
> > > >
> > > > 3) For its internal implementations, how should the key/value serde,
> > > > sendOldValues flag etc be inherited from its parent processor node?
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Wed, May 2, 2018 at 12:43 PM, Andy Coates 
> > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> I would like to start a discussion for KIP 292. I would appreciate
> it
> > if
> > > >> you could review and provide feedback.
> > > >>
> > > >> KIP: KIP-292: Add transformValues() method to KTable
> > > >> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >> 292%3A+Add+transformValues%28%29+method+to+KTable>
> > > >> Jira: KAFKA-6849 <https://issues.apache.org/jira/browse/KAFKA-6849>
> > > >>
> > > >>PR: #4959 <https://github.com/apache/kafka/pull/4959>
> > > >>
> > > >>
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Andy
> > > >>
> > > >
> > > >
> > > >
> > >
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-11 Thread Andy Coates
i think I'm agreeing with you. I was merely suggesting that having an
additional field that controls how the current field is interpreted is more
flexible / extensible in the future than using a 'union' style approach,
where only one of several possible fields should be populated. But it's a
minor thing.






On 10 May 2018 at 09:29, Colin McCabe  wrote:

> Hi Andy,
>
> The issue that I was trying to solve here is the Java API.  Right now,
> someone can write "new ResourceFilter(ResourceType.TRANSACTIONAL_ID,
> "foo*") and have a ResourceFilter that applies to a Transactional ID named
> "foo*".  This has to continue to work, or else we have broken compatibility.
>
> I was proposing that there would be something like a new function like
> ResourceFilter.fromPattern(ResourceType.TRANSACTIONAL_ID, "foo*") which
> would create a ResourceFilter that applied to transactional IDs starting
> with "foo", rather than transactional IDs named "foo*" specifically.
>
> I don't think it's important whether the Java class has an integer, an
> enum, or two string fields.  The important thing is that there's a new
> static function, or new constructor overload, etc. that works for patterns
> rather than literal strings.
>
> On Thu, May 10, 2018, at 03:30, Andy Coates wrote:
> > Rather than having name and pattern fields on the ResourceFilter, where
> > it’s only valid for one to be set, and we want to restrict the character
> > set in case future enhancements need them, we could instead add a new
> > integer ‘nameType’ field, and use constants to indicate how the name
> > field should be interpreted, e.g. 0 = literal, 1 = wildcard. This would
> > be extendable, e.g we can later add 2 = regex, or what ever, and
> > wouldn’t require any escaping.
>
> This is very user-unfriendly, though.  Users don't want to have to
> explicitly supply a version number when using the API, which is what this
> would force them to do.  I don't think users are going to want to memorize
> that version 4 supprted "+", whereas version 3 only supported "[0-9]", or
> whatever.
>
> Just as an example, do you remember which versions of FetchRequest added
> which features?  I don't.  I always have to look at the code to remember.
>
> Also, escaping is still required any time you overload a character to mean
> two things.  Escaping is required in the current proposal to be able to
> create a pattern that matches only "foo*".  You have to type "foo\*"  It
> would be required if we forced users to specify a version, as well.
>
> best,
> Colin
>
> >
> > Sent from my iPhone
> >
> > > On 7 May 2018, at 05:16, Piyush Vijay  wrote:
> > >
> > > Makes sense. I'll update the KIP.
> > >
> > > Does anyone have any other comments? :)
> > >
> > > Thanks
> > >
> > >
> > > Piyush Vijay
> > >
> > >> On Thu, May 3, 2018 at 11:55 AM, Colin McCabe 
> wrote:
> > >>
> > >> Yeah, I guess that's a good point.  It probably makes sense to
> support the
> > >> prefix scheme for consumer groups and transactional IDs as well as
> topics.
> > >>
> > >> I agree that the current situation where anything goes in consumer
> group
> > >> names and transactional ID names is not ideal.  I wish we could
> rewind the
> > >> clock and impose restrictions on the names.  However, it doesn't seem
> > >> practical at the moment.  Adding new restrictions would break a lot of
> > >> existing users after an upgrade.  It would be a really bad upgrade
> > >> experience.
> > >>
> > >> However, I think we can support this in a compatible way.  From the
> > >> perspective of AdminClient, we just have to add a new field to
> > >> ResourceFilter.  Currently, it has two fields, resourceType and name:
> > >>
> > >>> /**
> > >>> * A filter which matches Resource objects.
> > >>> *
> > >>> * The API for this class is still evolving and we may break
> > >> compatibility in minor releases, if necessary.
> > >>> */
> > >>> @InterfaceStability.Evolving
> > >>> public class ResourceFilter {
> > >>>private final ResourceType resourceType;
> > >>>private final String name;
> > >>
> > >> We can add a third field, pattern.
> > >>
> > >> So the API will basically be, if I create a
> ResourceF

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-10 Thread Andy Coates
gt;  There is a lot of value in having a defined naming convention on
>> these
>>>  concepts.
>>>  - This will help us not run into more issues down the line.
>>>  - I’m not sure if people actually use ‘*’ in their consumer group
>>>  names anyway.
>>>  - Escaping ‘*’ isn’t trivial because ‘\’ is an allowed character
>> too.
>>> 
>>> 
>>> *Why introduce two new APIs?*
>>> 
>>> 
>>> 
>>>   - It’s possible to make this change without introducing new APIs but
>> new
>>>   APIs are required for inspection.
>>>   - For example: If I want to fetch all ACLs that match ’topicA*’, it’s
>>>   not possible without introducing new APIs AND maintaining backwards
>>>   compatibility.
>>> 
>>> 
>>> *Matching multiple hosts*
>>> 
>>> 
>>> 
>>>   - Rajini is right that wildcard ACLs aren’t the correct fit for
>>>   specifying range of hosts.
>>>   - We will rely on KIP-252 for the proper functionality (
>>> 
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 252+-+Extend+ACLs+to+allow+filtering+based+on+ip+ranges+and+subnets
>>>   )
>>> 
>>> 
>>> *Implementation of matching algorithm and performance concerns*
>>> 
>>> 
>>> 
>>>   - Updated the KIP with an implementation.
>>>   - Andy, you’re right. The length doesn’t play a part. The request will
>>>   be authorized *iff* there is at least one matching ALLOW and no
>> matching
>>>   DENY irrespective of the prefix length. Included this detail in the
>> KIP.
>>>   - Since everything is stored in memory, the performance hit isn’t
>> really
>>>   significantly worse than the current behavior.
>>>   - Stephane’s performance improvement suggestion is a great idea but
>>>   orthogonal.
>>> 
>>> 
>>> *Why extend this for principal?*
>>> 
>>> 
>>> 
>>>   - Thanks for the amazing points Rajini.
>>>   - There is a use case where ALL principals from an org might want to
>>>   access fix set of topics.
>>>   - User group functionality is currently missing.
>>>   - IIRC SASL doesn’t use custom principal builder.
>>>   - However, prefixing is not the right choice in all cases. Agreed.
>>>   - Thoughts?
>>> 
>>> 
>>> *Changes to AdminClient to support wildcard ACLs*
>>> 
>>> 
>>> 
>>>   - Thanks Colin for the implementation. It’s good to have you and
>>> others
>>>   here for the expert opinions.
>>>   - The current implementation uses two classes: AclBinding and
>>>   AclBindingFilter. (
>>> 
>>> https://github.com/apache/kafka/blob/trunk/clients/src/
>> main/java/org/apache/kafka/common/acl/AclBinding.java
>>>   and
>>> 
>>> https://github.com/apache/kafka/blob/trunk/clients/src/
>> main/java/org/apache/kafka/common/acl/AclBindingFilter.java
>>>   )
>>>   - AclBinding is definition of an Acl. It’s used to create ACLs.
>>>   - AclBindingFilter is used to fetch or delete “matching’ ACLs.
>>>   - In the context of wildcard suffixed ACLs, a stored ACL may have ‘*’
>>> in
>>>   it. *It really removes the distinction between these two classes.*
>>>   - The current implementation uses ‘null’ to define wildcard ACL
>>> (‘*’). I
>>>   think it’s not a good pattern and we should use ‘*’ for the wildcard
>>> ACL (
>>> 
>>> https://github.com/apache/kafka/blob/trunk/clients/src/
>> main/java/org/apache/kafka/common/acl/AclBindingFilter.java#L39
>>>   and
>>> 
>>> https://github.com/apache/kafka/blob/trunk/clients/src/
>> main/java/org/apache/kafka/common/resource/ResourceFilter.java#L37
>>>   ).
>>>   - However, the above two changes are breaking change but it should be
>>>   acceptable because the API is marked with
>>> @InterfaceStability.Evolving.
>>>   - If everyone agrees to the above two changes (merging the two
>>> classes
>>>   and using non-null values for blanket access), the only other change
>>> is
>>>   using the matching algorithm from the KIP to match ACLs.
>>> 
>>> 
>>> Other comments:
>>> 
>>> 
>>> 
>>>   - > It may be worth excluding delegation token ACLs from using
>> prefixed
>>>   wildcards since it doesn't make much sense.
>>> 
>>> I want to ask fo

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-02 Thread Andy Coates
 different quotas are
> > > required for the full user name)? User principal strings take different
> > > formats for different security protocols (eg. CN=xxx,O=org,C=UK for
> SSL)
> > > and simple prefixing isn't probably the right grouping in many cases.
> > >
> > > 3. I am assuming we don't want to do wildcarded hosts in this KIP since
> > > wildcard suffix doesn't really work for hosts.
> > >
> > > 4. It may be worth excluding delegation token ACLs from using prefixed
> > > wildcards since it doesn't make much sense.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Wed, May 2, 2018 at 2:05 AM, Stephane Maarek <
> > > steph...@simplemachines.com.au> wrote:
> > >
> > > > Hi, thanks for this badly needed feature
> > > >
> > > > 1) Why introduce two new APIs in authorizer instead of replacing the
> > > > implementation for simple ACL authorizer with adding the wildcard
> > > > capability?
> > > >
> > > > 2) is there an impact to performance as now we're evaluating more
> > rules ? A
> > > > while back I had evaluated the concept of cached Acl result so
> swallow
> > the
> > > > cost of computing an authorisation cost once and then doing in memory
> > > > lookups. CF: https://issues.apache.org/jira/browse/KAFKA-5261
> > > >
> > > > 3) is there any need to also extend this to consumer group resources
> ?
> > > >
> > > > 4) create topics KIP as recently moved permissions out of Cluster
> into
> > > > Topic. Will wildcard be supported for this action too?
> > > >
> > > > Thanks a lot for this !
> > > >
> > > > On Wed., 2 May 2018, 1:37 am Ted Yu,  wrote:
> > > >
> > > > > w.r.t. naming, we can keep wildcard and drop 'prefixed' (or
> > 'suffixed')
> > > > > since the use of regex would always start with non-wildcard
> portion.
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Tue, May 1, 2018 at 12:13 PM, Andy Coates 
> > wrote:
> > > > >
> > > > > > Hi Piyush,
> > > > > >
> > > > > > Can you also document in the Compatibility section what would
> > happen
> > > > > should
> > > > > > the cluster be upgraded, wildcard-suffixed ACLs are added, and
> > then the
> > > > > > cluster is rolled back to the previous version.  On downgrade the
> > > > partial
> > > > > > wildcard ACLs will be treated as literals and hence never match
> > > > anything.
> > > > > > This is fine for ALLOW ACLs, but some might consider this a
> > security
> > > > hole
> > > > > > if a DENY ACL is ignored, so this needs to be documented, both in
> > the
> > > > KIP
> > > > > > and the final docs.
> > > > > >
> > > > > > For some reason I find the term 'prefixed wildcard ACLs' easier
> to
> > grok
> > > > > > than 'wildcard suffixed ACLs'. Probably because in the former the
> > > > > > 'wildcard' term comes after the positional adjective, which
> > matches the
> > > > > > position of the wildcard char in the resource name, i.e. "some*".
> > It's
> > > > > > most likely a person thing, but I thought I'd mention it as
> naming
> > is
> > > > > > important when it comes to making this initiative for users.
> > > > > >
> > > > > > On 1 May 2018 at 19:57, Andy Coates  wrote:
> > > > > >
> > > > > > > Hi Piyush,
> > > > > > >
> > > > > > > Thanks for raising this KIP - it's very much appreciated.
> > > > > > >
> > > > > > > I've not had chance to digest it yet, but...
> > > > > > >
> > > > > > > 1. you might want to add details of how the internals of the
> > > > > > > `getMatchingAcls` is implemented. We'd want to make sure the
> > > > complexity
> > > > > > of
> > > > > > > the operation isn't adversely affected.
> > > > > > > 2. You might want to be more explicit that the length of a
> pre

[DISCUSS] KIP-292: Add transformValues() method to KTable

2018-05-02 Thread Andy Coates
Hi everyone,

I would like to start a discussion for KIP 292. I would appreciate it if
you could review and provide feedback.

KIP: KIP-292: Add transformValues() method to KTable

Jira: KAFKA-6849 

   PR: #4959 



Thanks,

Andy


[jira] [Created] (KAFKA-6849) Add transformValues() method to KTable

2018-05-02 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-6849:
--

 Summary:  Add transformValues() method to KTable
 Key: KAFKA-6849
 URL: https://issues.apache.org/jira/browse/KAFKA-6849
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Andy Coates
Assignee: Andy Coates


Add {{transformValues()}} methods to the {{KTable}} interface with the same 
semantics as the functions of the same name on the {{KStream}} interface.

 

More details in 
[KIP-292|https://cwiki.apache.org/confluence/display/KAFKA/KIP-292%3A+Add+transformValues%28%29+method+to+KTable].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-01 Thread Andy Coates
Hi Piyush,

Can you also document in the Compatibility section what would happen should
the cluster be upgraded, wildcard-suffixed ACLs are added, and then the
cluster is rolled back to the previous version.  On downgrade the partial
wildcard ACLs will be treated as literals and hence never match anything.
This is fine for ALLOW ACLs, but some might consider this a security hole
if a DENY ACL is ignored, so this needs to be documented, both in the KIP
and the final docs.

For some reason I find the term 'prefixed wildcard ACLs' easier to grok
than 'wildcard suffixed ACLs'. Probably because in the former the
'wildcard' term comes after the positional adjective, which matches the
position of the wildcard char in the resource name, i.e. "some*".  It's
most likely a person thing, but I thought I'd mention it as naming is
important when it comes to making this initiative for users.

On 1 May 2018 at 19:57, Andy Coates  wrote:

> Hi Piyush,
>
> Thanks for raising this KIP - it's very much appreciated.
>
> I've not had chance to digest it yet, but...
>
> 1. you might want to add details of how the internals of the
> `getMatchingAcls` is implemented. We'd want to make sure the complexity of
> the operation isn't adversely affected.
> 2. You might want to be more explicit that the length of a prefix does not
> play a part in the `authorize` call, e.g. given ACLs {DENY, some.*}, {ALLOW,
> some.prefix.*}, the longer, i.e. more specific, allow ACL does _not_
> override the more general deny ACL.
>
> Thanks,
>
> Andy
>
> On 1 May 2018 at 16:59, Ron Dagostino  wrote:
>
>> Hi Piyush.  I appreciated your talk at Kafka Summit and appreciate the KIP
>> -- thanks.
>>
>> Could you explain these mismatching references?  Near the top of the KIP
>> you refer to these proposed new method signatures:
>>
>> def getMatchingAcls(resource: Resource): Set[Acl]
>> def getMatchingAcls(principal: KafkaPrincipal): Map[Resource, Set[Acl]]
>>
>> But near the bottom of the KIP you refer to different method
>> signatures that don't seem to match the above ones:
>>
>> getMatchingAcls(topicRegex)
>> getMatchingAcls(principalRegex)
>>
>> Ron
>>
>>
>> On Tue, May 1, 2018 at 11:48 AM, Ted Yu  wrote:
>>
>> > The KIP was well written. Minor comment on formatting:
>> >
>> > https://github.com/apache/kafka/blob/trunk/core/src/
>> > main/scala/kafka/admin/AclCommand.scala
>> > to
>> >
>> > Leave space between the URL and 'to'
>> >
>> > Can you describe changes for the AdminClient ?
>> >
>> > Thanks
>> >
>> > On Tue, May 1, 2018 at 8:12 AM, Piyush Vijay 
>> > wrote:
>> >
>> > > Hi all,
>> > >
>> > > I just opened a KIP to add support for wildcard suffixed ACLs. This is
>> > one
>> > > of the feature I talked about in my Kafka summit talk and we promised
>> to
>> > > upstream it :)
>> > >
>> > > The details are here -
>> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > 290%3A+Support+for+wildcard+suffixed+ACLs
>> > >
>> > > There is an open question about the way to add the support in the
>> > > AdminClient, which I can discuss here in more detail once everyone has
>> > > taken a first look at the KIP.
>> > >
>> > > Looking forward to discuss the change.
>> > >
>> > > Best,
>> > > Piyush Vijay
>> > >
>> >
>>
>
>


Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-01 Thread Andy Coates
Hi Piyush,

Thanks for raising this KIP - it's very much appreciated.

I've not had chance to digest it yet, but...

1. you might want to add details of how the internals of the
`getMatchingAcls` is implemented. We'd want to make sure the complexity of
the operation isn't adversely affected.
2. You might want to be more explicit that the length of a prefix does not
play a part in the `authorize` call, e.g. given ACLs {DENY, some.*}, {ALLOW,
some.prefix.*}, the longer, i.e. more specific, allow ACL does _not_
override the more general deny ACL.

Thanks,

Andy

On 1 May 2018 at 16:59, Ron Dagostino  wrote:

> Hi Piyush.  I appreciated your talk at Kafka Summit and appreciate the KIP
> -- thanks.
>
> Could you explain these mismatching references?  Near the top of the KIP
> you refer to these proposed new method signatures:
>
> def getMatchingAcls(resource: Resource): Set[Acl]
> def getMatchingAcls(principal: KafkaPrincipal): Map[Resource, Set[Acl]]
>
> But near the bottom of the KIP you refer to different method
> signatures that don't seem to match the above ones:
>
> getMatchingAcls(topicRegex)
> getMatchingAcls(principalRegex)
>
> Ron
>
>
> On Tue, May 1, 2018 at 11:48 AM, Ted Yu  wrote:
>
> > The KIP was well written. Minor comment on formatting:
> >
> > https://github.com/apache/kafka/blob/trunk/core/src/
> > main/scala/kafka/admin/AclCommand.scala
> > to
> >
> > Leave space between the URL and 'to'
> >
> > Can you describe changes for the AdminClient ?
> >
> > Thanks
> >
> > On Tue, May 1, 2018 at 8:12 AM, Piyush Vijay 
> > wrote:
> >
> > > Hi all,
> > >
> > > I just opened a KIP to add support for wildcard suffixed ACLs. This is
> > one
> > > of the feature I talked about in my Kafka summit talk and we promised
> to
> > > upstream it :)
> > >
> > > The details are here -
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 290%3A+Support+for+wildcard+suffixed+ACLs
> > >
> > > There is an open question about the way to add the support in the
> > > AdminClient, which I can discuss here in more detail once everyone has
> > > taken a first look at the KIP.
> > >
> > > Looking forward to discuss the change.
> > >
> > > Best,
> > > Piyush Vijay
> > >
> >
>


Permission to create KIP

2018-05-01 Thread Andy Coates
Hi,

Can I get permission to add a KIP in Confluence please?

My Wiki Id is: bigandy

Thanks!

Andy


[jira] [Created] (KAFKA-6727) org.apache.kafka.clients.admin.Config has broken equals and hashCode method.

2018-03-29 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-6727:
--

 Summary: org.apache.kafka.clients.admin.Config has broken equals 
and hashCode method.
 Key: KAFKA-6727
 URL: https://issues.apache.org/jira/browse/KAFKA-6727
 Project: Kafka
  Issue Type: Improvement
  Components: clients, tools
Affects Versions: 1.1.0
Reporter: Andy Coates
Assignee: Andy Coates


`Config` makes use of `Collections.unmodifiableCollection` to wrap the supplied 
entries to make it immutable. Unfortunately, this breaks `hashCode` and 
`equals`.

>From Java docs:

> The returned collection does _not_ pass the hashCode and equals operations 
>through to the backing collection, but relies on {{Object}}'s {{equals}} and 
>{{hashCode}} methods.

See: 
https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#unmodifiableCollection(java.util.Collection)

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-06-08 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-5246:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Discussions on PR mean we're closing this without fixing. Work around is to use 
ACLs to lock down the __consumer_offset topic to only allow required use-cases 
direct access to produce to it.

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-5246:
---
Status: Patch Available  (was: Open)

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1, 0.10.2.0, 0.10.1.1, 0.10.1.0, 0.10.0.1, 
> 0.10.0.0
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates reassigned KAFKA-5246:
--

Assignee: Andy Coates

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010948#comment-16010948
 ] 

Andy Coates commented on KAFKA-5246:


I've got a patch waiting for this is someone can assign this Jira to me and 
give me what ever permissions are needed to allow me to raise a PR in Github.

Thanks,

Andy

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-5246:
--

 Summary:  Remove backdoor that allows any client to produce to 
internal topics
 Key: KAFKA-5246
 URL: https://issues.apache.org/jira/browse/KAFKA-5246
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.10.2.1, 0.10.2.0, 0.10.1.1, 0.10.1.0, 0.10.0.1, 0.10.0.0
Reporter: Andy Coates
Priority: Minor


kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be unused 
in the code, with the exception of a single use in KafkaAPis.scala in 
handleProducerRequest, where is looks to allow any client, using the special 
‘__admin_client' client id, to append to internal topics.

This looks like a security risk to me, as it would allow any client to produce 
either rouge offsets or even a record containing something other than 
group/offset info.

Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-09-26 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15522746#comment-15522746
 ] 

Andy Coates commented on KAFKA-3919:


Thanks [~ijuma]. Given that the problems I'm experiencing are purely related to 
messages produced with acks=1 and KAFKA-2111 speaks explicitly about acks>1, 
I'd be interested if [~junrao] thinks that 2111 does indeed fix this issue too, 
given this new information on the problem.

Thanks all!

> Broker faills to start after ungraceful shutdown due to non-monotonically 
> incrementing offsets in logs
> --
>
> Key: KAFKA-3919
> URL: https://issues.apache.org/jira/browse/KAFKA-3919
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Andy Coates
>
> Hi All,
> I encountered an issue with Kafka following a power outage that saw a 
> proportion of our cluster disappear. When the power came back on several 
> brokers halted on start up with the error:
> {noformat}
>   Fatal error during KafkaServerStartable startup. Prepare to shutdown”
>   kafka.common.InvalidOffsetException: Attempt to append an offset 
> (1239742691) to position 35728 no larger than the last offset appended 
> (1239742822) to 
> /data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
>   at 
> kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
>   at kafka.log.LogSegment.recover(LogSegment.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>   at kafka.log.Log.loadSegments(Log.scala:160)
>   at kafka.log.Log.(Log.scala:90)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The only way to recover the brokers was to delete the log files that 
> contained non monotonically incrementing offsets.
> We've spent some time digging through the logs and I feel I may have worked 
> out the sequence of events leading to this issue, (though this is based on 
> some assumptions I've made about the way Kafka is working, which may be 
> wrong).
> First off, we have unclean leadership elections disable. (We did later enable 
> them to help get around some other issues we were having, but this was 
> several hours after this issue manifested), and we're producing to the topic 
> with gzip compression and acks=1
> We looked through the data logs that were causing the brokers to not start. 
> What we found the initial part of the log has monotonically increasing 
> offset, where each compressed batch normally contained one or two records. 
> Then the is a batch that contains many records, whose first records have an 
> offset below the previous batch and whose last record has an offset above the 
> previous batch. Following on from this there continues a period of large 
> batches, with monotonically increasing offsets, and then the log returns to 
> batches with one or two records.
> Our working assumption here is that the period before the offset dip, with 
> the small batches, is pre-outage normal operation. The period of larger 
> batches is from just after the outage, where producers have a back log to 
> processes when the partition becomes available, and then things return to 
> norm

[jira] [Commented] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-09-26 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15522583#comment-15522583
 ] 

Andy Coates commented on KAFKA-3919:


We experienced a similar incident again recently. This time the cluster was 
destabilised when there was intermittent connectivity issues with ZK for a few 
minutes. This resulted in a quick succession of Controller elections and 
changes, and ISR shrinks and grows.  During the ISR shrinks and grows some 
brokers seem to have got themselves into the same inconsistent state as above 
and halted. To recover the brokers we needed to manually delete the corrupted 
log segments.

So it looks like the above issue is not related, or not just related, to 
ungraceful shutdown. This inconsistent state appears to be possible during ISR 
changes where producers are producing with acks=1. Though obviously the ZK 
connectivity issues will likely have played a role.


> Broker faills to start after ungraceful shutdown due to non-monotonically 
> incrementing offsets in logs
> --
>
> Key: KAFKA-3919
> URL: https://issues.apache.org/jira/browse/KAFKA-3919
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Andy Coates
>
> Hi All,
> I encountered an issue with Kafka following a power outage that saw a 
> proportion of our cluster disappear. When the power came back on several 
> brokers halted on start up with the error:
> {noformat}
>   Fatal error during KafkaServerStartable startup. Prepare to shutdown”
>   kafka.common.InvalidOffsetException: Attempt to append an offset 
> (1239742691) to position 35728 no larger than the last offset appended 
> (1239742822) to 
> /data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
>   at 
> kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
>   at kafka.log.LogSegment.recover(LogSegment.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>   at kafka.log.Log.loadSegments(Log.scala:160)
>   at kafka.log.Log.(Log.scala:90)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The only way to recover the brokers was to delete the log files that 
> contained non monotonically incrementing offsets.
> We've spent some time digging through the logs and I feel I may have worked 
> out the sequence of events leading to this issue, (though this is based on 
> some assumptions I've made about the way Kafka is working, which may be 
> wrong).
> First off, we have unclean leadership elections disable. (We did later enable 
> them to help get around some other issues we were having, but this was 
> several hours after this issue manifested), and we're producing to the topic 
> with gzip compression and acks=1
> We looked through the data logs that were causing the brokers to not start. 
> What we found the initial part of the log has monotonically increasing 
> offset, where each compressed batch normally contained one or two records. 
> Then the is a batch that contains many records, whose first records have an 
> offset below the previous batch and whose last record has an offset above the 
> previous batch. Following on from this there continues a period of

[jira] [Commented] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-12 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372565#comment-15372565
 ] 

Andy Coates commented on KAFKA-3919:


[~junrao]  Good stuff. Look forward to hearing from you and getting involved 
more =)

> Broker faills to start after ungraceful shutdown due to non-monotonically 
> incrementing offsets in logs
> --
>
> Key: KAFKA-3919
> URL: https://issues.apache.org/jira/browse/KAFKA-3919
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Andy Coates
>
> Hi All,
> I encountered an issue with Kafka following a power outage that saw a 
> proportion of our cluster disappear. When the power came back on several 
> brokers halted on start up with the error:
> {noformat}
>   Fatal error during KafkaServerStartable startup. Prepare to shutdown”
>   kafka.common.InvalidOffsetException: Attempt to append an offset 
> (1239742691) to position 35728 no larger than the last offset appended 
> (1239742822) to 
> /data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
>   at 
> kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
>   at kafka.log.LogSegment.recover(LogSegment.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>   at kafka.log.Log.loadSegments(Log.scala:160)
>   at kafka.log.Log.(Log.scala:90)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The only way to recover the brokers was to delete the log files that 
> contained non monotonically incrementing offsets.
> We've spent some time digging through the logs and I feel I may have worked 
> out the sequence of events leading to this issue, (though this is based on 
> some assumptions I've made about the way Kafka is working, which may be 
> wrong).
> First off, we have unclean leadership elections disable. (We did later enable 
> them to help get around some other issues we were having, but this was 
> several hours after this issue manifested), and we're producing to the topic 
> with gzip compression and acks=1
> We looked through the data logs that were causing the brokers to not start. 
> What we found the initial part of the log has monotonically increasing 
> offset, where each compressed batch normally contained one or two records. 
> Then the is a batch that contains many records, whose first records have an 
> offset below the previous batch and whose last record has an offset above the 
> previous batch. Following on from this there continues a period of large 
> batches, with monotonically increasing offsets, and then the log returns to 
> batches with one or two records.
> Our working assumption here is that the period before the offset dip, with 
> the small batches, is pre-outage normal operation. The period of larger 
> batches is from just after the outage, where producers have a back log to 
> processes when the partition becomes available, and then things return to 
> normal batch sizes again once the back log clears.
> We did also look through the Kafka's application logs to try and piece 
> together the series of events leading up to this. Here’s what we know 
> happened

[jira] [Commented] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-08 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367489#comment-15367489
 ] 

Andy Coates commented on KAFKA-3919:


[~junrao] Yes, we lost a good number of brokers in a power outage.

The solution you're proposing in KAFKA-1211 looks fairly involved, i.e. a 
protocol change, is this something you think I can pick up, (having never 
committed to Kafka, but no newbie to distributed programming), or something I'd 
be best of leaving to the committers? (I'm conscious that its been sat as a 
known issue for a long time, and internally this is viewed as a blocker...)



> Broker faills to start after ungraceful shutdown due to non-monotonically 
> incrementing offsets in logs
> --
>
> Key: KAFKA-3919
> URL: https://issues.apache.org/jira/browse/KAFKA-3919
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Andy Coates
>
> Hi All,
> I encountered an issue with Kafka following a power outage that saw a 
> proportion of our cluster disappear. When the power came back on several 
> brokers halted on start up with the error:
> {noformat}
>   Fatal error during KafkaServerStartable startup. Prepare to shutdown”
>   kafka.common.InvalidOffsetException: Attempt to append an offset 
> (1239742691) to position 35728 no larger than the last offset appended 
> (1239742822) to 
> /data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
>   at 
> kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
>   at kafka.log.LogSegment.recover(LogSegment.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>   at kafka.log.Log.loadSegments(Log.scala:160)
>   at kafka.log.Log.(Log.scala:90)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The only way to recover the brokers was to delete the log files that 
> contained non monotonically incrementing offsets.
> We've spent some time digging through the logs and I feel I may have worked 
> out the sequence of events leading to this issue, (though this is based on 
> some assumptions I've made about the way Kafka is working, which may be 
> wrong).
> First off, we have unclean leadership elections disable. (We did later enable 
> them to help get around some other issues we were having, but this was 
> several hours after this issue manifested), and we're producing to the topic 
> with gzip compression and acks=1
> We looked through the data logs that were causing the brokers to not start. 
> What we found the initial part of the log has monotonically increasing 
> offset, where each compressed batch normally contained one or two records. 
> Then the is a batch that contains many records, whose first records have an 
> offset below the previous batch and whose last record has an offset above the 
> previous batch. Following on from this there continues a period of large 
> batches, with monotonically increasing offsets, and then the log returns to 
> batches with one or two records.
> Our working assumption here is that the period before the offset dip, with 
> the small batches, is pre-outage normal operation. The period of larger 
> ba

[jira] [Commented] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-06 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364141#comment-15364141
 ] 

Andy Coates commented on KAFKA-3919:


[~junrao] My understanding was that the offset index looks at the offset of the 
first record in the compressed set, not the last.  Having checked the code this 
does seem to be the case. (Code below from LogSegment.scala, recover()):

{code}
val startOffset =
entry.message.compressionCodec match {
  case NoCompressionCodec =>
entry.offset
  case _ =>
ByteBufferMessageSet.deepIterator(entry.message).next().offset
  }
{code}

> Broker faills to start after ungraceful shutdown due to non-monotonically 
> incrementing offsets in logs
> --
>
> Key: KAFKA-3919
> URL: https://issues.apache.org/jira/browse/KAFKA-3919
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Andy Coates
>
> Hi All,
> I encountered an issue with Kafka following a power outage that saw a 
> proportion of our cluster disappear. When the power came back on several 
> brokers halted on start up with the error:
> {noformat}
>   Fatal error during KafkaServerStartable startup. Prepare to shutdown”
>   kafka.common.InvalidOffsetException: Attempt to append an offset 
> (1239742691) to position 35728 no larger than the last offset appended 
> (1239742822) to 
> /data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
>   at 
> kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
>   at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
>   at kafka.log.LogSegment.recover(LogSegment.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
>   at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>   at kafka.log.Log.loadSegments(Log.scala:160)
>   at kafka.log.Log.(Log.scala:90)
>   at 
> kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The only way to recover the brokers was to delete the log files that 
> contained non monotonically incrementing offsets.
> We've spent some time digging through the logs and I feel I may have worked 
> out the sequence of events leading to this issue, (though this is based on 
> some assumptions I've made about the way Kafka is working, which may be 
> wrong).
> First off, we have unclean leadership elections disable. (We did later enable 
> them to help get around some other issues we were having, but this was 
> several hours after this issue manifested), and we're producing to the topic 
> with gzip compression and acks=1
> We looked through the data logs that were causing the brokers to not start. 
> What we found the initial part of the log has monotonically increasing 
> offset, where each compressed batch normally contained one or two records. 
> Then the is a batch that contains many records, whose first records have an 
> offset below the previous batch and whose last record has an offset above the 
> previous batch. Following on from this there continues a period of large 
> batches, with monotonically increasing offsets, and then the log returns to 
> batches with one or two records.
> Our working assumption here is that the period before the offset dip, with 
> the small batches, is pre-outage norm

[jira] [Updated] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-05 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-3919:
---
Description: 
Hi All,

I encountered an issue with Kafka following a power outage that saw a 
proportion of our cluster disappear. When the power came back on several 
brokers halted on start up with the error:

{noformat}
Fatal error during KafkaServerStartable startup. Prepare to shutdown”
kafka.common.InvalidOffsetException: Attempt to append an offset 
(1239742691) to position 35728 no larger than the last offset appended 
(1239742822) to 
/data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
at 
kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
at kafka.log.LogSegment.recover(LogSegment.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.log.Log.loadSegments(Log.scala:160)
at kafka.log.Log.(Log.scala:90)
at 
kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

The only way to recover the brokers was to delete the log files that contained 
non monotonically incrementing offsets.

We've spent some time digging through the logs and I feel I may have worked out 
the sequence of events leading to this issue, (though this is based on some 
assumptions I've made about the way Kafka is working, which may be wrong).

First off, we have unclean leadership elections disable. (We did later enable 
them to help get around some other issues we were having, but this was several 
hours after this issue manifested), and we're producing to the topic with gzip 
compression and acks=1

We looked through the data logs that were causing the brokers to not start. 
What we found the initial part of the log has monotonically increasing offset, 
where each compressed batch normally contained one or two records. Then the is 
a batch that contains many records, whose first records have an offset below 
the previous batch and whose last record has an offset above the previous 
batch. Following on from this there continues a period of large batches, with 
monotonically increasing offsets, and then the log returns to batches with one 
or two records.

Our working assumption here is that the period before the offset dip, with the 
small batches, is pre-outage normal operation. The period of larger batches is 
from just after the outage, where producers have a back log to processes when 
the partition becomes available, and then things return to normal batch sizes 
again once the back log clears.

We did also look through the Kafka's application logs to try and piece together 
the series of events leading up to this. Here’s what we know happened, with 
regards to one partition that has issues, from the logs:

Prior to outage:
* Replicas for the partition are brokers 2011, 2012,  2024, with 2024 being the 
preferred leader.
* Producers using acks=1, compression=gzip
* Brokers configured with unclean.elections=false, zk.session-timeout=36s

Post outage:
* 2011 comes up first, (also as the Controller), recovers unflushed log segment 
1239444214, completes load with offset 1239740602, and becomes leader of the 
partition.
* 2012 comes up next, recovers its log,  recovers unflushed log segment 
1239444214, truncates to offset 1239742830, (thats 2,228 records ahead of the 
recovered offset of the current leader), and starts following.
* 2024 comes up quickly after 2012.  recovers unflushed log segment 1239444214, 
truncates to offset  1239742250, (thats 1,648 records ahead of the recovered 
offset of the current leader), and starts following.
* The Control

[jira] [Updated] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-05 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-3919:
---
Description: 
Hi All,

I encountered an issue with Kafka following a power outage that saw a 
proportion of our cluster disappear. When the power came back on several 
brokers halted on start up with the error:

{noformat}
Fatal error during KafkaServerStartable startup. Prepare to shutdown”
kafka.common.InvalidOffsetException: Attempt to append an offset 
(1239742691) to position 35728 no larger than the last offset appended 
(1239742822) to 
/data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
at 
kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
at kafka.log.LogSegment.recover(LogSegment.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.log.Log.loadSegments(Log.scala:160)
at kafka.log.Log.(Log.scala:90)
at 
kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

The only way to recover the brokers was to delete the log files that contained 
non monotonically incrementing offsets.

We've spent some time digging through the logs and I feel I may have worked out 
the sequence of events leading to this issue, (though this is based on some 
assumptions I've made about the way Kafka is working, which may be wrong).

First off, we have unclean leadership elections disable. (We did later enable 
them to help get around some other issues we were having, but this was several 
hours after this issue manifested), and we're producing to the topic with gzip 
compression and acks=1

We looked through the data logs that were causing the brokers to not start. 
What we found the initial part of the log has monotonically increasing offset, 
where each compressed batch normally contained one or two records. Then the is 
a batch that contains many records, whose first records have an offset below 
the previous batch and whose last record has an offset above the previous 
batch. Following on from this there continues a period of large batches, with 
monotonically increasing offsets, and then the log returns to batches with one 
or two records.

Our working assumption here is that the period before the offset dip, with the 
small batches, is pre-outage normal operation. The period of larger batches is 
from just after the outage, where producers have a back log to processes when 
the partition becomes available, and then things return to normal batch sizes 
again once the back log clears.

We did also look through the Kafka's application logs to try and piece together 
the series of events leading up to this. Here’s what we know happened, with 
regards to one partition that has issues, from the logs:

Prior to outage:
Replicas for the partition are brokers 2011, 2012,  2024, with 2024 being the 
preferred leader.
Producers using acks=1, compression=gzip
Brokers configured with unclean.elections=false, zk.session-timeout=36s

Post outage:
2011 comes up first, (also as the Controller), recovers unflushed log segment 
1239444214, completes load with offset 1239740602, and becomes leader of the 
partition.
2012 comes up next, recovers its log,  recovers unflushed log segment 
1239444214, truncates to offset 1239742830, (thats 2,228 records ahead of the 
recovered offset of the current leader), and starts following.
2024 comes up quickly after 2012.  recovers unflushed log segment 1239444214, 
truncates to offset  1239742250, (thats 1,648 records ahead of the recovered 
offset of the current leader), and starts following.
The Controller adds 202

[jira] [Commented] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-05 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362604#comment-15362604
 ] 

Andy Coates commented on KAFKA-3919:


Hi [~junrao], thanks for taking the time to look at this.

First off, we have unclean leadership elections disable. (We did later enable 
them to help get around some other issues we were having, but this was several 
hours after this issue manifested).

We did look through data logs that were causing the brokers to not start. What 
we found before the incident was a monotonically increasing offset, where each 
compressed batch normally contained one or two records. Then the is a batch 
that contains many records, whose first records has an offset below the 
previous batch and whose last record has an offset above the previous batch. 
Following on from this there continues a period of large batches, with 
monotonically increasing offsets, and then the log returns to batches with one 
or two records.

Our working assumption here is that the period before the offset dip is 
pre-outage normal operation. The period of larger batches is from just after 
the outage, where producers have a back log to processes when the partition 
becomes available, and then things return to normal batch sizes again once the 
back log clears.

We did also look through the Kafka's application logs to try and piece together 
the series of events leading up to this:

Here’s what I know happened, with regards to one partition that has issues, 
from the logs:

Prior to outage:
Replicas for the partition are brokers 2011, 2012,  2024, with 2024 being the 
preferred leader.
Producers using acks=1, compression=gzip
Brokers configured with unclean.elections=false, zk.session-timeout=36s

Post outage:
2011 comes up first, (also as the Controller), recovers unflushed log segment 
1239444214, completes load with offset 1239740602, and becomes leader of the 
partition.
2012 comes up next, recovers its log,  recovers unflushed log segment 
1239444214, truncates to offset 1239742830, (thats 2,228 records ahead of the 
recovered offset of the current leader), and starts following.
2024 comes up quickly after 2012.  recovers unflushed log segment 1239444214, 
truncates to offset  1239742250, (thats 1,648 records ahead of the recovered 
offset of the current leader), and starts following.
The Controller adds 2024 to the replica set just before 2024 halts due to 
another partition having an offset greater than the leader.
The Controller adds 2012 to the replica set just before 2012 halts due to 
another partition having an offset greater than the leader.
When 2012 is next restarted, it fails to fully start as its complaining of 
invalid offsets in the log.

You’ll notice that the offset the brokers truncate to are different for each of 
the three brokers. 

Given that I can write to the partition with only one broker available, and 
that I can then take this broker down and bring up a different one from the 
replica set and write to that one, how does Kafka currently look to reconcile 
these different histories when the first node is brought back online?  I know 
that if the first node has a greater offset it will halt when it tries to 
follow the second, but what happens if the first node has a lower offset?

Maybe the above scenario is correctly handled and I’m off down a tangent! (I’d 
appreciate any info to improve my understanding of Kafka and help me figure out 
what is happening here.). I’m just trying to reconcile the data I’m seeing in 
the logs and your response to my post.

I’m going to extract the pertinent entries from our app logs, obfuscate and add 
them in here.

(I’ll also add some of that I’ve written here to the description above for the 
benefit of anyone new to the ticket)

Thanks,

Andy

> Broker faills to start after ungraceful shutdown due to non-monotonically 
> incrementing offsets in logs
> --
>
> Key: KAFKA-3919
> URL: https://issues.apache.org/jira/browse/KAFKA-3919
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Andy Coates
>
> Hi All,
> I encountered an issue with Kafka following a power outage that saw a 
> proportion of our cluster disappear. When the power came back on several 
> brokers halted on start up with the error:
> {noformat}
>   Fatal error during KafkaServerStartable startup. Prepare to shutdown”
>   kafka.common.InvalidOffsetException: Attempt to append an offset 
> (1239742691) to position 35728 no larger than the last offset appended 
> (1239742822) to 
> /data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
>   at 
> kafka

[jira] [Updated] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-01 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-3919:
---
Description: 
Hi All,

I encountered an issue with Kafka following a power outage that saw a 
proportion of our cluster disappear. When the power came back on several 
brokers halted on start up with the error:

{noformat}
Fatal error during KafkaServerStartable startup. Prepare to shutdown”
kafka.common.InvalidOffsetException: Attempt to append an offset 
(1239742691) to position 35728 no larger than the last offset appended 
(1239742822) to 
/data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
at 
kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
at kafka.log.LogSegment.recover(LogSegment.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.log.Log.loadSegments(Log.scala:160)
at kafka.log.Log.(Log.scala:90)
at 
kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

The only way to recover the brokers was to delete the log files that contained 
non monotonically incrementing offsets.

I’ve spent some time digging through the logs and I feel I may have worked out 
the sequence of events leading to this issue, (though this is based on some 
assumptions I've made about the way Kafka is working, which may be wrong).

Given:
* A topic that is produced to using acks = 1
* A topic that is produced to using gzip compression
* A topic that has min.isr set to less than the number of replicas, (i.e. 
min.isr=2, #replicas=3)
* Following ISRs are lagging behind the leader by some small number of 
messages, (which is normal with acks=1)
* brokers are configured with fairly large zk session timeout e.g. 30s.
* brokers are configured so that unclean leader elections are disabled.

Then:
When something like a power outage take out all three replicas, its possible to 
get into a state such that the indexes won’t rebuild on a restart and a broker 
fails to start. This can happen when:
* Enough brokers, but not the pre-outage leader, come on-line for the partition 
to be writeable
* Producers produce enough records to the partition that the head offset is now 
greater than the pre-outage leader head offset.
* The pre-outage leader comes back online.

At this point the logs on the pre-outage leader have diverged from the other 
replicas.  It has some messages that are not in the other replicas, and the 
other replicas have some records not in the pre-outage leader's log - at the 
same offsets.

I’m assuming that because the current leader has at higher offset than the 
pre-outage leader, the pre-outage leader just starts following the leader and 
requesting the records it thinks its missing.

I’m also assuming that because the producers were using gzip, so each record is 
actual a compressed message set, that iwhen the pre-outage leader requests 
records from the leader, the offset it requests could just happened to be in 
the middle of a compressed batch, but the leader returns the full batch.  When 
the pre-outage leader appends this batch to its own log it thinks all is OK. 
But what has happened is that the offsets in the log are no longer 
monotonically incrementing. Instead they actually dip by the number of records 
in the compressed batch that were before the requested offset.  If and when 
this broker restarts this dip may be at the 4K boundary the indexer checks. If 
it is, the broker won’t start.

Several of our brokers were unlucky enough to hit that 4K boundary, causing a 
protracted outage.  We’ve written a little utility that shows se

[jira] [Issue Comment Deleted] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-01 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-3919:
---
Comment: was deleted

(was: Hi [~junrao], thanks for taking the time to look at this.

Note: I've incorporated some of what I say below into the problem description 
above, so that it doesn't get lost in the comments.

First off, we have unclean leadership elections disable. (We did later enable 
them to help get around some other issues we were having, but this was several 
hours after this issue manifested).

We did look through data logs that were causing the brokers to not start. What 
we found before the incident was a monotonically increasing offset, where each 
compressed batch normally contained one or two records. Then the is a batch 
that contains many records, whose first records has an offset below the 
previous batch and whose last record has an offset above the previous batch. 
Following on from this there continues a period of large batches, with 
monotonically increasing offsets, and then the log returns to batches with one 
or two records.

Our working assumption here is that the period before the offset dip is 
pre-outage normal operation. The period of larger batches is from just after 
the outage, where producers have a back log to processes when the partition 
becomes available, and then things return to normal batch sizes again once the 
back log clears.

We did also look through the Kafka's application logs to try and piece together 
the series of events leading up to this:

Here’s what I know happened, with regards to one partition that has issues, 
from the logs:

Prior to outage:
Replicas for the partition are brokers 2011, 2012,  2024, with 2024 being the 
preferred leader.
Producers using acks=1, compression=gzip
Brokers configured with unclean.elections=false, zk.session-timeout=36s

Post outage:
2011 comes up first, (also as the Controller), recovers unflushed log set 
1239444214, completes load with offset 1239740602, and becomes leader of the 
partition.
2012 comes up next, recovers its log,  recovers unflushed log set 1239444214, 
truncates to offset 1239742830, (thats 2,228 records ahead of the recovered 
offset of the current leader), and starts following.
2024 comes up quickly after 2012.  recovers unflushed log set 1239444214, 
truncates to offset  1239742250, (thats 1,648 records ahead of the recovered 
offset of the current leader), and starts following.
The Controller adds 2024 to the replica set just before 2024 halts due to 
another partition having an offset greater than the leader.
The Controller adds 2012 to the replica set just before 2012 halts due to 
another partition having an offset greater than the leader.
When 2012 is next restarted, it fails to fully start as its complaining of 
invalid offsets in the log.

Our working hypothesis here is that the partition becomes writeable again, 
possibly as brokers 2012 & 2024 get added to the ISR set before halting, and 
maybe don’t remove themselves when they halt? - hence remain in the ISR set for 
36 seconds. Mean while our producers are happily sending large compressed 
batches, as they have a backlog, to broker 2011, which is accepting them, (as 
there are enough replicas in the ISR set), and appending them to its log - 
moving its offset beyond brokers 2012 and 2024.

Log entries:

(Interleaved log entries from the three brokers - the broker id is in the [id] 
brackets)

Just as the power was going out I see this in the broker that was the 
controller:

2016-04-11 12:01:42 - [2026] - "[Partition state machine on Controller 2026]: 
Invoking state change to OnlinePartition for partitions 
[mt_xp_its_music_main_itsevent,20]”

2016-04-11 12:01:56 - [2026] - "[Replica state machine on controller 2026]: 
Invoking state change to OfflineReplica for replicas
[Topic=mt_xp_its_music_main_itsevent,Partition=20,Replica=2024] 

2016-04-11 12:01:56 - [2026] - "[Controller 2026]: Cannot remove replica 2024 
from ISR of partition [mt_xp_its_music_main_itsevent,20] since it is not in the 
ISR. Leader = 2011 ; ISR = List(2011, 2012)”

2016-04-11 12:01:56 - [2026] - "[Channel manager on controller 2026]: Not 
sending request 
{controller_id=2026,controller_epoch=111,delete_partitions=0,partitions=[{topic=mt_xp_its_music_main_itsevent,partition=20}]}
 to broker 2024, since it is offline.”

2016-04-11 12:04:46 - [2026] - [Replica state machine on controller 2026]: 
Invoking state change to OnlineReplica for replicas
[Topic=mt_xp_its_music_main_itsevent,Partition=20,Replica=2024]

2016-04-11 12:04:58 - [2026] - "[Controller 2026]: Starting preferred replica 
leader election for partitions [mt_xp_its_music_main_itsevent,20]”

2016-04-11 12:04:58 - [2026] - "[Partition state machine on Controller 2026]: 
Invoking state change to OnlinePartition for partitions 
[mt_xp_its_musi

[jira] [Commented] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-01 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358704#comment-15358704
 ] 

Andy Coates commented on KAFKA-3919:


Hi [~junrao], thanks for taking the time to look at this.

Note: I've incorporated some of what I say below into the problem description 
above, so that it doesn't get lost in the comments.

First off, we have unclean leadership elections disable. (We did later enable 
them to help get around some other issues we were having, but this was several 
hours after this issue manifested).

We did look through data logs that were causing the brokers to not start. What 
we found before the incident was a monotonically increasing offset, where each 
compressed batch normally contained one or two records. Then the is a batch 
that contains many records, whose first records has an offset below the 
previous batch and whose last record has an offset above the previous batch. 
Following on from this there continues a period of large batches, with 
monotonically increasing offsets, and then the log returns to batches with one 
or two records.

Our working assumption here is that the period before the offset dip is 
pre-outage normal operation. The period of larger batches is from just after 
the outage, where producers have a back log to processes when the partition 
becomes available, and then things return to normal batch sizes again once the 
back log clears.

We did also look through the Kafka's application logs to try and piece together 
the series of events leading up to this:

Here’s what I know happened, with regards to one partition that has issues, 
from the logs:

Prior to outage:
Replicas for the partition are brokers 2011, 2012,  2024, with 2024 being the 
preferred leader.
Producers using acks=1, compression=gzip
Brokers configured with unclean.elections=false, zk.session-timeout=36s

Post outage:
2011 comes up first, (also as the Controller), recovers unflushed log set 
1239444214, completes load with offset 1239740602, and becomes leader of the 
partition.
2012 comes up next, recovers its log,  recovers unflushed log set 1239444214, 
truncates to offset 1239742830, (thats 2,228 records ahead of the recovered 
offset of the current leader), and starts following.
2024 comes up quickly after 2012.  recovers unflushed log set 1239444214, 
truncates to offset  1239742250, (thats 1,648 records ahead of the recovered 
offset of the current leader), and starts following.
The Controller adds 2024 to the replica set just before 2024 halts due to 
another partition having an offset greater than the leader.
The Controller adds 2012 to the replica set just before 2012 halts due to 
another partition having an offset greater than the leader.
When 2012 is next restarted, it fails to fully start as its complaining of 
invalid offsets in the log.

Our working hypothesis here is that the partition becomes writeable again, 
possibly as brokers 2012 & 2024 get added to the ISR set before halting, and 
maybe don’t remove themselves when they halt? - hence remain in the ISR set for 
36 seconds. Mean while our producers are happily sending large compressed 
batches, as they have a backlog, to broker 2011, which is accepting them, (as 
there are enough replicas in the ISR set), and appending them to its log - 
moving its offset beyond brokers 2012 and 2024.

Log entries:

(Interleaved log entries from the three brokers - the broker id is in the [id] 
brackets)

Just as the power was going out I see this in the broker that was the 
controller:

2016-04-11 12:01:42 - [2026] - "[Partition state machine on Controller 2026]: 
Invoking state change to OnlinePartition for partitions 
[mt_xp_its_music_main_itsevent,20]”

2016-04-11 12:01:56 - [2026] - "[Replica state machine on controller 2026]: 
Invoking state change to OfflineReplica for replicas
[Topic=mt_xp_its_music_main_itsevent,Partition=20,Replica=2024] 

2016-04-11 12:01:56 - [2026] - "[Controller 2026]: Cannot remove replica 2024 
from ISR of partition [mt_xp_its_music_main_itsevent,20] since it is not in the 
ISR. Leader = 2011 ; ISR = List(2011, 2012)”

2016-04-11 12:01:56 - [2026] - "[Channel manager on controller 2026]: Not 
sending request 
{controller_id=2026,controller_epoch=111,delete_partitions=0,partitions=[{topic=mt_xp_its_music_main_itsevent,partition=20}]}
 to broker 2024, since it is offline.”

2016-04-11 12:04:46 - [2026] - [Replica state machine on controller 2026]: 
Invoking state change to OnlineReplica for replicas
[Topic=mt_xp_its_music_main_itsevent,Partition=20,Replica=2024]

2016-04-11 12:04:58 - [2026] - "[Controller 2026]: Starting preferred replica 
leader election for partitions [mt_xp_its_music_main_itsevent,20]”

2016-04-11 12:04:58 - [2026] - "[Partition state machine on Controller 2026]: 
Invoking state change to OnlinePartit

[jira] [Updated] (KAFKA-3919) Broker faills to start after ungraceful shutdown due to non-monotonically incrementing offsets in logs

2016-07-01 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-3919:
---
Description: 
Hi All,

I encountered an issue with Kafka following a power outage that saw a 
proportion of our cluster disappear. When the power came back on several 
brokers halted on start up with the error:

{noformat}
Fatal error during KafkaServerStartable startup. Prepare to shutdown”
kafka.common.InvalidOffsetException: Attempt to append an offset 
(1239742691) to position 35728 no larger than the last offset appended 
(1239742822) to 
/data3/kafka/mt_xp_its_music_main_itsevent-20/001239444214.index.
at 
kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:207)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
at kafka.log.LogSegment.recover(LogSegment.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:188)
at kafka.log.Log$$anonfun$loadSegments$4.apply(Log.scala:160)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.log.Log.loadSegments(Log.scala:160)
at kafka.log.Log.(Log.scala:90)
at 
kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:150)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:60)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

The only way to recover the brokers was to delete the log files that contained 
non monotonically incrementing offsets.

I’ve spent some time digging through the logs and I feel I may have worked out 
the sequence of events leading to this issue, (though this is based on some 
assumptions I've made about the way Kafka is working, which may be wrong):

Given:
* A topic that is produced to using acks = 1
* A topic that is produced to using gzip compression
* A topic that has min.isr set to less than the number of replicas, (i.e. 
min.isr=2, #replicas=3)
* Following ISRs are lagging behind the leader by some small number of 
messages, (which is normal with acks=1)
* brokers are configured with fairly large zk session timeout e.g. 30s.
* brokers are configured so that unclean leader elections are disabled.

Then:
When something like a power outage take out all three replicas, its possible to 
get into a state such that the indexes won’t rebuild on a restart and a broker 
fails to start. This can happen when:
* Enough brokers, but not the pre-outage leader, come on-line for the partition 
to be writeable
* Producers produce enough records to the partition that the head offset is now 
greater than the pre-outage leader head offset.
* The pre-outage leader comes back online.

At this point the logs on the pre-outage leader have diverged from the other 
replicas.  It has some messages that are not in the other replicas, and the 
other replicas have some records not in the pre-outage leader's log - at the 
same offsets.

I’m assuming that because the current leader has at higher offset than the 
pre-outage leader, the pre-outage leader just starts following the leader and 
requesting the records it thinks its missing.

I’m also assuming that because the producers were using gzip, so each record is 
actual a compressed message set, that iwhen the pre-outage leader requests 
records from the leader, the offset it requests could just happened to be in 
the middle of a compressed batch, but the leader returns the full batch.  When 
the pre-outage leader appends this batch to its own log it thinks all is OK. 
But what has happened is that the offsets in the log are no longer 
monotonically incrementing. Instead they actually dip by the number of records 
in the compressed batch that were before the requested offset.  If and when 
this broker restarts this dip may be at the 4K boundary the indexer checks. If 
it is, the broker won’t start.

Several of our brokers were unlucky enough to hit that 4K boundary, causing a 
protracted outage.  We’ve written a little utility that shows se

  1   2   >