Re: [DISCUSS] KIP-66 Kafka Connect Transformers for messages

2016-12-10 Thread Ewen Cheslack-Postava
If anyone has time to review here, it'd be great to get feedback. I'd
imagine that the proposal itself won't be too controversial -- keeps
transformations simple (by only allowing map/filter), doesn't affect the
rest of the framework much, and fits in with general config structure we've
used elsewhere (although ConfigDef could use some updates to make this
easier...).

I think the main open questions for me are:

a) Is TransformableRecord worth it to avoid reimplementing small bits of
code (it allows for a single implementation of the interface to trivially
apply to both Source and SinkRecords). I think I prefer this, but it does
come with some commitment to another interface on top of ConnectRecord. We
could alternatively modify ConnectRecord which would require fewer changes.
b) How do folks feel about built-in transformations and the set that are
mentioned here? This brings us way back to the discussion of built-in
connectors. Transformations, especially when intended to be lightweight and
touch nothing besides the data already in the record, seem different from
connectors -- there might be quite a few, but hopefully limited. Since we
(hopefully) already factor out most serialization-specific stuff via
Converters, I think we can keep this pretty limited. That said, I have no
doubt some folks will (in my opinion) abuse this feature to do data
enrichment by querying external systems, so building a bunch of
transformations in could potentially open the floodgates, or at least make
decisions about what is included vs what should be 3rd party muddy.

-Ewen


On Wed, Dec 7, 2016 at 11:46 AM, Shikhar Bhushan 
wrote:

> Hi all,
>
> I have another iteration at a proposal for this feature here:
> https://cwiki.apache.org/confluence/display/KAFKA/
> Connect+Transforms+-+Proposed+Design
>
> I'd welcome your feedback and comments.
>
> Thanks,
>
> Shikhar
>
> On Tue, Aug 2, 2016 at 7:21 PM Ewen Cheslack-Postava 
> wrote:
>
> On Thu, Jul 28, 2016 at 11:58 PM, Shikhar Bhushan 
> wrote:
>
> > >
> > >
> > > Hmm, operating on ConnectRecords probably doesn't work since you need
> to
> > > emit the right type of record, which might mean instantiating a new
> one.
> > I
> > > think that means we either need 2 methods, one for SourceRecord, one
> for
> > > SinkRecord, or we'd need to limit what parts of the message you can
> > modify
> > > (e.g. you can change the key/value via something like
> > > transformKey(ConnectRecord) and transformValue(ConnectRecord), but
> other
> > > fields would remain the same and the fmwk would handle allocating new
> > > Source/SinkRecords if needed)
> > >
> >
> > Good point, perhaps we could add an abstract method on ConnectRecord that
> > takes all the shared fields as parameters and the implementations return
> a
> > copy of the narrower SourceRecord/SinkRecord type as appropriate.
> > Transformers would only operate on ConnectRecord rather than caring about
> > SourceRecord or SinkRecord (in theory they could instanceof/cast, but the
> > API should discourage it)
> >
> >
> > > Is there a use case for hanging on to the original? I can't think of a
> > > transformation where you'd need to do that (or couldn't just order
> things
> > > differently so it isn't a problem).
> >
> >
> > Yeah maybe this isn't really necessary. No strong preference here.
> >
> > That said, I do worry a bit that farming too much stuff out to
> transformers
> > > can result in "programming via config", i.e. a lot of the simplicity
> you
> > > get from Connect disappears in long config files. Standardization would
> > be
> > > nice and might just avoid this (and doesn't cost that much implementing
> > it
> > > in each connector), and I'd personally prefer something a bit less
> > flexible
> > > but consistent and easy to configure.
> >
> >
> > Not sure what the you're suggesting :-) Standardized config properties
> for
> > a small set of transformations, leaving it upto connectors to integrate?
> >
>
> I just mean that you get to the point where you're practically writing a
> Kafka Streams application, you're just doing it through either an
> incredibly convoluted set of transformers and configs, or a single
> transformer with incredibly convoluted set of configs. You basically get to
> the point where you're config is a mini DSL and you're not really saving
> that much.
>
> The real question is how much we want to venture into the "T" part of ETL.
> I tend to favor minimizing how much we take on since the rest of Connect
> isn't designed for it, it's designed around the E & L parts.
>
> -Ewen
>
>
> > Personally I'm skeptical of that level of flexibility in transformers --
> > > its getting awfully complex and certainly takes us pretty far from
> > "config
> > > only" realtime data integration. It's not clear to me what the use
> cases
> > > are that aren't covered by a small set of common transformations that
> can
> > > be chained together (e.g. rename/remove fields, mask values, and maybe
> a
> > > couple more).
> >

Build failed in Jenkins: kafka-trunk-jdk8 #1095

2016-12-10 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-4476: Kafka Streams gets stuck if metadata is missing

[wangguoz] KAFKA-4510: StreamThread must finish rebalance in state 
PENDING_SHUTDOWN

--
[...truncated 14543 lines...]

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldNotGetAllTasksWithStoreWhenNotRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartOnceClosed PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStartAndClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[0] PASSED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] STARTED

org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest > 
shouldCountClicksPerRegion[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[0] FAILED
java.lang.AssertionError: Condition not met within timeout 6. Did not 
receive 1 number of records
at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:279)
at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinValuesRecordsReceived(IntegrationTestUtils.java:251)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.waitUntilAtLeastNumRecordProcessed(QueryableStateIntegrationTest.java:658)
at 
org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:341)

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[0] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryState[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance[1] PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses[1] PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhe

Re: [VOTE] KIP-88: OffsetFetch Protocol Update

2016-12-10 Thread Ewen Cheslack-Postava
+1 (binding)

The AdminClient piece will probably need follow up, but given that it's
defined in the context of incomplete KIP-4 design, I'm kind of assuming
it's still up for further discussion once we hit the implementation of that
class (where I think a wholistic view of the API is needed).

Vahid, I think this gives you enough to close the vote given 3 binding
votes + >72hrs passed :)

-Ewen

On Wed, Dec 7, 2016 at 11:37 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

> +1 (non-binding)
>
> On Wed, Dec 7, 2016 at 7:22 PM, Apurva Mehta  wrote:
>
> > +1 (non-binding)
> >
> > On Wed, Dec 7, 2016 at 10:05 AM, Jason Gustafson 
> > wrote:
> >
> > > +1 Thanks for the KIP!
> > >
> > > On Wed, Dec 7, 2016 at 2:53 AM, Ismael Juma  wrote:
> > >
> > > > Thanks for the KIP, Vahid. +1 (binding)
> > > >
> > > > On Mon, Dec 5, 2016 at 6:16 PM, Vahid S Hashemian <
> > > > vahidhashem...@us.ibm.com
> > > > > wrote:
> > > >
> > > > > Happy Monday,
> > > > >
> > > > > I'd like to start voting on KIP-88 (
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 88%3A+OffsetFetch+Protocol+Update
> > > > > ).
> > > > > The discussion thread can be found here:
> > > > > https://www.mail-archive.com/dev@kafka.apache.org/msg59608.html
> > > > >
> > > > > Thank you for your feedback.
> > > > >
> > > > > Regards,
> > > > > --Vahid
> > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
>
> Rajini
>


[jira] [Commented] (KAFKA-4521) MirrorMaker should flush all messages before releasing partition ownership during rebalance

2016-12-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15739243#comment-15739243
 ] 

ASF GitHub Bot commented on KAFKA-4521:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/2241

KAFKA-4521; MirrorMaker should flush all messages before releasing 
partition ownership during rebalance



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4521

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2241.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2241


commit cdbd79189f1cdb53f93d014c9b9758a5941b8168
Author: Dong Lin 
Date:   2016-12-11T07:03:46Z

KAFKA-4521; MirrorMaker should flush all messages before releasing 
partition ownership during rebalance




> MirrorMaker should flush all messages before releasing partition ownership 
> during rebalance
> ---
>
> Key: KAFKA-4521
> URL: https://issues.apache.org/jira/browse/KAFKA-4521
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> In order to ensure that messages from a given partition in the source cluster 
> are mirrored to the same partition in the destination cluster in the *same* 
> order, MirrorMaker needs to produce and flush all messages that its consumer 
> has received from source cluster before giving up partition in the cluster. 
> However, as of current implementation of Apache Kafka, this is not guaranteed 
> and will cause out-of-order message delivery in the following scenario.
> - mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
> FetchRequest
> - mirror maker process 2 starts up and triggers rebalance.
> - `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
> thread, which does producer.flush() and commit offset of this consumer. 
> However, at this moment messages 1, 2, 3 haven't even been produced.
> - consumer of mirror maker process 1 releases ownership of this partition
> - consumer of mirror maker process 2 gets ownership of this partition.
> - mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
> - messages 4, 5, 6 can be produced before messages 1, 2, 3. 
> To fix this problem, the rebalance listener callback function should signal 
> MirrorMakerThread to get all messages from consumer, produce these messages 
> to destination cluster, flush producer, and commit offset. Rebalance listener 
> callback function should wait for MirrorMakerThread to finish these steps 
> before it allows ownership of this partition to be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2241: KAFKA-4521; MirrorMaker should flush all messages ...

2016-12-10 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/2241

KAFKA-4521; MirrorMaker should flush all messages before releasing 
partition ownership during rebalance



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4521

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2241.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2241


commit cdbd79189f1cdb53f93d014c9b9758a5941b8168
Author: Dong Lin 
Date:   2016-12-11T07:03:46Z

KAFKA-4521; MirrorMaker should flush all messages before releasing 
partition ownership during rebalance




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4521) MirrorMaker should flush all messages before releasing partition ownership during rebalance

2016-12-10 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-4521:

Description: 
In order to ensure that messages from a given partition in the source cluster 
are mirrored to the same partition in the destination cluster in the *same* 
order, MirrorMaker needs to produce and flush all messages that its consumer 
has received from source cluster before giving up partition in the cluster. 

However, as of current implementation of Apache Kafka, this is not guaranteed 
and will cause out-of-order message delivery in the following scenario.

- mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
FetchRequest
- mirror maker process 2 starts up and triggers rebalance.
- `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
thread, which does producer.flush() and commit offset of this consumer. 
However, at this moment messages 1, 2, 3 haven't even been produced.
- consumer of mirror maker process 1 releases ownership of this partition
- consumer of mirror maker process 2 gets ownership of this partition.
- mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
- messages 4, 5, 6 can be produced before messages 1, 2, 3. 

To fix this problem, the rebalance listener callback function should signal 
MirrorMakerThread to get all messages from consumer, produce these messages to 
destination cluster, flush producer, and commit offset. Rebalance listener 
callback function should wait for MirrorMakerThread to finish these steps 
before it allows ownership of this partition to be released.



  was:
In order to ensure that messages from a given partition in the source cluster 
are mirrored to the same partition in the destination cluster in the *same* 
order, MirrorMaker needs to produce and flush all messages that its consumer 
has received from source cluster before giving up partition in the cluster. 

However, as of current implementation of Apache Kafka, this is not guaranteed 
and will cause out-of-order message delivery in the following scenario.

- mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
FetchRequest
- mirror maker process 2 starts up and triggers rebalance.
- `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
thread, which does producer.flush() and commit offset of this consumer. 
However, at this moment messages 1, 2, 3 haven't even been produced.
- consumer of mirror maker process 1 releases ownership of this partition
- consumer of mirror maker process 2 gets ownership of this partition.
- mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
- messages 4, 5, 6 can be produced before messages 1, 2, 3. 

To avoid this problem, the rebalance listener callback function should signal 
MirrorMakerThread to get all messages from consumer, produce these messages to 
destination cluster, flush producer, and commit offset. rebalance listener 
callback function waits for MirrorMakerThread to finish these steps before it 
allows ownership of this partition to be released.




> MirrorMaker should flush all messages before releasing partition ownership 
> during rebalance
> ---
>
> Key: KAFKA-4521
> URL: https://issues.apache.org/jira/browse/KAFKA-4521
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> In order to ensure that messages from a given partition in the source cluster 
> are mirrored to the same partition in the destination cluster in the *same* 
> order, MirrorMaker needs to produce and flush all messages that its consumer 
> has received from source cluster before giving up partition in the cluster. 
> However, as of current implementation of Apache Kafka, this is not guaranteed 
> and will cause out-of-order message delivery in the following scenario.
> - mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
> FetchRequest
> - mirror maker process 2 starts up and triggers rebalance.
> - `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
> thread, which does producer.flush() and commit offset of this consumer. 
> However, at this moment messages 1, 2, 3 haven't even been produced.
> - consumer of mirror maker process 1 releases ownership of this partition
> - consumer of mirror maker process 2 gets ownership of this partition.
> - mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
> - messages 4, 5, 6 can be produced before messages 1, 2, 3. 
> To fix this problem, the rebalance listener callback function should signal 
> MirrorMakerThread to get all messages from consumer, produce these messages 
> to destination cluster, flush producer, and commit offset. Rebalance listener 
> callback funct

[jira] [Created] (KAFKA-4521) MirrorMaker should flush all messages before releasing partition ownership during rebalance

2016-12-10 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4521:
---

 Summary: MirrorMaker should flush all messages before releasing 
partition ownership during rebalance
 Key: KAFKA-4521
 URL: https://issues.apache.org/jira/browse/KAFKA-4521
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin


In order to ensure that messages from a given partition in the source cluster 
are mirrored to the same partition in the destination cluster in the *same* 
order, MirrorMaker needs to produce and flush all messages that its consumer 
has received from source cluster before giving up partition in the cluster. 

However, as of current implementation of Apache Kafka, this is not guaranteed 
and will cause out-of-order message delivery in the following scenario.

- mirror maker process 1 fetches messages 1, 2, 3, from source cluster in one 
FetchRequest
- mirror maker process 2 starts up and triggers rebalance.
- `InternalRebalanceListenerForOldConsumer` is executed by zookeeper listener 
thread, which does producer.flush() and commit offset of this consumer. 
However, at this moment messages 1, 2, 3 haven't even been produced.
- consumer of mirror maker process 1 releases ownership of this partition
- consumer of mirror maker process 2 gets ownership of this partition.
- mirror maker process 2 fetches messages 4, 5, 6 from source cluster.
- messages 4, 5, 6 can be produced before messages 1, 2, 3. 

To avoid this problem, the rebalance listener callback function should signal 
MirrorMakerThread to get all messages from consumer, produce these messages to 
destination cluster, flush producer, and commit offset. rebalance listener 
callback function waits for MirrorMakerThread to finish these steps before it 
allows ownership of this partition to be released.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-95: Incremental Batch Processing for Kafka Streams

2016-12-10 Thread Ewen Cheslack-Postava
A few thoughts:

> introduce a Streams metadata topic (single partitioned and log compacted;
one for each application-ID)

I would consider whether the single partition is a strict requirement. A
single partition increases the likelihood of outages. This is something I'd
like to see changed in Connect at some point -- the config topic is single
partition because it's already complicated enough given compaction (see
this complicated implementation explanation
https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java#L58-L149).
But it also means that you can't make progress on anything if that topic
partition has some outage. When starting up from scratch, maybe you're just
stuck since a single topic partition *could* block progress. But I could
imagine the completed marker in your example simply containing the offsets
for completion in all the other topic partitions that were written to,
making it trivial to determine if you need to read a specific topic
partition in the metadata topic. In that type of design, more partitions
become a tradeoff -- a single streams app might be less likely to hit a
missing partition, but with more partitions if you spread them across the
same # of brokers the chance of any given partition being offline could
increase.

In any case, I think it is worth thinking carefully about that tradeoff
since it's hard to change later.

> Because there will be just one leader, this operation is single-threaded
and thus safe.

Another way to make this safe would be to be able to take advantage of the
generation ID from the consumer. This isn't exposed publicly atm, but I
think you can get at it in the streams impl and also helps protect against
zombie writers (the setup you're describing here and in Connect is
vulnerable to an old leader going through a GC pause or something and then
continuing to overwrite data). Again, this is something we didn't do for
Connect and I now regret it since I think it would also simplify some of
the recovery process when you've had partial writes and then a crash. (Now
that I think about it, adding some sort of version number and making the
log cleaner aware of it would be quite nice...)

> After the last instance did terminate itself, the metadata topic will
have one more record written.
> The writing order into the metadata topic ensures, that we shut down
cleanly.

Not sure I understand this. How does each instance know that it is the
last? Couldn't the last 2 or N complete within the same consumer session,
before a rebalance occurs, and both think they are not the last one? Even
if you can guarantee this, couldn't only some of the messages be written
and the deletion wouldn't be completed? After compaction you'd end up with
partial data that I don't think you can tell is partial. I think the
startup algorithm isn't actually complete as you need to handle 1) no
end-offsets 2) some end-offsets 3) all end-offsets with a completed marker.
(Checking if the completed marker was the last message read isn't
sufficient since we expect the following cleanup messages to be compacted).

Jay's suggestion of using the offset commit metadata was also what I was
thinking before reading the multiple sub topology stuff, but even then a
large input set could probably run into issues with offset commit size
since you'd want to write all of that from the leader in a single
messageset.

-Ewen

On Fri, Dec 9, 2016 at 5:19 PM, Matthias J. Sax 
wrote:

> About using offset-topic metadata field:
>
> Even with KIP-98, I think this would not work (or maybe in a weird way).
> If we have the following topology:
>
> topic1 -> subTopologyA -> topic2 -> subTopologyB
>
> If producer of subTopologyA commits, it will commit its input offsets
> from topic1. Thus, the stop-offsets of topic2 for subTopologyB would we
> committed with the metadata of topic1 commits. But subTopologyB is not
> really related to topic1. I guess it would not be impossible to make
> this work, however, the design seems to be somewhat weird.
>
> But maybe, I do miss something.
>
> Furthermore, I am not sure if Streams will use transactions all the
> same. Will there be an option for the user to disable transactions and
> stick with at-least-once processing?
>
> Also, we would need to delay this KIP until KIP-98 and a Streams EOS KIP
> is in place... I would rather include this in next release 0.10.2.
>
>
> -Matthias
>
> On 12/9/16 10:47 AM, Guozhang Wang wrote:
> > I will read through the KIP doc once again to provide more detailed
> > feedbacks, but let me through my two cents just for the above email.
> >
> > There are a few motivations to have a "consistent" stop-point across
> tasks
> > belonging to different sub-topologies. One of them is for interactive
> > queries: say you have two state stores belonging to two sub-topologies,
> if
> > they stopped at different points, then when user querying them they will
> > see inconsistent answer

[jira] [Updated] (KAFKA-4510) StreamThread must finish rebalance in state PENDING_SHUTDOWN

2016-12-10 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4510:
-
   Resolution: Fixed
Fix Version/s: 0.10.2.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2227
[https://github.com/apache/kafka/pull/2227]

> StreamThread must finish rebalance in state PENDING_SHUTDOWN
> 
>
> Key: KAFKA-4510
> URL: https://issues.apache.org/jira/browse/KAFKA-4510
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.10.2.0
>
>
> If a Streams application runs with multiple threads within one JVM and the 
> application gets stopped. this can triggers a rebalance when the first 
> threads finishes processing because not all thread shut down at the same time.
> Because we try to reuse tasks, on rebalance task are not closed immediately 
> in order for a potential reuse (if a task gets assigned the its original 
> thread). However, if a thread is in state {{PENDING_SHUTDOWN}} it does finish 
> the rebalance operation completely and thus does not release the suspended 
> task and the application exits with not all locks released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2227: KAFKA-4510: StreamThread must finish rebalance in ...

2016-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2227


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4510) StreamThread must finish rebalance in state PENDING_SHUTDOWN

2016-12-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15739151#comment-15739151
 ] 

ASF GitHub Bot commented on KAFKA-4510:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2227


> StreamThread must finish rebalance in state PENDING_SHUTDOWN
> 
>
> Key: KAFKA-4510
> URL: https://issues.apache.org/jira/browse/KAFKA-4510
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.10.2.0
>
>
> If a Streams application runs with multiple threads within one JVM and the 
> application gets stopped. this can triggers a rebalance when the first 
> threads finishes processing because not all thread shut down at the same time.
> Because we try to reuse tasks, on rebalance task are not closed immediately 
> in order for a potential reuse (if a task gets assigned the its original 
> thread). However, if a thread is in state {{PENDING_SHUTDOWN}} it does finish 
> the rebalance operation completely and thus does not release the suspended 
> task and the application exits with not all locks released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4476) Kafka Streams gets stuck if metadata is missing

2016-12-10 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4476:
-
   Resolution: Fixed
Fix Version/s: 0.10.2.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2209
[https://github.com/apache/kafka/pull/2209]

> Kafka Streams gets stuck if metadata is missing
> ---
>
> Key: KAFKA-4476
> URL: https://issues.apache.org/jira/browse/KAFKA-4476
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Critical
> Fix For: 0.10.2.0
>
>
> When a Kafka Streams application gets started for the first time, it can 
> happen that some topic metadata is missing when 
> {{StreamPartitionAssigner#assign()}} is called on the group leader instance. 
> This can result in an infinite loop within 
> {{StreamPartitionAssigner#assign()}}. This issue was detected by 
> {{ResetIntegrationTest}} that does have a transient timeout failure (c.f. 
> https://issues.apache.org/jira/browse/KAFKA-4058 -- this issue was re-opened 
> multiple times as the problem was expected to be in the test -- however, that 
> is not the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2209: KAFKA-4476: Kafka Streams gets stuck if metadata i...

2016-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2209


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4476) Kafka Streams gets stuck if metadata is missing

2016-12-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15739141#comment-15739141
 ] 

ASF GitHub Bot commented on KAFKA-4476:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2209


> Kafka Streams gets stuck if metadata is missing
> ---
>
> Key: KAFKA-4476
> URL: https://issues.apache.org/jira/browse/KAFKA-4476
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Critical
> Fix For: 0.10.2.0
>
>
> When a Kafka Streams application gets started for the first time, it can 
> happen that some topic metadata is missing when 
> {{StreamPartitionAssigner#assign()}} is called on the group leader instance. 
> This can result in an infinite loop within 
> {{StreamPartitionAssigner#assign()}}. This issue was detected by 
> {{ResetIntegrationTest}} that does have a transient timeout failure (c.f. 
> https://issues.apache.org/jira/browse/KAFKA-4058 -- this issue was re-opened 
> multiple times as the problem was expected to be in the test -- however, that 
> is not the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE]: KIP-97: The client compatibility KIP

2016-12-10 Thread Ewen Cheslack-Postava
+1 (binding)

-Ewen

On Sat, Dec 10, 2016 at 2:41 PM, Jay Kreps  wrote:

> +1 (binding)
>
> -Jay
>
> On Wed, Dec 7, 2016 at 9:17 AM, Colin McCabe  wrote:
>
> > Hi all,
> >
> > I heard that the VOTE and DISCUSS threads for the KIP-97 discussion
> > appeared to be in the same email thread for some people using gmail.  So
> > I'm reposting in hopes of getting a separate email thread this time for
> > those viewers. :)
> >
> > I'd like to start voting on KIP-97
> > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 97%3A+Improved+Kafka+Client+RPC+Compatibility+Policy
> > ).
> >
> > The discussion thread can be found here:
> > https://www.mail-archive.com/dev@kafka.apache.org/msg60955.html
> >
> > Thanks for your feedback.
> >
> > best,
> > Colin McCabe
> >
>


Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag

2016-12-10 Thread Ewen Cheslack-Postava
It seemed like this was addressed in the migration section, wasn't it? The
V2 vs V3 requests and the fact that the broker will downgrade the message
format (losing zero copy) if you issues a V2 request to a broker using V3
format handles compatibility, does it not? The existing consumers won't see
the extra metadata in the value, but they will get a null instead and treat
it as a tombstone. Certainly there is a performance impact, but it seemed
compatible.

I'm worried about this though:


   - *NOTE *: With the new version of producer client using ProduceRequest
   V3 (magic byte = 2), a non tombstone (tombstone bit not set) and null value
   should not be allowed as the older version of consumer using FetchRequest
   V2 will think of this as a tombstone when its actually not.


Unless I'm misunderstanding, this ends up breaking binary compatibility for
the Java API. It sounds like this suggests that passing a null value to the
existing ProducerRecord constructors would generate an exception since you
didn't explicitly enable the tombstone (via whatever new constructor is
provided). But that means you can't swap in a newer client jar without
recompiling and get the same behavior if your app deletes keys using the
current approach because your app will start throwing exceptions. Maybe
this is a tradeoff we're ok with, but we've tried pretty hard to avoid
breaking compatibility like this so far -- it makes picking up bug fixes in
the clients harder for users of frameworks that have to pin to earlier
default versions for compatibility.

But then later the KIP says:


   - The old constructors for ProducerRecord without this parameter will be
   kept but updated so that their default behaviour if setting null value will
   be the tombstone will be set to true to keep existing behaviour.


So maybe I am misinterpreting something.

And just a nit re: motivation section, item 6 would be clearer for a union
schema with null (or optional schemas in other formats), e.g. [null,
string], in which case losing the schema truly is losing information
(whereas null is already the only valid value for a pure null schema).

-Ewen


On Sat, Dec 10, 2016 at 9:24 PM, Michael Pearce 
wrote:

> Hi Jay,
>
> Good point this detail is missing in the KIP write up. Ive added this now.
>
> Essentially simply just upgrading the clients we do not expect any client
> app code change needed.
>
> Cheers
> Mike
> 
> From: Jay Kreps 
> Sent: Saturday, December 10, 2016 10:51 PM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag
>
> Michael,
>
> The compatibility section goes through the migration path, but isn't the
> bigger compatibility issue with existing apps? There are many (probably
> thousands) of apps in production that use this feature and send null to
> mean delete. It seems like this would break compatibility with them, and
> they would have to be rewritten to right?
>
> -Jay
>
> On Thu, Dec 8, 2016 at 12:12 AM, Michael Pearce 
> wrote:
>
> > Hi Jun,
> >
> > 4) On v3 we honour the tombstone. As such we expect it to be set
> correctly
> > as per the KIP.
> >
> > 4.1) why would we want to produce an error when its v3? This is the exact
> > purpose to support non-null tombstone’s
> > 4.2) again here im unclear on the question on the v3, produce request we
> > expect the tombstone flag to be set correctly.
> >
> > 4.4) compaction only occurs on compacted topics, the bit makes no
> > difference and not looked at on un-compacted (time/size based eviction).
> >
> >
> > On 06/12/2016, 20:08, "Jun Rao"  wrote:
> >
> > Hi, Michael,
> >
> > 4. Then, I think I misunderstood this point. Could you document the
> > following points in the wiki?
> > 4.1 If producer V3 sets tombstone, but provides a non-null value,
> does
> > the
> > send() get an error or does the producer automatically set the value
> to
> > null?
> > 4.2 If producer V3 doesn't set tombstone, but provides a null value,
> > does
> > the send() get an error or does the producer automatically sets the
> > tombstone?
> > 4.3 Does the broker only expect messages that (a) have no tombstone
> and
> > non-null value; (b) have tombstone and null value and reject the
> > messages
> > with an error code otherwise?
> > 4.4 Do 4.1, 4.2,  4.3 depend on whether the topic is compacted on
> not?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Dec 6, 2016 at 10:35 AM, Michael Pearce <
> michael.pea...@ig.com
> > >
> > wrote:
> >
> > > Not at all.  This only acts on compacted topics just as what occurs
> > today
> > >
> > > Sent using OWA for iPhone
> > > 
> > > From: Jun Rao 
> > > Sent: Tuesday, December 6, 2016 6:25:28 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > Hi, Michael,
> > >
> > > 4. Hmm, does that

Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag

2016-12-10 Thread Michael Pearce
Hi Jay,

Good point this detail is missing in the KIP write up. Ive added this now.

Essentially simply just upgrading the clients we do not expect any client app 
code change needed.

Cheers
Mike

From: Jay Kreps 
Sent: Saturday, December 10, 2016 10:51 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag

Michael,

The compatibility section goes through the migration path, but isn't the
bigger compatibility issue with existing apps? There are many (probably
thousands) of apps in production that use this feature and send null to
mean delete. It seems like this would break compatibility with them, and
they would have to be rewritten to right?

-Jay

On Thu, Dec 8, 2016 at 12:12 AM, Michael Pearce 
wrote:

> Hi Jun,
>
> 4) On v3 we honour the tombstone. As such we expect it to be set correctly
> as per the KIP.
>
> 4.1) why would we want to produce an error when its v3? This is the exact
> purpose to support non-null tombstone’s
> 4.2) again here im unclear on the question on the v3, produce request we
> expect the tombstone flag to be set correctly.
>
> 4.4) compaction only occurs on compacted topics, the bit makes no
> difference and not looked at on un-compacted (time/size based eviction).
>
>
> On 06/12/2016, 20:08, "Jun Rao"  wrote:
>
> Hi, Michael,
>
> 4. Then, I think I misunderstood this point. Could you document the
> following points in the wiki?
> 4.1 If producer V3 sets tombstone, but provides a non-null value, does
> the
> send() get an error or does the producer automatically set the value to
> null?
> 4.2 If producer V3 doesn't set tombstone, but provides a null value,
> does
> the send() get an error or does the producer automatically sets the
> tombstone?
> 4.3 Does the broker only expect messages that (a) have no tombstone and
> non-null value; (b) have tombstone and null value and reject the
> messages
> with an error code otherwise?
> 4.4 Do 4.1, 4.2,  4.3 depend on whether the topic is compacted on not?
>
> Thanks,
>
> Jun
>
> On Tue, Dec 6, 2016 at 10:35 AM, Michael Pearce  >
> wrote:
>
> > Not at all.  This only acts on compacted topics just as what occurs
> today
> >
> > Sent using OWA for iPhone
> > 
> > From: Jun Rao 
> > Sent: Tuesday, December 6, 2016 6:25:28 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag
> >
> > Hi, Michael,
> >
> > 4. Hmm, does that mean the new client library can never send a null
> message
> > even to a regular topic? This seems like a change of the existing
> behavior.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Dec 6, 2016 at 9:51 AM, Michael Pearce <
> michael.pea...@ig.com>
> > wrote:
> >
> > > Hi Jun,
> > >
> > > Re 4) That's because we expect the tombstone value to be set
> correctly if
> > > message bit is 2, as such if an older client sends in on old
> message the
> > > message is upcast and the bit is set correctly. And such no longer
> need
> > to
> > > check the value. Mayuresh can you confirm my thinking and
> understanding
> > of
> > > what we've discussed?
> > >
> > > The second point I understand what you're getting at now my
> apologies.
> > Yes
> > > this makes sense to save on touching the message, if we're the
> only kip
> > > going in, in this release.
> > >
> > > Cheers
> > > Mike
> > >
> > > Sent using OWA for iPhone
> > > 
> > > From: Jun Rao 
> > > Sent: Tuesday, December 6, 2016 5:22:13 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > Hi, Michael,
> > >
> > > 4. Is this updated in the wiki? The text "If the magic byte on
> message is
> > > 2, the broker should use the tombstone bit for log compaction."
> doesn't
> > > seem to have changed.
> > >
> > > 2. My point is that if we change the message format just for this
> KIP, we
> > > should consider whether it's worth optimizing the down conversion
> path
> > > (i.e., decide whether a conversion is needed by just looking at the
> > > tombstone bit in the wrapper message) since tombstone will be used
> > rarely.
> > > However, if the message format change here is combined with other
> KIPs,
> > > then this optimization likely won't be needed. The latter probably
> makes
> > > the code simpler. Jiangjie, Mayuresh, what do you think?
> > >
> > > Other than those, +1 from me,
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Dec 6, 2016 at 8:54 AM, Michael Pearce <
> michael.pea...@ig.com>
> > > wrote:
> > >
> > > > Hi Jun
> > > >
> > > > do we have your vote on this now?

Re: [DISCUSS] KIP-97: Improved Kafka Client RPC Compatibility Policy

2016-12-10 Thread Ewen Cheslack-Postava
On Sat, Dec 10, 2016 at 5:23 PM, Sven Ludwig  wrote:

> Hi,
>
> as a user I find this concept interesting, but not important enough. I
> just want to share my personal thoughts, potentially leaning myself
> somewhat out of the window, that came to my mind during reading:
>
> - I think it is of utmost importance for the Kafka project to continue the
> good work on easing and really securing the upgrading of the broker version
> via rolling upgrade. It is really cool for users when they can rely on
> that, allowing for continuous tech refresh, even on brokers with really
> large and/or really many partitions. I would always prefer this to stay in
> front.
>
> - Secondly, I, as most users, like a clear, easy to understand, yet
> powerful API contract together with producer/consumer configuration,
> including also aspects such as load-balancing and availability (consumer
> groups and alternative solutions via external partitioning and
> schedulers/supervisors respectively etc.).
>
> - Third, I find it important for the Kafka community to continue support
> for Scala-based APIs on the client side with the same devotion as seen in
> the support for the Java-based Kafka Streams API.
>
> - Fourth, I need production readiness features.
>
> Only after all these, I would look at a possibility to use a new client
> together with an old broker. Even though the concept seems to be sound on
> first read, I am somewhat afraid of the additional complexity, resulting
> complications and thus additional work for people. Think of database
> drivers. It can potentially become really complicated to support feature
> handshaking. Also, runtime exceptions that could have been avoided by a
> clear and simple policy would not be appealing. I really wonder if there
> are actually enough users who would need the ability to use a newer client
> with an older broker, to justify the increase in complexity.
>

This is something we see come up very regularly on the mailing list. It's
almost always a problem when you have one team that manages the broker
infrastructure and other teams that write apps against it -- the former
tend to be conservative with upgrades, the latter want new features even if
they are client-side only. There are a ton of folks looking for a) bug
fixes in producer/consumer that are newer than their broker (but unrelated
to broker-side fixes), b) support for connectors that are written against
newer versions of the Connect API than their broker (which rarely have any
coupled changes), and c) support using streams apps against older versions
(which also rarely have coupled changes).

This KIP covers the core clients, which actually solves that vast majority
of problems for Connect & Streams as well.

In terms of effort, the delta isn't exactly the same because librdkafka was
already setup for some compatibility, but KIP-35 support there was a
relatively small patch given what it was adding:
https://github.com/edenhill/librdkafka/commit/763b5220905246987c08fc86ea65c1df7d1cc9a9
There have been minimal follow ups required after that patch. Kafka also
has the benefit of substantial cross-version system testing already being
in place, so getting decent realistic validation of compatibility will
actually be pretty low cost (as described in the KIP) and will provide
ongoing validation.


>
> I would agree however if someone said that it should better be possible to
> use two different client versions within one JVM, in order to talk to
> several brokers of different versions, which could become easier to achieve
> at the client side under Java 9 (Jigsaw).
>

With this KIP you don't need to support that because a single newer client
will be able to talk to a variety of broker versions anyway.

-Ewen


>
> Kind Regards
> Sven
>
>


[jira] [Commented] (KAFKA-4497) log cleaner breaks on timeindex

2016-12-10 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15739035#comment-15739035
 ] 

Jiangjie Qin commented on KAFKA-4497:
-

To be more clear, this part of the code has been changed in the trunk (the bug 
remains). 

I will provide two patches, one for trunk the other one for 0.10.1 branch.

> log cleaner breaks on timeindex
> ---
>
> Key: KAFKA-4497
> URL: https://issues.apache.org/jira/browse/KAFKA-4497
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.0
> Environment: Debian Jessie, Oracle Java 8u92, kafka_2.11-0.10.1.0
>Reporter: Robert Schumann
>Assignee: Jiangjie Qin
>Priority: Critical
>
> _created from ML entry by request of [~ijuma]_
> Hi all,
> we are facing an issue with latest kafka 0.10.1 and the log cleaner thread 
> with regards to the timeindex files. From the log of the log-cleaner we see 
> after startup that it tries to cleanup a topic called xdc_listing-status-v2 
> [1]. The topic is setup with log compaction [2] and the kafka cluster 
> configuration has log.cleaner enabled [3]. Looking at the log and the newly 
> created file [4], the cleaner seems to refer to tombstones prior to 
> epoch_time=0 - maybe because he finds messages, which don’t have a timestamp 
> at all (?). All producers and consumers are using 0.10.1 and the topics have 
> been created completely new, so I’m not sure, where this issue would come 
> from. The original timeindex file [5] seems to show only valid timestamps for 
> the mentioned offsets. I would also like to mention that the issue happened 
> in two independent datacenters at the same time, so I would rather expect an 
> application/producer issue instead of random disk failures. We didn’t have 
> the problem with 0.10.0 for around half a year, it appeared shortly after the 
> upgrade to 0.10.1.
> The puzzling message from the cleaner “cleaning prior to Fri Dec 02 16:35:50 
> CET 2016, discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970” also 
> confuses me a bit. Does that mean, it does not find any log segments which 
> can be cleaned up or the last timestamp of the last log segment is somehow 
> broken/missing?
> I’m also a bit wondering, why the log cleaner thread stops completely after 
> an error with one topic. I would at least expect that it keeps on cleaning up 
> other topics, but apparently it doesn’t do that, e.g. it’s not even cleaning 
> the __consumer_offsets anymore.
> Does anybody have the same issues or can explain, what’s going on? Thanks for 
> any help or suggestions.
> Cheers
> Robert
> [1]
> {noformat}
> [2016-12-06 12:49:17,885] INFO Starting the log cleaner (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,895] INFO [kafka-log-cleaner-thread-0], Starting  
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,947] INFO Cleaner 0: Beginning cleaning of log 
> xdc_listing-status-v2-1. (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,948] INFO Cleaner 0: Building offset map for 
> xdc_listing-status-v2-1... (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,989] INFO Cleaner 0: Building offset map for log 
> xdc_listing-status-v2-1 for 1 segments in offset range [0, 194991). 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,572] INFO Cleaner 0: Offset map for log 
> xdc_listing-status-v2-1 complete. (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,577] INFO Cleaner 0: Cleaning log 
> xdc_listing-status-v2-1 (cleaning prior to Fri Dec 02 16:35:50 CET 2016, 
> discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970)... 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,580] INFO Cleaner 0: Cleaning segment 0 in log 
> xdc_listing-status-v2-1 (largest timestamp Fri Dec 02 16:35:50 CET 2016) into 
> 0, retaining deletes. (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,968] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> kafka.common.InvalidOffsetException: Attempt to append an offset (-1) to slot 
> 9 no larger than the last offset appended (11832) to 
> /var/lib/kafka/xdc_listing-status-v2-1/.timeindex.cleaned.
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:117)
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:107)
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:107)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:107)
> at kafka.log.LogSegment.append(LogSegment.scala:106)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:518)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:404)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:400)
> at scala.c

[jira] [Commented] (KAFKA-4497) log cleaner breaks on timeindex

2016-12-10 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15739018#comment-15739018
 ] 

Jiangjie Qin commented on KAFKA-4497:
-

I looked at the log cleaner code in 0.10.1.  There is a bug that could cause 
this problem. For each compressed message set, The time index should have the 
offset of the exact inner message with the largest timestamp, but in the log 
cleaner we are using the last message's offset in the last message set it reads 
into the read buffer. 

The current logic has a problem of generating a -1 offset for the compacted 
message sets even though the write buffer is not empty.

In log cleaner, we may read a few compressed message set into read buffer at a 
time, and compact out some messages from each of the compressed message set. 
Consider the following case:

The read buffer reads two message sets, MessageSet 1 and MessageSet 2. And 
after compaction, 
1. MessageSet 1 has retained messages [M1, M2]
2. MessageSet 2 does not have retained messages.

What Log Cleaner will do is to append [M1, M2] to the log but the 
OffsetOfTheLargestTimestamp will be -1 because the last message set is 
MessageSet 2 and it has no retained messages.

The fix is to update the {{OffsetOfLargestTimestamp}} together with the update 
of {{maxTimestamp}} (the same as we are doing in 
{{ByteBufferMessageSet.validateMessagesAndAssignOffsets()}}. 

I'll provide a patch.



> log cleaner breaks on timeindex
> ---
>
> Key: KAFKA-4497
> URL: https://issues.apache.org/jira/browse/KAFKA-4497
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.0
> Environment: Debian Jessie, Oracle Java 8u92, kafka_2.11-0.10.1.0
>Reporter: Robert Schumann
>Priority: Critical
>
> _created from ML entry by request of [~ijuma]_
> Hi all,
> we are facing an issue with latest kafka 0.10.1 and the log cleaner thread 
> with regards to the timeindex files. From the log of the log-cleaner we see 
> after startup that it tries to cleanup a topic called xdc_listing-status-v2 
> [1]. The topic is setup with log compaction [2] and the kafka cluster 
> configuration has log.cleaner enabled [3]. Looking at the log and the newly 
> created file [4], the cleaner seems to refer to tombstones prior to 
> epoch_time=0 - maybe because he finds messages, which don’t have a timestamp 
> at all (?). All producers and consumers are using 0.10.1 and the topics have 
> been created completely new, so I’m not sure, where this issue would come 
> from. The original timeindex file [5] seems to show only valid timestamps for 
> the mentioned offsets. I would also like to mention that the issue happened 
> in two independent datacenters at the same time, so I would rather expect an 
> application/producer issue instead of random disk failures. We didn’t have 
> the problem with 0.10.0 for around half a year, it appeared shortly after the 
> upgrade to 0.10.1.
> The puzzling message from the cleaner “cleaning prior to Fri Dec 02 16:35:50 
> CET 2016, discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970” also 
> confuses me a bit. Does that mean, it does not find any log segments which 
> can be cleaned up or the last timestamp of the last log segment is somehow 
> broken/missing?
> I’m also a bit wondering, why the log cleaner thread stops completely after 
> an error with one topic. I would at least expect that it keeps on cleaning up 
> other topics, but apparently it doesn’t do that, e.g. it’s not even cleaning 
> the __consumer_offsets anymore.
> Does anybody have the same issues or can explain, what’s going on? Thanks for 
> any help or suggestions.
> Cheers
> Robert
> [1]
> {noformat}
> [2016-12-06 12:49:17,885] INFO Starting the log cleaner (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,895] INFO [kafka-log-cleaner-thread-0], Starting  
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,947] INFO Cleaner 0: Beginning cleaning of log 
> xdc_listing-status-v2-1. (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,948] INFO Cleaner 0: Building offset map for 
> xdc_listing-status-v2-1... (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,989] INFO Cleaner 0: Building offset map for log 
> xdc_listing-status-v2-1 for 1 segments in offset range [0, 194991). 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,572] INFO Cleaner 0: Offset map for log 
> xdc_listing-status-v2-1 complete. (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,577] INFO Cleaner 0: Cleaning log 
> xdc_listing-status-v2-1 (cleaning prior to Fri Dec 02 16:35:50 CET 2016, 
> discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970)... 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,580] INFO Cleaner 0: Cleaning segment 0 in log 
> xdc_listing-status-v2-1 (largest timestamp Fri Dec 02 16:35:50 CET 2016) into 
> 0, retaining deletes. (kafka

[jira] [Assigned] (KAFKA-4497) log cleaner breaks on timeindex

2016-12-10 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin reassigned KAFKA-4497:
---

Assignee: Jiangjie Qin

> log cleaner breaks on timeindex
> ---
>
> Key: KAFKA-4497
> URL: https://issues.apache.org/jira/browse/KAFKA-4497
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.0
> Environment: Debian Jessie, Oracle Java 8u92, kafka_2.11-0.10.1.0
>Reporter: Robert Schumann
>Assignee: Jiangjie Qin
>Priority: Critical
>
> _created from ML entry by request of [~ijuma]_
> Hi all,
> we are facing an issue with latest kafka 0.10.1 and the log cleaner thread 
> with regards to the timeindex files. From the log of the log-cleaner we see 
> after startup that it tries to cleanup a topic called xdc_listing-status-v2 
> [1]. The topic is setup with log compaction [2] and the kafka cluster 
> configuration has log.cleaner enabled [3]. Looking at the log and the newly 
> created file [4], the cleaner seems to refer to tombstones prior to 
> epoch_time=0 - maybe because he finds messages, which don’t have a timestamp 
> at all (?). All producers and consumers are using 0.10.1 and the topics have 
> been created completely new, so I’m not sure, where this issue would come 
> from. The original timeindex file [5] seems to show only valid timestamps for 
> the mentioned offsets. I would also like to mention that the issue happened 
> in two independent datacenters at the same time, so I would rather expect an 
> application/producer issue instead of random disk failures. We didn’t have 
> the problem with 0.10.0 for around half a year, it appeared shortly after the 
> upgrade to 0.10.1.
> The puzzling message from the cleaner “cleaning prior to Fri Dec 02 16:35:50 
> CET 2016, discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970” also 
> confuses me a bit. Does that mean, it does not find any log segments which 
> can be cleaned up or the last timestamp of the last log segment is somehow 
> broken/missing?
> I’m also a bit wondering, why the log cleaner thread stops completely after 
> an error with one topic. I would at least expect that it keeps on cleaning up 
> other topics, but apparently it doesn’t do that, e.g. it’s not even cleaning 
> the __consumer_offsets anymore.
> Does anybody have the same issues or can explain, what’s going on? Thanks for 
> any help or suggestions.
> Cheers
> Robert
> [1]
> {noformat}
> [2016-12-06 12:49:17,885] INFO Starting the log cleaner (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,895] INFO [kafka-log-cleaner-thread-0], Starting  
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,947] INFO Cleaner 0: Beginning cleaning of log 
> xdc_listing-status-v2-1. (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,948] INFO Cleaner 0: Building offset map for 
> xdc_listing-status-v2-1... (kafka.log.LogCleaner)
> [2016-12-06 12:49:17,989] INFO Cleaner 0: Building offset map for log 
> xdc_listing-status-v2-1 for 1 segments in offset range [0, 194991). 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,572] INFO Cleaner 0: Offset map for log 
> xdc_listing-status-v2-1 complete. (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,577] INFO Cleaner 0: Cleaning log 
> xdc_listing-status-v2-1 (cleaning prior to Fri Dec 02 16:35:50 CET 2016, 
> discarding tombstones prior to Thu Jan 01 01:00:00 CET 1970)... 
> (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,580] INFO Cleaner 0: Cleaning segment 0 in log 
> xdc_listing-status-v2-1 (largest timestamp Fri Dec 02 16:35:50 CET 2016) into 
> 0, retaining deletes. (kafka.log.LogCleaner)
> [2016-12-06 12:49:24,968] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> kafka.common.InvalidOffsetException: Attempt to append an offset (-1) to slot 
> 9 no larger than the last offset appended (11832) to 
> /var/lib/kafka/xdc_listing-status-v2-1/.timeindex.cleaned.
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:117)
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:107)
> at 
> kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:107)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
> at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:107)
> at kafka.log.LogSegment.append(LogSegment.scala:106)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:518)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:404)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:400)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:400)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:364)

Re: [DISCUSS] KIP-97: Improved Kafka Client RPC Compatibility Policy

2016-12-10 Thread Sven Ludwig
Hi,
 
as a user I find this concept interesting, but not important enough. I just 
want to share my personal thoughts, potentially leaning myself somewhat out of 
the window, that came to my mind during reading:
 
- I think it is of utmost importance for the Kafka project to continue the good 
work on easing and really securing the upgrading of the broker version via 
rolling upgrade. It is really cool for users when they can rely on that, 
allowing for continuous tech refresh, even on brokers with really large and/or 
really many partitions. I would always prefer this to stay in front.
 
- Secondly, I, as most users, like a clear, easy to understand, yet powerful 
API contract together with producer/consumer configuration, including also 
aspects such as load-balancing and availability (consumer groups and 
alternative solutions via external partitioning and schedulers/supervisors 
respectively etc.).
 
- Third, I find it important for the Kafka community to continue support for 
Scala-based APIs on the client side with the same devotion as seen in the 
support for the Java-based Kafka Streams API.
 
- Fourth, I need production readiness features.
 
Only after all these, I would look at a possibility to use a new client 
together with an old broker. Even though the concept seems to be sound on first 
read, I am somewhat afraid of the additional complexity, resulting 
complications and thus additional work for people. Think of database drivers. 
It can potentially become really complicated to support feature handshaking. 
Also, runtime exceptions that could have been avoided by a clear and simple 
policy would not be appealing. I really wonder if there are actually enough 
users who would need the ability to use a newer client with an older broker, to 
justify the increase in complexity.
 
I would agree however if someone said that it should better be possible to use 
two different client versions within one JVM, in order to talk to several 
brokers of different versions, which could become easier to achieve at the 
client side under Java 9 (Jigsaw).
 
Kind Regards
Sven
 


Jenkins build is back to normal : kafka-trunk-jdk8 #1094

2016-12-10 Thread Apache Jenkins Server
See 



Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag

2016-12-10 Thread Jay Kreps
Michael,

The compatibility section goes through the migration path, but isn't the
bigger compatibility issue with existing apps? There are many (probably
thousands) of apps in production that use this feature and send null to
mean delete. It seems like this would break compatibility with them, and
they would have to be rewritten to right?

-Jay

On Thu, Dec 8, 2016 at 12:12 AM, Michael Pearce 
wrote:

> Hi Jun,
>
> 4) On v3 we honour the tombstone. As such we expect it to be set correctly
> as per the KIP.
>
> 4.1) why would we want to produce an error when its v3? This is the exact
> purpose to support non-null tombstone’s
> 4.2) again here im unclear on the question on the v3, produce request we
> expect the tombstone flag to be set correctly.
>
> 4.4) compaction only occurs on compacted topics, the bit makes no
> difference and not looked at on un-compacted (time/size based eviction).
>
>
> On 06/12/2016, 20:08, "Jun Rao"  wrote:
>
> Hi, Michael,
>
> 4. Then, I think I misunderstood this point. Could you document the
> following points in the wiki?
> 4.1 If producer V3 sets tombstone, but provides a non-null value, does
> the
> send() get an error or does the producer automatically set the value to
> null?
> 4.2 If producer V3 doesn't set tombstone, but provides a null value,
> does
> the send() get an error or does the producer automatically sets the
> tombstone?
> 4.3 Does the broker only expect messages that (a) have no tombstone and
> non-null value; (b) have tombstone and null value and reject the
> messages
> with an error code otherwise?
> 4.4 Do 4.1, 4.2,  4.3 depend on whether the topic is compacted on not?
>
> Thanks,
>
> Jun
>
> On Tue, Dec 6, 2016 at 10:35 AM, Michael Pearce  >
> wrote:
>
> > Not at all.  This only acts on compacted topics just as what occurs
> today
> >
> > Sent using OWA for iPhone
> > 
> > From: Jun Rao 
> > Sent: Tuesday, December 6, 2016 6:25:28 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag
> >
> > Hi, Michael,
> >
> > 4. Hmm, does that mean the new client library can never send a null
> message
> > even to a regular topic? This seems like a change of the existing
> behavior.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Dec 6, 2016 at 9:51 AM, Michael Pearce <
> michael.pea...@ig.com>
> > wrote:
> >
> > > Hi Jun,
> > >
> > > Re 4) That's because we expect the tombstone value to be set
> correctly if
> > > message bit is 2, as such if an older client sends in on old
> message the
> > > message is upcast and the bit is set correctly. And such no longer
> need
> > to
> > > check the value. Mayuresh can you confirm my thinking and
> understanding
> > of
> > > what we've discussed?
> > >
> > > The second point I understand what you're getting at now my
> apologies.
> > Yes
> > > this makes sense to save on touching the message, if we're the
> only kip
> > > going in, in this release.
> > >
> > > Cheers
> > > Mike
> > >
> > > Sent using OWA for iPhone
> > > 
> > > From: Jun Rao 
> > > Sent: Tuesday, December 6, 2016 5:22:13 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > Hi, Michael,
> > >
> > > 4. Is this updated in the wiki? The text "If the magic byte on
> message is
> > > 2, the broker should use the tombstone bit for log compaction."
> doesn't
> > > seem to have changed.
> > >
> > > 2. My point is that if we change the message format just for this
> KIP, we
> > > should consider whether it's worth optimizing the down conversion
> path
> > > (i.e., decide whether a conversion is needed by just looking at the
> > > tombstone bit in the wrapper message) since tombstone will be used
> > rarely.
> > > However, if the message format change here is combined with other
> KIPs,
> > > then this optimization likely won't be needed. The latter probably
> makes
> > > the code simpler. Jiangjie, Mayuresh, what do you think?
> > >
> > > Other than those, +1 from me,
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Dec 6, 2016 at 8:54 AM, Michael Pearce <
> michael.pea...@ig.com>
> > > wrote:
> > >
> > > > Hi Jun
> > > >
> > > > do we have your vote on this now?
> > > >
> > > > Any other concerns?
> > > >
> > > > Cheers
> > > > Mike
> > > >
> > > > Sent using OWA for iPhone
> > > > 
> > > > From: Michael Pearce 
> > > > Sent: Saturday, December 3, 2016 1:37:45 AM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [VOTE] KIP-87 - Add Compaction Tombs

Re: [VOTE]: KIP-97: The client compatibility KIP

2016-12-10 Thread Jay Kreps
+1 (binding)

-Jay

On Wed, Dec 7, 2016 at 9:17 AM, Colin McCabe  wrote:

> Hi all,
>
> I heard that the VOTE and DISCUSS threads for the KIP-97 discussion
> appeared to be in the same email thread for some people using gmail.  So
> I'm reposting in hopes of getting a separate email thread this time for
> those viewers. :)
>
> I'd like to start voting on KIP-97
> (https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 97%3A+Improved+Kafka+Client+RPC+Compatibility+Policy
> ).
>
> The discussion thread can be found here:
> https://www.mail-archive.com/dev@kafka.apache.org/msg60955.html
>
> Thanks for your feedback.
>
> best,
> Colin McCabe
>


Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-12-10 Thread Jay Kreps
Jason, it's not just decompression but also the conversion from packed
bytes to java objects, right? That can be even larger than the
decompression blow up. I think this may be okay, the problem may just be
that the naming is a bit misleading. In the producer you are literally
allocating a buffer of that size, so the name buffer.memory makes sense. In
this case it is something more like max.bytes.read.per.poll.call (terrible
name, but maybe something like that?).

Mickael, I'd second Jason's request for the default and expand on it. We
currently have several consumer-related memory
settings--max.partition.fetch.bytes, fetch.max.bytes. I don't think it is
clear today how to set these. For example we mark max.partition.fetch.bytes
as high importance and fetch.max.bytes as medium, but it seems like it
would be the other way around. Can we think this through from the point of
view of a lazy user? I.e. I have 64MB of space to use for my consumer, in
an ideal world I'd say, "hey consumer here is 64MB go use that as
efficiently as possible" and not have to tune a bunch of individual things
with complex relationships. Maybe one or both of the existing settings can
either be eliminated or at the least marked as low priority and we can
infer a reasonable default from the new config your introducing?

-jay

On Fri, Dec 9, 2016 at 2:08 PM, Jason Gustafson  wrote:

> Hi Mickael,
>
> I think the approach looks good, just a few minor questions:
>
> 1. The KIP doesn't say what the default value of `buffer.memory` will be.
> Looks like we use 50MB as the default for `fetch.max.bytes`, so perhaps it
> makes sense to set the default based on that. Might also be worth
> mentioning somewhere the constraint between the two configs.
> 2. To clarify, this limit only affects the uncompressed size of the fetched
> data, right? The consumer may still exceed it in order to store the
> decompressed record data. We delay decompression until the records are
> returned to the user, but because of max.poll.records, we may end up
> holding onto the decompressed data from a single partition for a few
> iterations. I think this is fine, but probably worth noting in the KIP.
> 3. Is there any risk using the MemoryPool that, after we fill up the memory
> with fetch data, we can starve the coordinator's connection? Suppose, for
> example, that we send a bunch of pre-fetches right before returning to the
> user. These fetches might return before the next call to poll(), in which
> case we might not have enough memory to receive heartbeats, which would
> block us from sending additional heartbeats until the next call to poll().
> Not sure it's a big problem since heartbeats are tiny, but might be worth
> thinking about.
>
> Thanks,
> Jason
>
>
> On Fri, Dec 2, 2016 at 4:31 AM, Mickael Maison 
> wrote:
>
> > It's been a few days since the last comments. KIP-72 vote seems to
> > have passed so if I don't get any new comments I'll start the vote on
> > Monday.
> > Thanks
> >
> > On Mon, Nov 14, 2016 at 6:25 PM, radai 
> wrote:
> > > +1 - there's is a need for an effective way to control kafka memory
> > > consumption - both on the broker and on clients.
> > > i think we could even reuse the exact same param name -
> > *queued.max.bytes *-
> > > as it would serve the exact same purpose.
> > >
> > > also (and again its the same across the broker and clients) this bound
> > > should also cover decompression, at some point.
> > > the problem with that is that to the best of my knowledge the current
> > wire
> > > protocol does not declare the final, uncompressed size of anything up
> > front
> > > - all we know is the size of the compressed buffer. this may require a
> > > format change in the future to properly support?
> > >
> > > On Mon, Nov 14, 2016 at 10:03 AM, Mickael Maison <
> > mickael.mai...@gmail.com>
> > > wrote:
> > >
> > >> Thanks for all the replies.
> > >>
> > >> I've updated the KIP:
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> 81%3A+Bound+Fetch+memory+usage+in+the+consumer
> > >> The main point is to selectively read from sockets instead of
> > >> throttling FetchRequests sends. I also mentioned it will be reusing
> > >> the MemoryPool implementation created in KIP-72 instead of adding
> > >> another memory tracking method.
> > >>
> > >> Please have another look. As always, comments are welcome !
> > >>
> > >> On Thu, Nov 10, 2016 at 2:47 AM, radai 
> > wrote:
> > >> > selectively reading from sockets achieves memory control (up to and
> > not
> > >> > including talk of (de)compression)
> > >> >
> > >> > this is exactly what i (also, even mostly) did for kip-72 - which i
> > hope
> > >> in
> > >> > itself should be a reason to think about both KIPs at the same time
> > >> because
> > >> > the changes will be similar (at least in intent) and might result in
> > >> > duplicated effort.
> > >> >
> > >> > a pool API is a way to "scale" all the way from just maintaining a
> > >> variable
> > >> > holding amount of av

[GitHub] kafka pull request #2240: MINOR: Update Log Compaction Details in design.htm...

2016-12-10 Thread curlup
GitHub user curlup opened a pull request:

https://github.com/apache/kafka/pull/2240

MINOR: Update Log Compaction Details in design.html



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/curlup/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2240


commit 72251d195285030d1e998645fda80f019ee08f90
Author: Pavel Trukhanov 
Date:   2016-12-10T20:52:41Z

Update design.html




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3008) Connect should parallelize task start/stop

2016-12-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15738466#comment-15738466
 ] 

ASF GitHub Bot commented on KAFKA-3008:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2232


> Connect should parallelize task start/stop
> --
>
> Key: KAFKA-3008
> URL: https://issues.apache.org/jira/browse/KAFKA-3008
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.9.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Konstantine Karantasis
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> The Herder implementations currently iterate over all connectors/tasks and 
> sequentially start/stop them. We should parallelize this. This is less 
> critical for {{StandaloneHerder}}, but pretty important for 
> {{DistributedHerder}} since it will generally be managing more tasks and any 
> delay starting/stopping a single task will impact every other task on the 
> node (and can ultimately result in incorrect behavior in the case of a single 
> offset commit in one connector taking too long preventing all of the rest 
> from committing offsets).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2232: HOTFIX: Fix HerderRequest.compareTo()

2016-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2232


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Reg: ACLS

2016-12-10 Thread BigData dev
Hi,

bin/kafka-acls.sh --topic kafka-testtopic --add -allow-host 9.30.15.19
--operation Write --authorizer-properties
zookeeper.connect=hostname.abc.com:2181

Below message I am getting.
You must specify one of: --allow-principal, --deny-principal when
trying to add ACLs.

So, as kerberos is not enabled what will be the allow-principal value.

Any information on this would be greatly helpful.



Thanks

On Sat, Dec 10, 2016 at 11:02 AM, BigData dev 
wrote:

> Hi Ashish, Ismael
> Thanks for Info.
> So on Kafka Cluster (With out any security enabled) I can add ACLS with IP
> address.
> Is that correct?
>
>
> Thanks,
> Bharat
>
>
> On Fri, Dec 9, 2016 at 11:14 AM, Ashish Singh  wrote:
>
>> Ismael, thanks for the correction. I assumed the question was targeted for
>> without any security enabled, but yea even then IP based auth is possible.
>>
>> On Fri, Dec 9, 2016 at 11:01 AM, Ismael Juma  wrote:
>>
>> > It is possible to use ACLs with IPs or other SASL mechanisms (PLAIN for
>> > example). So Kerberos and SSL are not required (although commonly used).
>> >
>> > Ismael
>> >
>> > On Fri, Dec 9, 2016 at 6:59 PM, Ashish Singh 
>> wrote:
>> >
>> > > Hey,
>> > >
>> > > No it does not. Without kerberos or ssl, all requests will appear to
>> come
>> > > from anonymous user, and as long as a user is not identified it is not
>> > > possible to do authorization on.
>> > >
>> > > On Fri, Dec 9, 2016 at 10:40 AM, BigData dev > >
>> > > wrote:
>> > >
>> > > > Hi All,
>> > > > I have a question here, Does Kafka support ACL's with out
>> kerberos/SSL?
>> > > >
>> > > > Any info on this would be greatly helpful.
>> > > >
>> > > >
>> > > > Thanks
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > > Regards,
>> > > Ashish
>> > >
>> >
>>
>>
>>
>> --
>>
>> Regards,
>> Ashish
>>
>
>


Re: Reg: ACLS

2016-12-10 Thread BigData dev
Hi Ashish, Ismael
Thanks for Info.
So on Kafka Cluster (With out any security enabled) I can add ACLS with IP
address.
Is that correct?


Thanks,
Bharat


On Fri, Dec 9, 2016 at 11:14 AM, Ashish Singh  wrote:

> Ismael, thanks for the correction. I assumed the question was targeted for
> without any security enabled, but yea even then IP based auth is possible.
>
> On Fri, Dec 9, 2016 at 11:01 AM, Ismael Juma  wrote:
>
> > It is possible to use ACLs with IPs or other SASL mechanisms (PLAIN for
> > example). So Kerberos and SSL are not required (although commonly used).
> >
> > Ismael
> >
> > On Fri, Dec 9, 2016 at 6:59 PM, Ashish Singh 
> wrote:
> >
> > > Hey,
> > >
> > > No it does not. Without kerberos or ssl, all requests will appear to
> come
> > > from anonymous user, and as long as a user is not identified it is not
> > > possible to do authorization on.
> > >
> > > On Fri, Dec 9, 2016 at 10:40 AM, BigData dev 
> > > wrote:
> > >
> > > > Hi All,
> > > > I have a question here, Does Kafka support ACL's with out
> kerberos/SSL?
> > > >
> > > > Any info on this would be greatly helpful.
> > > >
> > > >
> > > > Thanks
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Ashish
> > >
> >
>
>
>
> --
>
> Regards,
> Ashish
>


[jira] [Comment Edited] (KAFKA-3355) GetOffsetShell command doesn't work with SASL enabled Kafka

2016-12-10 Thread Mohammed amine GARMES (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15735361#comment-15735361
 ] 

Mohammed amine GARMES edited comment on KAFKA-3355 at 12/10/16 10:20 AM:
-

Hello [~singhashish], [~gwenshap],

I want to know if there is news in relation to the problem of GetOffsetShell ?
I fixed this issue for my company, I want to know if I can pufh this fix to be 
provided with  next  version ??!

And I want to fix create/delete topic commands, because we have some security 
issue regarding the PCI DSS audit. 


Best regards 


was (Author: magarmes):
Hello [~singhashish], [~gwenshap],

I want to know if there is news in relation to the problem of GetOffsetShell ?
I fixed this issue for my company, I want to know if I can pufh this fix to be 
provided with  0.10.0.1 version ??!

And I want to fix create/delete topic commands, because we have some security 
issue regarding the PCI DSS audit. 


Best regards 

> GetOffsetShell command doesn't work with SASL enabled Kafka
> ---
>
> Key: KAFKA-3355
> URL: https://issues.apache.org/jira/browse/KAFKA-3355
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.9.0.1
> Environment: Kafka 0.9.0.1
>Reporter: TAO XIAO
>Assignee: Ashish K Singh
>
> I found that GetOffsetShell doesn't work with SASL enabled Kafka. I believe 
> this is due to old producer being used in GetOffsetShell.
> Kafka version 0.9.0.1
> Exception
> % bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list 
> localhost:9092 --topic test --time -1
> [2016-03-04 21:43:56,597] INFO Verifying properties 
> (kafka.utils.VerifiableProperties)
> [2016-03-04 21:43:56,613] INFO Property client.id is overridden to 
> GetOffsetShell (kafka.utils.VerifiableProperties)
> [2016-03-04 21:43:56,613] INFO Property metadata.broker.list is overridden to 
> localhost:9092 (kafka.utils.VerifiableProperties)
> [2016-03-04 21:43:56,613] INFO Property request.timeout.ms is overridden to 
> 1000 (kafka.utils.VerifiableProperties)
> [2016-03-04 21:43:56,674] INFO Fetching metadata from broker 
> BrokerEndPoint(0,localhost,9092) with correlation id 0 for 1 topic(s) 
> Set(test) (kafka.client.ClientUtils$)
> [2016-03-04 21:43:56,689] INFO Connected to localhost:9092 for producing 
> (kafka.producer.SyncProducer)
> [2016-03-04 21:43:56,705] WARN Fetching topic metadata with correlation id 0 
> for topics [Set(test)] from broker [BrokerEndPoint(0,localhost,9092)] failed 
> (kafka.client.ClientUtils$)
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:498)
>   at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:304)
>   at kafka.api.ApiUtils$.readShortString(ApiUtils.scala:36)
>   at kafka.cluster.BrokerEndPoint$.readFrom(BrokerEndPoint.scala:52)
>   at 
> kafka.api.TopicMetadataResponse$$anonfun$1.apply(TopicMetadataResponse.scala:28)
>   at 
> kafka.api.TopicMetadataResponse$$anonfun$1.apply(TopicMetadataResponse.scala:28)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at scala.collection.immutable.Range.foreach(Range.scala:166)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> kafka.api.TopicMetadataResponse$.readFrom(TopicMetadataResponse.scala:28)
>   at kafka.producer.SyncProducer.send(SyncProducer.scala:120)
>   at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
>   at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:94)
>   at kafka.tools.GetOffsetShell$.main(GetOffsetShell.scala:78)
>   at kafka.tools.GetOffsetShell.main(GetOffsetShell.scala)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)