Which container should you use when deploying on docker ?

2019-12-13 Thread Yu Watanabe
Hello.

I would like to ask question related to kafka on docker engine.
Which container should you use for kafka when deploying on docker in
production ?

When I look in docker hub , I do not see neither of below tagged for kafka
container .

Docker certified
Verified publisher
Official Images

Repository "confluent" seems be the closest one since its the creator of
kafka but it does not have above tag .

Thanks,
Yu Watanabe

-- 
Yu Watanabe
Weekend Freelancer who loves to challenge building data platform
yu.w.ten...@gmail.com
[image: LinkedIn icon]   [image:
Twitter icon] 


[RESULTS] [VOTE] Release Kafka version 2.4.0

2019-12-13 Thread Manikumar
This vote passes with 6 +1 votes (3 bindings) and no 0 or -1 votes.

+1 votes
PMC Members:
* Gwen Shapira
* Jun Rao
* Guozhang Wang

Committers:
* Mickael Maison

Community:
* Adam Bellemare
* Israel Ekpo

0 votes
* No votes

-1 votes
* No votes

Vote thread:
https://markmail.org/message/qlira627sqbmmzz4

I'll continue with the release process and the release announcement will
follow in the next few days.

Manikumar


Re: Reducing streams startup bandwidth usage

2019-12-13 Thread Alessandro Tagliapietra
Hi Sophie,

thanks for explaining that.
So yeah it seems that since I'm using the default grace period of 24 hours,
that's might cause the records to be sent to the changelog after ~24 hours.

However, I'm switching the regular windowing system to the custom one, and
while for the regular windows now I understand why it might send all the
data to the changelog, the new one I don't.
In the regulard windows version since new keys are added every minute
(because of the key-timewindow combination) I think it's an expected
behavior, I don't see why the cache would stop working in the new scenario
where the number of keys are fixed.

Since I'm rewriting the same keys over and over and my changelog topic has
a compact cleanup policy, why should it send old values to the changelog?
What I mean is, if the cache can keep the values for e.g. a few hours, and
we saw that the amount of data in each custom window per key doesn't change
(it goes up and down but never exceeds a certain value), why shouldn't it
be able to do the same for longer periods of time?

Btw it doesn't seem to be a metric to know the amount of cache available
for each state store, is it possible to find out in some other way? Just to
tkeep track of it?

Regards

--
Alessandro Tagliapietra


On Thu, Dec 12, 2019 at 3:14 PM Sophie Blee-Goldman 
wrote:

> Thanks for collecting all these metrics. It might be that as the length of
> the lists
> increases over time, the cache is able to hold fewer unique keys and
> eventually has to
> start evicting things. This would explain why the cache hit rate starts to
> decrease, and
> likely why latency starts to go up. Whenever a dirty entry is
> evicted/flushed from the cache
> it gets sent to the changelog (and underlying state store), so these
> evictions might be the
> cause of the increased load.
>
> The fluctuations you're seeing (ie it starts and stops "working") could
> just be the window
> closing. After that, the list size would go back down to zero, and the
> cache would suddenly
> have free space again.
>
> Does that seem to make sense with what you're seeing?
>
> On Tue, Dec 10, 2019 at 7:04 PM Alessandro Tagliapietra <
> tagliapietra.alessan...@gmail.com> wrote:
>
> > Just an update since it has been happening again now and I have some more
> > metrics to show, the topology is this:
> >
> > Topologies:
> >Sub-topology: 0
> > Source: KSTREAM-SOURCE-00 (topics: [sensors])
> >   --> KSTREAM-TRANSFORMVALUES-01
> > Processor: KSTREAM-TRANSFORMVALUES-01 (stores:
> > [new-data-store])
> >   --> KSTREAM-FLATMAPVALUES-02
> >   <-- KSTREAM-SOURCE-00
> > Processor: KSTREAM-FLATMAPVALUES-02 (stores: [])
> >   --> KSTREAM-TRANSFORMVALUES-03
> >   <-- KSTREAM-TRANSFORMVALUES-01
> > Processor: KSTREAM-TRANSFORMVALUES-03 (stores:
> > [LastValueStore])
> >   --> KSTREAM-FILTER-04
> >   <-- KSTREAM-FLATMAPVALUES-02
> > Processor: KSTREAM-FILTER-04 (stores: [])
> >   --> KSTREAM-AGGREGATE-05
> >   <-- KSTREAM-TRANSFORMVALUES-03
> > Processor: KSTREAM-AGGREGATE-05 (stores: [aggregate-store])
> >   --> KTABLE-TOSTREAM-06
> >   <-- KSTREAM-FILTER-04
> > Processor: KTABLE-TOSTREAM-06 (stores: [])
> >   --> KSTREAM-TRANSFORM-07
> >   <-- KSTREAM-AGGREGATE-05
> > Processor: KSTREAM-TRANSFORM-07 (stores: [suppress-store])
> >   --> KSTREAM-MAP-08
> >   <-- KTABLE-TOSTREAM-06
> > Processor: KSTREAM-MAP-08 (stores: [])
> >   --> KSTREAM-PRINTER-09, KSTREAM-SINK-10
> >   <-- KSTREAM-TRANSFORM-07
> > Processor: KSTREAM-PRINTER-09 (stores: [])
> >   --> none
> >   <-- KSTREAM-MAP-08
> > Sink: KSTREAM-SINK-10 (topic: sensors-output)
> >   <-- KSTREAM-MAP-08
> >
> >  - https://imgur.com/R3Pqypo this shows that the input source topic has
> > the
> > same rate of messages
> >  - https://imgur.com/BTwq09p this is the number of records processed by
> > each processor node, at first there are 3 processor nodes
> > kstream-transformvalues-3, kstream-filter-4, kstream-aggregate-5
> processing
> > 4/5k records/min, then ktable-tostream-6 and kstream-transform-7 rump up
> > and the previous ones slow down due the higher load
> >  - https://imgur.com/5eXpf8l the state stores cache rate starts to
> > decrease
> >  - https://imgur.com/dwFOb2g put and fetch operations of the window
> store
> > almost remain the same (maybe lowers due higher load)
> >  - https://imgur.com/1XZmMW5 commit latency increases
> >  - https://imgur.com/UdBpOVU commit rate stays almost the same
> >  - https://imgur.com/UJ3JB4f process latency increases
> >  - https://imgur.com/55YVmy2 process rate stays the same
> >  - https://imgur.com/GMJ3eGV sent records increase because of aggregate
> > and
> > 

Re: Topics marked for deletion stuck as ineligible for deletion

2019-12-13 Thread Peter Bukowinski
If any brokers are offline, kafka can’t successfully delete a topic. What’s the 
state of broker 5?

-- Peter (from phone)

> On Dec 13, 2019, at 8:55 AM, Vincent Rischmann  wrote:
> 
> Hi,
> 
> I've deleted a bunch of topics yesterday on our cluster but some are now 
> stuck in "marked for deletion".
> 
> * i've looked in the data directory of every broker and there's no data left 
> for the topics, the directory doesn't exist anymore.
> * in zookeeper the znode `brokers/topics/mytopic` still exists
> * the znode `admin/delete_topics/mytopic` still exists
> 
> I've tried the following to no avail:
> 
> * restarting all brokers
> * removing the `admin/delete_topics/mytopic` node and re-running 
> `kafka-topics.sh --delete --topic mytopic`
> 
> In the kafka-controller.log of some brokers I see this which seems relevant:
> 
>[2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
> sending request (type=StopReplicaRequest, controllerId=6, controllerEpoch=78, 
> deletePartitions=false, partitions=mytopic-17) to broker 5, since it is 
> offline. (kafka.controller.ControllerChannelManager)
>[2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
> sending request (type=StopReplicaRequest, controllerId=6, controllerEpoch=78, 
> deletePartitions=false, partitions=mytopic-24) to broker 5, since it is 
> offline. (kafka.controller.ControllerChannelManager)
> 
> and
> 
>12061:[2019-12-12 10:35:55,290] INFO [Topic Deletion Manager 1], Handling 
> deletion for topics mytopic (kafka.controller.TopicDeletionManager)
>12062:[2019-12-12 10:35:55,292] INFO [Topic Deletion Manager 1], Not 
> retrying deletion of topic mytopic at this time since it is marked ineligible 
> for deletion (kafka.controller.TopicDeletionManager)
> 
> Since the data directory is already deleted I'm thinking of simply removing 
> the znode `brokers/topics/mytopic` from zookeeper manually.
> 
> Does anyone has another suggestion ? Is it safe to remove the znode manually ?
> 
> Thanks.


Topics marked for deletion stuck as ineligible for deletion

2019-12-13 Thread Vincent Rischmann
Hi,

I've deleted a bunch of topics yesterday on our cluster but some are now stuck 
in "marked for deletion".

* i've looked in the data directory of every broker and there's no data left 
for the topics, the directory doesn't exist anymore.
* in zookeeper the znode `brokers/topics/mytopic` still exists
* the znode `admin/delete_topics/mytopic` still exists

I've tried the following to no avail:

* restarting all brokers
* removing the `admin/delete_topics/mytopic` node and re-running 
`kafka-topics.sh --delete --topic mytopic`

In the kafka-controller.log of some brokers I see this which seems relevant:

[2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
sending request (type=StopReplicaRequest, controllerId=6, controllerEpoch=78, 
deletePartitions=false, partitions=mytopic-17) to broker 5, since it is 
offline. (kafka.controller.ControllerChannelManager)
[2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
sending request (type=StopReplicaRequest, controllerId=6, controllerEpoch=78, 
deletePartitions=false, partitions=mytopic-24) to broker 5, since it is 
offline. (kafka.controller.ControllerChannelManager)

and

12061:[2019-12-12 10:35:55,290] INFO [Topic Deletion Manager 1], Handling 
deletion for topics mytopic (kafka.controller.TopicDeletionManager)
12062:[2019-12-12 10:35:55,292] INFO [Topic Deletion Manager 1], Not 
retrying deletion of topic mytopic at this time since it is marked ineligible 
for deletion (kafka.controller.TopicDeletionManager)

Since the data directory is already deleted I'm thinking of simply removing the 
znode `brokers/topics/mytopic` from zookeeper manually.

Does anyone has another suggestion ? Is it safe to remove the znode manually ?

Thanks.