Re: [VOTE] 2.1.1 RC2

2019-02-13 Thread Gwen Shapira
+1
(Validated signatures, unit tests, quickstart)

Thanks, Colin, for managing the bugfix release.

On Fri, Feb 8, 2019 at 12:02 PM Colin McCabe  wrote:
>
> Hi all,
>
> This is the third candidate for release of Apache Kafka 2.1.1.  This release 
> includes many bug fixes for Apache Kafka 2.1.
>
> Compared to rc1, this release includes the following changes:
> * MINOR: release.py: fix some compatibility problems.
> * KAFKA-7897; Disable leader epoch cache when older message formats are used
> * KAFKA-7902: Replace original loginContext if SASL/OAUTHBEARER refresh login 
> fails
> * MINOR: Fix more places where the version should be bumped from 2.1.0 -> 
> 2.1.1
> * KAFKA-7890: Invalidate ClusterConnectionState cache for a broker if the 
> hostname of the broker changes.
> * KAFKA-7873; Always seek to beginning in KafkaBasedLog
> * MINOR: Correctly set dev version in version.py
>
> Check out the release notes here:
> http://home.apache.org/~cmccabe/kafka-2.1.1-rc2/RELEASE_NOTES.html
>
> The vote will go until Wednesday, February 13st.
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~cmccabe/kafka-2.1.1-rc2/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~cmccabe/kafka-2.1.1-rc2/javadoc/
>
> * Tag to be voted upon (off 2.1 branch) is the 2.1.1 tag:
> https://github.com/apache/kafka/releases/tag/2.1.1-rc2
>
> * Jenkins builds for the 2.1 branch:
> Unit/integration tests: https://builds.apache.org/job/kafka-2.1-jdk8/
>
> Thanks to everyone who tested the earlier RCs.
>
> cheers,
> Colin



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Re: [ANNOUNCE] New Committer: Bill Bejeck

2019-02-13 Thread Damian Guy
Congratulations Bill!

On Wed, 13 Feb 2019 at 16:51, Satish Duggana 
wrote:

> Congratulations Bill!
>
> On Thu, Feb 14, 2019 at 6:41 AM Marcelo Barbosa
>  wrote:
> >
> > Wow! Congrats Bill!
> > Cheers,
> > Barbosa
> > Em quarta-feira, 13 de fevereiro de 2019 23:03:54 BRST, Guozhang
> Wang  escreveu:
> >
> >  Hello all,
> >
> > The PMC of Apache Kafka is happy to announce that we've added Bill Bejeck
> > as our newest project committer.
> >
> > Bill has been active in the Kafka community since 2015. He has made
> > significant contributions to the Kafka Streams project with more than 100
> > PRs and 4 authored KIPs, including the streams topology optimization
> > framework. Bill's also very keen on tightening Kafka's unit test / system
> > tests coverage, which is a great value to our project codebase.
> >
> > In addition, Bill has been very active in evangelizing Kafka for stream
> > processing in the community. He has given several Kafka meetup talks in
> the
> > past year, including a presentation at Kafka Summit SF. He's also
> authored
> > a book about Kafka Streams (
> > https://www.manning.com/books/kafka-streams-in-action), as well as
> various
> > of posts in public venues like DZone as well as his personal blog (
> > http://codingjunkie.net/).
> >
> > We really appreciate the contributions and are looking forward to see
> more
> > from him. Congratulations, Bill !
> >
> >
> > Guozhang, on behalf of the Apache Kafka PMC
> >
>


Re: [ANNOUNCE] New Committer: Bill Bejeck

2019-02-13 Thread Satish Duggana
Congratulations Bill!

On Thu, Feb 14, 2019 at 6:41 AM Marcelo Barbosa
 wrote:
>
> Wow! Congrats Bill!
> Cheers,
> Barbosa
> Em quarta-feira, 13 de fevereiro de 2019 23:03:54 BRST, Guozhang Wang 
>  escreveu:
>
>  Hello all,
>
> The PMC of Apache Kafka is happy to announce that we've added Bill Bejeck
> as our newest project committer.
>
> Bill has been active in the Kafka community since 2015. He has made
> significant contributions to the Kafka Streams project with more than 100
> PRs and 4 authored KIPs, including the streams topology optimization
> framework. Bill's also very keen on tightening Kafka's unit test / system
> tests coverage, which is a great value to our project codebase.
>
> In addition, Bill has been very active in evangelizing Kafka for stream
> processing in the community. He has given several Kafka meetup talks in the
> past year, including a presentation at Kafka Summit SF. He's also authored
> a book about Kafka Streams (
> https://www.manning.com/books/kafka-streams-in-action), as well as various
> of posts in public venues like DZone as well as his personal blog (
> http://codingjunkie.net/).
>
> We really appreciate the contributions and are looking forward to see more
> from him. Congratulations, Bill !
>
>
> Guozhang, on behalf of the Apache Kafka PMC
>


Re: [ANNOUNCE] New Committer: Bill Bejeck

2019-02-13 Thread Vahid Hashemian
Congratulations Bill!

On Wed, Feb 13, 2019 at 5:09 PM Matthias J. Sax 
wrote:

> Congrats! Well deserved!
>
> -Matthias
>
> On 2/13/19 4:56 PM, Guozhang Wang wrote:
> > Hello all,
> >
> > The PMC of Apache Kafka is happy to announce that we've added Bill Bejeck
> > as our newest project committer.
> >
> > Bill has been active in the Kafka community since 2015. He has made
> > significant contributions to the Kafka Streams project with more than 100
> > PRs and 4 authored KIPs, including the streams topology optimization
> > framework. Bill's also very keen on tightening Kafka's unit test / system
> > tests coverage, which is a great value to our project codebase.
> >
> > In addition, Bill has been very active in evangelizing Kafka for stream
> > processing in the community. He has given several Kafka meetup talks in
> the
> > past year, including a presentation at Kafka Summit SF. He's also
> authored
> > a book about Kafka Streams (
> > https://www.manning.com/books/kafka-streams-in-action), as well as
> various
> > of posts in public venues like DZone as well as his personal blog (
> > http://codingjunkie.net/).
> >
> > We really appreciate the contributions and are looking forward to see
> more
> > from him. Congratulations, Bill !
> >
> >
> > Guozhang, on behalf of the Apache Kafka PMC
> >
>
>


Re: [ANNOUNCE] New Committer: Bill Bejeck

2019-02-13 Thread Marcelo Barbosa
Wow! Congrats Bill! 
Cheers,
Barbosa
Em quarta-feira, 13 de fevereiro de 2019 23:03:54 BRST, Guozhang Wang 
 escreveu:  
 
 Hello all,

The PMC of Apache Kafka is happy to announce that we've added Bill Bejeck
as our newest project committer.

Bill has been active in the Kafka community since 2015. He has made
significant contributions to the Kafka Streams project with more than 100
PRs and 4 authored KIPs, including the streams topology optimization
framework. Bill's also very keen on tightening Kafka's unit test / system
tests coverage, which is a great value to our project codebase.

In addition, Bill has been very active in evangelizing Kafka for stream
processing in the community. He has given several Kafka meetup talks in the
past year, including a presentation at Kafka Summit SF. He's also authored
a book about Kafka Streams (
https://www.manning.com/books/kafka-streams-in-action), as well as various
of posts in public venues like DZone as well as his personal blog (
http://codingjunkie.net/).

We really appreciate the contributions and are looking forward to see more
from him. Congratulations, Bill !


Guozhang, on behalf of the Apache Kafka PMC
  

Re: [ANNOUNCE] New Committer: Bill Bejeck

2019-02-13 Thread Matthias J. Sax
Congrats! Well deserved!

-Matthias

On 2/13/19 4:56 PM, Guozhang Wang wrote:
> Hello all,
> 
> The PMC of Apache Kafka is happy to announce that we've added Bill Bejeck
> as our newest project committer.
> 
> Bill has been active in the Kafka community since 2015. He has made
> significant contributions to the Kafka Streams project with more than 100
> PRs and 4 authored KIPs, including the streams topology optimization
> framework. Bill's also very keen on tightening Kafka's unit test / system
> tests coverage, which is a great value to our project codebase.
> 
> In addition, Bill has been very active in evangelizing Kafka for stream
> processing in the community. He has given several Kafka meetup talks in the
> past year, including a presentation at Kafka Summit SF. He's also authored
> a book about Kafka Streams (
> https://www.manning.com/books/kafka-streams-in-action), as well as various
> of posts in public venues like DZone as well as his personal blog (
> http://codingjunkie.net/).
> 
> We really appreciate the contributions and are looking forward to see more
> from him. Congratulations, Bill !
> 
> 
> Guozhang, on behalf of the Apache Kafka PMC
> 



signature.asc
Description: OpenPGP digital signature


[ANNOUNCE] New Committer: Bill Bejeck

2019-02-13 Thread Guozhang Wang
Hello all,

The PMC of Apache Kafka is happy to announce that we've added Bill Bejeck
as our newest project committer.

Bill has been active in the Kafka community since 2015. He has made
significant contributions to the Kafka Streams project with more than 100
PRs and 4 authored KIPs, including the streams topology optimization
framework. Bill's also very keen on tightening Kafka's unit test / system
tests coverage, which is a great value to our project codebase.

In addition, Bill has been very active in evangelizing Kafka for stream
processing in the community. He has given several Kafka meetup talks in the
past year, including a presentation at Kafka Summit SF. He's also authored
a book about Kafka Streams (
https://www.manning.com/books/kafka-streams-in-action), as well as various
of posts in public venues like DZone as well as his personal blog (
http://codingjunkie.net/).

We really appreciate the contributions and are looking forward to see more
from him. Congratulations, Bill !


Guozhang, on behalf of the Apache Kafka PMC


Re: Kafka Kerberos Login

2019-02-13 Thread nunoragoncalves
Anyone?


Re: Can I query ktable/stream/store with SQL like statment.

2019-02-13 Thread Matthias J. Sax
https://github.com/confluentinc/kafka-streams-examples

The "Music" example show how Interactive Queries work.

And of course you should read the docs.


-Matthias

On 2/13/19 6:59 AM, Nan Xu wrote:
> Range query would be enough for me, is there an example? which api I can
> call for this?
> 
> Thanks,
> Nan
> 
> On Tue, Feb 12, 2019 at 6:17 PM Matthias J. Sax 
> wrote:
> 
>> You could do a range query from "abc" to "abd" for example (in this
>> case, you would need to make sure to check the result form the iterator
>> and drop "abd" though).
>>
>> Note, that range queries are executed on the raw bytes. Thus, you need
>> to understand how the serializes you use work. In doubt, you may want to
>> use conservative ranges and apply a filter on the iterator to ignore
>> false positives.
>>
>> Also, this only work for prefix queries, ie, if you query with a know
>> prefix of the key.
>>
>> Hope this helps.
>>
>> -Matthias
>>
>> On 2/12/19 8:25 AM, Nan Xu wrote:
>>> Hi,
>>>
>>> Just wondering if there is a way to do a sql like "select key,
>>> value.fieild1 from ktable where key like abc%"
>>> The purpose of this to select some value from a ktable without a
>> fully
>>> defined key.  Store.all then filter on them would be very inefficient if
>>> store is big.
>>>
>>> Thanks,
>>> Nan
>>>
>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: kstream transform forward to different topics

2019-02-13 Thread Matthias J. Sax
The goal of KIP-307 is a different one. It's about providing names to
make debugging easier.

Thus, I don't think "307 is doing it the wrong way" -- the question is,
what problem is addressed, and KIP-307 addresses a different one as
discussed on this question.


-Matthias

On 2/13/19 1:53 AM, Jan Filipiak wrote:
> For now, just use the name it gets automatically, or crack the
> AbstractStream open with reflection ;)
> 
> 307 is doing it the wrong way again, just make name accessible instead
> of make the users put them :face_with_rolling_eyes:
> 
> On 08.02.2019 02:36, Guozhang Wang wrote:
>> Hi Nan,
>>
>> Glad it helps with your case. Just another note that in the next release
>> when KIP-307 is in place [1], you can actually combine the DSL with PAPI by
>> naming the last operator that creates your transformed KStream, and then
>> manually add the sink nodes like:
>>
>> stream2 = stream1.transform(Named.as("myName"));
>>
>> topology = builder.build();
>>
>> // continue adding to the built topology
>> topology.addSink(... "myName");
>>
>> -



signature.asc
Description: OpenPGP digital signature


Re: Kafka Streams & Distributed state question

2019-02-13 Thread Bill Bejeck
Hi Andrian,

There is an existing framework for Interactive Queries contributed by
Lightbend - https://github.com/lightbend/kafka-streams-query

HTH,
Bill



On Wed, Feb 13, 2019 at 10:14 AM Ryanne Dolan  wrote:

> Andrian, this looks useful:
>
>
> https://docs.confluent.io/current/streams/developer-guide/interactive-queries.html
>
> The gist is you'd need to build your own routing and discovery layers.
>
> Also consider materializing your data to an external data store, e.g.
> Couchbase or Hive, which provides this functionality already.
>
> Ryanne
>
> On Wed, Feb 13, 2019, 5:16 AM Andrian Jardan  wrote:
>
> > Hello dear Kafka community,
> >
> > We are planning to build a Kafka Streams application that will build a
> > pretty big state (~100Gb) in real time from various sources.
> >
> > We later on need to query this state as fast as possible, and the main
> > idea is to use the built-in in-memory RockDB and get the data via,
> > some sort of REST API we will build on top. The question is how do we
> > identify where the data we need is, since the state will obviously
> > not fit in the memory of a single instance and we need to scale somehow
> in
> > case this state keeps growing, and it will…
> >
> > Is there a way to identify where the data we care about resides, on what
> > kafka streams app instance ?
> >
> > I tried to find the answer in the documentation, but was not able to
> > figure it our unfortunately.
> >
> > Thank you in advance !
> >
> > —
> > Andrian Jardan
> > Infrastructure and DevOps expert
> > cell: +49 174 2815994
> > Skype: macrosdnb
> >
> >
>


Re: Kafka Streams & Distributed state question

2019-02-13 Thread Ryanne Dolan
Andrian, this looks useful:

https://docs.confluent.io/current/streams/developer-guide/interactive-queries.html

The gist is you'd need to build your own routing and discovery layers.

Also consider materializing your data to an external data store, e.g.
Couchbase or Hive, which provides this functionality already.

Ryanne

On Wed, Feb 13, 2019, 5:16 AM Andrian Jardan  Hello dear Kafka community,
>
> We are planning to build a Kafka Streams application that will build a
> pretty big state (~100Gb) in real time from various sources.
>
> We later on need to query this state as fast as possible, and the main
> idea is to use the built-in in-memory RockDB and get the data via,
> some sort of REST API we will build on top. The question is how do we
> identify where the data we need is, since the state will obviously
> not fit in the memory of a single instance and we need to scale somehow in
> case this state keeps growing, and it will…
>
> Is there a way to identify where the data we care about resides, on what
> kafka streams app instance ?
>
> I tried to find the answer in the documentation, but was not able to
> figure it our unfortunately.
>
> Thank you in advance !
>
> —
> Andrian Jardan
> Infrastructure and DevOps expert
> cell: +49 174 2815994
> Skype: macrosdnb
>
>


Re: Can I query ktable/stream/store with SQL like statment.

2019-02-13 Thread Nan Xu
Range query would be enough for me, is there an example? which api I can
call for this?

Thanks,
Nan

On Tue, Feb 12, 2019 at 6:17 PM Matthias J. Sax 
wrote:

> You could do a range query from "abc" to "abd" for example (in this
> case, you would need to make sure to check the result form the iterator
> and drop "abd" though).
>
> Note, that range queries are executed on the raw bytes. Thus, you need
> to understand how the serializes you use work. In doubt, you may want to
> use conservative ranges and apply a filter on the iterator to ignore
> false positives.
>
> Also, this only work for prefix queries, ie, if you query with a know
> prefix of the key.
>
> Hope this helps.
>
> -Matthias
>
> On 2/12/19 8:25 AM, Nan Xu wrote:
> > Hi,
> >
> > Just wondering if there is a way to do a sql like "select key,
> > value.fieild1 from ktable where key like abc%"
> > The purpose of this to select some value from a ktable without a
> fully
> > defined key.  Store.all then filter on them would be very inefficient if
> > store is big.
> >
> > Thanks,
> > Nan
> >
>
>


Re: Kafka Cluster Consumer - Failover not working

2019-02-13 Thread Jorit Hagedorn
Hi,

that did the trick, thanks!


On 2/13/19 12:09 PM, M. Manna wrote:
> In that case do let us know if it worked for you.
>
> On Wed, 13 Feb 2019 at 11:07, Jorit Hagedorn  wrote:
>
>> Hi,
>>
>> I did not know about this configuration file. Content:
>>
>> bootstrap.servers=localhost:9092
>> group.id=test-consumer-group
>>
>> Think I know what to do now.
>>
>> Kind Regards
>>
>> Jorit
>>
>> On 2/13/19 11:38 AM, M. Manna wrote:
>>> What is your consumer prop file ? How have you adjusted the properties?
>>>
>>>
>>>
>>> On Wed, 13 Feb 2019 at 09:41, Jorit Hagedorn 
>> wrote:
 Hello,

 I've setup a Kafka- / Zookeeper-Cluster with 3 vms:

 101
 201
 102

 These 3 servers run as cluster. Configs below:

 Kafka:

 broker.id=101  #  102 and 201
 num.network.threads=3
 num.io.threads=8
 socket.send.buffer.bytes=102400
 socket.receive.buffer.bytes=102400
 socket.request.max.bytes=104857600
 log.dirs=/opt/kafka/kafka-data
 num.partitions=1
 num.recovery.threads.per.data.dir=1
 offsets.topic.replication.factor=1
 transaction.state.log.replication.factor=1
 transaction.state.log.min.isr=1
 log.retention.hours=168
 log.segment.bytes=1073741824
 log.retention.check.interval.ms=30
 zookeeper.connect=10.1.221.13:2181,10.2.172.13:2181,10.1.221.16:2181
 zookeeper.connection.timeout.ms=6000
 group.initial.rebalance.delay.ms=0


 Zookeeper:

 dataDir=/opt/kafka/zookeeper-data
 clientPort=2181
 initLimit=10
 syncLimit=5
 maxClientCnxns=0

 server.101=10.1.221.13:2888:3888
 server.201=10.2.172.13:2888:3888
 server.102=10.1.221.16:2888:3888


 Full cluster is available:


 /opt/kafka/bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
 Connecting to localhost:2181
 Welcome to ZooKeeper!
 JLine support is disabled

 WATCHER::

 WatchedEvent state:SyncConnected type:None path:null
 [101, 102, 201]

 After that, I've created a replicated topic:

 /opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181
 --replication-factor 3 --partitions 1 --topic myreplicatedtopic

 Verify:

 /opt/kafka/bin/kafka-topics.sh --describe --topic myreplicatedtopic
 --zookeeper localhost:2181
 Topic:myreplicatedtopicPartitionCount:1ReplicationFactor:3
 Configs:
 Topic: myreplicatedtopicPartition: 0Leader: 102Replicas:
 102,101,201Isr: 102,101,201


 The issue is that a consumer which connects to node 102 for example will
 stop working and not use any of the other 2 servers for failover.
 As a consumer we currently have logstash. The consumer seems to fail
 once the first server that started in the cluster becomes unavailable
 for some reason.

 Producing messages (with filebeat) is always working, as long as the
 majority of the servers are up, which is the expected behaviour.

 What am I missing here?

 Kind Regards

 Jorit







signature.asc
Description: OpenPGP digital signature


Kafka Streams & Distributed state question

2019-02-13 Thread Andrian Jardan
Hello dear Kafka community,

We are planning to build a Kafka Streams application that will build a pretty 
big state (~100Gb) in real time from various sources.

We later on need to query this state as fast as possible, and the main idea is 
to use the built-in in-memory RockDB and get the data via,
some sort of REST API we will build on top. The question is how do we identify 
where the data we need is, since the state will obviously
not fit in the memory of a single instance and we need to scale somehow in case 
this state keeps growing, and it will…

Is there a way to identify where the data we care about resides, on what kafka 
streams app instance ?

I tried to find the answer in the documentation, but was not able to figure it 
our unfortunately.

Thank you in advance !

—
Andrian Jardan
Infrastructure and DevOps expert
cell: +49 174 2815994
Skype: macrosdnb



Re: Kafka Cluster Consumer - Failover not working

2019-02-13 Thread M. Manna
In that case do let us know if it worked for you.

On Wed, 13 Feb 2019 at 11:07, Jorit Hagedorn  wrote:

> Hi,
>
> I did not know about this configuration file. Content:
>
> bootstrap.servers=localhost:9092
> group.id=test-consumer-group
>
> Think I know what to do now.
>
> Kind Regards
>
> Jorit
>
> On 2/13/19 11:38 AM, M. Manna wrote:
> > What is your consumer prop file ? How have you adjusted the properties?
> >
> >
> >
> > On Wed, 13 Feb 2019 at 09:41, Jorit Hagedorn 
> wrote:
> >
> >> Hello,
> >>
> >> I've setup a Kafka- / Zookeeper-Cluster with 3 vms:
> >>
> >> 101
> >> 201
> >> 102
> >>
> >> These 3 servers run as cluster. Configs below:
> >>
> >> Kafka:
> >>
> >> broker.id=101  #  102 and 201
> >> num.network.threads=3
> >> num.io.threads=8
> >> socket.send.buffer.bytes=102400
> >> socket.receive.buffer.bytes=102400
> >> socket.request.max.bytes=104857600
> >> log.dirs=/opt/kafka/kafka-data
> >> num.partitions=1
> >> num.recovery.threads.per.data.dir=1
> >> offsets.topic.replication.factor=1
> >> transaction.state.log.replication.factor=1
> >> transaction.state.log.min.isr=1
> >> log.retention.hours=168
> >> log.segment.bytes=1073741824
> >> log.retention.check.interval.ms=30
> >> zookeeper.connect=10.1.221.13:2181,10.2.172.13:2181,10.1.221.16:2181
> >> zookeeper.connection.timeout.ms=6000
> >> group.initial.rebalance.delay.ms=0
> >>
> >>
> >> Zookeeper:
> >>
> >> dataDir=/opt/kafka/zookeeper-data
> >> clientPort=2181
> >> initLimit=10
> >> syncLimit=5
> >> maxClientCnxns=0
> >>
> >> server.101=10.1.221.13:2888:3888
> >> server.201=10.2.172.13:2888:3888
> >> server.102=10.1.221.16:2888:3888
> >>
> >>
> >> Full cluster is available:
> >>
> >>
> >> /opt/kafka/bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
> >> Connecting to localhost:2181
> >> Welcome to ZooKeeper!
> >> JLine support is disabled
> >>
> >> WATCHER::
> >>
> >> WatchedEvent state:SyncConnected type:None path:null
> >> [101, 102, 201]
> >>
> >> After that, I've created a replicated topic:
> >>
> >> /opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181
> >> --replication-factor 3 --partitions 1 --topic myreplicatedtopic
> >>
> >> Verify:
> >>
> >> /opt/kafka/bin/kafka-topics.sh --describe --topic myreplicatedtopic
> >> --zookeeper localhost:2181
> >> Topic:myreplicatedtopicPartitionCount:1ReplicationFactor:3
> >> Configs:
> >> Topic: myreplicatedtopicPartition: 0Leader: 102Replicas:
> >> 102,101,201Isr: 102,101,201
> >>
> >>
> >> The issue is that a consumer which connects to node 102 for example will
> >> stop working and not use any of the other 2 servers for failover.
> >> As a consumer we currently have logstash. The consumer seems to fail
> >> once the first server that started in the cluster becomes unavailable
> >> for some reason.
> >>
> >> Producing messages (with filebeat) is always working, as long as the
> >> majority of the servers are up, which is the expected behaviour.
> >>
> >> What am I missing here?
> >>
> >> Kind Regards
> >>
> >> Jorit
> >>
> >>
> >>
> --
> Jorit Hagedorn
> Linux System Engineer
>
> Plus.line AG
> Erlenstr. 2
> 60325 Frankfurt am Main
>
> Internet - Hosting - MPLS VPN
>
> Tel.:   +49 (0)69 758915-126
> Fax:+49 (0)69 758915-33
> E-Mail: jhaged...@plusline.de
> WWW:http://www.plusline.de
>
> Handelsregister: HRB 53629 Frankfurt am Main
> Vorstand: Richard Gresek
> Aufsichtsratsvorsitzender: Horst E. Eckhard
>
>
>


Re: Kafka Cluster Consumer - Failover not working

2019-02-13 Thread Jorit Hagedorn
Hi,

I did not know about this configuration file. Content:

bootstrap.servers=localhost:9092
group.id=test-consumer-group

Think I know what to do now.

Kind Regards

Jorit

On 2/13/19 11:38 AM, M. Manna wrote:
> What is your consumer prop file ? How have you adjusted the properties?
>
>
>
> On Wed, 13 Feb 2019 at 09:41, Jorit Hagedorn  wrote:
>
>> Hello,
>>
>> I've setup a Kafka- / Zookeeper-Cluster with 3 vms:
>>
>> 101
>> 201
>> 102
>>
>> These 3 servers run as cluster. Configs below:
>>
>> Kafka:
>>
>> broker.id=101  #  102 and 201
>> num.network.threads=3
>> num.io.threads=8
>> socket.send.buffer.bytes=102400
>> socket.receive.buffer.bytes=102400
>> socket.request.max.bytes=104857600
>> log.dirs=/opt/kafka/kafka-data
>> num.partitions=1
>> num.recovery.threads.per.data.dir=1
>> offsets.topic.replication.factor=1
>> transaction.state.log.replication.factor=1
>> transaction.state.log.min.isr=1
>> log.retention.hours=168
>> log.segment.bytes=1073741824
>> log.retention.check.interval.ms=30
>> zookeeper.connect=10.1.221.13:2181,10.2.172.13:2181,10.1.221.16:2181
>> zookeeper.connection.timeout.ms=6000
>> group.initial.rebalance.delay.ms=0
>>
>>
>> Zookeeper:
>>
>> dataDir=/opt/kafka/zookeeper-data
>> clientPort=2181
>> initLimit=10
>> syncLimit=5
>> maxClientCnxns=0
>>
>> server.101=10.1.221.13:2888:3888
>> server.201=10.2.172.13:2888:3888
>> server.102=10.1.221.16:2888:3888
>>
>>
>> Full cluster is available:
>>
>>
>> /opt/kafka/bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
>> Connecting to localhost:2181
>> Welcome to ZooKeeper!
>> JLine support is disabled
>>
>> WATCHER::
>>
>> WatchedEvent state:SyncConnected type:None path:null
>> [101, 102, 201]
>>
>> After that, I've created a replicated topic:
>>
>> /opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181
>> --replication-factor 3 --partitions 1 --topic myreplicatedtopic
>>
>> Verify:
>>
>> /opt/kafka/bin/kafka-topics.sh --describe --topic myreplicatedtopic
>> --zookeeper localhost:2181
>> Topic:myreplicatedtopicPartitionCount:1ReplicationFactor:3
>> Configs:
>> Topic: myreplicatedtopicPartition: 0Leader: 102Replicas:
>> 102,101,201Isr: 102,101,201
>>
>>
>> The issue is that a consumer which connects to node 102 for example will
>> stop working and not use any of the other 2 servers for failover.
>> As a consumer we currently have logstash. The consumer seems to fail
>> once the first server that started in the cluster becomes unavailable
>> for some reason.
>>
>> Producing messages (with filebeat) is always working, as long as the
>> majority of the servers are up, which is the expected behaviour.
>>
>> What am I missing here?
>>
>> Kind Regards
>>
>> Jorit
>>
>>
>>
-- 
Jorit Hagedorn
Linux System Engineer

Plus.line AG
Erlenstr. 2
60325 Frankfurt am Main

Internet - Hosting - MPLS VPN

Tel.:   +49 (0)69 758915-126
Fax:+49 (0)69 758915-33
E-Mail: jhaged...@plusline.de
WWW:http://www.plusline.de

Handelsregister: HRB 53629 Frankfurt am Main
Vorstand: Richard Gresek
Aufsichtsratsvorsitzender: Horst E. Eckhard 




signature.asc
Description: OpenPGP digital signature


Re: Kafka Cluster Consumer - Failover not working

2019-02-13 Thread M. Manna
What is your consumer prop file ? How have you adjusted the properties?



On Wed, 13 Feb 2019 at 09:41, Jorit Hagedorn  wrote:

> Hello,
>
> I've setup a Kafka- / Zookeeper-Cluster with 3 vms:
>
> 101
> 201
> 102
>
> These 3 servers run as cluster. Configs below:
>
> Kafka:
>
> broker.id=101  #  102 and 201
> num.network.threads=3
> num.io.threads=8
> socket.send.buffer.bytes=102400
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> log.dirs=/opt/kafka/kafka-data
> num.partitions=1
> num.recovery.threads.per.data.dir=1
> offsets.topic.replication.factor=1
> transaction.state.log.replication.factor=1
> transaction.state.log.min.isr=1
> log.retention.hours=168
> log.segment.bytes=1073741824
> log.retention.check.interval.ms=30
> zookeeper.connect=10.1.221.13:2181,10.2.172.13:2181,10.1.221.16:2181
> zookeeper.connection.timeout.ms=6000
> group.initial.rebalance.delay.ms=0
>
>
> Zookeeper:
>
> dataDir=/opt/kafka/zookeeper-data
> clientPort=2181
> initLimit=10
> syncLimit=5
> maxClientCnxns=0
>
> server.101=10.1.221.13:2888:3888
> server.201=10.2.172.13:2888:3888
> server.102=10.1.221.16:2888:3888
>
>
> Full cluster is available:
>
>
> /opt/kafka/bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
> Connecting to localhost:2181
> Welcome to ZooKeeper!
> JLine support is disabled
>
> WATCHER::
>
> WatchedEvent state:SyncConnected type:None path:null
> [101, 102, 201]
>
> After that, I've created a replicated topic:
>
> /opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181
> --replication-factor 3 --partitions 1 --topic myreplicatedtopic
>
> Verify:
>
> /opt/kafka/bin/kafka-topics.sh --describe --topic myreplicatedtopic
> --zookeeper localhost:2181
> Topic:myreplicatedtopicPartitionCount:1ReplicationFactor:3
> Configs:
> Topic: myreplicatedtopicPartition: 0Leader: 102Replicas:
> 102,101,201Isr: 102,101,201
>
>
> The issue is that a consumer which connects to node 102 for example will
> stop working and not use any of the other 2 servers for failover.
> As a consumer we currently have logstash. The consumer seems to fail
> once the first server that started in the cluster becomes unavailable
> for some reason.
>
> Producing messages (with filebeat) is always working, as long as the
> majority of the servers are up, which is the expected behaviour.
>
> What am I missing here?
>
> Kind Regards
>
> Jorit
>
>
>


Re: kstream transform forward to different topics

2019-02-13 Thread Jan Filipiak
For now, just use the name it gets automatically, or crack the
AbstractStream open with reflection ;)

307 is doing it the wrong way again, just make name accessible instead
of make the users put them :face_with_rolling_eyes:

On 08.02.2019 02:36, Guozhang Wang wrote:
> Hi Nan,
> 
> Glad it helps with your case. Just another note that in the next release
> when KIP-307 is in place [1], you can actually combine the DSL with PAPI by
> naming the last operator that creates your transformed KStream, and then
> manually add the sink nodes like:
> 
> stream2 = stream1.transform(Named.as("myName"));
> 
> topology = builder.build();
> 
> // continue adding to the built topology
> topology.addSink(... "myName");
> 
> -


Kafka Cluster Consumer - Failover not working

2019-02-13 Thread Jorit Hagedorn
Hello,

I've setup a Kafka- / Zookeeper-Cluster with 3 vms:

101
201
102

These 3 servers run as cluster. Configs below:

Kafka:

broker.id=101  #  102 and 201
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/opt/kafka/kafka-data
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=30
zookeeper.connect=10.1.221.13:2181,10.2.172.13:2181,10.1.221.16:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0


Zookeeper:

dataDir=/opt/kafka/zookeeper-data
clientPort=2181
initLimit=10
syncLimit=5
maxClientCnxns=0

server.101=10.1.221.13:2888:3888
server.201=10.2.172.13:2888:3888
server.102=10.1.221.16:2888:3888


Full cluster is available:


/opt/kafka/bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is disabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[101, 102, 201]

After that, I've created a replicated topic:

/opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 3 --partitions 1 --topic myreplicatedtopic

Verify:

/opt/kafka/bin/kafka-topics.sh --describe --topic myreplicatedtopic
--zookeeper localhost:2181
Topic:myreplicatedtopic    PartitionCount:1    ReplicationFactor:3   
Configs:
    Topic: myreplicatedtopic    Partition: 0    Leader: 102    Replicas:
102,101,201    Isr: 102,101,201


The issue is that a consumer which connects to node 102 for example will
stop working and not use any of the other 2 servers for failover.
As a consumer we currently have logstash. The consumer seems to fail
once the first server that started in the cluster becomes unavailable
for some reason.

Producing messages (with filebeat) is always working, as long as the
majority of the servers are up, which is the expected behaviour.

What am I missing here?

Kind Regards

Jorit




signature.asc
Description: OpenPGP digital signature