Re: Partitions reassignment is failing in Kafka 1.1.0

2018-06-30 Thread Ted Yu
Code snippet from ReassignPartitionsCommand.scala :

  "log_dirs" -> replicas.map(r =>
replicaLogDirAssignment.getOrElse(new TopicPartitionReplica(tp.topic,
tp.partition, r), AnyLogDir)).asJava

We know that the appearance of "any" was due to the OrElse clause.
Arguably there is a bug in the above code that the number of AnyLogDir
should match the length of the replicas list, or "log_dirs" should be
omitted in such case.

On Sat, Jun 30, 2018 at 12:06 AM, Manikumar 
wrote:

> It will be taken as "any" directory for each replica, which means replica
> will placed on any one of the
> configured directory on that broker.
>
> Since it is "log_dirs" optional, you can remove from the json.
>
> On Sat, Jun 30, 2018 at 12:02 PM Debraj Manna 
> wrote:
>
> > It is problem on my side. The code was changing the replicas count but
> not
> > the log_dirs. Since I am migrating from 0.10 this part of the code was
> not
> > changed.
> >
> > I have a follow up question what is the default value of log_dirs if I
> > don't specify it in reassignment.json ?
> >
> > On Sat, Jun 30, 2018 at 11:15 AM, Debraj Manna  >
> > wrote:
> >
> > > I am generating the reassignent.json like below
> > >
> > > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh --zookeeper
> > 127.0.0.1:2181 --generate --topics-to-move-json-file
> > /home/ubuntu/deploy/kafka/topics_to_move.json --broker-list '%s' |tail
> -1 >
> > /home/ubuntu/deploy/kafka/reassignment.json"
> > >
> > > Then I am doing the reassignment using the generated file
> > >
> > > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh --zookeeper
> > 127.0.0.1:2181 --execute --reassignment-json-file
> > /home/ubuntu/deploy/kafka/reassignment.json
> > >
> > > kafka-reassign-partitions.sh helps states
> > >
> > > The JSON file with the partition reassignment configurationThe format
> to
> > >> use is -
> > >> {"partitions":[{"topic": "foo", "partition": 1, "replicas": [1,2,3],
> > >> "log_dirs": ["dir1","dir2","dir3"]}], "version":1} Note that
> "log_dirs"
> > is
> > >> optional. When it is specified, its length must equal the length of
> the
> > >> replicas list. The value in this list can be either "any" or the
> > absolution
> > >> path of the log directory on the broker. If absolute log directory
> path
> > is
> > >> specified, it is currently required that the replica has not already
> > been
> > >> created on that broker. The replica will then be created in the
> > specified
> > >> log directory on the broker later.
> > >
> > >
> > > So it appears reassignment json that is generated by
> > > kafka-reassign-partions.sh is creating an issue with logdirs. Is this
> > > some issue in kafka-reassign-partitions.sh or some misconfiguration
> from
> > my
> > > side. ?
> > >
> > > On Sat, Jun 30, 2018 at 10:26 AM, Debraj Manna <
> subharaj.ma...@gmail.com
> > >
> > > wrote:
> > >
> > >> Please find the server.properties from one of the broker.
> > >>
> > >> broker.id=0
> > >> port=9092
> > >> num.network.threads=3
> > >> num.io.threads=8
> > >> socket.send.buffer.bytes=102400
> > >> socket.receive.buffer.bytes=102400
> > >> socket.request.max.bytes=104857600
> > >> log.dirs=/var/lib/kafka/kafka-logs
> > >> num.recovery.threads.per.data.dir=1
> > >> log.retention.hours=36
> > >> log.retention.bytes=1073741824
> > >> log.segment.bytes=536870912
> > >> log.retention.check.interval.ms=30
> > >> log.cleaner.enable=false
> > >> zookeeper.connect=platform1:2181,platform2:2181,platform3:2181
> > >> message.max.bytes=1500
> > >> replica.fetch.max.bytes=1500
> > >> auto.create.topics.enable=true
> > >> zookeeper.connection.timeout.ms=6000
> > >> unclean.leader.election.enable=false
> > >> delete.topic.enable=false
> > >> offsets.topic.replication.factor=1
> > >> transaction.state.log.replication.factor=1
> > >> transaction.state.log.min.isr=1
> > >>
> > >> I have placed server.log from a broker at https://gist.github.com/deb
> > >> raj-manna/4b4bdae8a1c15c36b313a04f37e8776d
> > >>
> > >> On Sat, Jun 30, 2018 at 8:16 AM, Ted Yu  wrote:
> > >>
> > >>> Seems to be related to KIP-113.
> > >>>
> > >>> server.properties didn't go thru. Do you mind pastebin'ing its
> content
> > ?
> > >>>
> > >>> If you can pastebin logs from broker, that should help.
> > >>>
> > >>> Thanks
> > >>>
> > >>> On Fri, Jun 29, 2018 at 10:37 AM, Debraj Manna <
> > subharaj.ma...@gmail.com
> > >>> >
> > >>> wrote:
> > >>>
> > >>> > Hi
> > >>> >
> > >>> > I altered a topic like below in kafka 1.1.0
> > >>> >
> > >>> > /home/ubuntu/deploy/kafka/bin/kafka-topics.sh --zookeeper
> > >>> 127.0.0.1:2181
> > >>> > --alter --topic Topic3 --config min.insync.replicas=2
> > >>> >
> > >>> > But whenever I am trying to verify the reassignment it is showing
> the
> > >>> > below exception
> > >>> >
> > >>> > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh
> > >>> --zookeeper 127.0.0.1:2181 --reassignment-json-file
> > >>> /home/ubuntu/deploy/kafka/reassignment.json --verify
> > >>> >
> > >>> > Partitions

How to fix ISR/URP when there is no new FetchRequest made?

2018-06-30 Thread Henry Cai
We have URP reported for some empty topics, it was very annoying.

The below screen shots shows both leader (1004) and follower (1001) has the
exact same content.

They probably get into this stage when some brokers were prematurely
stopped and restarted.

By looking at the code, looks like ISR information for a partition only
gets updated in ZK when leader broker was processing new FetchRequest from
follower.  But there is no new traffic happening on that topic, how do we
trigger the ISR update and fix the URP?

***
(.venv) [hcai@dev-hcai:~]$ ssh m10nrealtimedevkafka01004 "ls -l
/mnt/kafka/realtime_spend_reducer_dev-global_spend_segmented_window-repartition-10"
Warning: Permanently added the RSA host key for IP address '10.1.228.255'
to the list of known hosts.
total 0
-rw-rw-r-- 1 kafka kafka 52428800 Jun 29 06:48 .index
-rw-rw-r-- 1 kafka kafka0 Jun 13 22:08 .log
-rw-rw-r-- 1 kafka kafka 52428792 Jun 29 06:48
.timeindex
-rw-rw-r-- 1 kafka kafka0 Jun 13 22:08 leader-epoch-checkpoint
(.venv) [hcai@dev-hcai:~]$ ssh m10nrealtimedevkafka01001 "ls -l
/mnt/kafka/realtime_spend_reducer_dev-global_spend_segmented_window-repartition-10"
Warning: Permanently added the RSA host key for IP address '10.1.60.114' to
the list of known hosts.
total 0
-rw-rw-r-- 1 kafka kafka 52428800 Jun 29 06:36 .index
-rw-rw-r-- 1 kafka kafka0 Jun 29 04:29 .log
-rw-rw-r-- 1 kafka kafka 52428792 Jun 29 06:36
.timeindex
-rw-rw-r-- 1 kafka kafka0 Jun 29 04:29 leader-epoch-checkpoint


Kafka Streams - Shared Ktable State Store

2018-06-30 Thread ade brogan
Hi

We are using kafka streams api. We have come across an issue which we need
a short term fix for.

We host 2 independent kafka clusters in 2 different data centres and these
clusters are mirrored using mirrormaker. The problem this brings is that we
cannot use kTables backed by a rocksDB because when traffic flips from one
datacentre to the other, the local state will not be populated with
previous state from the other datacentre.

This would not be an issue if we had one logical kafka cluster. We are
moving to this setup, but in the short term need a solution.

The kafka docs say that you can plug in a custom database for ktables, so
we are exploring the possibility of using a shared database (e.g. mongo, or
oracle) to store the state. This database is visible by both kafka clusters
and acts as a single source of truth.

I guess my question is does anybody see an reason why this would not work?

Thanks
Ade


Re: Partitions reassignment is failing in Kafka 1.1.0

2018-06-30 Thread Debraj Manna
Thanks

On Sat, Jun 30, 2018 at 12:36 PM, Manikumar 
wrote:

> It will be taken as "any" directory for each replica, which means replica
> will placed on any one of the
> configured directory on that broker.
>
> Since it is "log_dirs" optional, you can remove from the json.
>
> On Sat, Jun 30, 2018 at 12:02 PM Debraj Manna 
> wrote:
>
> > It is problem on my side. The code was changing the replicas count but
> not
> > the log_dirs. Since I am migrating from 0.10 this part of the code was
> not
> > changed.
> >
> > I have a follow up question what is the default value of log_dirs if I
> > don't specify it in reassignment.json ?
> >
> > On Sat, Jun 30, 2018 at 11:15 AM, Debraj Manna  >
> > wrote:
> >
> > > I am generating the reassignent.json like below
> > >
> > > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh --zookeeper
> > 127.0.0.1:2181 --generate --topics-to-move-json-file
> > /home/ubuntu/deploy/kafka/topics_to_move.json --broker-list '%s' |tail
> -1 >
> > /home/ubuntu/deploy/kafka/reassignment.json"
> > >
> > > Then I am doing the reassignment using the generated file
> > >
> > > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh --zookeeper
> > 127.0.0.1:2181 --execute --reassignment-json-file
> > /home/ubuntu/deploy/kafka/reassignment.json
> > >
> > > kafka-reassign-partitions.sh helps states
> > >
> > > The JSON file with the partition reassignment configurationThe format
> to
> > >> use is -
> > >> {"partitions":[{"topic": "foo", "partition": 1, "replicas": [1,2,3],
> > >> "log_dirs": ["dir1","dir2","dir3"]}], "version":1} Note that
> "log_dirs"
> > is
> > >> optional. When it is specified, its length must equal the length of
> the
> > >> replicas list. The value in this list can be either "any" or the
> > absolution
> > >> path of the log directory on the broker. If absolute log directory
> path
> > is
> > >> specified, it is currently required that the replica has not already
> > been
> > >> created on that broker. The replica will then be created in the
> > specified
> > >> log directory on the broker later.
> > >
> > >
> > > So it appears reassignment json that is generated by
> > > kafka-reassign-partions.sh is creating an issue with logdirs. Is this
> > > some issue in kafka-reassign-partitions.sh or some misconfiguration
> from
> > my
> > > side. ?
> > >
> > > On Sat, Jun 30, 2018 at 10:26 AM, Debraj Manna <
> subharaj.ma...@gmail.com
> > >
> > > wrote:
> > >
> > >> Please find the server.properties from one of the broker.
> > >>
> > >> broker.id=0
> > >> port=9092
> > >> num.network.threads=3
> > >> num.io.threads=8
> > >> socket.send.buffer.bytes=102400
> > >> socket.receive.buffer.bytes=102400
> > >> socket.request.max.bytes=104857600
> > >> log.dirs=/var/lib/kafka/kafka-logs
> > >> num.recovery.threads.per.data.dir=1
> > >> log.retention.hours=36
> > >> log.retention.bytes=1073741824
> > >> log.segment.bytes=536870912
> > >> log.retention.check.interval.ms=30
> > >> log.cleaner.enable=false
> > >> zookeeper.connect=platform1:2181,platform2:2181,platform3:2181
> > >> message.max.bytes=1500
> > >> replica.fetch.max.bytes=1500
> > >> auto.create.topics.enable=true
> > >> zookeeper.connection.timeout.ms=6000
> > >> unclean.leader.election.enable=false
> > >> delete.topic.enable=false
> > >> offsets.topic.replication.factor=1
> > >> transaction.state.log.replication.factor=1
> > >> transaction.state.log.min.isr=1
> > >>
> > >> I have placed server.log from a broker at https://gist.github.com/deb
> > >> raj-manna/4b4bdae8a1c15c36b313a04f37e8776d
> > >>
> > >> On Sat, Jun 30, 2018 at 8:16 AM, Ted Yu  wrote:
> > >>
> > >>> Seems to be related to KIP-113.
> > >>>
> > >>> server.properties didn't go thru. Do you mind pastebin'ing its
> content
> > ?
> > >>>
> > >>> If you can pastebin logs from broker, that should help.
> > >>>
> > >>> Thanks
> > >>>
> > >>> On Fri, Jun 29, 2018 at 10:37 AM, Debraj Manna <
> > subharaj.ma...@gmail.com
> > >>> >
> > >>> wrote:
> > >>>
> > >>> > Hi
> > >>> >
> > >>> > I altered a topic like below in kafka 1.1.0
> > >>> >
> > >>> > /home/ubuntu/deploy/kafka/bin/kafka-topics.sh --zookeeper
> > >>> 127.0.0.1:2181
> > >>> > --alter --topic Topic3 --config min.insync.replicas=2
> > >>> >
> > >>> > But whenever I am trying to verify the reassignment it is showing
> the
> > >>> > below exception
> > >>> >
> > >>> > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh
> > >>> --zookeeper 127.0.0.1:2181 --reassignment-json-file
> > >>> /home/ubuntu/deploy/kafka/reassignment.json --verify
> > >>> >
> > >>> > Partitions reassignment failed due to Size of replicas list
> Vector(3,
> > >>> 0, 2) is different from size of log dirs list Vector(any) for
> partition
> > >>> Topic3-7
> > >>> > kafka.common.AdminCommandFailedException: Size of replicas list
> > >>> Vector(3, 0, 2) is different from size of log dirs list Vector(any)
> for
> > >>> partition Topic3-7
> > >>> >   at kafka.admin.ReassignPartitionsCommand$$
> anonfun$parsePartitio

Re: [kafka-clients] [VOTE] 2.0.0 RC1

2018-06-30 Thread Rajini Sivaram
Hi Manikumar,

Thank you for pointing that out, I had forgotten to drop the old artifacts.
New artifacts should be there now.

Regards,

Rajini

On Sat, Jun 30, 2018 at 7:44 AM, Manikumar 
wrote:

> looks like maven artifacts are not updated in the staging repo. They are
> still at old timestamp.
> https://repository.apache.org/content/groups/staging/org/
> apache/kafka/kafka_2.11/2.0.0/
>
> On Sat, Jun 30, 2018 at 12:06 AM Rajini Sivaram 
> wrote:
>
>> Hello Kafka users, developers and client-developers,
>>
>>
>> This is the second candidate for release of Apache Kafka 2.0.0.
>>
>>
>> This is a major version release of Apache Kafka. It includes 40 new  KIPs
>> and
>>
>> several critical bug fixes. Please see the 2.0.0 release plan for more
>> details:
>>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
>>
>>
>> A few notable highlights:
>>
>>- Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for
>>CreateTopics (KIP-277)
>>- SASL/OAUTHBEARER implementation (KIP-255)
>>- Improved quota communication and customization of quotas (KIP-219,
>>KIP-257)
>>- Efficient memory usage for down conversion (KIP-283)
>>- Fix log divergence between leader and follower during fast leader
>>failover (KIP-279)
>>- Drop support for Java 7 and remove deprecated code including old
>>scala clients
>>- Connect REST extension plugin, support for externalizing secrets
>>and improved error handling (KIP-285, KIP-297, KIP-298 etc.)
>>- Scala API for Kafka Streams and other Streams API improvements
>>(KIP-270, KIP-150, KIP-245, KIP-251 etc.)
>>
>> Release notes for the 2.0.0 release:
>>
>> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/RELEASE_NOTES.html
>>
>>
>>
>> *** Please download, test and vote by Tuesday, July 3rd, 4pm PT
>>
>>
>> Kafka's KEYS file containing PGP keys we use to sign the release:
>>
>> http://kafka.apache.org/KEYS
>>
>>
>> * Release artifacts to be voted upon (source and binary):
>>
>> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/
>>
>>
>> * Maven artifacts to be voted upon:
>>
>> https://repository.apache.org/content/groups/staging/
>>
>>
>> * Javadoc:
>>
>> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/javadoc/
>>
>>
>> * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
>>
>> https://github.com/apache/kafka/tree/2.0.0-rc1
>>
>>
>> * Documentation:
>>
>> http://kafka.apache.org/20/documentation.html
>>
>>
>> * Protocol:
>>
>> http://kafka.apache.org/20/protocol.html
>>
>>
>> * Successful Jenkins builds for the 2.0 branch:
>>
>> Unit/integration tests: https://builds.apache.org/job/kafka-2.0-jdk8/66/
>>
>> System tests: https://jenkins.confluent.io/job/system-test-
>> kafka/job/2.0/15/
>>
>>
>>
>> Please test and verify the release artifacts and submit a vote for this RC
>> or report any issues so that we can fix them and roll out a new RC ASAP!
>>
>> Although this release vote requires PMC votes to pass, testing, votes,
>> and bug
>> reports are valuable and appreciated from everyone.
>>
>>
>> Thanks,
>>
>>
>> Rajini
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "kafka-clients" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to kafka-clients+unsubscr...@googlegroups.com.
>> To post to this group, send email to kafka-clie...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/kafka-clients.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/kafka-clients/CAOJcB39GdTWOaK4qysvyPyGU8Ldm8
>> 2t_TA364x1MP8a8OAod6A%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>


Re: [External] Re: Is order of data is not maintained in Kafka, Is Kafka not suitable to do manage State / Transactional Scenarios such as Updating a Bank Account scenarios etc

2018-06-30 Thread Nicolas Guignard
Hi Malik,

I think it depends on the strategy you want for your producers and
consumers.

Actually one topic one partition you'll have partition ordered guaranteed
and consequently topic too.
However, if you have several partitions for one topic then the events in
your partitions will be ordered but it is not guaranteed at the topic
level. You sometimes decide to have several partitions for data replication
(leader and replica) but also because you want to be able to parallelize.
In the second case, your producers will shard the data over a set
of partitions/brokers, messages with the same key from the same producer
are delivered in order to your consumer.

Hope it helps.
Cheers

Le ven. 29 juin 2018 à 05:52, Malik, Shibha (GE Renewable Energy,
consultant)  a écrit :

>  So say if there is only one consumer in a consumer group ( to make order
> guarantee scenario ) and it reads from only one of the partition at a time
> ( say the topic he is subscribing to is partitioned into 3 partitions )  ,
> the only use of putting data into other two partitions is :
>
> 1. Other consumers can consume from it.
> 2. It will help in buffering data which produces sends and will increase
> throughput and latency
>
> ??
>
> -Original Message-
> From: Virgil Palanciuc [mailto:virg...@adobe.com.INVALID]
> Sent: Thursday, June 28, 2018 10:59 AM
> To: users@kafka.apache.org
> Subject: EXT: Re: [External] Re: Is order of data is not maintained in
> Kafka, Is Kafka not suitable to do manage State / Transactional Scenarios
> such as Updating a Bank Account scenarios etc
>
> Yes, that's how Kafka works - all partitions are read in parallel, but
> only one consumer form the same consumer group reads a partition at one
> time (a consumer may consume multiple partitions, but no two consumers from
> same group consume the same partition)
>
> Virgil.
>
> On 6/28/18, 7:45 PM, "Malik, Shibha (GE Renewable Energy, consultant)" <
> shibha.ma...@ge.com> wrote:
>
> You mean we use multiple partitions for a topic say topic is event
> "Customer Account Registration" , we can have multiple consumers read from
> different partition at a time ( only one consumer per partition ) and can
> perform registration of different customers in parallel ?
>
>
> -Original Message-
> From: Virgil Palanciuc [mailto:virg...@adobe.com.INVALID]
> Sent: Thursday, June 28, 2018 1:05 AM
> To: users@kafka.apache.org
> Subject: EXT: Re: [External] Re: Is order of data is not maintained in
> Kafka, Is Kafka not suitable to do manage State / Transactional Scenarios
> such as Updating a Bank Account scenarios etc
>
> Messages within the same partition are ordered. You don't need to use
> only one partition (unless you need global ordering) - you just need to use
> keys. E.g. if your key is the account number, then all operations done on
> the same account are ordered; if your key is the customer ID, all
> operations done by the same customer are ordered; etc. Just find a suitable
> key.
>
> Virgil.
>
> On 6/28/18, 4:12 AM, "Malik, Shibha (GE Renewable Energy, consultant)"
>  wrote:
>
> But then restricting a consumer to use only partition seems to be
> similar to traditional message queues ( except the persistence feature )
> and does not offer parallelism , isn’t it ?
>
> -Original Message-
> From: Yi, Gene [mailto:gene...@sc.com.INVALID]
> Sent: Wednesday, June 27, 2018 5:43 PM
> To: users@kafka.apache.org
> Subject: EXT: RE: [External] Re: Is order of data is not
> maintained in Kafka, Is Kafka not suitable to do manage State /
> Transactional Scenarios such as Updating a Bank Account scenarios etc
>
> I think you may use one partition only?
>
>
> __
>
> Gene Huijian Yi
> TS - FM - App
>
> Standard Chartered Global Business Services Co., Ltd.
> Phone:  +862259806930
> Address:Standard Chartered Center,  3/F No.35 Xinhuanbei
> Road,
> TEDA,  Tianjin - 300457
> E-mail: gene...@sc.com
> __
> Please consider the environment before printing this email
>
> -Original Message-
> From: Tushar Madhukar [mailto:tushar.madhu...@gmail.com]
> Sent: Thursday, June 28, 2018 7:45 AM
> To: users@kafka.apache.org
> Subject: [External] Re: Is order of data is not maintained in
> Kafka, Is Kafka not suitable to do manage State / Transactional Scenarios
> such as Updating a Bank Account scenarios etc
>
> Hi,
>
> Kafka guarantees ordering only within a partition (not across
> partitions in a topic) ie consumers reading off a partition will always get
> records in the order they were *received into the partition*.
>
> Note that in some cases of retries by a non- idempotent producer,
> the order 

Re: [VOTE] 1.0.2 RC1

2018-06-30 Thread Ted Yu
+1

Ran test suite.

On Fri, Jun 29, 2018 at 10:02 PM, Matthias J. Sax 
wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the second candidate for release of Apache Kafka 1.0.2.
>
> This is a bug fix release addressing 27 tickets:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.0.2
>
> Release notes for the 1.0.2 release:
> http://home.apache.org/~mjsax/kafka-1.0.2-rc1/RELEASE_NOTES.html
>
> *** Please download, test and vote by end of next week (7/6/18).
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> http://home.apache.org/~mjsax/kafka-1.0.2-rc1/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/
>
> * Javadoc:
> http://home.apache.org/~mjsax/kafka-1.0.2-rc1/javadoc/
>
> * Tag to be voted upon (off 1.0 branch) is the 1.0.2 tag:
> https://github.com/apache/kafka/releases/tag/1.0.2-rc1
>
> * Documentation:
> http://kafka.apache.org/10/documentation.html
>
> * Protocol:
> http://kafka.apache.org/10/protocol.html
>
> * Successful Jenkins builds for the 1.0 branch:
> Unit/integration tests: https://builds.apache.org/job/kafka-1.0-jdk7/214/
> System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/1.0/225/
>
> /**
>
> Thanks,
>   -Matthias
>
>
>


Re: Visible intermediate state for KGroupedTable aggregation

2018-06-30 Thread Thilo-Alexander Ginkel
On Sat, Jun 30, 2018 at 1:48 AM Matthias J. Sax  wrote:
> I see what you are saying -- this is what I meant by eventual
> consistency guarantee.

Understood.

> What is unclear to me is, why you need to re-group the KTable? The issue
> you describe only occurs for this operation.

The transactions I need to aggregate have a different primary key (tx
id) than the resulting exposure (account #, security id). Each
transaction also carries the (account #, security id) attributes, but
they aren't unique. If I understand correctly, I need to regroup if
I'd like to aggregate on (account #, security id). If there is an
alternative approach without re-grouping, I'm more than happy to try
it.

Thanks,
Thilo


Re: Partitions reassignment is failing in Kafka 1.1.0

2018-06-30 Thread Manikumar
It will be taken as "any" directory for each replica, which means replica
will placed on any one of the
configured directory on that broker.

Since it is "log_dirs" optional, you can remove from the json.

On Sat, Jun 30, 2018 at 12:02 PM Debraj Manna 
wrote:

> It is problem on my side. The code was changing the replicas count but not
> the log_dirs. Since I am migrating from 0.10 this part of the code was not
> changed.
>
> I have a follow up question what is the default value of log_dirs if I
> don't specify it in reassignment.json ?
>
> On Sat, Jun 30, 2018 at 11:15 AM, Debraj Manna 
> wrote:
>
> > I am generating the reassignent.json like below
> >
> > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh --zookeeper
> 127.0.0.1:2181 --generate --topics-to-move-json-file
> /home/ubuntu/deploy/kafka/topics_to_move.json --broker-list '%s' |tail -1 >
> /home/ubuntu/deploy/kafka/reassignment.json"
> >
> > Then I am doing the reassignment using the generated file
> >
> > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh --zookeeper
> 127.0.0.1:2181 --execute --reassignment-json-file
> /home/ubuntu/deploy/kafka/reassignment.json
> >
> > kafka-reassign-partitions.sh helps states
> >
> > The JSON file with the partition reassignment configurationThe format to
> >> use is -
> >> {"partitions":[{"topic": "foo", "partition": 1, "replicas": [1,2,3],
> >> "log_dirs": ["dir1","dir2","dir3"]}], "version":1} Note that "log_dirs"
> is
> >> optional. When it is specified, its length must equal the length of the
> >> replicas list. The value in this list can be either "any" or the
> absolution
> >> path of the log directory on the broker. If absolute log directory path
> is
> >> specified, it is currently required that the replica has not already
> been
> >> created on that broker. The replica will then be created in the
> specified
> >> log directory on the broker later.
> >
> >
> > So it appears reassignment json that is generated by
> > kafka-reassign-partions.sh is creating an issue with logdirs. Is this
> > some issue in kafka-reassign-partitions.sh or some misconfiguration from
> my
> > side. ?
> >
> > On Sat, Jun 30, 2018 at 10:26 AM, Debraj Manna  >
> > wrote:
> >
> >> Please find the server.properties from one of the broker.
> >>
> >> broker.id=0
> >> port=9092
> >> num.network.threads=3
> >> num.io.threads=8
> >> socket.send.buffer.bytes=102400
> >> socket.receive.buffer.bytes=102400
> >> socket.request.max.bytes=104857600
> >> log.dirs=/var/lib/kafka/kafka-logs
> >> num.recovery.threads.per.data.dir=1
> >> log.retention.hours=36
> >> log.retention.bytes=1073741824
> >> log.segment.bytes=536870912
> >> log.retention.check.interval.ms=30
> >> log.cleaner.enable=false
> >> zookeeper.connect=platform1:2181,platform2:2181,platform3:2181
> >> message.max.bytes=1500
> >> replica.fetch.max.bytes=1500
> >> auto.create.topics.enable=true
> >> zookeeper.connection.timeout.ms=6000
> >> unclean.leader.election.enable=false
> >> delete.topic.enable=false
> >> offsets.topic.replication.factor=1
> >> transaction.state.log.replication.factor=1
> >> transaction.state.log.min.isr=1
> >>
> >> I have placed server.log from a broker at https://gist.github.com/deb
> >> raj-manna/4b4bdae8a1c15c36b313a04f37e8776d
> >>
> >> On Sat, Jun 30, 2018 at 8:16 AM, Ted Yu  wrote:
> >>
> >>> Seems to be related to KIP-113.
> >>>
> >>> server.properties didn't go thru. Do you mind pastebin'ing its content
> ?
> >>>
> >>> If you can pastebin logs from broker, that should help.
> >>>
> >>> Thanks
> >>>
> >>> On Fri, Jun 29, 2018 at 10:37 AM, Debraj Manna <
> subharaj.ma...@gmail.com
> >>> >
> >>> wrote:
> >>>
> >>> > Hi
> >>> >
> >>> > I altered a topic like below in kafka 1.1.0
> >>> >
> >>> > /home/ubuntu/deploy/kafka/bin/kafka-topics.sh --zookeeper
> >>> 127.0.0.1:2181
> >>> > --alter --topic Topic3 --config min.insync.replicas=2
> >>> >
> >>> > But whenever I am trying to verify the reassignment it is showing the
> >>> > below exception
> >>> >
> >>> > /home/ubuntu/deploy/kafka/bin/kafka-reassign-partitions.sh
> >>> --zookeeper 127.0.0.1:2181 --reassignment-json-file
> >>> /home/ubuntu/deploy/kafka/reassignment.json --verify
> >>> >
> >>> > Partitions reassignment failed due to Size of replicas list Vector(3,
> >>> 0, 2) is different from size of log dirs list Vector(any) for partition
> >>> Topic3-7
> >>> > kafka.common.AdminCommandFailedException: Size of replicas list
> >>> Vector(3, 0, 2) is different from size of log dirs list Vector(any) for
> >>> partition Topic3-7
> >>> >   at kafka.admin.ReassignPartitionsCommand$$anonfun$parsePartitio
> >>> nReassignmentData$1$$anonfun$apply$4$$anonfun$apply$5.apply(
> >>> ReassignPartitionsCommand.scala:262)
> >>> >   at kafka.admin.ReassignPartitionsCommand$$anonfun$parsePartitio
> >>> nReassignmentData$1$$anonfun$apply$4$$anonfun$apply$5.apply(
> >>> ReassignPartitionsCommand.scala:251)
> >>> >   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>