[jira] [Created] (KAFKA-9339) Increased CPU utilization in brokers in 2.4.0

2019-12-27 Thread James Brown (Jira)
James Brown created KAFKA-9339:
--

 Summary: Increased CPU utilization in brokers in 2.4.0
 Key: KAFKA-9339
 URL: https://issues.apache.org/jira/browse/KAFKA-9339
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.4.0
 Environment: CentOS 6; Java 1.8.0_232 (OpenJDK)
Reporter: James Brown


I upgraded one of my company's test clusters from 2.3.1 to 2.4.0 and have 
noticed a significant (40%) increase in the CPU time consumed. This is a small 
cluster of three nodes (running on t2.large EC2 instances all in the same AZ) 
pushing about 150 message/s in aggregate spread across 208 topics (a total of 
266 partitions; most topics only have one partition). Leadership is reasonably 
well-distributed and each node has between 83 and 94 partitions which it leads. 
This CPU time increase is visible symmetrically on all three nodes in the 
cluster (e.g., the controller isn't using more CPU than the other nodes).
 
The CPU consumption did not return to normal after I did the second restart to 
bump the log and inter-broker protocol versions to 2.4, so I don't think it has 
anything to do with down-converting to the 2.3 protocols.
 
No settings were changed, nor was anything about the JVM changed. There is 
nothing interesting being written to the logs. There's no sign of any 
instability (partitions aren't being reassigned, etc).
 
The best guess I have for the increased CPU usage is that the number of garbage 
collections increased by approximately 30%, suggesting that something is 
churning a lot more garbage inside Kafka. This is a small cluster, so it's only 
got a 3GB heap allocated to Kafka on each node; we're using G1GC with some 
light tuning and are on Java 8 if that helps.
 
We are only using OpenJDK, so I don't think I can produce a Flight Recorder 
profile.
 
The kafka-users mailing list suggested this was worth filing a Jira issue about.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9338) Incremental fetch sessions do not maintain or use leader epoch for fencing purposes

2019-12-27 Thread Lucas Bradstreet (Jira)
Lucas Bradstreet created KAFKA-9338:
---

 Summary: Incremental fetch sessions do not maintain or use leader 
epoch for fencing purposes
 Key: KAFKA-9338
 URL: https://issues.apache.org/jira/browse/KAFKA-9338
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.4.0, 2.3.0, 2.2.0, 2.1.0
Reporter: Lucas Bradstreet


KIP-320 adds the ability to fence replicas by detecting stale leader epochs 
from followers, and helping consumers handle unclean truncation.

Unfortunately the incremental fetch session handling does not maintain or use 
the leader epoch in the fetch session cache. As a result, it does not appear 
that the leader epoch is used for fencing a majority of the time. I'm not sure 
if this is only the case after incremental fetch sessions are established - it 
may be the case that the first "full" fetch session is safe.

Optional.empty is returned for the FetchRequest.PartitionData here:

[https://github.com/apache/kafka/blob/a4cbdc6a7b3140ccbcd0e2339e28c048b434974e/core/src/main/scala/kafka/server/FetchSession.scala#L111]

I believe this affects brokers from 2.1.0 when fencing was improved on the 
replica fetcher side, and 2.3.0 and above for consumers, which is when client 
side truncation detection was added on the consumer side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Permission to create KIP

2019-12-27 Thread Matthias J. Sax
What is your wiki account id?


-Matthias

On 12/27/19 7:12 AM, Sagar wrote:
> Hi,
> 
> I have done some work on adding prefix scan for state store and would like
> to create a KIP for the same.
> 
> Thanks!
> Sagar.
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-515: Enable ZK client to use the new TLS supported authentication

2019-12-27 Thread Ron Dagostino
Hi everyone.  I would like to make the following changes to the KIP.

MOTIVATION:
Include a statement that it will be difficult in the short term to
deprecate direct Zookeeper communication in kafka-configs.{sh, bat} (which
invoke kafka.admin.ConfigCommand) because bootstrapping a Kafka cluster
with encrypted passwords in Zookeeper is an explicitly-supported use case;
therefore it is in scope to be able to securely configure the CLI tools
that still leverage non-deprecated direct Zookeeper communication for TLS
(the other 2 tools are kafka-reassign-partitions.{sh, bat} and
zookeeper-security-migration.sh).

GOALS:
Support the secure configuration of TLS-encrypted communication between
Zookeeper and:
  a) Kafka brokers
  b) The three CLI tools mentioned above that still support direct,
non-deprecated communication to Zookeeper
It is explicitly out-of-scope to deprecate any direct Zookeeper
communication in CLI tools as part of this KIP; such work will occur in
future KIPs instead.

PUBLIC INTERFACES:
1) The following new broker configurations will be recognized.
  zookeeper.client.secure (default value = false, for backwards
compatibility)
  zookeeper.clientCnxnSocket
  zookeeper.ssl.keyStore.location
  zookeeper.ssl.keyStore.password
  zookeeper.ssl.trustStore.location
  zookeeper.ssl.trustStore.password
It will be an error for any of the last 5 values to be left unspecified if
zookeeper.client.secure is explicitly set to true.

2) In addition, the kafka.security.authorizer.AclAuthorizer class supports
the ability to connect to a different Zookeeper instance than the one the
brokers use.  We therefore also add the following optional configs, which
override the corresponding ones from above when present:
  authorizer.zookeeper.client.secure
  authorizer.zookeeper.clientCnxnSocket
  authorizer.zookeeper.ssl.keyStore.location
  authorizer.zookeeper.ssl.keyStore.password
  authorizer.zookeeper.ssl.trustStore.location
  authorizer.zookeeper.ssl.trustStore.password

3) The three CLI tools mentioned above will support a new --zk-tls-config-file
" option.  The following
properties will be recognized in that file, and unrecognized properties
will be ignored to allow the possibility of pointing zk-tls-config-file at
the broker's config file.
  zookeeper.client.secure (default value = false)
  zookeeper.clientCnxnSocket
  zookeeper.ssl.keyStore.location
  zookeeper.ssl.keyStore.password
  zookeeper.ssl.trustStore.location
  zookeeper.ssl.trustStore.password
It will be an error for any of the last 5 values to be left unspecified if
zookeeper.client.secure is explicitly set to true.

Ron

On Mon, Dec 23, 2019 at 3:03 PM Ron Dagostino  wrote:

> Hi everyone.  Let's get this discussion going again now that Kafka 2.4
> with Zookeeper 3.5.6 is out.
>
> First, regarding the KIP number, the other KIP that was using this number
> moved to KIP 534, so KIP 515 remains the correct number for this
> discussion.  I've updated the Kafka Improvement Proposal page to list this
> KIP in the 515 slot, so we're all set there.
>
> Regarding the actual issues under discussion, I think there are some
> things we should clarify.
>
> 1) It is possible to use TLS connectivity to Zookeeper from Apache Kafka
> 2.4 -- the problem is that configuration information has to be passed via
> system properties as "-D" command line options on the java invocation of
> the broker, and those are not secure (anyone with access to the box can see
> the command line used to invoke the broker); the configuration includes
> sensitive information (e.g. a keystore password), so we need a secure
> mechanism for passing the configuration values.  I believe the real
> motivation for this KIP is to harden the configuration mechanism for
> Zookeeper TLS connectivity.
>
> 2) I believe the list of CLI tools that continue to use direct Zookeeper
> connectivity in a non-deprecated fashion is:
>   a) zookeeper-security-migration.sh/kafka.admin.ZkSecurityMigrator
>   b)
> kafka-reassign-partitions.{sh,bat}/kafka.admin.ReassignPartitionsCommand
>   c) kafka-configs.{sh, bat}/kafka.admin.ConfigCommand
>
> 3) I believe Kafka.admin.ConfigCommand presents a conundrum because it
> explicitly states in a comment that a supported use case is bootstrapping a
> Kafka cluster with encrypted passwords in Zookeeper (see
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ConfigCommand.scala#L65).
> This means it will be especially difficult to deprecate this particular
> direct Zookeeper connectivity without a different storage mechanism for
> dynamic configuration values being available (i.e. the self-managed quorum
> referred to in KIP-500).
>
> I think it would be easier and simpler to harden the Zookeeper TLS
> configuration in both Kafka and in CLI tools compared to trying to
> deprecate the direct Zookeeper connectivity in the above 3 CLI tools --
> especially when ConfigCommand has no obvious short-term path to deprecation.
>
> Regarding how to harden the

Re: Kafka 2.4.0 & Mirror Maker 2.0 Error

2019-12-27 Thread Ryanne Dolan
Thanks Peter, I'll take a look.

Ryanne

On Fri, Dec 27, 2019, 7:48 AM Péter Sinóros-Szabó
 wrote:

> Hi,
>
> I see the same.
> I just downloaded the Kafka zip and I run:
>
> ~/kafka-2.4.0-rc3$ ./bin/connect-mirror-maker.sh
> config/connect-mirror-maker.properties
>
> Peter
>
> On Mon, 16 Dec 2019 at 17:14, Ryanne Dolan  wrote:
>
> > Hey Jamie, are you running the MM2 connectors on an existing Connect
> > cluster, or with the connet-mirror-maker.sh driver? Given your question
> > about plugin.path I'm guessing the former. Is the Connect cluster running
> > 2.4.0 as well? The jars should land in the Connect runtime without any
> need
> > to modify the plugin.path or copy jars around.
> >
> > Ryanne
> >
> > On Mon, Dec 16, 2019, 6:23 AM Jamie  wrote:
> >
> > > Hi All,
> > > I'm trying to set up mirror maker 2.0 with Kafka 2.4.0 however, I'm
> > > receiving the following errors on startup:
> > > ERROR Plugin class loader for connector
> > > 'org.apache.kafka.connect.mirror.MirrorSourceConnector' was not found.
> > > Returning:
> > >
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader@187eb9a8
> > > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> > > ERROR Plugin class loader for connector
> > > 'org.apache.kafka.connect.mirror.MirrorHeartbeatConnector' was not
> > > found. Returning:
> > >
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader@187eb9a8
> > > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> > > ERROR Plugin class loader for connector
> > > 'org.apache.kafka.connect.mirror.MirrorCheckpointConnector' was not
> > > found. Returning:
> > >
> org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader@187eb9a8
> > > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> > >
> > > I've checked the jar file containing these class file is in the class
> > > path.
> > > Is there anything I need to add to plugin.path for the connect
> properties
> > > when running mirror maker?
> > > Many Thanks,
> > > Jamie
> >
>
>
> --
>  - Sini
>


Re: Kafka 2.4.0 & Mirror Maker 2.0 Error

2019-12-27 Thread Péter Sinóros-Szabó
Hi,

I see the same.
I just downloaded the Kafka zip and I run:

~/kafka-2.4.0-rc3$ ./bin/connect-mirror-maker.sh
config/connect-mirror-maker.properties

Peter

On Mon, 16 Dec 2019 at 17:14, Ryanne Dolan  wrote:

> Hey Jamie, are you running the MM2 connectors on an existing Connect
> cluster, or with the connet-mirror-maker.sh driver? Given your question
> about plugin.path I'm guessing the former. Is the Connect cluster running
> 2.4.0 as well? The jars should land in the Connect runtime without any need
> to modify the plugin.path or copy jars around.
>
> Ryanne
>
> On Mon, Dec 16, 2019, 6:23 AM Jamie  wrote:
>
> > Hi All,
> > I'm trying to set up mirror maker 2.0 with Kafka 2.4.0 however, I'm
> > receiving the following errors on startup:
> > ERROR Plugin class loader for connector
> > 'org.apache.kafka.connect.mirror.MirrorSourceConnector' was not found.
> > Returning:
> > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader@187eb9a8
> > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> > ERROR Plugin class loader for connector
> > 'org.apache.kafka.connect.mirror.MirrorHeartbeatConnector' was not
> > found. Returning:
> > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader@187eb9a8
> > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> > ERROR Plugin class loader for connector
> > 'org.apache.kafka.connect.mirror.MirrorCheckpointConnector' was not
> > found. Returning:
> > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader@187eb9a8
> > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
> >
> > I've checked the jar file containing these class file is in the class
> > path.
> > Is there anything I need to add to plugin.path for the connect properties
> > when running mirror maker?
> > Many Thanks,
> > Jamie
>


-- 
 - Sini


Permission to create KIP

2019-12-27 Thread Sagar
Hi,

I have done some work on adding prefix scan for state store and would like
to create a KIP for the same.

Thanks!
Sagar.


Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-12-27 Thread Ivan Yurchenko
Hi all,


Jun:
> (a) Cost: S3 list object requests cost $0.005 per 1000 requests. If you
> have 100,000 partitions and want to pull the metadata for each partition
at
> the rate of 1/sec. It can cost $0.5/sec, which is roughly $40K per day.

I want to note here, that no reasonably durable storage will be cheap
at 100k RPS. For example, DynamoDB might give the same ballpark figures.
If we want to keep the pull-based approach, we can try to reduce this number
in several ways: doing listings less frequently (as Satish mentioned,
with the current defaults it's ~3.33k RPS for your example),
batching listing operations in some way (depending on the storage;
it might require the change of RSM's interface).


> There are different ways for doing push based metadata propagation. Some
> object stores may support that already. For example, S3 supports events
> notification
This sounds interesting. However, I see a couple of issues using it:
  1. As I understand the documentation, notification delivery is not
guaranteed
and it's recommended to periodically do LIST to fill the gaps.
Which brings us back to the same LIST consistency guarantees issue.
  2. The same goes for the broker start: to get the current state, we need
to LIST.
  3. The dynamic set of multiple consumers (RSMs): AFAIK SQS and SNS aren't
designed for such a case.


Alexandre:
> A.1 As commented on PR 7561, S3 consistency model [1][2] implies RSM
cannot
> relies solely on S3 APIs to guarantee the expected strong consistency. The
> proposed implementation [3] would need to be updated to take this into
> account. Let’s talk more about this.

Thank you for the feedback. I clearly see the need for changing the S3
implementation
to provide stronger consistency guarantees. As it see from this thread,
there are
several possible approaches to this. Let's discuss RemoteLogManager's
contract and
behavior (like pull vs push model) further before picking one (or several -
?) of them.
I'm going to do some evaluation of DynamoDB for the pull-based approach,
if it's possible to apply it paying a reasonable bill. Also, of the
push-based approach
with a Kafka topic as the medium.


> A.2.3 Atomicity – what does an implementation of RSM need to provide with
> respect to atomicity of the APIs copyLogSegment, cleanupLogUntil and
> deleteTopicPartition? If a partial failure happens in any of those (e.g.
in
> the S3 implementation, if one of the multiple uploads fails [4]),

The S3 implementation is going to change, but it's worth clarifying anyway.
The segment log file is being uploaded after S3 has acked uploading of
all other files associated with the segment and only after this the whole
segment file set becomes visible remotely for operations like
listRemoteSegments [1].
In case of upload failure, the files that has been successfully uploaded
stays
as invisible garbage that is collected by cleanupLogUntil (or overwritten
successfully later).
And the opposite happens during the deletion: log files are deleted first.
This approach should generally work when we solve consistency issues
by adding a strongly consistent storage: a segment's uploaded files remain
invisible garbage until some metadata about them is written.


> A.3 Caching – storing locally the segments retrieved from the remote
> storage is excluded as it does not align with the original intent and even
> defeat some of its purposes (save disk space etc.). That said, could there
> be other types of use cases where the pattern of access to the remotely
> stored segments would benefit from local caching (and potentially
> read-ahead)? Consider the use case of a large pool of consumers which
start
> a backfill at the same time for one day worth of data from one year ago
> stored remotely. Caching the segments locally would allow to uncouple the
> load on the remote storage from the load on the Kafka cluster. Maybe the
> RLM could expose a configuration parameter to switch that feature on/off?

I tend to agree here, caching remote segments locally and making
this configurable sounds pretty practical to me. We should implement this,
maybe not in the first iteration.


Br,
Ivan

[1]
https://github.com/harshach/kafka/pull/18/files#diff-4d73d01c16caed6f2548fc3063550ef0R152

On Thu, 19 Dec 2019 at 19:49, Alexandre Dupriez 
wrote:

> Hi Jun,
>
> Thank you for the feedback. I am trying to understand how a push-based
> approach would work.
> In order for the metadata to be propagated (under the assumption you
> stated), would you plan to add a new API in Kafka to allow the
> metadata store to send them directly to the brokers?
>
> Thanks,
> Alexandre
>
>
> Le mer. 18 déc. 2019 à 20:14, Jun Rao  a écrit :
> >
> > Hi, Satish and Ivan,
> >
> > There are different ways for doing push based metadata propagation. Some
> > object stores may support that already. For example, S3 supports events
> > notification (
> > https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html).
> > Otherwise one could use

[jira] [Created] (KAFKA-9337) Simplifying standalone mm2-connect config

2019-12-27 Thread karan kumar (Jira)
karan kumar created KAFKA-9337:
--

 Summary: Simplifying standalone mm2-connect config
 Key: KAFKA-9337
 URL: https://issues.apache.org/jira/browse/KAFKA-9337
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect, mirrormaker
Affects Versions: 2.4.0
Reporter: karan kumar


One of the nice things about kafka is setting up in the local environment is 
really simple. I was giving a try to the latest feature ie MM2 and found it 
took me some time to get a minimal setup running. 
Default config provided assumes that there will already be 3 brokers running 
due to the default replication factor of the admin topics the mm2 connector 
creates. 

This got me thinking that most of the people would follow the same approach I 
followed. 
1. Start a single broker cluster on 9092 
2. Start another single cluster broker on, let's say, 10002 
3. Start mm2 by"./bin/connect-mirror-maker.sh 
./config/connect-mirror-maker.properties" 

What happened was I had to supply a lot more configs 

This jira is created post discussion on the mailing list:
https://lists.apache.org/thread.html/%3ccajxudh13kw3nam3ho69wrozsyovwue1nxf9hkcbawc9r-3d...@mail.gmail.com%3E

cc [~ryannedolan]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-552: Add interface to handle unused config

2019-12-27 Thread Artur Burtsev
Hi,

Indeed changing log level to debug would be the easiest and I think
that would be a good solution. When no one object I'm ready to move
forward with this approach and submit a MR.

The only minor thing I have – having it at debug log level might make
it a bit less friendly for developers, especially for those who just
do the first steps in Kafka. For example, if you misspelled the
property name and trying to understand why things don't do what you
expect. Having a warning might save some time in this case. Other than
that I cannot see any reasons to have warnings there.

Thanks,
Artur

On Thu, Dec 26, 2019 at 10:01 PM John Roesler  wrote:
>
> Thanks for the KIP, Artur!
>
> For reference, here is the kip: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-552%3A+Add+interface+to+handle+unused+config
>
> I agree, these warnings are kind of a nuisance. Would it be feasible just to 
> leverage log4j in some way to make it easy to filter these messages? For 
> example, we could move those warnings to debug level, or even use a separate 
> logger for them.
>
> Thanks for starting the discussion.
> -John
>
> On Tue, Dec 24, 2019, at 07:23, Artur Burtsev wrote:
> > Hi,
> >
> > This KIP provides a way to deal with a warning "The configuration {}
> > was supplied but isn't a known config." when it is not relevant.
> >
> > Cheers,
> > Artur
> >