Re: [DISCUSS] Java 23 Support for 3.9.x

2024-11-20 Thread David Arthur
Greg,

I have not been following this closely, so apologies for some basic
questions.

Has the SecurityManager been fully removed in JDK 23?

What is the effect of running Kafka 3.9.0 with JDK 23?

By "4.0 breaking changes" do you mean changes to our JDK/Scala supported
versions, removal or ZK, Kafka API changes, or something else?

In general, I do not think we should change our supported JDK versions in a
hotfix release. I see https://issues.apache.org/jira/browse/KAFKA-17638
which explicitly adds JDK 23 to our CI with a fix version of 4.0.0. Lack of
support for JDK 23 in 3.9.x is not a bug, it is what we planned (as far as
I can tell).

Also, I feel that we should not add too much to 3.9.x aside from actual
bugs. If we backport things into 3.9.x, it will slow adoption of 4.x and
increase our maintenance burden over time.

Just my $0.02

Thanks!
David A

On Wed, Nov 20, 2024 at 12:22 PM Greg Harris 
wrote:

> Hi all,
>
> Now that 3.9.0 is released and 4.0.x is progressing, I'd like to understand
> everyone's expectations about the 3.9.x branch, and ask for a specific
> consensus on Java 23 support.
>
> Some context that I think is relevant to the discussion:
> * KIP-1006 [1] proposes a backwards-compatible strategy for handling the
> ongoing removal of the SecurityManager, which is merged and due to release
> in 4.0.0 [2].
> * KIP-1012 [3] rejected ongoing parallel feature development on a 3.x
> branch while having trunk on 4.x.
> * During the 3.9.0 release, the patch [2] was rejected [4] due to being a
> new feature which did not meet the feature freeze deadline.
> * Other than the SecurityManager removal, there are additional PRs which
> would also need to be backported for full Java 23 support [5] including a
> Scala patch upgrade.
> * Downstream users are asking for a backport [6] because adding support for
> Java 23 would obligate them to also include the 4.0 breaking changes.
>
> So while adding Java version support in the past has been a KIP-less
> feature and normally only appears in the next version, it happens to align
> with a major version bump this time. This will cause additional pain for
> users if we do not elect to backport this.
>
> Thanks,
> Greg
>
> [1]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1006%3A+Remove+SecurityManager+Support
> [2] https://github.com/apache/kafka/pull/16522
> [3]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1012%3A+The+need+for+a+Kafka+3.8+and+3.9+release
> [4] https://lists.apache.org/thread/xy5rwd1w274qgpwf3qxxnzlqpoly5d4p
> [5] https://issues.apache.org/jira/browse/KAFKA-17638
> [6] https://github.com/apache/kafka/pull/16522#issuecomment-2488340682
>


-- 
David Arthur


Re: production ready for zookeeper to kraft migration

2024-04-04 Thread David Arthur
Matthieu,

There are a few things to look out for during the migration.

1) Migration not starting. This is pretty much always going to be due to
mis-configuration. Controller KRaftMigrationDriver logs should reveal what
its waiting for
2) Migration of metadata fails. If some poison record in ZK causes the
metadata migration to fail, you'll see stacktraces and ERRORs in the
controller logs. The brokers will also not see an active controller and
won't be getting metadata updates.
3) Something in the dual-write path fails. If the write back to ZK fails
during dual-write mode (with ZK brokers or KRaft brokers), you'll see an
increasing lag (ZkWriteBehindLag metric)

So, in general: lack of an active controller, metadata not being propagated
(e.g., ISRs not getting updated), errors in the logs.

I'll try to get this written up in the docs soon.

Cheers,
David


On Thu, Apr 4, 2024 at 12:56 PM Matthieu Patou  wrote:

> Hey Luke,
>
> Thank you for the update.
> Out of curiosity if the migration is not working what are the symptoms ? is
> it just that the controller won't show that the migration is complete ? or
> could the controller claim (wrongfully) that the migration is complete when
> it's not ?
>
> Best.
> Matthieu
>
> On Wed, Apr 3, 2024 at 4:53 PM Luke Chen  wrote:
>
> > Hi Matthieu,
> >
> > Yes, the ZK migrating to KRaft feature is already GA in v3.6.0.
> > Sorry, we forgot to update the document in the Kafka-site repo.
> > I've filed a PR for it: https://github.com/apache/kafka-site/pull/594
> >
> > Thanks.
> > Luke
> >
> > On Thu, Apr 4, 2024 at 6:14 AM Matthieu Patou 
> wrote:
> >
> > > I looked at the notes for 3.7.x and the migration from ZK to Kraft is
> > still
> > > not marked as production ready.
> > >
> > > I'm wondering what are the issues that people could be facing during
> the
> > > migration.
> > >
> > > If 4.0 is still planned to be the full removal for ZK, is there a plan
> > for
> > > something after 3.7 to mark ZK migration as production ready ?
> > >
> > > Best.
> > >
> > > Matthieu.
> > >
> >
>


-- 
-David


Re: Kraft Multi Cluster (chroot equivalent)

2024-04-01 Thread David Arthur
Omer,

Thanks for the email. This is an interesting thing to consider.
Conceptually, there is no reason why the controllers couldn't manage the
metadata for multiple brokers. The main counter-argument I can think of is
essentially the same as the motivation -- less isolation. With a shared
controller, one "noisy" broker cluster that put a lot of load on the
controller could affect metadata availability/latency for other broker
clusters. Related to this, having multiple broker clusters share one
controller cluster means a larger blast radius for controller failures.

The "noisy neighbor" problem could be mitigated with a good implementation,
but the failure coupling cannot.

In the containerized world, resources are abstracted away, so there is not
so much overhead to run a set of dedicated controller nodes. Even with
bare-metal hardware, controller processes can be run on the same nodes as
broker processes if needed.


The 2+1 data center example seems a bit tangential to me.

> This way metadata and data would have different level of availability
and it enable enterprises to design a more cost effective solution by
separating metadata and data service layer

Is the idea here to have a multi-region controller quorum and then single
region broker clusters? Could you achieve the same thing with one large
Kafka cluster spread across regions but with topics having assignments that
kept them region local? Is the "cost effectiveness" you're after just
inter-broker networking costs?

Maybe you could expand on this scenario and help motivate it a bit more?

-David


Re: In Kafka KRaft can controllers participate as bootstrap servers

2023-12-13 Thread David Arthur
Only brokers can be specified as --bootstrap-servers for AdminClient (the
bin/kafka-* scripts).

In 3.7, we are adding the ability to bootstrap from KRaft controllers for
certain scripts. In this case, the scripts will use --bootstrap-controllers
(the details are in
https://cwiki.apache.org/confluence/display/KAFKA/KIP-919%3A+Allow+AdminClient+to+Talk+Directly+with+the+KRaft+Controller+Quorum+and+add+Controller+Registration
)

But in general, no controllers cannot be used as bootstrap servers.

-David

On Tue, Dec 5, 2023 at 10:05 AM Dima Brodsky  wrote:

> Hello, question,
>
> If I have my kafka cluster behind a VIP for bootstrapping, is it possible
> to have the controllers participate in the bootstrap process or only
> brokers can?
>
> Thanks!
> ttyl
> Dima
>
> --
> ddbrod...@gmail.com
>
> "The price of reliability is the pursuit of the utmost simplicity.
> It is a price which the very rich find the most hard to pay."
>(Sir
> Antony Hoare, 1980)
>


-- 
-David


Re: How to migrate single node Kafka with zookeeper to Kraft without ending up with additional Kraft controller node

2023-07-05 Thread David Arthur
Pengcheng,

Right now, migrating to a combined broker+controller is not supported. The
main reason for this is we don't have a way to change a KRaft broker into a
combined KRaft broker + controller. It is probably possible, and may be
supported at some point, but for now we're focusing on getting the
migration feature polished for production use cases.

Thanks!
David

On Mon, Jul 3, 2023 at 5:25 PM Pengcheng Wang
 wrote:

> Hi,
>
> We have a single node kafka and single node zookeeper, which we want to
> migrate to a single node kraft (which act as both controller and broker) so
> we can drop the zookeeper.
>
> I know we have the documentation here “
> https://kafka.apache.org/documentation/#kraft_zk_migration” for zookeeper
> to Kraft Migration. I followed it and the migration was successful, but I
> end up with an additional kraft controller node which was provisioned for
> the migration purpose. So my question is: is there a way for us to remove
> that additional kraft controller node, so we end up with a single node
> kafka as before?
>
> I know it’s not recommended for production, but given that a single kraft
> node can act as both broker and controller, I guess this is possible?
> Thanks!
>
> Best,
> Pengcheng
>


-- 
-David


Re: Kafka Node Shutting Down Automatically

2023-05-05 Thread David Arthur
Akshay, this was recently fixed by Luke Chen. It will be a part of the 3.5
and 3.4.1 releases.

For reference, here is the bug
https://issues.apache.org/jira/browse/KAFKA-14946 and the fix
https://github.com/apache/kafka/pull/13653

Cheers,
David

On Tue, Apr 25, 2023 at 1:23 PM 
wrote:

> On 2023-04-22 02:53, Luke Chen wrote:
> > Hi Akshay,
> >
> > Thanks for reporting the issue.
> > It looks like a bug.
> > Could you open a JIRA 
> > ticket
> > to track it?
> >
> > Thank you.
> > Luke
> >
> >
> > On Fri, Apr 21, 2023 at 10:16 PM Akshay Kumar
> > 
> > wrote:
> >
> >> Hello team,
> >>
> >>- We are using the zookeeper less Kafka (kafka Kraft).
> >>- The cluster is having 3 nodes.
> >>- One of the nodes gets automatically shut down randomly.
> >>- Checked the logs but didn't get the exact reason.
> >>- Sharing the logs below. Kafka version - 3.3.1
> >>
> >> *Logs - *
> >>
> >> [2023-04-13 01:49:17,411] WARN [Controller 1] Renouncing the
> >> leadership
> >> due to a metadata log event. We were the leader at epoch 37110, but in
> >> the
> >> new epoch 37111, the leader is (none). Reverting to last committed
> >> offset
> >> 28291464. (org.apache.kafka.controller.QuorumController)
> >> [2023-04-13 01:49:17,531] INFO [RaftManager nodeId=1] Completed
> >> transition
> >> to Unattached(epoch=37112, voters=[1, 2, 3], electionTimeoutMs=982)
> >> (org.apache.kafka.raft.QuorumState)
> >>
> >> [2023-04-13 02:00:33,902] WARN [Controller 1] Renouncing the
> >> leadership
> >> due to a metadata log event. We were the leader at epoch 37116, but in
> >> the
> >> new epoch 37117, the leader is (none). Reverting to last committed
> >> offset
> >> 28292807. (org.apache.kafka.controller.QuorumController)
> >> [2023-04-13 02:00:33,936] INFO [RaftManager nodeId=1] Completed
> >> transition
> >> to Unattached(epoch=37118, voters=[1, 2, 3], electionTimeoutMs=1497)
> >> (org.apache.kafka.raft.QuorumState)
> >>
> >> [2023-04-13 02:00:35,014] ERROR [Controller 1] processBrokerHeartbeat:
> >> unable to start processing because of NotControllerException.
> >> (org.apache.kafka.controller.QuorumController)
> >>
> >> [2023-04-13 02:12:21,883] WARN [Controller 1] Renouncing the
> >> leadership
> >> due to a metadata log event. We were the leader at epoch 37129, but in
> >> the
> >> new epoch 37131, the leader is (none). Reverting to last committed
> >> offset
> >> 28294206. (org.apache.kafka.controller.QuorumController)
> >>
> >> [2023-04-13 02:13:41,328] WARN [Controller 1] Renouncing the
> >> leadership
> >> due to a metadata log event. We were the leader at epoch 37141, but in
> >> the
> >> new epoch 37142, the leader is (none). Reverting to last committed
> >> offset
> >> 28294325. (org.apache.kafka.controller.QuorumController)
> >>
> >> [2023-04-13 02:13:41,328] INFO [Controller 1] writeNoOpRecord: failed
> >> with
> >> NotControllerException in 16561838 us
> >> (org.apache.kafka.controller.QuorumController)
> >>
> >> [2023-04-13 02:13:41,328] INFO [Controller 1] maybeFenceReplicas:
> >> failed
> >> with NotControllerException in 8520846 us
> >> (org.apache.kafka.controller.QuorumController)
> >>
> >> [2023-04-13 02:13:41,328] INFO [BrokerToControllerChannelManager
> >> broker=1
> >> name=heartbeat] Client requested disconnect from node 1
> >> (org.apache.kafka.clients.NetworkClient)
> >> [2023-04-13 02:13:41,329] INFO [BrokerLifecycleManager id=1] Unable to
> >> send a heartbeat because the RPC got timed out before it could be
> >> sent.
> >> (kafka.server.BrokerLifecycleManager)
> >> [2023-04-13 02:13:41,351] ERROR Encountered fatal fault: exception
> >> while
> >> renouncing leadership
> >> (org.apache.kafka.server.fault.ProcessExitingFaultHandler)
> >> java.lang.NullPointerException
> >> at
> >>
> org.apache.kafka.timeline.SnapshottableHashTable$HashTier.mergeFrom(SnapshottableHashTable.java:125)
> >> at
> >> org.apache.kafka.timeline.Snapshot.mergeFrom(Snapshot.java:68)
> >> at
> >>
> org.apache.kafka.timeline.SnapshotRegistry.deleteSnapshot(SnapshotRegistry.java:236)
> >> at
> >>
> org.apache.kafka.timeline.SnapshotRegistry$SnapshotIterator.remove(SnapshotRegistry.java:67)
> >> at
> >>
> org.apache.kafka.timeline.SnapshotRegistry.revertToSnapshot(SnapshotRegistry.java:214)
> >> at
> >>
> org.apache.kafka.controller.QuorumController.renounce(QuorumController.java:1232)
> >> at
> >>
> org.apache.kafka.controller.QuorumController.access$3300(QuorumController.java:150)
> >> at
> >>
> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleLeaderChange$3(QuorumController.java:1076)
> >> at
> >>
> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$4(QuorumController.java:1101)
> >> at
> >>
> org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:496)
> >> at
> >>
> org.apache.kafka.queue.Kafka

Re: Kafka Node Shutting Down Automatically

2023-04-22 Thread David Arthur
Akshay, this looks a lot like
https://issues.apache.org/jira/browse/KAFKA-14035 which was fixed for
3.3.0. Can you upload complete controller logs to that JIRA (or a new one
if you prefer)?

Thanks!
David

On Sat, Apr 22, 2023 at 2:54 AM Luke Chen  wrote:

> Hi Akshay,
>
> Thanks for reporting the issue.
> It looks like a bug.
> Could you open a JIRA  ticket
> to track it?
>
> Thank you.
> Luke
>
>
> On Fri, Apr 21, 2023 at 10:16 PM Akshay Kumar
> 
> wrote:
>
> > Hello team,
> >
> >- We are using the zookeeper less Kafka (kafka Kraft).
> >- The cluster is having 3 nodes.
> >- One of the nodes gets automatically shut down randomly.
> >- Checked the logs but didn't get the exact reason.
> >- Sharing the logs below. Kafka version - 3.3.1
> >
> > *Logs - *
> >
> > [2023-04-13 01:49:17,411] WARN [Controller 1] Renouncing the leadership
> > due to a metadata log event. We were the leader at epoch 37110, but in
> the
> > new epoch 37111, the leader is (none). Reverting to last committed offset
> > 28291464. (org.apache.kafka.controller.QuorumController)
> > [2023-04-13 01:49:17,531] INFO [RaftManager nodeId=1] Completed
> transition
> > to Unattached(epoch=37112, voters=[1, 2, 3], electionTimeoutMs=982)
> > (org.apache.kafka.raft.QuorumState)
> >
> > [2023-04-13 02:00:33,902] WARN [Controller 1] Renouncing the leadership
> > due to a metadata log event. We were the leader at epoch 37116, but in
> the
> > new epoch 37117, the leader is (none). Reverting to last committed offset
> > 28292807. (org.apache.kafka.controller.QuorumController)
> > [2023-04-13 02:00:33,936] INFO [RaftManager nodeId=1] Completed
> transition
> > to Unattached(epoch=37118, voters=[1, 2, 3], electionTimeoutMs=1497)
> > (org.apache.kafka.raft.QuorumState)
> >
> > [2023-04-13 02:00:35,014] ERROR [Controller 1] processBrokerHeartbeat:
> > unable to start processing because of NotControllerException.
> > (org.apache.kafka.controller.QuorumController)
> >
> > [2023-04-13 02:12:21,883] WARN [Controller 1] Renouncing the leadership
> > due to a metadata log event. We were the leader at epoch 37129, but in
> the
> > new epoch 37131, the leader is (none). Reverting to last committed offset
> > 28294206. (org.apache.kafka.controller.QuorumController)
> >
> > [2023-04-13 02:13:41,328] WARN [Controller 1] Renouncing the leadership
> > due to a metadata log event. We were the leader at epoch 37141, but in
> the
> > new epoch 37142, the leader is (none). Reverting to last committed offset
> > 28294325. (org.apache.kafka.controller.QuorumController)
> >
> > [2023-04-13 02:13:41,328] INFO [Controller 1] writeNoOpRecord: failed
> with
> > NotControllerException in 16561838 us
> > (org.apache.kafka.controller.QuorumController)
> >
> > [2023-04-13 02:13:41,328] INFO [Controller 1] maybeFenceReplicas: failed
> > with NotControllerException in 8520846 us
> > (org.apache.kafka.controller.QuorumController)
> >
> > [2023-04-13 02:13:41,328] INFO [BrokerToControllerChannelManager broker=1
> > name=heartbeat] Client requested disconnect from node 1
> > (org.apache.kafka.clients.NetworkClient)
> > [2023-04-13 02:13:41,329] INFO [BrokerLifecycleManager id=1] Unable to
> > send a heartbeat because the RPC got timed out before it could be sent.
> > (kafka.server.BrokerLifecycleManager)
> > [2023-04-13 02:13:41,351] ERROR Encountered fatal fault: exception while
> > renouncing leadership
> > (org.apache.kafka.server.fault.ProcessExitingFaultHandler)
> > java.lang.NullPointerException
> > at
> >
> org.apache.kafka.timeline.SnapshottableHashTable$HashTier.mergeFrom(SnapshottableHashTable.java:125)
> > at org.apache.kafka.timeline.Snapshot.mergeFrom(Snapshot.java:68)
> > at
> >
> org.apache.kafka.timeline.SnapshotRegistry.deleteSnapshot(SnapshotRegistry.java:236)
> > at
> >
> org.apache.kafka.timeline.SnapshotRegistry$SnapshotIterator.remove(SnapshotRegistry.java:67)
> > at
> >
> org.apache.kafka.timeline.SnapshotRegistry.revertToSnapshot(SnapshotRegistry.java:214)
> > at
> >
> org.apache.kafka.controller.QuorumController.renounce(QuorumController.java:1232)
> > at
> >
> org.apache.kafka.controller.QuorumController.access$3300(QuorumController.java:150)
> > at
> >
> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleLeaderChange$3(QuorumController.java:1076)
> > at
> >
> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$4(QuorumController.java:1101)
> > at
> >
> org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:496)
> > at
> >
> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
> > at
> >
> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
> > at
> >
> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
> 

Re: Kafka Cluster WITHOUT Zookeeper

2023-03-28 Thread David Arthur
Paul, thanks for the articles. It's great to see someone digging into
KRaft! It would be interesting to see your experiments against a newer
version of KRaft, such as 3.4.0. Also, we hope to improve on the maximum
number of partitions with the addition of
https://cwiki.apache.org/confluence/display/KAFKA/KIP-868+Metadata+Transactions
which is likely to land in Kafka 3.5.0.

Cheers,
David

On Mon, Mar 27, 2023 at 6:32 PM Brebner, Paul
 wrote:

> I have a recent 3 part blog series on Kraft (expanded version of ApacheCon
> 2022 talk):
>
>
>
>
> https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-1-partitions-and-data-performance/
>
>
> https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-2-partitions-and-meta-data-performance/
>
>
> https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-3-maximum-partitions-and-conclusions/
>
>
>
> Regards, Paul
>
>
>
> *From: *Chia-Ping Tsai 
> *Date: *Monday, 27 March 2023 at 5:37 pm
> *To: *d...@kafka.apache.org 
> *Cc: *users@kafka.apache.org ,
> mmcfarl...@cavulus.com , Israel Ekpo <
> israele...@gmail.com>, ranlupov...@gmail.com ,
> scante...@gmail.com , show...@gmail.com <
> show...@gmail.com>, sunilmchaudhar...@gmail.com <
> sunilmchaudhar...@gmail.com>
> *Subject: *Re: Kafka Cluster WITHOUT Zookeeper
>
> *NetApp Security WARNING*: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> hi
>
>
>
> You can use the keyword “kraft” to get the answer by google or chatgpt.
> For example:
>
>
>
> Introduction:
>
> KRaft - Apache Kafka Without ZooKeeper
> 
>
> developer.confluent.io 
>
>
>
>
>
> QuickStart:
>
> Apache Kafka 
>
> kafka.apache.org 
>
>
>
>
>
> —
>
> Chia-Ping
>
>
>
>
>
>
>
> Kafka Life  於 2023年3月27日 下午1:33 寫道:
>
> Hello  Kafka experts
>
> Is there a way where we can have Kafka Cluster be functional serving
> producers and consumers without having Zookeeper cluster manage the
> instance .
>
> Any particular version of kafka for this or how can we achieve this please
>
>

-- 
-David


Re: Question about KRaft

2023-03-10 Thread David Arthur
Hi Zhenyu,

> Currently I am using 3.3.2 (upgrade from 3.2) with only one node, which is
both controller & broker, even ZK is installed on this node too (sorry I
know it is not distributed and I will try to improve it with more knowledge
learned in future)

Controllers are always colocated with brokers in ZK mode. Only in
KRaft mode do we separate the two concepts by introducing the
"process.roles" configuration.

As Luke mentioned, you can try a regular migration by following the
docs and it should work. Essentially, you would be bringing up a new
KRaft controller (on the same or different server) and letting it do a
migration of your single-node Kafka cluster. Once you've gone through
all the steps, you should have a single KRaft broker and a single
KRaft controller. At that point you can decommission ZooKeeper.

If you run into any trouble, feel free to reach out here on the users
list or file a JIRA (if you think you found a bug 😉)
https://issues.apache.org/jira/browse/KAFKA

Cheers,
David

On Fri, Mar 10, 2023 at 12:17 AM Luke Chen  wrote:
>
> For questions related to confluent, I think you'd better ask in their
> channel.
>
> Luke
>
> On Fri, Mar 10, 2023 at 12:54 PM sunil chaudhari <
> sunilmchaudhar...@gmail.com> wrote:
>
> > Hi Luke,
> > This docu is good.
> > Does it apply for confluent as well?
> >
> >
> >
> > On Fri, 10 Mar 2023 at 8:47 AM, Luke Chen  wrote:
> >
> > > Hi Zhenyu,
> > >
> > > Answering your question:
> > >
> > > > Should I simply
> > > 1. download 3.4 binary
> > > 2. stop ZK & Kafka service
> > > 3. upgrade Kafka to 3.4
> > > 4. start only Kafka service with KRaft server.properties
> > >
> > > That is not migrating, actually. That is just creating another kafka
> > > cluster in KRaft mode.
> > > The point for migration is to move metadata in ZK into KRaft controllers.
> > > You can follow the guide here to do migration:
> > > https://kafka.apache.org/documentation/#kraft_zk_migration
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Tue, Mar 7, 2023 at 11:07 PM Zhenyu Wang 
> > > wrote:
> > >
> > > > Hi Sunil,
> > > >
> > > > As mentioned earlier in my question, I have only one "combined" node as
> > > > both controller and broker, and I totally accept downtime (stop
> > service)
> > > >
> > > > So just want to ask for my case, single node, if I want to upgrade to
> > 3.4
> > > > then start service under KRaft (get rid of ZK), what would be the
> > steps?
> > > >
> > > > Thanks~
> > > >
> > > > On Mon, Mar 6, 2023 at 11:49 PM sunil chaudhari <
> > > > sunilmchaudhar...@gmail.com>
> > > > wrote:
> > > >
> > > > > How will you achieve zero downtime of you stop zookeeper and kafka?
> > > > > There must be some standard steps so that stop zookeeper one by one
> > and
> > > > > start kraft same time so that it will be migrated gradually.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, 7 Mar 2023 at 9:26 AM, Zhenyu Wang 
> > > > wrote:
> > > > >
> > > > > > Hi team,
> > > > > >
> > > > > > Here is a question about KRaft from normal user, who starts to use
> > > and
> > > > > > learn Kafka since 3.2
> > > > > >
> > > > > > Last month Kafka 3.4, the first bridge release was available, and I
> > > am
> > > > > > considering to have a plan to use KRaft (get rid of ZK) since this
> > > > > version
> > > > > >
> > > > > > Currently I am using 3.3.2 (upgrade from 3.2) with only one node,
> > > which
> > > > > is
> > > > > > both controller & broker, even ZK is installed on this node too
> > > (sorry
> > > > I
> > > > > > know it is not distributed and I will try to improve it with more
> > > > > knowledge
> > > > > > learned in future)
> > > > > >
> > > > > > When I read KIP-866, ZK to KRaft migration, from section Migration
> > > > > > Overview, seems like the document is for multi-nodes with no or
> > > almost
> > > > no
> > > > > > downtime, enable KRaft node by node; however my case accepts
> > downtime
> > > > > (one
> > > > > > node -_-!!), just want to have Kafka upgrade to 3.4 then start
> > > service
> > > > > > under KRaft mode, make sure everything works well and no log lost
> > > > > >
> > > > > > Should I simply
> > > > > > 1. download 3.4 binary
> > > > > > 2. stop ZK & Kafka service
> > > > > > 3. upgrade Kafka to 3.4
> > > > > > 4. start only Kafka service with KRaft server.properties
> > > > > >
> > > > > > Or any other thing I need to pay attention to?
> > > > > >
> > > > > > If there is a documentation as guide that would be quite helpful
> > > > > >
> > > > > > Really appreciate
> > > > > >
> > > > >
> > > >
> > >
> >



-- 
David Arthur


[ANNOUNCE] Apache Kafka 3.4.0

2023-02-07 Thread David Arthur
The Apache Kafka community is pleased to announce the release of
Apache Kafka 3.4.0.

This is a major release and it includes fixes and improvements from
over 120 JIRAs.

All of the changes in this release can be found in the release notes:
https://www.apache.org/dist/kafka/3.4.0/RELEASE_NOTES.html

An overview of the release can be found in our announcement blog post:
https://blogs.apache.org/kafka/entry/what-s-new-in-apache9

You can download the source and binary release (Scala 2.11 and Scala 2.12) from:

https://kafka.apache.org/downloads#3.4.0

---

Apache Kafka is a distributed streaming platform with four core APIs:
** The Producer API allows an application to publish a stream of
records to one or more Kafka topics.
** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.
** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output stream to one or more output topics, effectively transforming
the input streams to output streams.
** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might
capture every change to a table.

With these APIs, Kafka can be used for two broad classes of application:
** Building real-time streaming data pipelines that reliably get data
between systems or applications.
** Building real-time streaming applications that transform or react
to the streams of data.


Apache Kafka is in use at large and small companies worldwide,
including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
Zalando, among others.

A big thank you to the following 117 contributors to this release!

A. Sophie Blee-Goldman, Ahmed Sobeh, Akhilesh C, Akhilesh Chaganti,
Alan Sheinberg, aLeX, Alex Sorokoumov, Alexandre Garnier, Alyssa
Huang, Andras Katona, Andrew Borley, Andrew Dean, andymg3, Artem
Livshits, Ashmeet Lamba, Badai Aqrandista, Bill Bejeck, Bruno Cadonna,
Calvin Liu, Chase Thomas, Chia-Ping Tsai, Chris Egerton, Christo
Lolov, Christopher L. Shannon, Colin P. McCabe, Colin Patrick McCabe,
Dalibor Plavcic, Dan Stelljes, Daniel Fonai, David Arthur, David
Jacot, David Karlsson, David Mao, dengziming, Derek Troy-West, Divij
Vaidya, Edoardo Comar, Elkhan Eminov, Eugene Tolbakov, Federico
Valeri, Francesco Nigro, FUNKYE, Greg Harris, Guozhang Wang, Hao Li,
Himani Arora, Huilin Shi, Igor Soarez, Ismael Juma, James Hughes,
Janik Dotzel, Jason Gustafson, Jeff Kim, Jim Galasyn, JK-Wang, Joel
Hamill, John Roesler, Jonathan Albrecht, Jordan Bull, Jorge Esteban
Quilcate Otoya, José Armando García Sancio, Justine Olshan, K8sCat,
Kirk True, Kvicii, Levani Kokhreidze, Liam Clarke-Hutchinson,
LinShunKang, liuzc9, liuzhuang2017, Lucas Brutschy, Lucia Cerchie,
Luke Chen, Manikumar Reddy, Matthew de Detrich, Matthew Stidham,
Matthias J. Sax, Mickael Maison, Nandini Anagondi, Nick Telford,
nicolasguyomar, Niket, Niket Goel, Nikolay, Okada Haruki, Oliver
Eikemeier, Omnia G H Ibrahim, Orsák Maroš, Patrik Marton, Peter Nied,
Philip Nee, Philipp Trulson, Pratim SC, Proven Provenzano, Purshotam
Chauhan, Rajini Sivaram, Ramesh, Rens Groothuijsen, RivenSun, Rohan,
Ron Dagostino, runom, Sanjana Kaundinya, Satish Duggana, Shawn, Shay
Lin, Shenglong Zhang, srishti-saraswat, Stanislav Vodetskyi, Sushant
Mahajan, Tom Bentley, vamossagar12, venkatteki, Vicky Papavasileiou,
Walker Carlson, Yash Mayya, zou shengfu, 行路难行路


We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kafka.apache.org/

Thank you!

Regards,
David Arthur


Re: [VOTE] 3.4.0 RC2

2023-02-06 Thread David Arthur
I'd like to go ahead and close out this vote. There were three binding
votes and one non-binding vote.

Binding +1 PMC votes:
* David Jacot
* Bill Bejeck
* Mickael Maison

Non-binding community votes:
* Federico Valeri

There were no -1 votes.

The vote for Apache Kafka 3.4.0 RC2 passes.

Thanks to all who voted and a big thanks to Sophie who did a vast majority
of the work to prepare this release. We will continue the release process
and get the announcement sent out this week.

Cheers,
David Arthur

On Fri, Feb 3, 2023 at 12:25 PM Federico Valeri 
wrote:

> +1 (non binding)
>
> - Ran the unit and integration test suites with Java 17 and Scala 2.13
> - Ran a series of basic examples and client configurations
> - Spot checked the docs and Javadocs
>
> Thanks
> Fede
>
> On Fri, Feb 3, 2023 at 5:29 PM Jakub Scholz  wrote:
> >
> > +1 (non-binding). I run my tests with the staged Scala 2.13 binaries and
> > staged Maven artifacts. All seems to work fine.
> >
> > Thanks & Regards
> > Jakub
> >
> > On Tue, Jan 31, 2023 at 8:01 PM David Arthur 
> wrote:
> >
> > > Hey folks, we found a couple of blockers with RC1 and have fixed them
> in
> > > the latest release candidate, RC2.
> > >
> > > The major features of this release include:
> > >
> > > * KIP-881: Rack-aware Partition Assignment for Kafka Consumers
> > > <
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-881%3A+Rack-aware+Partition+Assignment+for+Kafka+Consumers
> > > >
> > >
> > > * KIP-876: Time based cluster metadata snapshots
> > > <
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-876%3A+Time+based+cluster+metadata+snapshots
> > > >
> > >
> > > * KIP-787: MM2 manage Kafka resources with custom Admin implementation.
> > > <
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191335620
> > > >
> > >
> > > * KIP-866 ZooKeeper to KRaft Migration
> > > <
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration
> > > >
> > > (Early
> > > Access)
> > >
> > >
> > >
> > > Release notes for the 3.4.0 release:
> > >
> > >
> https://home.apache.org/~davidarthur/kafka-3.4.0-rc2/RELEASE_NOTES.html
> > >
> > >
> > > Please download, test and vote by Friday, February 3, 5pm PT
> > >
> > >
> > > ---
> > >
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > >
> > > https://kafka.apache.org/KEYS
> > >
> > >
> > > * Release artifacts to be voted upon (source and binary):
> > >
> > > https://home.apache.org/~davidarthur/kafka-3.4.0-rc2/
> > >
> > >
> > > * Maven artifacts to be voted upon:
> > >
> > > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >
> > >
> > > * Javadoc:
> > >
> > > https://home.apache.org/~davidarthur/kafka-3.4.0-rc2/javadoc/
> > >
> > >
> > > * Tag to be voted upon (off 3.4 branch) is the 3.4.0 tag:
> > >
> > > https://github.com/apache/kafka/releases/tag/3.4.0-rc2
> > >
> > >
> > > * Documentation:
> > >
> > > https://kafka.apache.org/34/documentation.html
> > >
> > >
> > > * Protocol:
> > >
> > > https://kafka.apache.org/34/protocol.html
> > >
> > >
> > > ---
> > >
> > >
> > > Test results:
> > >
> > >
> > > We haven't had a 100% passing build, but the latest system test run
> looks
> > > pretty good:
> > >
> > >
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/3.4/2023-01-31--001.system-test-kafka-3.4--1675184554--confluentinc--3.4--ef3f5bd834/report.html
> > >
> > >
> > > Here are the Jenkins test runs for 3.4:
> > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.4/. We will
> > > continue
> > > trying to diagnose the flaky test failures as the release continues. I
> do
> > > not expect that any of these test failures are blockers for the
> release.
> > >
> > >
> > > Thanks!
> > >
> > > David Arthur
> > >
>


[VOTE] 3.4.0 RC2

2023-01-31 Thread David Arthur
Hey folks, we found a couple of blockers with RC1 and have fixed them in
the latest release candidate, RC2.

The major features of this release include:

* KIP-881: Rack-aware Partition Assignment for Kafka Consumers
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-881%3A+Rack-aware+Partition+Assignment+for+Kafka+Consumers>

* KIP-876: Time based cluster metadata snapshots
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-876%3A+Time+based+cluster+metadata+snapshots>

* KIP-787: MM2 manage Kafka resources with custom Admin implementation.
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191335620>

* KIP-866 ZooKeeper to KRaft Migration
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration>
(Early
Access)



Release notes for the 3.4.0 release:

https://home.apache.org/~davidarthur/kafka-3.4.0-rc2/RELEASE_NOTES.html


Please download, test and vote by Friday, February 3, 5pm PT


---


Kafka's KEYS file containing PGP keys we use to sign the release:

https://kafka.apache.org/KEYS


* Release artifacts to be voted upon (source and binary):

https://home.apache.org/~davidarthur/kafka-3.4.0-rc2/


* Maven artifacts to be voted upon:

https://repository.apache.org/content/groups/staging/org/apache/kafka/


* Javadoc:

https://home.apache.org/~davidarthur/kafka-3.4.0-rc2/javadoc/


* Tag to be voted upon (off 3.4 branch) is the 3.4.0 tag:

https://github.com/apache/kafka/releases/tag/3.4.0-rc2


* Documentation:

https://kafka.apache.org/34/documentation.html


* Protocol:

https://kafka.apache.org/34/protocol.html


---


Test results:


We haven't had a 100% passing build, but the latest system test run looks
pretty good:
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/3.4/2023-01-31--001.system-test-kafka-3.4--1675184554--confluentinc--3.4--ef3f5bd834/report.html


Here are the Jenkins test runs for 3.4:
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.4/. We will continue
trying to diagnose the flaky test failures as the release continues. I do
not expect that any of these test failures are blockers for the release.


Thanks!

David Arthur


[VOTE] 3.4.0 RC0

2023-01-11 Thread David Arthur
Hello Kafka users, developers and client-developers,


This is the first candidate for release of Apache Kafka 3.4.0. Some of the
major features include:


* KIP-881: Rack-aware Partition Assignment for Kafka Consumers


* KIP-876: Time based cluster metadata snapshots


* KIP-787: MM2 manage Kafka resources with custom Admin implementation.


* KIP-866 ZooKeeper to KRaft Migration

(Early
Access)


For a full list of the features in this release, please refer to the
release notes:

https://home.apache.org/~davidarthur/kafka-3.4.0-rc0/RELEASE_NOTES.html


*** Please download, test and vote by Wednesday, Jan 18th, 2023.


Kafka's KEYS file containing PGP keys we use to sign the release:

https://kafka.apache.org/KEYS


* Release artifacts to be voted upon (source and binary):

https://home.apache.org/~davidarthur/kafka-3.4.0-rc0/


* Maven artifacts to be voted upon:

https://repository.apache.org/content/groups/staging/org/apache/kafka/


* Javadoc:

https://home.apache.org/~davidarthur/kafka-3.4.0-rc0/javadoc/


* Tag to be voted upon (off 3.4 branch) is the 3.4.0 tag:

https://github.com/apache/kafka/releases/tag/3.4.0-rc0


* Documentation:

https://kafka.apache.org/34/documentation.html


* Protocol:

https://kafka.apache.org/34/protocol.html


The Jenkins unit/integration for 3.4 is currently experiencing
spurious timeouts. The recent build history of 3.4 shows some green builds.
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.4/. If we get a
solid all-green build, I'll update this thread.


System tests are currently running, we'll send out results once they are
available.


A big thanks to Sophie for running this release!


Re: [VOTE] 3.3.0 RC2

2022-09-28 Thread David Arthur
For those interested, here's a PR to fix the HTML rendering issue:
https://github.com/apache/kafka-site/pull/446

-David

On Wed, Sep 28, 2022 at 9:45 AM David Arthur  wrote:
>
> Thanks Divij, we made a bunch of documentation changes at the last minute for 
> this release, so we’re trying to figure out when the issue was introduced. 
> I’m hoping we can fix it before we send out the announcement.
>
> Thanks for calling it out :)
>
> Best,
> David
>
> On Wed, Sep 28, 2022 at 09:11 Divij Vaidya  wrote:
>>
>> Please ignore my previous email. Seems like that is a known issue and we a
>> plan to fix it after the release.
>>
>> Divij Vaidya
>>
>>
>>
>> On Wed, Sep 28, 2022 at 3:08 PM Divij Vaidya 
>> wrote:
>>
>> > Hey folks
>> >
>> > I noticed a non-blocking bug with the documentation page where the arrow
>> > to left nav overlaps with the text and a blue color vertical bar appears at
>> > the right side. Please see the highlighted elements in the attached image.
>> > In contrast, the current documentation page does not have this bug.
>> >
>> > Reproducer:
>> > 1. Visit https://kafka.apache.org/33/documentation.html on a
>> > chrome browser.
>> > 2. Observe that arrow to expand left nav overlaps with text.
>> > 3. Remove "33/" from the url to observe the current documentation.
>> > 4. Observe that the current website does not have this bug.
>> >
>> >
>> > Divij Vaidya
>> >
>> >
>> >
>> > On Tue, Sep 27, 2022 at 8:35 PM David Arthur 
>> > wrote:
>> >
>> >> I re-ran the failing system tests last night and got passing builds
>> >> for each. There is still some flakiness it seems.
>> >>
>> >> Round trip test:
>> >>
>> >> http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1664223686--apache--3.3--9b8a48ca2a/2022-09-26--001./2022-09-26--001./report.html
>> >>
>> >> Upgrade test:
>> >> http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1664235839--apache--3.3--1ce7bd7f29/2022-09-26--001./2022-09-26--001./report.html
>> >>
>> >> I was also able to verify that the failing delegation token test was
>> >> not a regression, rather it was an issue with the test. I've opened a
>> >> PR with a fix for the test https://github.com/apache/kafka/pull/12693.
>> >> Included in the PR are the results of this fix applied to the 3.3
>> >> branch (confirming we don't have a regression).
>> >>
>> >> I also filed a JIRA for the flaky upgrade_test
>> >> https://issues.apache.org/jira/browse/KAFKA-14263
>> >>
>> >> With all that out of the way, I'm happy to close this vote with the
>> >> following results:
>> >>
>> >> 5 binding +1 votes from PMC members John R, David J, Bill B, Ismael J,
>> >> and Mickael M.
>> >> 1 non-binding +1 community vote from Jakub Scholz
>> >> No -1 votes
>> >>
>> >> The vote for Apache Kafka 3.3 passes!
>> >>
>> >> Thanks to everyone who voted and helped verify this release! A special
>> >> thanks to José who has driven the release up to this point.
>> >>
>> >> Best,
>> >> David Arthur
>> >>
>> >>
>> >> On Tue, Sep 27, 2022 at 6:50 AM Mickael Maison 
>> >> wrote:
>> >> >
>> >> > +1 (binding)
>> >> > I checked the signatures/checksums, built from source and ran tests,
>> >> > and ran the quickstart with the 2.13 binaries.
>> >> >
>> >> > Thanks José and David for running this release
>> >> >
>> >> > On Mon, Sep 26, 2022 at 11:07 PM David Arthur 
>> >> wrote:
>> >> > >
>> >> > > Thanks for the votes, everyone!
>> >> > >
>> >> > > Here is the best recent run of the system tests on 3.3
>> >> > >
>> >> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/3.3/2022-09-24--001.system-test-kafka-3.3--1664037736--confluentinc--3.3--b084e9068c/report.html
>> >> .
>> >> > > I'm currently re-running the failed tests to confirm that they are
>> >> > > merely flaky and not broken. One exception is the failing delegation
&g

Re: [VOTE] 3.3.0 RC2

2022-09-28 Thread David Arthur
Thanks Divij, we made a bunch of documentation changes at the last minute
for this release, so we’re trying to figure out when the issue was
introduced. I’m hoping we can fix it before we send out the announcement.

Thanks for calling it out :)

Best,
David

On Wed, Sep 28, 2022 at 09:11 Divij Vaidya  wrote:

> Please ignore my previous email. Seems like that is a known issue and we a
> plan to fix it after the release.
>
> Divij Vaidya
>
>
>
> On Wed, Sep 28, 2022 at 3:08 PM Divij Vaidya 
> wrote:
>
> > Hey folks
> >
> > I noticed a non-blocking bug with the documentation page where the arrow
> > to left nav overlaps with the text and a blue color vertical bar appears
> at
> > the right side. Please see the highlighted elements in the attached
> image.
> > In contrast, the current documentation page does not have this bug.
> >
> > Reproducer:
> > 1. Visit https://kafka.apache.org/33/documentation.html on a
> > chrome browser.
> > 2. Observe that arrow to expand left nav overlaps with text.
> > 3. Remove "33/" from the url to observe the current documentation.
> > 4. Observe that the current website does not have this bug.
> >
> >
> > Divij Vaidya
> >
> >
> >
> > On Tue, Sep 27, 2022 at 8:35 PM David Arthur 
> > wrote:
> >
> >> I re-ran the failing system tests last night and got passing builds
> >> for each. There is still some flakiness it seems.
> >>
> >> Round trip test:
> >>
> >>
> http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1664223686--apache--3.3--9b8a48ca2a/2022-09-26--001./2022-09-26--001./report.html
> >>
> >> Upgrade test:
> >>
> http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1664235839--apache--3.3--1ce7bd7f29/2022-09-26--001./2022-09-26--001./report.html
> >>
> >> I was also able to verify that the failing delegation token test was
> >> not a regression, rather it was an issue with the test. I've opened a
> >> PR with a fix for the test https://github.com/apache/kafka/pull/12693.
> >> Included in the PR are the results of this fix applied to the 3.3
> >> branch (confirming we don't have a regression).
> >>
> >> I also filed a JIRA for the flaky upgrade_test
> >> https://issues.apache.org/jira/browse/KAFKA-14263
> >>
> >> With all that out of the way, I'm happy to close this vote with the
> >> following results:
> >>
> >> 5 binding +1 votes from PMC members John R, David J, Bill B, Ismael J,
> >> and Mickael M.
> >> 1 non-binding +1 community vote from Jakub Scholz
> >> No -1 votes
> >>
> >> The vote for Apache Kafka 3.3 passes!
> >>
> >> Thanks to everyone who voted and helped verify this release! A special
> >> thanks to José who has driven the release up to this point.
> >>
> >> Best,
> >> David Arthur
> >>
> >>
> >> On Tue, Sep 27, 2022 at 6:50 AM Mickael Maison <
> mickael.mai...@gmail.com>
> >> wrote:
> >> >
> >> > +1 (binding)
> >> > I checked the signatures/checksums, built from source and ran tests,
> >> > and ran the quickstart with the 2.13 binaries.
> >> >
> >> > Thanks José and David for running this release
> >> >
> >> > On Mon, Sep 26, 2022 at 11:07 PM David Arthur  >
> >> wrote:
> >> > >
> >> > > Thanks for the votes, everyone!
> >> > >
> >> > > Here is the best recent run of the system tests on 3.3
> >> > >
> >>
> http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/3.3/2022-09-24--001.system-test-kafka-3.3--1664037736--confluentinc--3.3--b084e9068c/report.html
> >> .
> >> > > I'm currently re-running the failed tests to confirm that they are
> >> > > merely flaky and not broken. One exception is the failing delegation
> >> > > token test that is consistently failing. This appears to be an issue
> >> > > with the test itself due to changes in the command output introduced
> >> > > recently.
> >> > >
> >> > > Similarly, the unit/integration tests are mostly passing, with some
> >> > > flaky failures. Here is a recent run that has two out of three jobs
> >> > > passing
> https://ci-builds.apache.org/job/Kafka/job/k

Re: [VOTE] 3.3.0 RC2

2022-09-27 Thread David Arthur
I re-ran the failing system tests last night and got passing builds
for each. There is still some flakiness it seems.

Round trip test:
http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1664223686--apache--3.3--9b8a48ca2a/2022-09-26--001./2022-09-26--001./report.html

Upgrade test: 
http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1664235839--apache--3.3--1ce7bd7f29/2022-09-26--001./2022-09-26--001./report.html

I was also able to verify that the failing delegation token test was
not a regression, rather it was an issue with the test. I've opened a
PR with a fix for the test https://github.com/apache/kafka/pull/12693.
Included in the PR are the results of this fix applied to the 3.3
branch (confirming we don't have a regression).

I also filed a JIRA for the flaky upgrade_test
https://issues.apache.org/jira/browse/KAFKA-14263

With all that out of the way, I'm happy to close this vote with the
following results:

5 binding +1 votes from PMC members John R, David J, Bill B, Ismael J,
and Mickael M.
1 non-binding +1 community vote from Jakub Scholz
No -1 votes

The vote for Apache Kafka 3.3 passes!

Thanks to everyone who voted and helped verify this release! A special
thanks to José who has driven the release up to this point.

Best,
David Arthur


On Tue, Sep 27, 2022 at 6:50 AM Mickael Maison  wrote:
>
> +1 (binding)
> I checked the signatures/checksums, built from source and ran tests,
> and ran the quickstart with the 2.13 binaries.
>
> Thanks José and David for running this release
>
> On Mon, Sep 26, 2022 at 11:07 PM David Arthur  wrote:
> >
> > Thanks for the votes, everyone!
> >
> > Here is the best recent run of the system tests on 3.3
> > http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/3.3/2022-09-24--001.system-test-kafka-3.3--1664037736--confluentinc--3.3--b084e9068c/report.html.
> > I'm currently re-running the failed tests to confirm that they are
> > merely flaky and not broken. One exception is the failing delegation
> > token test that is consistently failing. This appears to be an issue
> > with the test itself due to changes in the command output introduced
> > recently.
> >
> > Similarly, the unit/integration tests are mostly passing, with some
> > flaky failures. Here is a recent run that has two out of three jobs
> > passing https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.3/86/.
> >
> > Once I verify the system tests are good, I'll update this thread and
> > close out the vote.
> >
> > Thanks!
> > David
> >
> >
> > On Mon, Sep 26, 2022 at 2:24 PM Ismael Juma  wrote:
> > >
> > > +1 provided that the system test results are good. Can you please post 
> > > them
> > > along with the JUnit test results (these seem ok although there are some
> > > flakes)?
> > >
> > > I tested the kraft quick start with the Scala 2.13 binary and ran the 
> > > tests
> > > on the source release. I noticed a non-blocker issue with the KRaft readme
> > > and submitted a PR:
> > >
> > > https://github.com/apache/kafka/pull/12688
> > >
> > > Ismael
> > >
> > > On Tue, Sep 20, 2022 at 4:17 PM David Arthur  
> > > wrote:
> > >
> > > > Hello Kafka users, developers and client-developers,
> > > >
> > > > This is the second release candidate for Apache Kafka 3.3.0. Many new
> > > > features and bug fixes are included in this major release of Kafka. A
> > > > significant number of the issues in this release are related to KRaft,
> > > > which will be considered "production ready" as part of this release
> > > > (KIP-833)
> > > >
> > > > KRaft improvements:
> > > > * KIP-778: Online KRaft to KRaft Upgrades
> > > > * KIP-833: Mark KRaft as Production Ready
> > > > * KIP-835: Monitor Quorum health (many new KRaft metrics)
> > > > * KIP-836: Expose voter lag via kafka-metadata-quorum.sh
> > > > * KIP-841: Fenced replicas should not be allowed to join the ISR in 
> > > > KRaft
> > > > * KIP-859: Add Metadata Log Processing Error Related Metrics
> > > >
> > > > Other major improvements include:
> > > > * KIP-618: Exactly-Once Support for Source Connectors
> > > > * KIP-831: Add metric for log recovery progress
> > > > * KIP-827: Expose logdirs total and usable space via Kafka API
> > > > * KIP-834: Add ability to Pause / Resume KafkaStre

Re: [VOTE] 3.3.0 RC2

2022-09-26 Thread David Arthur
Thanks for the votes, everyone!

Here is the best recent run of the system tests on 3.3
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/3.3/2022-09-24--001.system-test-kafka-3.3--1664037736--confluentinc--3.3--b084e9068c/report.html.
I'm currently re-running the failed tests to confirm that they are
merely flaky and not broken. One exception is the failing delegation
token test that is consistently failing. This appears to be an issue
with the test itself due to changes in the command output introduced
recently.

Similarly, the unit/integration tests are mostly passing, with some
flaky failures. Here is a recent run that has two out of three jobs
passing https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.3/86/.

Once I verify the system tests are good, I'll update this thread and
close out the vote.

Thanks!
David


On Mon, Sep 26, 2022 at 2:24 PM Ismael Juma  wrote:
>
> +1 provided that the system test results are good. Can you please post them
> along with the JUnit test results (these seem ok although there are some
> flakes)?
>
> I tested the kraft quick start with the Scala 2.13 binary and ran the tests
> on the source release. I noticed a non-blocker issue with the KRaft readme
> and submitted a PR:
>
> https://github.com/apache/kafka/pull/12688
>
> Ismael
>
> On Tue, Sep 20, 2022 at 4:17 PM David Arthur  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the second release candidate for Apache Kafka 3.3.0. Many new
> > features and bug fixes are included in this major release of Kafka. A
> > significant number of the issues in this release are related to KRaft,
> > which will be considered "production ready" as part of this release
> > (KIP-833)
> >
> > KRaft improvements:
> > * KIP-778: Online KRaft to KRaft Upgrades
> > * KIP-833: Mark KRaft as Production Ready
> > * KIP-835: Monitor Quorum health (many new KRaft metrics)
> > * KIP-836: Expose voter lag via kafka-metadata-quorum.sh
> > * KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft
> > * KIP-859: Add Metadata Log Processing Error Related Metrics
> >
> > Other major improvements include:
> > * KIP-618: Exactly-Once Support for Source Connectors
> > * KIP-831: Add metric for log recovery progress
> > * KIP-827: Expose logdirs total and usable space via Kafka API
> > * KIP-834: Add ability to Pause / Resume KafkaStreams Topologies
> >
> > The full release notes are available here:
> > https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/RELEASE_NOTES.html
> >
> > Please download, test and vote by Monday, Sep 26 at 5pm EDT
> >
> > Also, huge thanks to José for running the release so far. He has done
> > the vast majority of the work to prepare this rather large release :)
> >
> > -
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > https://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc: https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/javadoc/
> >
> > * Tag to be voted upon (off 3.3 branch) is the 3.3.0 tag:
> > https://github.com/apache/kafka/releases/tag/3.3.0-rc2
> >
> > * Documentation:  https://kafka.apache.org/33/documentation.html
> >
> > * Protocol: https://kafka.apache.org/33/protocol.html
> >
> >
> >
> >
> > Successful Jenkins builds to follow in a future update to this email.
> >
> >
> > Thanks!
> > David Arthur
> >


Re: [kafka-clients] Re: [VOTE] 3.3.0 RC2

2022-09-26 Thread David Arthur
Happy Monday, everyone! I've uploaded the 3.3 docs to the website to
make that part of the release validation easier

https://kafka.apache.org/33/documentation.html

These docs are not referenced by the top level docs page, so they can
only be accessed directly via the "33" url above.

I'd like to close out the vote by the end of day today, so please take a look.

Thanks!
David

On Thu, Sep 22, 2022 at 9:06 AM David Arthur
 wrote:
>
> Josep, thanks for the note. We will mention the CVEs fixed in this release
> in the announcement email. I believe we can also update the release notes
> HTML after the vote is complete.
>
> -David
>
> On Wed, Sep 21, 2022 at 2:51 AM 'Josep Prat' via kafka-clients <
> kafka-clie...@googlegroups.com> wrote:
>
> > Hi David,
> >
> > Thanks for driving this. One question, should we include in the release
> > notes the recently fixed CVE vulnerability? I understand this not being
> > explicitly mentioned on the recently released versions to not cause an
> > unintentional 0-day, but I think it could be mentioned for this release.
> > What do you think?
> >
> > Best,
> >
> > On Wed, Sep 21, 2022 at 1:17 AM David Arthur 
> > wrote:
> >
> >> Hello Kafka users, developers and client-developers,
> >>
> >> This is the second release candidate for Apache Kafka 3.3.0. Many new
> >> features and bug fixes are included in this major release of Kafka. A
> >> significant number of the issues in this release are related to KRaft,
> >> which will be considered "production ready" as part of this release
> >> (KIP-833)
> >>
> >> KRaft improvements:
> >> * KIP-778: Online KRaft to KRaft Upgrades
> >> * KIP-833: Mark KRaft as Production Ready
> >> * KIP-835: Monitor Quorum health (many new KRaft metrics)
> >> * KIP-836: Expose voter lag via kafka-metadata-quorum.sh
> >> * KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft
> >> * KIP-859: Add Metadata Log Processing Error Related Metrics
> >>
> >> Other major improvements include:
> >> * KIP-618: Exactly-Once Support for Source Connectors
> >> * KIP-831: Add metric for log recovery progress
> >> * KIP-827: Expose logdirs total and usable space via Kafka API
> >> * KIP-834: Add ability to Pause / Resume KafkaStreams Topologies
> >>
> >> The full release notes are available here:
> >> https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/RELEASE_NOTES.html
> >>
> >> Please download, test and vote by Monday, Sep 26 at 5pm EDT
> >>
> >> Also, huge thanks to José for running the release so far. He has done
> >> the vast majority of the work to prepare this rather large release :)
> >>
> >> -
> >>
> >> Kafka's KEYS file containing PGP keys we use to sign the release:
> >> https://kafka.apache.org/KEYS
> >>
> >> * Release artifacts to be voted upon (source and binary):
> >> https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/
> >>
> >> * Maven artifacts to be voted upon:
> >> https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >>
> >> * Javadoc: https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/javadoc/
> >>
> >> * Tag to be voted upon (off 3.3 branch) is the 3.3.0 tag:
> >> https://github.com/apache/kafka/releases/tag/3.3.0-rc2
> >>
> >> * Documentation:  https://kafka.apache.org/33/documentation.html
> >>
> >> * Protocol: https://kafka.apache.org/33/protocol.html
> >>
> >>
> >>
> >>
> >> Successful Jenkins builds to follow in a future update to this email.
> >>
> >>
> >> Thanks!
> >> David Arthur
> >>
> >
> >
> > --
> > [image: Aiven] <https://www.aiven.io>
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io <https://www.aiven.io>   |
> > <https://www.facebook.com/aivencloud>
> > <https://www.linkedin.com/company/aiven/>   <https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Immanuelkirchstraße 26, 10405 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "kafka-clients" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to kafka-clients+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/kafka-clients/CAOJ18G4DE9Q_DYyZTbDLF6J6MRj30WrCNj6njrYRV3SQeThs-w%40mail.gmail.com
> > <https://groups.google.com/d/msgid/kafka-clients/CAOJ18G4DE9Q_DYyZTbDLF6J6MRj30WrCNj6njrYRV3SQeThs-w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> > .
> >
>
>
> --
> -David


Re: [kafka-clients] Re: [VOTE] 3.3.0 RC2

2022-09-22 Thread David Arthur
Josep, thanks for the note. We will mention the CVEs fixed in this release
in the announcement email. I believe we can also update the release notes
HTML after the vote is complete.

-David

On Wed, Sep 21, 2022 at 2:51 AM 'Josep Prat' via kafka-clients <
kafka-clie...@googlegroups.com> wrote:

> Hi David,
>
> Thanks for driving this. One question, should we include in the release
> notes the recently fixed CVE vulnerability? I understand this not being
> explicitly mentioned on the recently released versions to not cause an
> unintentional 0-day, but I think it could be mentioned for this release.
> What do you think?
>
> Best,
>
> On Wed, Sep 21, 2022 at 1:17 AM David Arthur 
> wrote:
>
>> Hello Kafka users, developers and client-developers,
>>
>> This is the second release candidate for Apache Kafka 3.3.0. Many new
>> features and bug fixes are included in this major release of Kafka. A
>> significant number of the issues in this release are related to KRaft,
>> which will be considered "production ready" as part of this release
>> (KIP-833)
>>
>> KRaft improvements:
>> * KIP-778: Online KRaft to KRaft Upgrades
>> * KIP-833: Mark KRaft as Production Ready
>> * KIP-835: Monitor Quorum health (many new KRaft metrics)
>> * KIP-836: Expose voter lag via kafka-metadata-quorum.sh
>> * KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft
>> * KIP-859: Add Metadata Log Processing Error Related Metrics
>>
>> Other major improvements include:
>> * KIP-618: Exactly-Once Support for Source Connectors
>> * KIP-831: Add metric for log recovery progress
>> * KIP-827: Expose logdirs total and usable space via Kafka API
>> * KIP-834: Add ability to Pause / Resume KafkaStreams Topologies
>>
>> The full release notes are available here:
>> https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/RELEASE_NOTES.html
>>
>> Please download, test and vote by Monday, Sep 26 at 5pm EDT
>>
>> Also, huge thanks to José for running the release so far. He has done
>> the vast majority of the work to prepare this rather large release :)
>>
>> -
>>
>> Kafka's KEYS file containing PGP keys we use to sign the release:
>> https://kafka.apache.org/KEYS
>>
>> * Release artifacts to be voted upon (source and binary):
>> https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/
>>
>> * Maven artifacts to be voted upon:
>> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>>
>> * Javadoc: https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/javadoc/
>>
>> * Tag to be voted upon (off 3.3 branch) is the 3.3.0 tag:
>> https://github.com/apache/kafka/releases/tag/3.3.0-rc2
>>
>> * Documentation:  https://kafka.apache.org/33/documentation.html
>>
>> * Protocol: https://kafka.apache.org/33/protocol.html
>>
>>
>>
>>
>> Successful Jenkins builds to follow in a future update to this email.
>>
>>
>> Thanks!
>> David Arthur
>>
>
>
> --
> [image: Aiven] <https://www.aiven.io>
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io <https://www.aiven.io>   |
> <https://www.facebook.com/aivencloud>
> <https://www.linkedin.com/company/aiven/>   <https://twitter.com/aiven_io>
> *Aiven Deutschland GmbH*
> Immanuelkirchstraße 26, 10405 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B
>
> --
> You received this message because you are subscribed to the Google Groups
> "kafka-clients" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kafka-clients+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/kafka-clients/CAOJ18G4DE9Q_DYyZTbDLF6J6MRj30WrCNj6njrYRV3SQeThs-w%40mail.gmail.com
> <https://groups.google.com/d/msgid/kafka-clients/CAOJ18G4DE9Q_DYyZTbDLF6J6MRj30WrCNj6njrYRV3SQeThs-w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>


-- 
-David


[VOTE] 3.3.0 RC2

2022-09-20 Thread David Arthur
Hello Kafka users, developers and client-developers,

This is the second release candidate for Apache Kafka 3.3.0. Many new
features and bug fixes are included in this major release of Kafka. A
significant number of the issues in this release are related to KRaft,
which will be considered "production ready" as part of this release
(KIP-833)

KRaft improvements:
* KIP-778: Online KRaft to KRaft Upgrades
* KIP-833: Mark KRaft as Production Ready
* KIP-835: Monitor Quorum health (many new KRaft metrics)
* KIP-836: Expose voter lag via kafka-metadata-quorum.sh
* KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft
* KIP-859: Add Metadata Log Processing Error Related Metrics

Other major improvements include:
* KIP-618: Exactly-Once Support for Source Connectors
* KIP-831: Add metric for log recovery progress
* KIP-827: Expose logdirs total and usable space via Kafka API
* KIP-834: Add ability to Pause / Resume KafkaStreams Topologies

The full release notes are available here:
https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/RELEASE_NOTES.html

Please download, test and vote by Monday, Sep 26 at 5pm EDT

Also, huge thanks to José for running the release so far. He has done
the vast majority of the work to prepare this rather large release :)

-

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc: https://home.apache.org/~davidarthur/kafka-3.3.0-rc2/javadoc/

* Tag to be voted upon (off 3.3 branch) is the 3.3.0 tag:
https://github.com/apache/kafka/releases/tag/3.3.0-rc2

* Documentation:  https://kafka.apache.org/33/documentation.html

* Protocol: https://kafka.apache.org/33/protocol.html




Successful Jenkins builds to follow in a future update to this email.


Thanks!
David Arthur


[ANNOUNCE] Apache Kafka 3.2.1

2022-08-01 Thread David Arthur
The Apache Kafka community is pleased to announce the release for
Apache Kafka 3.2.1

This is a bugfix release with several fixes since the release of
3.2.0. A few of the major issues include:

* KAFKA-14062 OAuth client token refresh fails with SASL extensions
* KAFKA-14079 Memory leak in connectors using errors.tolerance=all
* KAFKA-14024 Cooperative rebalance regression causing clients to get stuck


All of the changes in this release can be found in the release notes:

https://www.apache.org/dist/kafka/3.2.1/RELEASE_NOTES.html


You can download the source and binary release (Scala 2.12 and 2.13) from:

https://kafka.apache.org/downloads#3.2.1

---


Apache Kafka is a distributed streaming platform with four core APIs:

** The Producer API allows an application to publish a stream of
records to one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output stream to one or more output topics, effectively transforming
the input streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might
capture every change to a table.


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react
to the streams of data.


Apache Kafka is in use at large and small companies worldwide,
including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
Zalando, among others.

A big thank you for the following 19 contributors to this release!

Akhilesh Chaganti, Bruno Cadonna, Christopher L. Shannon, David
Arthur, Divij Vaidya, Eugene Tolbakov, Guozhang Wang, Ismael Juma,
James Hughes, Jason Gustafson, Kirk True, Lucas Bradstreet, Luke Chen,
Nicolas Guyomar, Niket Goel, Okada Haruki, Shawn Wang, Viktor
Somogyi-Vass, Walker Carlson

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kafka.apache.org/


Thank you!

Regards,
David Arthur


[RESULTS] [VOTE] Release Kafka version 3.2.1

2022-07-27 Thread David Arthur
The vote for RC3 has passed with eight +1 votes (three binding) and no -1
votes. Here are the results:

+1 votes
PMC:
* Randall Hauch
* Rajini Sivaram
* Bill Bejeck

Committers:
None

Community:
* Christopher Shannon
* Federico Valeri
* Dongjoon Hyun
* Jakub Scholz
* Matthew de Detrich

0 Votes:
None

-1 Votes:
None

Vote Thread:
https://lists.apache.org/thread/kcr2xncr762sqy79rbl83w0hzw85w775

I'll continue with the release process and send out the release
announcement over the next few days.

Thanks!
David Arthur


Re: [VOTE] 3.2.1 RC3

2022-07-27 Thread David Arthur
I'm closing out the vote now. Thanks to everyone who voted. The RC passed
with the required number of votes. I'll send out the results thread shortly.

Cheers,
David Arthur

On Wed, Jul 27, 2022 at 11:54 AM Bill Bejeck  wrote:

> Hi David,
>
> Thanks for running the release!
>
> I did the following steps:
>
>- Validated all signatures and checksums
>- Built from source
>- Ran all the unit tests
>- I spot-checked the doc.  I did notice the same version number as
>Randal - but I expect that will get fixed when the docs are updated with
>the release.
>
> +1(binding)
>
> Thanks,
> Bill
>
> On Tue, Jul 26, 2022 at 5:56 PM Matthew Benedict de Detrich
>  wrote:
>
> > Thanks for the RC,
> >
> > I ran the full (unit + integration) tests using Scala 2.12 and 2.13
> across
> > OpenJDK (Linux) 11 and 17 and all tests passed apart from a single one
> > which is documented at https://issues.apache.org/jira/browse/KAFKA-13514
> >
> > +1 (non binding)
> >
> >
> >
> > On Fri, Jul 22, 2022 at 3:15 AM David Arthur 
> > wrote:
> >
> > > Hello Kafka users, developers and client-developers,
> > >
> > > This is the first release candidate of Apache Kafka 3.2.1.
> > >
> > > This is a bugfix release with several fixes since the release of
> 3.2.0. A
> > > few of the major issues include:
> > >
> > > * KAFKA-14062 OAuth client token refresh fails with SASL extensions
> > > * KAFKA-14079 Memory leak in connectors using errors.tolerance=all
> > > * KAFKA-14024 Cooperative rebalance regression causing clients to get
> > stuck
> > >
> > >
> > > Release notes for the 3.2.1 release:
> > >
> https://home.apache.org/~davidarthur/kafka-3.2.1-rc3/RELEASE_NOTES.html
> > >
> > >
> > >
> > >  Please download, test and vote by Wednesday July 27, 2022 at 17:00
> > PT.
> > > 
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > https://kafka.apache.org/KEYS
> > >
> > > Release artifacts to be voted upon (source and binary):
> > > https://home.apache.org/~davidarthur/kafka-3.2.1-rc3/
> > >
> > > Maven artifacts to be voted upon:
> > > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >
> > > Javadoc: https://home.apache.org/~davidarthur/kafka-3.2.1-rc3/javadoc/
> > >
> > > Tag to be voted upon (off 3.2 branch) is the 3.2.1 tag:
> > > https://github.com/apache/kafka/releases/tag/3.2.1-rc3
> > >
> > > Documentation: https://kafka.apache.org/32/documentation.html
> > >
> > > Protocol: https://kafka.apache.org/32/protocol.html
> > >
> > >
> > > The past few builds have had flaky test failures. I will update this
> > thread
> > > with passing build links soon.
> > >
> > > Unit/Integration test job:
> > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.2/
> > > System test job:
> > > https://jenkins.confluent.io/job/system-test-kafka/job/3.2/
> > >
> > >
> > > Thanks!
> > > David Arthur
> > >
> >
> >
> > --
> >
> > Matthew de Detrich
> >
> > *Aiven Deutschland GmbH*
> >
> > Immanuelkirchstraße 26, 10405 Berlin
> >
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> >
> > *m:* +491603708037
> >
> > *w:* aiven.io *e:* matthew.dedetr...@aiven.io
> >
>


[VOTE] 3.2.1 RC3

2022-07-21 Thread David Arthur
Hello Kafka users, developers and client-developers,

This is the first release candidate of Apache Kafka 3.2.1.

This is a bugfix release with several fixes since the release of 3.2.0. A
few of the major issues include:

* KAFKA-14062 OAuth client token refresh fails with SASL extensions
* KAFKA-14079 Memory leak in connectors using errors.tolerance=all
* KAFKA-14024 Cooperative rebalance regression causing clients to get stuck


Release notes for the 3.2.1 release:
https://home.apache.org/~davidarthur/kafka-3.2.1-rc3/RELEASE_NOTES.html



 Please download, test and vote by Wednesday July 27, 2022 at 17:00 PT.

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

Release artifacts to be voted upon (source and binary):
https://home.apache.org/~davidarthur/kafka-3.2.1-rc3/

Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

Javadoc: https://home.apache.org/~davidarthur/kafka-3.2.1-rc3/javadoc/

Tag to be voted upon (off 3.2 branch) is the 3.2.1 tag:
https://github.com/apache/kafka/releases/tag/3.2.1-rc3

Documentation: https://kafka.apache.org/32/documentation.html

Protocol: https://kafka.apache.org/32/protocol.html


The past few builds have had flaky test failures. I will update this thread
with passing build links soon.

Unit/Integration test job:
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.2/
System test job: https://jenkins.confluent.io/job/system-test-kafka/job/3.2/


Thanks!
David Arthur


Re: [ANNOUNCE] Apache Kafka 2.5.0

2020-04-16 Thread David Arthur
I've just published a blog post highlighting many of the improvements that
landed with 2.5.0.

https://blogs.apache.org/kafka/entry/what-s-new-in-apache2

-David

On Wed, Apr 15, 2020 at 4:15 PM David Arthur  wrote:

> The Apache Kafka community is pleased to announce the release for Apache
> Kafka 2.5.0
>
> This release includes many new features, including:
>
> * TLS 1.3 support (1.2 is now the default)
> * Co-groups for Kafka Streams
> * Incremental rebalance for Kafka Consumer
> * New metrics for better operational insight
> * Upgrade Zookeeper to 3.5.7
> * Deprecate support for Scala 2.11
>
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/2.5.0/RELEASE_NOTES.html
>
>
> You can download the source and binary release (Scala 2.12 and 2.13) from:
> https://kafka.apache.org/downloads#2.5.0
>
>
> ---
>
>
> Apache Kafka is a distributed streaming platform with four core APIs:
>
>
> ** The Producer API allows an application to publish a stream records to
> one or more Kafka topics.
>
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
>
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming the
> input streams to output streams.
>
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.
>
>
> With these APIs, Kafka can be used for two broad classes of application:
>
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
>
> ** Building real-time streaming applications that transform or react
> to the streams of data.
>
>
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
>
> A big thank you for the following 108 contributors to this release!
>
> A. Sophie Blee-Goldman, Adam Bellemare, Alaa Zbair, Alex Kokachev, Alex
> Leung, Alex Mironov, Alice, Andrew Olson, Andy Coates, Anna Povzner, Antony
> Stubbs, Arvind Thirunarayanan, belugabehr, bill, Bill Bejeck, Bob Barrett,
> Boyang Chen, Brian Bushree, Brian Byrne, Bruno Cadonna, Bryan Ji, Chia-Ping
> Tsai, Chris Egerton, Chris Pettitt, Chris Stromberger, Colin P. Mccabe,
> Colin Patrick McCabe, commandini, Cyrus Vafadari, Dae-Ho Kim, David Arthur,
> David Jacot, David Kim, David Mao, dengziming, Dhruvil Shah, Edoardo Comar,
> Eduardo Pinto, Fábio Silva, gkomissarov, Grant Henke, Greg Harris, Gunnar
> Morling, Guozhang Wang, Harsha Laxman, high.lee, highluck, Hossein Torabi,
> huxi, huxihx, Ismael Juma, Ivan Yurchenko, Jason Gustafson, jiameixie, John
> Roesler, José Armando García Sancio, Jukka Karvanen, Karan Kumar, Kevin Lu,
> Konstantine Karantasis, Lee Dongjin, Lev Zemlyanov, Levani Kokhreidze,
> Lucas Bradstreet, Manikumar Reddy, Mathias Kub, Matthew Wong, Matthias J.
> Sax, Michael Gyarmathy, Michael Viamari, Mickael Maison, Mitch,
> mmanna-sapfgl, NanerLee, Narek Karapetian, Navinder Pal Singh Brar,
> nicolasguyomar, Nigel Liang, NIkhil Bhatia, Nikolay, ning2008wisc, Omkar
> Mestry, Rajini Sivaram, Randall Hauch, ravowlga123, Raymond Ng, Ron
> Dagostino, sainath batthala, Sanjana Kaundinya, Scott, Seungha Lee, Simon
> Clark, Stanislav Kozlovski, Svend Vanderveken, Sönke Liebau, Ted Yu, Tom
> Bentley, Tomislav, Tu Tran, Tu V. Tran, uttpal, Vikas Singh, Viktor
> Somogyi, vinoth chandar, wcarlson5, Will James, Xin Wang, zzccctv
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kafka.apache.org/
>
> Thank you!
>
>
> Regards,
> David Arthur
>


Re: [RESULTS] [VOTE] 2.5.0 RC3

2020-04-15 Thread David Arthur
Gary, indeed the release is official now. There are many moving parts to
the release which happen sequentially. Artifacts are generally available
between a few hours to a day before the announcement goes out.

Thanks,
David

On Wed, Apr 15, 2020 at 1:34 PM Gary Russell  wrote:

> I see 2.5.0 is in maven central (since yesterday).
>
> Can I assume it is officially released?
>
> Thanks.
>
> On Tue, Apr 14, 2020 at 11:15 AM David Arthur  wrote:
>
> > Thanks everyone! The vote passes with 7 +1 votes (4 of which are binding)
> > and no 0 or -1 votes.
> >
> > 4 binding +1 votes from PMC members Manikumar, Jun, Colin, and Matthias
> > 1 committer +1 vote from Bill
> > 2 community +1 votes from Israel Ekpo and Jonathan Santilli
> >
> > Voting email thread
> >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/202004.mbox/%3CCA%2B0Ze6rUxaPRvddHb50RfVxRtHHvnJD8j9Q9ni18Okc9s-_DSQ%40mail.gmail.com%3E
> >
> > I'll continue with the release steps and send out the announcement email
> > soon.
> >
> > -David
> >
> > On Tue, Apr 14, 2020 at 7:17 AM Jonathan Santilli <
> > jonathansanti...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > I have ran the tests (passed)
> > > Follow the quick start guide with scala 2.12 (success)
> > > +1
> > >
> > >
> > > Thanks!
> > > --
> > > Jonathan
> > >
> > > On Tue, Apr 14, 2020 at 1:16 AM Colin McCabe 
> wrote:
> > >
> > >> +1 (binding)
> > >>
> > >> verified checksums
> > >> ran unitTest
> > >> ran check
> > >>
> > >> best,
> > >> Colin
> > >>
> > >> On Tue, Apr 7, 2020, at 21:03, David Arthur wrote:
> > >> > Hello Kafka users, developers and client-developers,
> > >> >
> > >> > This is the forth candidate for release of Apache Kafka 2.5.0.
> > >> >
> > >> > * TLS 1.3 support (1.2 is now the default)
> > >> > * Co-groups for Kafka Streams
> > >> > * Incremental rebalance for Kafka Consumer
> > >> > * New metrics for better operational insight
> > >> > * Upgrade Zookeeper to 3.5.7
> > >> > * Deprecate support for Scala 2.11
> > >> >
> > >> > Release notes for the 2.5.0 release:
> > >> >
> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/RELEASE_NOTES.html
> > >> >
> > >> > *** Please download, test and vote by Friday April 10th 5pm PT
> > >> >
> > >> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > >> > https://kafka.apache.org/KEYS
> > >> >
> > >> > * Release artifacts to be voted upon (source and binary):
> > >> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/
> > >> >
> > >> > * Maven artifacts to be voted upon:
> > >> >
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >> >
> > >> > * Javadoc:
> > >> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/javadoc/
> > >> >
> > >> > * Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
> > >> > https://github.com/apache/kafka/releases/tag/2.5.0-rc3
> > >> >
> > >> > * Documentation:
> > >> > https://kafka.apache.org/25/documentation.html
> > >> >
> > >> > * Protocol:
> > >> > https://kafka.apache.org/25/protocol.html
> > >> >
> > >> > Successful Jenkins builds to follow
> > >> >
> > >> > Thanks!
> > >> > David
> > >> >
> > >>
> > >> > --
> > >> >  You received this message because you are subscribed to the Google
> > >> Groups "kafka-clients" group.
> > >> >  To unsubscribe from this group and stop receiving emails from it,
> > send
> > >> an email to kafka-clients+unsubscr...@googlegroups.com.
> > >> >  To view this discussion on the web visit
> > >>
> >
> https://groups.google.com/d/msgid/kafka-clients/CA%2B0Ze6rUxaPRvddHb50RfVxRtHHvnJD8j9Q9ni18Okc9s-_DSQ%40mail.gmail.com
> > >> <
> > >>
> >
> https://groups.google.com/d/msgid/kafka-clients/CA%2B0Ze6rUxaPRvddHb50RfVxRtHHvnJD8j9Q9ni18Okc9s-_DSQ%40mail.gmail.com?utm_medium=email&utm_source=footer
> > >> >.
> > >>
> > >
> > >
> > > --
> > > Santilli Jonathan
> > >
> >
> >
> > --
> > David Arthur
> >
>


-- 
David Arthur


[ANNOUNCE] Apache Kafka 2.5.0

2020-04-15 Thread David Arthur
The Apache Kafka community is pleased to announce the release for Apache
Kafka 2.5.0

This release includes many new features, including:

* TLS 1.3 support (1.2 is now the default)
* Co-groups for Kafka Streams
* Incremental rebalance for Kafka Consumer
* New metrics for better operational insight
* Upgrade Zookeeper to 3.5.7
* Deprecate support for Scala 2.11

All of the changes in this release can be found in the release notes:
https://www.apache.org/dist/kafka/2.5.0/RELEASE_NOTES.html


You can download the source and binary release (Scala 2.12 and 2.13) from:
https://kafka.apache.org/downloads#2.5.0

---


Apache Kafka is a distributed streaming platform with four core APIs:


** The Producer API allows an application to publish a stream records to
one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output stream to one or more output topics, effectively transforming the
input streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might
capture every change to a table.


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react
to the streams of data.


Apache Kafka is in use at large and small companies worldwide, including
Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
Target, The New York Times, Uber, Yelp, and Zalando, among others.

A big thank you for the following 108 contributors to this release!

A. Sophie Blee-Goldman, Adam Bellemare, Alaa Zbair, Alex Kokachev, Alex
Leung, Alex Mironov, Alice, Andrew Olson, Andy Coates, Anna Povzner, Antony
Stubbs, Arvind Thirunarayanan, belugabehr, bill, Bill Bejeck, Bob Barrett,
Boyang Chen, Brian Bushree, Brian Byrne, Bruno Cadonna, Bryan Ji, Chia-Ping
Tsai, Chris Egerton, Chris Pettitt, Chris Stromberger, Colin P. Mccabe,
Colin Patrick McCabe, commandini, Cyrus Vafadari, Dae-Ho Kim, David Arthur,
David Jacot, David Kim, David Mao, dengziming, Dhruvil Shah, Edoardo Comar,
Eduardo Pinto, Fábio Silva, gkomissarov, Grant Henke, Greg Harris, Gunnar
Morling, Guozhang Wang, Harsha Laxman, high.lee, highluck, Hossein Torabi,
huxi, huxihx, Ismael Juma, Ivan Yurchenko, Jason Gustafson, jiameixie, John
Roesler, José Armando García Sancio, Jukka Karvanen, Karan Kumar, Kevin Lu,
Konstantine Karantasis, Lee Dongjin, Lev Zemlyanov, Levani Kokhreidze,
Lucas Bradstreet, Manikumar Reddy, Mathias Kub, Matthew Wong, Matthias J.
Sax, Michael Gyarmathy, Michael Viamari, Mickael Maison, Mitch,
mmanna-sapfgl, NanerLee, Narek Karapetian, Navinder Pal Singh Brar,
nicolasguyomar, Nigel Liang, NIkhil Bhatia, Nikolay, ning2008wisc, Omkar
Mestry, Rajini Sivaram, Randall Hauch, ravowlga123, Raymond Ng, Ron
Dagostino, sainath batthala, Sanjana Kaundinya, Scott, Seungha Lee, Simon
Clark, Stanislav Kozlovski, Svend Vanderveken, Sönke Liebau, Ted Yu, Tom
Bentley, Tomislav, Tu Tran, Tu V. Tran, uttpal, Vikas Singh, Viktor
Somogyi, vinoth chandar, wcarlson5, Will James, Xin Wang, zzccctv

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kafka.apache.org/

Thank you!


Regards,
David Arthur


[RESULTS] [VOTE] 2.5.0 RC3

2020-04-14 Thread David Arthur
Thanks everyone! The vote passes with 7 +1 votes (4 of which are binding)
and no 0 or -1 votes.

4 binding +1 votes from PMC members Manikumar, Jun, Colin, and Matthias
1 committer +1 vote from Bill
2 community +1 votes from Israel Ekpo and Jonathan Santilli

Voting email thread
http://mail-archives.apache.org/mod_mbox/kafka-dev/202004.mbox/%3CCA%2B0Ze6rUxaPRvddHb50RfVxRtHHvnJD8j9Q9ni18Okc9s-_DSQ%40mail.gmail.com%3E

I'll continue with the release steps and send out the announcement email
soon.

-David

On Tue, Apr 14, 2020 at 7:17 AM Jonathan Santilli <
jonathansanti...@gmail.com> wrote:

> Hello,
>
> I have ran the tests (passed)
> Follow the quick start guide with scala 2.12 (success)
> +1
>
>
> Thanks!
> --
> Jonathan
>
> On Tue, Apr 14, 2020 at 1:16 AM Colin McCabe  wrote:
>
>> +1 (binding)
>>
>> verified checksums
>> ran unitTest
>> ran check
>>
>> best,
>> Colin
>>
>> On Tue, Apr 7, 2020, at 21:03, David Arthur wrote:
>> > Hello Kafka users, developers and client-developers,
>> >
>> > This is the forth candidate for release of Apache Kafka 2.5.0.
>> >
>> > * TLS 1.3 support (1.2 is now the default)
>> > * Co-groups for Kafka Streams
>> > * Incremental rebalance for Kafka Consumer
>> > * New metrics for better operational insight
>> > * Upgrade Zookeeper to 3.5.7
>> > * Deprecate support for Scala 2.11
>> >
>> > Release notes for the 2.5.0 release:
>> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/RELEASE_NOTES.html
>> >
>> > *** Please download, test and vote by Friday April 10th 5pm PT
>> >
>> > Kafka's KEYS file containing PGP keys we use to sign the release:
>> > https://kafka.apache.org/KEYS
>> >
>> > * Release artifacts to be voted upon (source and binary):
>> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/
>> >
>> > * Maven artifacts to be voted upon:
>> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
>> >
>> > * Javadoc:
>> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/javadoc/
>> >
>> > * Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
>> > https://github.com/apache/kafka/releases/tag/2.5.0-rc3
>> >
>> > * Documentation:
>> > https://kafka.apache.org/25/documentation.html
>> >
>> > * Protocol:
>> > https://kafka.apache.org/25/protocol.html
>> >
>> > Successful Jenkins builds to follow
>> >
>> > Thanks!
>> > David
>> >
>>
>> > --
>> >  You received this message because you are subscribed to the Google
>> Groups "kafka-clients" group.
>> >  To unsubscribe from this group and stop receiving emails from it, send
>> an email to kafka-clients+unsubscr...@googlegroups.com.
>> >  To view this discussion on the web visit
>> https://groups.google.com/d/msgid/kafka-clients/CA%2B0Ze6rUxaPRvddHb50RfVxRtHHvnJD8j9Q9ni18Okc9s-_DSQ%40mail.gmail.com
>> <
>> https://groups.google.com/d/msgid/kafka-clients/CA%2B0Ze6rUxaPRvddHb50RfVxRtHHvnJD8j9Q9ni18Okc9s-_DSQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>> >.
>>
>
>
> --
> Santilli Jonathan
>


-- 
David Arthur


Re: [VOTE] 2.5.0 RC3

2020-04-08 Thread David Arthur
Passing Jenkins build on 2.5 branch:
https://builds.apache.org/job/kafka-2.5-jdk8/90/

On Wed, Apr 8, 2020 at 12:03 AM David Arthur  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the forth candidate for release of Apache Kafka 2.5.0.
>
> * TLS 1.3 support (1.2 is now the default)
> * Co-groups for Kafka Streams
> * Incremental rebalance for Kafka Consumer
> * New metrics for better operational insight
> * Upgrade Zookeeper to 3.5.7
> * Deprecate support for Scala 2.11
>
> Release notes for the 2.5.0 release:
> https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/RELEASE_NOTES.html
>
> *** Please download, test and vote by Friday April 10th 5pm PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/javadoc/
>
> * Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
> https://github.com/apache/kafka/releases/tag/2.5.0-rc3
>
> * Documentation:
> https://kafka.apache.org/25/documentation.html
>
> * Protocol:
> https://kafka.apache.org/25/protocol.html
>
> Successful Jenkins builds to follow
>
> Thanks!
> David
>


-- 
David Arthur


[VOTE] 2.5.0 RC3

2020-04-07 Thread David Arthur
Hello Kafka users, developers and client-developers,

This is the forth candidate for release of Apache Kafka 2.5.0.

* TLS 1.3 support (1.2 is now the default)
* Co-groups for Kafka Streams
* Incremental rebalance for Kafka Consumer
* New metrics for better operational insight
* Upgrade Zookeeper to 3.5.7
* Deprecate support for Scala 2.11

Release notes for the 2.5.0 release:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/RELEASE_NOTES.html

*** Please download, test and vote by Friday April 10th 5pm PT

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc3/javadoc/

* Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
https://github.com/apache/kafka/releases/tag/2.5.0-rc3

* Documentation:
https://kafka.apache.org/25/documentation.html

* Protocol:
https://kafka.apache.org/25/protocol.html

Successful Jenkins builds to follow

Thanks!
David


[VOTE] 2.5.0 RC2

2020-03-17 Thread David Arthur
Hello Kafka users, developers and client-developers,

This is the third candidate for release of Apache Kafka 2.5.0.

* TLS 1.3 support (1.2 is now the default)
* Co-groups for Kafka Streams
* Incremental rebalance for Kafka Consumer
* New metrics for better operational insight
* Upgrade Zookeeper to 3.5.7
* Deprecate support for Scala 2.11


 Release notes for the 2.5.0 release:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc2/RELEASE_NOTES.html

*** Please download, test and vote by Tuesday March 24, 2020 by 5pm PT.

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~davidarthur/kafka-2.5.0-rc2/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc2/javadoc/

* Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
https://github.com/apache/kafka/releases/tag/2.5.0-rc2

* Documentation:
https://kafka.apache.org/25/documentation.html

* Protocol:
https://kafka.apache.org/25/protocol.html


I'm thrilled to be able to include links to both build jobs with successful
builds! Thanks to everyone who has helped reduce our flaky test exposure
these past few weeks :)

* Successful Jenkins builds for the 2.5 branch:
Unit/integration tests: https://builds.apache.org/job/kafka-2.5-jdk8/64/
System tests: https://jenkins.confluent.io/job/system-test-kafka/job/2.5/42/

-- 
David Arthur


Re: [kafka-clients] Re: [VOTE] 2.5.0 RC1

2020-03-17 Thread David Arthur
Thanks, Israel. I agree with Gwen, this is a great list and would be useful
to add to our release candidate boilerplate. Since we found a blocker bug
on RC1, I'll go ahead and close voting. RC2 will be announced shortly.

-David

On Tue, Mar 17, 2020 at 10:46 AM Israel Ekpo  wrote:

> Thanks for the feedback, Gwen. I will create JIRA tasks to track the items
> shortly.
>
> The JIRA tasks will document the goals, expectations and relevant Kafka
> versions for each resource.
>
> I will volunteer for some of them and update the JIRA tasks accordingly.
>
>
> On Tue, Mar 17, 2020 at 12:51 AM Gwen Shapira  wrote:
>
> > Oh wow, I love this checklist. I don't think we'll have time to create
> one
> > for this release, but will be great to track this via JIRA and see if we
> > can get all those contributed before 2.6...
> >
> > Gwen Shapira
> > Engineering Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter | blog
> >
> > On Mon, Mar 16, 2020 at 3:02 PM, Israel Ekpo < israele...@gmail.com >
> > wrote:
> >
> > >
> > >
> > >
> > > - Download artifacts successfully
> > > - Verified signatures successfully
> > > - All tests have passed so far for Scala 2.12. Have not run it on 2.13
> > yet
> > >
> > >
> > >
> > >
> > > +1 (non-binding) for the release
> > >
> > >
> > >
> > > I do have some feedback so I think we should include in the RC
> > > announcement a link for how the community should test and include
> > > information like:
> > >
> > >
> > >
> > > - How to set up test environment for unit and functional tests
> > > - Java version(s) needed for the tests
> > > - Scala version(s) needed for the tests
> > > - Gradle version needed
> > > - Sample script for running sanity checks and unit tests
> > > - Sample Helm Charts for running all the basic components on a
> Kubernetes
> > > - Sample Ansible Script for running all the basic components on Virtual
> > > Machines
> > >
> > >
> > >
> > > It takes a bit of time for newcomers to investigate why the tests are
> not
> > > running successfully in the beginning and providing guidance for these
> > > categories of contributors will be great. If I did not know where to
> look
> > > (kafka-2.5.0-src/gradle/dependencies.gradle) it would take longer to
> > > figure out why the tests are not working/running
> > >
> > >
> > >
> > > Thanks.
> > >
> > >
> > >
> > > On Thu, Mar 12, 2020 at 11:21 AM Bill Bejeck < bbejeck@ gmail. com (
> > > bbej...@gmail.com ) > wrote:
> > >
> > >
> > >>
> > >>
> > >> Hi David,
> > >>
> > >>
> > >>
> > >> 1. Scanned the Javadoc, looks good
> > >> 2. Downloaded kafka_2.12-2.5.0 and ran the quickstart and streams
> > >> quickstart
> > >> 3. Verified the signatures
> > >>
> > >>
> > >>
> > >> +1 (non-binding)
> > >>
> > >>
> > >>
> > >> Thanks for running the release David!
> > >>
> > >>
> > >>
> > >> -Bill
> > >>
> > >>
> > >>
> > >> On Tue, Mar 10, 2020 at 4:01 PM David Arthur < david. arthur@
> > confluent. io
> > >> ( david.art...@confluent.io ) > wrote:
> > >>
> > >>
> > >>>
> > >>>
> > >>> Thanks for the test failure reports, Tom. Tracking (and fixing) these
> > is
> > >>> important and will make future release managers have an easier time
> :)
> > >>>
> > >>>
> > >>>
> > >>> -David
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Mar 10, 2020 at 10:16 AM Tom Bentley < tbentley@ redhat.
> com (
> > >>> tbent...@redhat.com ) > wrote:
> > >>>
> > >>>
> > >>>>
> > >>>>
> > >>>> Hi David,
> > >>>>
> > >>>>
> > >>>>
> > >>>> I verified signatures, built the tagged branch and ran unit and
> > >>>> integration
> > >>>> tests. I found some flaky tests, as follows:
> > >>>>
> > >&g

Re: [VOTE] 2.5.0 RC1

2020-03-10 Thread David Arthur
Thanks for the test failure reports, Tom. Tracking (and fixing) these is
important and will make future release managers have an easier time :)

-David

On Tue, Mar 10, 2020 at 10:16 AM Tom Bentley  wrote:

> Hi David,
>
> I verified signatures, built the tagged branch and ran unit and integration
> tests. I found some flaky tests, as follows:
>
> https://issues.apache.org/jira/browse/KAFKA-9691 (new)
> https://issues.apache.org/jira/browse/KAFKA-9692 (new)
> https://issues.apache.org/jira/browse/KAFKA-9283 (already reported)
>
> Many thanks,
>
> Tom
>
> On Tue, Mar 10, 2020 at 3:28 AM David Arthur  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the second candidate for release of Apache Kafka 2.5.0. The first
> > release candidate included an erroneous NOTICE file, so another RC was
> > needed to fix that.
> >
> > This is a major release of Kafka which includes many new features,
> > improvements, and bug fixes including:
> >
> > * TLS 1.3 support (1.2 is now the default)
> > * Co-groups for Kafka Streams
> > * Incremental rebalance for Kafka Consumer
> > * New metrics for better operational insight
> > * Upgrade Zookeeper to 3.5.7
> > * Deprecate support for Scala 2.11
> >
> > Release notes for the 2.5.0 release:
> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc1/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Monday, March 16th 2020 5pm PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > https://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc1/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc:
> > https://home.apache.org/~davidarthur/kafka-2.5.0-rc1/javadoc/
> >
> > * Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
> > https://github.com/apache/kafka/releases/tag/2.5.0-rc1
> >
> > * Documentation:
> > https://kafka.apache.org/25/documentation.html
> >
> > * Protocol:
> > https://kafka.apache.org/25/protocol.html
> >
> > * Links to successful Jenkins builds for the 2.5 branch to follow
> >
> > Thanks,
> > David Arthur
> >
>


-- 
-David


[VOTE] 2.5.0 RC1

2020-03-09 Thread David Arthur
Hello Kafka users, developers and client-developers,

This is the second candidate for release of Apache Kafka 2.5.0. The first
release candidate included an erroneous NOTICE file, so another RC was
needed to fix that.

This is a major release of Kafka which includes many new features,
improvements, and bug fixes including:

* TLS 1.3 support (1.2 is now the default)
* Co-groups for Kafka Streams
* Incremental rebalance for Kafka Consumer
* New metrics for better operational insight
* Upgrade Zookeeper to 3.5.7
* Deprecate support for Scala 2.11

Release notes for the 2.5.0 release:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Monday, March 16th 2020 5pm PT

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~davidarthur/kafka-2.5.0-rc1/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc1/javadoc/

* Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
https://github.com/apache/kafka/releases/tag/2.5.0-rc1

* Documentation:
https://kafka.apache.org/25/documentation.html

* Protocol:
https://kafka.apache.org/25/protocol.html

* Links to successful Jenkins builds for the 2.5 branch to follow

Thanks,
David Arthur


Re: Subject: [VOTE] 2.4.1 RC0

2020-03-06 Thread David Arthur
+1 (binding)

Download kafka_2.13-2.4.1 and verified signature, ran quickstart,
everything looks good.

Thanks for running this release, Bill!

-David



On Wed, Mar 4, 2020 at 6:06 AM Eno Thereska  wrote:

> Hi Bill,
>
> I built from source and ran unit and integration tests. They passed.
> There was a large number of skipped tests, but I'm assuming that is
> intentional.
>
> Cheers
> Eno
>
> On Tue, Mar 3, 2020 at 8:42 PM Eric Lalonde  wrote:
> >
> > Hi,
> >
> > I ran:
> > $  https://github.com/elalonde/kafka/blob/master/bin/verify-kafka-rc.sh
> <https://github.com/elalonde/kafka/blob/master/bin/verify-kafka-rc.sh>
> 2.4.1 https://home.apache.org/~bbejeck/kafka-2.4.1-rc0 <
> https://home.apache.org/~bbejeck/kafka-2.4.1-rc0>
> >
> > All checksums and signatures are good and all unit and integration tests
> that were executed passed successfully.
> >
> > - Eric
> >
> > > On Mar 2, 2020, at 6:39 PM, Bill Bejeck  wrote:
> > >
> > > Hello Kafka users, developers and client-developers,
> > >
> > > This is the first candidate for release of Apache Kafka 2.4.1.
> > >
> > > This is a bug fix release and it includes fixes and improvements from
> 38
> > > JIRAs, including a few critical bugs.
> > >
> > > Release notes for the 2.4.1 release:
> > > https://home.apache.org/~bbejeck/kafka-2.4.1-rc0/RELEASE_NOTES.html
> > >
> > > *Please download, test and vote by Thursday, March 5, 9 am PT*
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > https://kafka.apache.org/KEYS
> > >
> > > * Release artifacts to be voted upon (source and binary):
> > > https://home.apache.org/~bbejeck/kafka-2.4.1-rc0/
> > >
> > > * Maven artifacts to be voted upon:
> > > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >
> > > * Javadoc:
> > > https://home.apache.org/~bbejeck/kafka-2.4.1-rc0/javadoc/
> > >
> > > * Tag to be voted upon (off 2.4 branch) is the 2.4.1 tag:
> > > https://github.com/apache/kafka/releases/tag/2.4.1-rc0
> > >
> > > * Documentation:
> > > https://kafka.apache.org/24/documentation.html
> > >
> > > * Protocol:
> > > https://kafka.apache.org/24/protocol.html
> > >
> > > * Successful Jenkins builds for the 2.4 branch:
> > > Unit/integration tests: Links to successful unit/integration test
> build to
> > > follow
> > > System tests:
> > > https://jenkins.confluent.io/job/system-test-kafka/job/2.4/152/
> > >
> > >
> > > Thanks,
> > > Bill Bejeck
> >
>


-- 
David Arthur


[VOTE] 2.5.0 RC0

2020-02-28 Thread David Arthur
Hello Kafka users, developers and client-developers,

This is the first candidate for release of Apache Kafka 2.5.0.

This is a major release of Kafka which includes many new features,
improvements, and bug fixes including:

* TLS 1.3 support (1.2 is now the default)
* Co-groups for Kafka Streams
* Incremental rebalance for Kafka Consumer
* New metrics for better operational insight
* Upgrade Zookeeper to 3.5.7
* Deprecate support for Scala 2.11

The full release notes for 2.5.0 can be found here:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc0/RELEASE_NOTES.html

*** Please download, test and vote by Thursday, March 5th, 5pm PT

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~davidarthur/kafka-2.5.0-rc0/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
https://home.apache.org/~davidarthur/kafka-2.5.0-rc0/javadoc/

* Tag to be voted upon (off 2.5 branch) is the 2.5.0 tag:
https://github.com/apache/kafka/releases/tag/2.5.0-rc0

* Documentation:
https://kafka.apache.org/25/documentation.html

* Protocol:
https://kafka.apache.org/25/protocol.html

* Links to successful Jenkins builds for the 2.5 branch to follow

Thanks,
David Arthur


Re: Subject: [VOTE] 2.2.2 RC2

2019-11-08 Thread David Arthur
* Glanced through docs, release notes
* Downloaded RC2 binaries, verified signatures
* Ran through quickstart

+1 binding

Thanks for managing this release, Randall!

-David

On Wed, Nov 6, 2019 at 7:39 PM Eric Lalonde  wrote:

> Hello,
>
> In an effort to assist in the verification of release candidates, I have
> authored the following quick-and-dirty utility to help people verify
> release candidate artifacts:
> https://github.com/elalonde/kafka/blob/master/bin/verify-kafka-rc.sh <
> https://github.com/elalonde/kafka/blob/master/bin/verify-kafka-rc.sh> . I
> have executed this script for 2.2.2 rc2 and everything looks good:
> - all checksums verify
> - all executed gradle commands succeed
> - all unit and integration tests pass.
>
> Hope this helps in the release of 2.2.2.
>
> - Eric
>
> > On Nov 5, 2019, at 7:55 AM, Randall Hauch  wrote:
> >
> > Thanks, Mickael!
> >
> > Anyone else get a chance to validate the 2.2.2 RC2 build? It'd be great
> to
> > get this out the door.
> >
> > Randall
> >
> > On Tue, Nov 5, 2019 at 6:34 AM Mickael Maison 
> > wrote:
> >
> >> +1 (non binding)
> >> I verified signatures, built it from source, ran unit tests and
> quickstart
> >>
> >>
> >>
> >> On Fri, Oct 25, 2019 at 3:10 PM Randall Hauch  wrote:
> >>>
> >>> Hello all, we identified around three dozen bug fixes, including an
> >> update
> >>> of a third party dependency, and wanted to release a patch release for
> >> the
> >>> Apache Kafka 2.2.0 release.
> >>>
> >>> This is the *second* candidate for release of Apache Kafka 2.2.2. (RC1
> >> did
> >>> not include a fix for https://issues.apache.org/jira/browse/KAFKA-9053
> ,
> >> but
> >>> the fix appeared before RC1 was announced so it was easier to just
> create
> >>> RC2.)
> >>>
> >>> Check out the release notes for a complete list of the changes in this
> >>> release candidate:
> >>> https://home.apache.org/~rhauch/kafka-2.2.2-rc2/RELEASE_NOTES.html
> >>>
> >>> *** Please download, test and vote by Wednesday, October 30, 9am PT>
> >>>
> >>> Kafka's KEYS file containing PGP keys we use to sign the release:
> >>> https://kafka.apache.org/KEYS
> >>>
> >>> * Release artifacts to be voted upon (source and binary):
> >>> https://home.apache.org/~rhauch/kafka-2.2.2-rc2/
> >>>
> >>> * Maven artifacts to be voted upon:
> >>> https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >>>
> >>> * Javadoc:
> >>> https://home.apache.org/~rhauch/kafka-2.2.2-rc2/javadoc/
> >>>
> >>> * Tag to be voted upon (off 2.2 branch) is the 2.2.2 tag:
> >>> https://github.com/apache/kafka/releases/tag/2.2.2-rc2
> >>>
> >>> * Documentation:
> >>> https://kafka.apache.org/22/documentation.html
> >>>
> >>> * Protocol:
> >>> https://kafka.apache.org/22/protocol.html
> >>>
> >>> * Successful Jenkins builds for the 2.2 branch:
> >>> Unit/integration tests:
> https://builds.apache.org/job/kafka-2.2-jdk8/1/
> >>> System tests:
> >>> https://jenkins.confluent.io/job/system-test-kafka/job/2.2/216/
> >>>
> >>> /**
> >>>
> >>> Thanks,
> >>>
> >>> Randall Hauch
> >>
>
>

-- 
David Arthur


[ANNOUNCE] Apache Kafka 2.3.1

2019-10-24 Thread David Arthur
The Apache Kafka community is pleased to announce the release for Apache
Kafka 2.3.1

This is a bugfix release for Kafka 2.3.0. All of the changes in this
release can be found in the release notes:
https://www.apache.org/dist/kafka/2.3.1/RELEASE_NOTES.html


You can download the source and binary release (with Scala 2.11 or 2.12)
from:
https://kafka.apache.org/downloads#2.3.1

---


Apache Kafka is a distributed streaming platform with four core APIs:


** The Producer API allows an application to publish a stream records to
one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output stream to one or more output topics, effectively transforming the
input streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might
capture every change to a table.


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react
to the streams of data.


Apache Kafka is in use at large and small companies worldwide, including
Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
Target, The New York Times, Uber, Yelp, and Zalando, among others.

A big thank you for the following 41 contributors to this release!

A. Sophie Blee-Goldman, Arjun Satish, Bill Bejeck, Bob Barrett, Boyang
Chen, Bruno Cadonna, Cheng Pan, Chia-Ping Tsai, Chris Egerton, Chris
Stromberger, Colin P. Mccabe, Colin Patrick McCabe, cpettitt-confluent,
cwildman, David Arthur, Dhruvil Shah, Greg Harris, Gunnar Morling, Guozhang
Wang, huxi, Ismael Juma, Jason Gustafson, John Roesler, Konstantine
Karantasis, Lee Dongjin, LuyingLiu, Magesh Nandakumar, Matthias J. Sax,
Michał Borowiecki, Mickael Maison, mjarvie, Nacho Muñoz Gómez, Nigel Liang,
Paul, Rajini Sivaram, Randall Hauch, Robert Yokota, slim, Tirtha
Chatterjee, vinoth chandar, Will James

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kafka.apache.org/

Thank you!


Regards,
David Arthur


Re: [VOTE] 2.3.1 RC2

2019-10-24 Thread David Arthur
Thanks to everyone who voted!

The vote for RC2 of the 2.3.1 release passes with the 6 +1s and no +0 or
-1.

+1 votes
PMC Members:
* Jason Gustafson
* Guozhang Wang
* Matthias Sax
* Rajini Sivaram

Committers:
* Colin McCabe

Community:
* Jonathan Santilli

0 votes
* No votes

-1 votes
* No votes

I will proceed with the release process and send out the release
announcement in the next day or so.

Cheers,
David

On Thu, Oct 24, 2019 at 4:43 AM Rajini Sivaram 
wrote:

> +1 (binding)
>
> Verified signatures, built source and ran tests, verified binary using
> broker, producer and consumer with security enabled.
>
> Regards,
>
> Rajini
>
>
>
> On Wed, Oct 23, 2019 at 11:37 PM Matthias J. Sax 
> wrote:
>
> > +1 (binding)
> >
> > - downloaded and compiled source code
> > - verified signatures for source code and Scala 2.11 binary
> > - run core/connect/streams quickstart using Scala 2.11 binaries
> >
> >
> > -Matthias
> >
> >
> > On 10/23/19 2:43 PM, Colin McCabe wrote:
> > > + d...@kafka.apache.org
> > >
> > > On Tue, Oct 22, 2019, at 15:48, Colin McCabe wrote:
> > >> +1.  I ran the broker, producer, consumer, etc.
> > >>
> > >> best,
> > >> Colin
> > >>
> > >> On Tue, Oct 22, 2019, at 13:32, Guozhang Wang wrote:
> > >>> +1. I've ran the quick start and unit tests.
> > >>>
> > >>>
> > >>> Guozhang
> > >>>
> > >>> On Tue, Oct 22, 2019 at 12:57 PM David Arthur 
> > wrote:
> > >>>
> > >>>> Thanks, Jonathon and Jason. I've updated the release notes along
> with
> > the
> > >>>> signature and checksums. KAFKA-9053 was also missing.
> > >>>>
> > >>>> On Tue, Oct 22, 2019 at 3:47 PM Jason Gustafson  >
> > >>>> wrote:
> > >>>>
> > >>>>> +1
> > >>>>>
> > >>>>> I ran the basic quickstart on the 2.12 artifact and verified
> > >>>>> signatures/checksums.
> > >>>>>
> > >>>>> I also looked over the release notes. I see that KAFKA-8950 is
> > included,
> > >>>> so
> > >>>>> maybe they just need to be refreshed.
> > >>>>>
> > >>>>> Thanks for running the release!
> > >>>>>
> > >>>>> -Jason
> > >>>>>
> > >>>>> On Fri, Oct 18, 2019 at 5:23 AM David Arthur 
> > wrote:
> > >>>>>
> > >>>>>> We found a few more critical issues and so have decided to do one
> > more
> > >>>> RC
> > >>>>>> for 2.3.1. Please review the release notes:
> > >>>>>>
> > >>>>
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/RELEASE_NOTES.html
> > >>>>>>
> > >>>>>>
> > >>>>>> *** Please download, test and vote by Tuesday, October 22, 9pm PDT
> > >>>>>>
> > >>>>>>
> > >>>>>> Kafka's KEYS file containing PGP keys we use to sign the release:
> > >>>>>>
> > >>>>>> https://kafka.apache.org/KEYS
> > >>>>>>
> > >>>>>>
> > >>>>>> * Release artifacts to be voted upon (source and binary):
> > >>>>>>
> > >>>>>> https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/
> > >>>>>>
> > >>>>>>
> > >>>>>> * Maven artifacts to be voted upon:
> > >>>>>>
> > >>>>>>
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >>>>>>
> > >>>>>>
> > >>>>>> * Javadoc:
> > >>>>>>
> > >>>>>> https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/javadoc/
> > >>>>>>
> > >>>>>>
> > >>>>>> * Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:
> > >>>>>>
> > >>>>>> https://github.com/apache/kafka/releases/tag/2.3.1-rc2
> > >>>>>>
> > >>>>>>
> > >>>>>> * Documentation:
> > >>>>>>
> > >>>>>> https://kafka.apache.org/23/documentation.html
> > >>>>>>
> > >>>>>>
> > >>>>>> * Protocol:
> > >>>>>>
> > >>>>>> https://kafka.apache.org/23/protocol.html
> > >>>>>>
> > >>>>>>
> > >>>>>> * Successful Jenkins builds to follow
> > >>>>>>
> > >>>>>>
> > >>>>>> Thanks!
> > >>>>>>
> > >>>>>> David
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> David Arthur
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> -- Guozhang
> > >>>
> > >>
> >
> >
>


-- 
David Arthur


Re: [VOTE] 2.3.1 RC2

2019-10-22 Thread David Arthur
Thanks, Jonathon and Jason. I've updated the release notes along with the
signature and checksums. KAFKA-9053 was also missing.

On Tue, Oct 22, 2019 at 3:47 PM Jason Gustafson  wrote:

> +1
>
> I ran the basic quickstart on the 2.12 artifact and verified
> signatures/checksums.
>
> I also looked over the release notes. I see that KAFKA-8950 is included, so
> maybe they just need to be refreshed.
>
> Thanks for running the release!
>
> -Jason
>
> On Fri, Oct 18, 2019 at 5:23 AM David Arthur  wrote:
>
> > We found a few more critical issues and so have decided to do one more RC
> > for 2.3.1. Please review the release notes:
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/RELEASE_NOTES.html
> >
> >
> > *** Please download, test and vote by Tuesday, October 22, 9pm PDT
> >
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> >
> > https://kafka.apache.org/KEYS
> >
> >
> > * Release artifacts to be voted upon (source and binary):
> >
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/
> >
> >
> > * Maven artifacts to be voted upon:
> >
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> >
> > * Javadoc:
> >
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/javadoc/
> >
> >
> > * Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:
> >
> > https://github.com/apache/kafka/releases/tag/2.3.1-rc2
> >
> >
> > * Documentation:
> >
> > https://kafka.apache.org/23/documentation.html
> >
> >
> > * Protocol:
> >
> > https://kafka.apache.org/23/protocol.html
> >
> >
> > * Successful Jenkins builds to follow
> >
> >
> > Thanks!
> >
> > David
> >
>


-- 
David Arthur


[VOTE] 2.3.1 RC2

2019-10-18 Thread David Arthur
We found a few more critical issues and so have decided to do one more RC
for 2.3.1. Please review the release notes:
https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/RELEASE_NOTES.html


*** Please download, test and vote by Tuesday, October 22, 9pm PDT


Kafka's KEYS file containing PGP keys we use to sign the release:

https://kafka.apache.org/KEYS


* Release artifacts to be voted upon (source and binary):

https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/


* Maven artifacts to be voted upon:

https://repository.apache.org/content/groups/staging/org/apache/kafka/


* Javadoc:

https://home.apache.org/~davidarthur/kafka-2.3.1-rc2/javadoc/


* Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:

https://github.com/apache/kafka/releases/tag/2.3.1-rc2


* Documentation:

https://kafka.apache.org/23/documentation.html


* Protocol:

https://kafka.apache.org/23/protocol.html


* Successful Jenkins builds to follow


Thanks!

David


Re: [VOTE] 2.3.1 RC1

2019-10-06 Thread David Arthur
Passing builds:
Unit/integration tests https://builds.apache.org/job/kafka-2.3-jdk8/122/
System tests https://jenkins.confluent.io/job/system-test-kafka/job/2.3/142/


On Fri, Oct 4, 2019 at 9:52 PM David Arthur  wrote:

> Hello all, we identified a few bugs and a dependency update we wanted to
> get fixed for 2.3.1. In particular, there was a problem with rolling
> upgrades of streams applications (KAFKA-8649).
>
> Check out the release notes for a complete list.
> https://home.apache.org/~davidarthur/kafka-2.3.1-rc1/RELEASE_NOTES.html
>
> *** Please download, test and vote by Wednesday October 9th, 9pm PST
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~davidarthur/kafka-2.3.1-rc1/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> https://home.apache.org/~davidarthur/kafka-2.3.1-rc1/javadoc/
>
> * Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:
> https://github.com/apache/kafka/releases/tag/2.3.1-rc1
>
> * Documentation:
> https://kafka.apache.org/23/documentation.html
>
> * Protocol:
> https://kafka.apache.org/23/protocol.html
>
> * Successful Jenkins builds for the 2.3 branch are TBD but will be located:
>
> Unit/integration tests: https://builds.apache.org/job/kafka-2.3-jdk8/
>
> System tests: https://jenkins.confluent.io/job/system-test-kafka/job/2.3/
>
>
> Thanks!
> David Arthur
>


-- 
David Arthur


Re: [kafka-clients] Re: [VOTE] 2.3.1 RC0

2019-10-04 Thread David Arthur
RC0 was cancelled and a new voting thread for RC1 was just sent out.

Thanks!

On Fri, Oct 4, 2019 at 11:06 AM Matt Farmer  wrote:

> Do we have an ETA on when y'all think 2.3.1 will land?
>
> On Sat, Sep 28, 2019 at 1:55 PM Matthias J. Sax 
> wrote:
>
> > There was a recent report about vulnerabilities of some dependent
> > libraries: https://issues.apache.org/jira/browse/KAFKA-8952
> >
> > I think we should fix this for 2.3.1.
> >
> > Furthermore, we identified the root cause of
> > https://issues.apache.org/jira/browse/KAFKA-8649 -- it seems to be a
> > critical issue because it affects upgrading of Kafka Streams
> > applications. We plan to do a PR asap and hope we can include it in
> 2.3.1.
> >
> >
> > -Matthias
> >
> > On 9/25/19 11:57 AM, David Arthur wrote:
> > > Thanks, Jason. I agree we should include this. I'll produce RC1 once
> > > this patch is available.
> > >
> > > -David
> > >
> > > On Tue, Sep 24, 2019 at 6:02 PM Jason Gustafson  > > <mailto:ja...@confluent.io>> wrote:
> > >
> > > Hi David,
> > >
> > > Thanks for running the release. I think we should consider getting
> > > this bug
> > > fixed: https://issues.apache.org/jira/browse/KAFKA-8896. The
> impact
> > > of this
> > > bug is that consumer groups cannot commit offsets or rebalance. The
> > > patch
> > > should be ready shortly.
> > >
> > > Thanks,
> > > Jason
> > >
> > >
> > >
> > > On Fri, Sep 13, 2019 at 3:53 PM David Arthur <
> davidart...@apache.org
> > > <mailto:davidart...@apache.org>> wrote:
> > >
> > > > Hello Kafka users, developers and client-developers,
> > > >
> > > >
> > > > This is the first candidate for release of Apache Kafka 2.3.1
> which
> > > > includes many bug fixes for Apache Kafka 2.3.
> > > >
> > > >
> > > > Release notes for the 2.3.1 release:
> > > >
> > > >
> > >
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/RELEASE_NOTES.html
> > > >
> > > >
> > > > *** Please download, test and vote by Wednesday, September 18,
> 9am
> > PT
> > > >
> > > >
> > > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > >
> > > > https://kafka.apache.org/KEYS
> > > >
> > > >
> > > > * Release artifacts to be voted upon (source and binary):
> > > >
> > > > https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/
> > > >
> > > >
> > > > * Maven artifacts to be voted upon:
> > > >
> > > >
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > > >
> > > >
> > > > * Javadoc:
> > > >
> > > > https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/javadoc/
> > > >
> > > >
> > > > * Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:
> > > >
> > > > https://github.com/apache/kafka/releases/tag/2.3.1-rc0
> > > >
> > > >
> > > > * Documentation:
> > > >
> > > > https://kafka.apache.org/23/documentation.html
> > > >
> > > >
> > > > * Protocol:
> > > >
> > > > https://kafka.apache.org/23/protocol.html
> > > >
> > > >
> > > > * Successful Jenkins builds for the 2.3 branch:
> > > >
> > > > Unit/integration tests:
> > https://builds.apache.org/job/kafka-2.3-jdk8/
> > > >
> > > > System tests:
> > > > https://jenkins.confluent.io/job/system-test-kafka/job/2.3/119
> > > >
> > > >
> > > >
> > > > We have yet to get a successful unit/integration job run due to
> > > some flaky
> > > > failures. I will send out a follow-up email once we have a
> passing
> > > build.
> > > >
> > > >
> > > > Thanks!
> > > >
> > > > David
> > > >
> > >
> > >
> > >
> > > --
> > > David Arthur
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "kafka-clients" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> > > an email to kafka-clients+unsubscr...@googlegroups.com
> > > <mailto:kafka-clients+unsubscr...@googlegroups.com>.
> > > To view this discussion on the web visit
> > >
> >
> https://groups.google.com/d/msgid/kafka-clients/CA%2B0Ze6q9tTVS4eYoZmaN2z4UB_vxyQ%2BhY_2Gisv%3DM2Pmn-hWpA%40mail.gmail.com
> > > <
> >
> https://groups.google.com/d/msgid/kafka-clients/CA%2B0Ze6q9tTVS4eYoZmaN2z4UB_vxyQ%2BhY_2Gisv%3DM2Pmn-hWpA%40mail.gmail.com?utm_medium=email&utm_source=footer
> > >.
> >
> >
>


-- 
David Arthur


[VOTE] 2.3.1 RC1

2019-10-04 Thread David Arthur
Hello all, we identified a few bugs and a dependency update we wanted to
get fixed for 2.3.1. In particular, there was a problem with rolling
upgrades of streams applications (KAFKA-8649).

Check out the release notes for a complete list.
https://home.apache.org/~davidarthur/kafka-2.3.1-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Wednesday October 9th, 9pm PST

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~davidarthur/kafka-2.3.1-rc1/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
https://home.apache.org/~davidarthur/kafka-2.3.1-rc1/javadoc/

* Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:
https://github.com/apache/kafka/releases/tag/2.3.1-rc1

* Documentation:
https://kafka.apache.org/23/documentation.html

* Protocol:
https://kafka.apache.org/23/protocol.html

* Successful Jenkins builds for the 2.3 branch are TBD but will be located:

Unit/integration tests: https://builds.apache.org/job/kafka-2.3-jdk8/

System tests: https://jenkins.confluent.io/job/system-test-kafka/job/2.3/


Thanks!
David Arthur


Re: [VOTE] 2.3.1 RC0

2019-09-25 Thread David Arthur
Thanks, Jason. I agree we should include this. I'll produce RC1 once this
patch is available.

-David

On Tue, Sep 24, 2019 at 6:02 PM Jason Gustafson  wrote:

> Hi David,
>
> Thanks for running the release. I think we should consider getting this bug
> fixed: https://issues.apache.org/jira/browse/KAFKA-8896. The impact of
> this
> bug is that consumer groups cannot commit offsets or rebalance. The patch
> should be ready shortly.
>
> Thanks,
> Jason
>
>
>
> On Fri, Sep 13, 2019 at 3:53 PM David Arthur 
> wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> >
> > This is the first candidate for release of Apache Kafka 2.3.1 which
> > includes many bug fixes for Apache Kafka 2.3.
> >
> >
> > Release notes for the 2.3.1 release:
> >
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/RELEASE_NOTES.html
> >
> >
> > *** Please download, test and vote by Wednesday, September 18, 9am PT
> >
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> >
> > https://kafka.apache.org/KEYS
> >
> >
> > * Release artifacts to be voted upon (source and binary):
> >
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/
> >
> >
> > * Maven artifacts to be voted upon:
> >
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> >
> > * Javadoc:
> >
> > https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/javadoc/
> >
> >
> > * Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:
> >
> > https://github.com/apache/kafka/releases/tag/2.3.1-rc0
> >
> >
> > * Documentation:
> >
> > https://kafka.apache.org/23/documentation.html
> >
> >
> > * Protocol:
> >
> > https://kafka.apache.org/23/protocol.html
> >
> >
> > * Successful Jenkins builds for the 2.3 branch:
> >
> > Unit/integration tests: https://builds.apache.org/job/kafka-2.3-jdk8/
> >
> > System tests:
> > https://jenkins.confluent.io/job/system-test-kafka/job/2.3/119
> >
> >
> >
> > We have yet to get a successful unit/integration job run due to some
> flaky
> > failures. I will send out a follow-up email once we have a passing build.
> >
> >
> > Thanks!
> >
> > David
> >
>


-- 
David Arthur


Re: Delivery Status Notification (Failure)

2019-09-16 Thread David Arthur
And here's a passing build for the 2.3 branch
https://builds.apache.org/view/All/job/kafka-2.3-jdk8/108/

On Mon, Sep 16, 2019 at 3:46 PM David Arthur  wrote:

> And here's a passing build for the 2.3 branch
> https://builds.apache.org/view/All/job/kafka-2.3-jdk8/108/
>
> On Fri, Sep 13, 2019 at 6:53 PM Mail Delivery Subsystem <
> mailer-dae...@googlemail.com> wrote:
>
>> Hello davidart...@apache.org,
>>
>> We're writing to let you know that the group you tried to contact
>> (kafka-clients) may not exist, or you may not have permission to post
>> messages to the group. A few more details on why you weren't able to post:
>>
>>  * You might have spelled or formatted the group name incorrectly.
>>  * The owner of the group may have removed this group.
>>  * You may need to join the group before receiving permission to post.
>>  * This group may not be open to posting.
>>
>> If you have questions related to this or any other Google Group, visit
>> the Help Center at https://groups.google.com/support/.
>>
>> Thanks,
>>
>> Google Groups
>>
>>
>>
>> - Original message -
>>
>> X-Google-Smtp-Source:
>> APXvYqzR4ecTqF5eQ+zbyuBxevrqEwPh8iwuX3JqXoKJrMBJp7djgdedjT2zyrbtVIrUeG6BwVA8
>> X-Received: by 2002:a2e:a408:: with SMTP id
>> p8mr31061788ljn.54.1568415187213;
>> Fri, 13 Sep 2019 15:53:07 -0700 (PDT)
>> ARC-Seal: i=1; a=rsa-sha256; t=1568415187; cv=none;
>> d=google.com; s=arc-20160816;
>>
>> b=lFaSoS3I6a2CXozRGM3EmhfndkH0TurGXBP9+hWIfDIcoNjnr3ARGwMKY7AWCDZPs3
>>
>>  ov7Q0bS1Q6p0sYNteXCQL/sV6/mgc2V/xyDSGG5o1KVIgZFfK9ufnwcMk4aO+WrXpDAW
>>
>>  j7LdU4dASdd+Xx7XStZv4q6MwXscMm4jQo0i8rUUDntcP4att8pHOMOLi1xPviWm16Fj
>>
>>  8hRHBhP3q3cVwJ5tEsDNgXBNpI6VsZ9QpMbqGyc5utoVc8SN2ga+8mE4hdBZER/dCA3N
>>
>>  z4ZShmQUeC1Ke8AkoSbnQ2xCSjHC9/WIjP2OFCglMGCTpnxKKBW7XS6WdC73tSKwCgqM
>>  gNdA==
>> ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
>> s=arc-20160816;
>> h=to:subject:message-id:date:from:mime-version;
>> bh=2IB75WkaHSQnbnrcwcxo9nzKnjVzTOZ3fxahUUU2E4A=;
>>
>> b=ItkjikNLKn9+gEytT805Fz6dm3386ciF2CFBtwmRwv/oR77fsGxREbIrats1BIvp3W
>>
>>  RE91FZbTRo3i9p4EbHpKpjpm1kLetiUrbaXVw2Ti85c7v2D+BoLEwpMAsVvRCQcnEG/K
>>
>>  oLLZP4I39alEFzH3RzUqXVbmdmBx5G/UGXEVvo6rtOEsvZm7r3Cg5/QZIee3jTNQL0Tv
>>
>>  1iVk3O1OUqtiEuaxg7e/x48fzwpMSg1Xo1xmXLRCfmVVGPsvc1pAsoMBwYHrCp5Fz6pS
>>
>>  p6pEtPZDKfZJ4xgGveJuawT4OyMkhcZVREot9KoEOzRA6zi/o2iPq93urcTQqskF13ze
>>  /+yQ==
>> ARC-Authentication-Results: i=1; gmr-mx.google.com;
>>spf=pass (google.com: domain of davidart...@apache.org designates
>> 207.244.88.153 as permitted sender) smtp.mailfrom=davidart...@apache.org
>> Return-Path: 
>> Received: from mail.apache.org (hermes.apache.org. [207.244.88.153])
>> by gmr-mx.google.com with SMTP id
>> o30si1535368lfi.0.2019.09.13.15.53.06
>> for ;
>> Fri, 13 Sep 2019 15:53:07 -0700 (PDT)
>> Received-SPF: pass (google.com: domain of davidart...@apache.org
>> designates 207.244.88.153 as permitted sender) client-ip=207.244.88.153;
>> Authentication-Results: gmr-mx.google.com;
>>spf=pass (google.com: domain of davidart...@apache.org designates
>> 207.244.88.153 as permitted sender) smtp.mailfrom=davidart...@apache.org
>> Received: (qmail 16798 invoked by uid 99); 13 Sep 2019 22:53:05 -
>> Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.159)
>> by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Sep 2019 22:53:05
>> +
>> Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com
>> [209.85.208.179])
>> by mailrelay1-lw-us.apache.org (ASF Mail Server at
>> mailrelay1-lw-us.apache.org) with ESMTPSA id 51D8C5A46
>> for ; Fri, 13 Sep 2019 22:53:05
>> + (UTC)
>> Received: by mail-lj1-f179.google.com with SMTP id m13so621468ljj.11
>> for ; Fri, 13 Sep 2019 15:53:05
>> -0700 (PDT)
>> X-Gm-Message-State:
>> APjAAAXWPIv9Dwy38bntGR/3Ohm5LevO97RH2xWTmubiYBHn99xVzzPX
>> BiCE0sUZAWUyGlzIzWDF8YoZOrAzpwrn7B3O8AA=
>> X-Received: by 2002:a2e:9c87:: with SMTP id
>> x7mr18958540lji.207.1568415184417;
>>  Fri, 13 Sep 2019 15:53:04 -0700 (PDT)
>> MIME-Version: 1.0
>> From: David Arthur 
>> Date: Fri, 13 Sep 2019 18:52:53 -0400
>> X-Gmail-Original-Message-ID: <
>> ca+0ze6rcdwmmc0e+usuekcttyr7r2ecck5tti_28eosfcve...@mail.gmail.com>
>> Message-ID: <
>> ca+0ze6rcdwmmc0e+usuekcttyr7r2ecck5tti_28eosfcv

[VOTE] 2.3.1 RC0

2019-09-13 Thread David Arthur
Hello Kafka users, developers and client-developers,


This is the first candidate for release of Apache Kafka 2.3.1 which
includes many bug fixes for Apache Kafka 2.3.


Release notes for the 2.3.1 release:

https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/RELEASE_NOTES.html


*** Please download, test and vote by Wednesday, September 18, 9am PT


Kafka's KEYS file containing PGP keys we use to sign the release:

https://kafka.apache.org/KEYS


* Release artifacts to be voted upon (source and binary):

https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/


* Maven artifacts to be voted upon:

https://repository.apache.org/content/groups/staging/org/apache/kafka/


* Javadoc:

https://home.apache.org/~davidarthur/kafka-2.3.1-rc0/javadoc/


* Tag to be voted upon (off 2.3 branch) is the 2.3.1 tag:

https://github.com/apache/kafka/releases/tag/2.3.1-rc0


* Documentation:

https://kafka.apache.org/23/documentation.html


* Protocol:

https://kafka.apache.org/23/protocol.html


* Successful Jenkins builds for the 2.3 branch:

Unit/integration tests: https://builds.apache.org/job/kafka-2.3-jdk8/

System tests: https://jenkins.confluent.io/job/system-test-kafka/job/2.3/119



We have yet to get a successful unit/integration job run due to some flaky
failures. I will send out a follow-up email once we have a passing build.


Thanks!

David


Re: [VOTE] 2.2.0 RC2

2019-03-19 Thread David Arthur
+1

Validated signatures, and ran through quick-start.

Thanks!

On Mon, Mar 18, 2019 at 4:00 AM Jakub Scholz  wrote:

> +1 (non-binding). I used the staged binaries and run some of my tests
> against them. All seems to look good to me.
>
> On Sat, Mar 9, 2019 at 11:56 PM Matthias J. Sax 
> wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the third candidate for release of Apache Kafka 2.2.0.
> >
> >  - Added SSL support for custom principal name
> >  - Allow SASL connections to periodically re-authenticate
> >  - Command line tool bin/kafka-topics.sh adds AdminClient support
> >  - Improved consumer group management
> >- default group.id is `null` instead of empty string
> >  - API improvement
> >- Producer: introduce close(Duration)
> >- AdminClient: introduce close(Duration)
> >- Kafka Streams: new flatTransform() operator in Streams DSL
> >- KafkaStreams (and other classed) now implement AutoClosable to
> > support try-with-resource
> >- New Serdes and default method implementations
> >  - Kafka Streams exposed internal client.id via ThreadMetadata
> >  - Metric improvements:  All `-min`, `-avg` and `-max` metrics will now
> > output `NaN` as default value
> > Release notes for the 2.2.0 release:
> > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/RELEASE_NOTES.html
> >
> > *** Please download, test, and vote by Thursday, March 14, 9am PST.
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > https://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc:
> > https://home.apache.org/~mjsax/kafka-2.2.0-rc2/javadoc/
> >
> > * Tag to be voted upon (off 2.2 branch) is the 2.2.0 tag:
> > https://github.com/apache/kafka/releases/tag/2.2.0-rc2
> >
> > * Documentation:
> > https://kafka.apache.org/22/documentation.html
> >
> > * Protocol:
> > https://kafka.apache.org/22/protocol.html
> >
> > * Jenkins builds for the 2.2 branch:
> > Unit/integration tests: https://builds.apache.org/job/kafka-2.2-jdk8/
> > System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/2.2/
> >
> > /**
> >
> > Thanks,
> >
> > -Matthias
> >
> >
>


IRC logs now available on botbot.me

2014-09-10 Thread David Arthur

https://botbot.me/freenode/apache-kafka/

Just FYI, wasn't sure if we had any logging in place

Cheers,
David




Re: python and kafka - how to use as a queue

2014-02-08 Thread David Arthur

jsh,

Please open an issue at https://github.com/mumrah/kafka-python so other 
users/devs of this library have visibility


Thanks!
-David

On 1/16/14 9:18 PM, Jagbir Hooda wrote:

Hi Arthur,

I'm running into a very similar issue even with the latest version ( 
kafka-python @ V. 0.8.1_1 used with kafka_2.8.0-0.8.0.tar.gz). I have created a 
topic 'my-topic' with two partitions and 1-replication (across a set of 3 kafka 
brokers). I've published 100 messages to the topic (see Reference below). Now 
each time when I run the following consumer test

--8<--
import logging
import time
from kafka.client import KafkaClient
from kafka.consumer import SimpleConsumer
from kafka.producer import SimpleProducer, KeyedProducer

kafka = KafkaClient("kafkabroker2", 9092)

consumer = SimpleConsumer(kafka, "my-group", "my-topic", auto_commit=True, 
auto_commit_every_n=10)

for message in consumer:
 time.sleep(1)
 print(message)
--8<--

I get back all the 100 messages. You mentioned that with kafka 0.8 there will be an 
offset stored in zookeeper (via broker) which will prevent consumers from getting older 
messages. I'm curious how to use this feature. I also want to run multiple consumers (on 
different machines) with exactly the same test code as above and get only one message 
delivered to only one of the consumers in "my-group" (multiple consumers per 
queue behavior).

Thanks,
jsh

REFERENCE:

I've used following code to publish messages
-8<---
rom kafka.client import KafkaClient
from kafka.consumer import SimpleConsumer
from kafka.producer import SimpleProducer, KeyedProducer

kafka = KafkaClient("kafkabroker2", 9092)

producer = SimpleProducer(kafka, "my-topic")
for i in range(0,100):
 producer.send_messages("some message {0}".format(i))
-8<---  
 




Re: C++ Producer => Broker => Java Consumer?

2014-02-08 Thread David Arthur
If you're working with more complex messages than strings, any of the 
dearth of cross-language serialization frameworks will work.



On 1/31/14 5:51 PM, Otis Gospodnetic wrote:

Beautiful then!  I thought this cause problems with Java consumer not
knowing how to deserialize, but sounds like I don't have to worry.
  Excellent, thanks!

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Fri, Jan 31, 2014 at 5:43 PM, Philip O'Toole  wrote:


Exactly.

Our C++ producers simply stream bytes to 0.72 Kafka, following Kafka's
byte-level message spec. Our Java-based Consumers just read bytes and use
the standard IO libraries to deserialize the data.

Philip


On Fri, Jan 31, 2014 at 2:38 PM, Tom Brown  wrote:


The C++ program writes bytes to kafka, and java reads bytes from kafka.

Is there something special about the way the messages are being

serialized

in C++?

--Tom


On Fri, Jan 31, 2014 at 2:36 PM, Philip O'Toole 

wrote:

Is this a Kafka C++ lib you wrote yourself, or some open-source

library?

What version of Kafka?

Philip


On Fri, Jan 31, 2014 at 1:30 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:


Hi,

If Kafka Producer is using a C++ Kafka lib to produce messages, how

can

Kafka Consumers written in Java deserialize them?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/





Re: New Producer Public API

2014-01-31 Thread David Arthur


On 1/24/14 7:41 PM, Jay Kreps wrote:

Yeah I'll fix that name.

Hmm, yeah, I agree that often you want to be able delay network
connectivity until you have started everything up. But at the same time I
kind of loath special init() methods because you always forget to call them
and get one round of error every time.
One pattern I've used in the past is to use lazy initialization but also 
provide a method to eagerly do it. E.g., if init() wasn't called, the 
first call of send() would call it for you.

I wonder if in those cases one could
just avoid creating the producer instance until you are ready to connect.
Basically if you think of the producer instance as the equivalent of a
socket connection or whatever this kind of makes sense.

-Jay


On Fri, Jan 24, 2014 at 4:34 PM, Roger Hoover wrote:


Jay,

Thanks for the explanation.  I didn't realize that the broker list was for
bootstrapping and was not required to be a complete list of all brokers
(although I see now that it's clearly stated in the text description of the
parameter).  Nonetheless, does it still make sense to make the config
parameter more clear?  Instead of BROKER_LIST_CONFIG, it could be something
like BROKER_LIST_INITIAL_CONFIG or BROKER_DISCOVERY_LIST_CONFIG or
BROKER_BOOTSTRAP_LIST_CONFIG?

  I like the idea of proactively checking that at least one broker url is
working and failing fast if it is not.  My 2 cents is that it should be
triggered by a method call like initialize() rather than doing it in the
constructor.  Sometimes for unit tests or other purposes, you want to be
able to create objects without triggering network dependencies.

Cheers,

Roger


On Fri, Jan 24, 2014 at 4:13 PM, Jay Kreps  wrote:


Roger,

These are good questions.

1. The producer since 0.8 is actually zookeeper free, so this is not new

to

this client it is true for the current client as well. Our experience was
that direct zookeeper connections from zillions of producers wasn't a

good

idea for a number of reasons. Our intention is to remove this dependency
from the consumer as well. The configuration in the producer doesn't need
the full set of brokers, though, just one or two machines to bootstrap

the

state of the cluster from--in other words it isn't like you need to
reconfigure your clients every time you add some servers. This is exactly
how zookeeper works too--if we used zookeeper you would need to give a

list

of zk urls in case a particular zk server was down. Basically either way
you need a few statically configured nodes to go to discover the full

state

of the cluster. For people who don't like hard coding hosts you can use a
VIP or dns or something instead.

2. Yes this is a good point and was a concern I had too--the current
behavior is that with bad urls the client would start normally and then
hang trying to fetch metadata when the first message is sent and finally
give up and throw an exception. This is not ideal.

The challenge is this: we use the bootstrap urls to fetch metadata for
particular topics but we don't know which until we start getting messages
for them. We have the option of fetching metadata for all topics but the
problem is that for a cluster hosting tens of thousands of topics that is
actually a ton of data.

An alternative that this just made me think of is that we could

proactively

connect to bootstrap urls sequentially until one succeeds when the

producer

is first created and fail fast if we can't establish a connection. This
would not be wasted work as we could use the connection for the metadata
request when the first message is sent. I like this solution and will
implement it. So thanks for asking!

-Jay



On Fri, Jan 24, 2014 at 2:46 PM, Roger Hoover 
wrote:
A couple comments:

1) Why does the config use a broker list instead of discovering the

brokers

in ZooKeeper?  It doesn't match the HighLevelConsumer API.

2) It looks like broker connections are created on demand.  I'm

wondering

if sometimes you might want to flush out config or network connectivity
issues before pushing the first message through.

Should there also be a KafkaProducer.connect() or .open() method or
connectAll()?  I guess it would try to connect to all brokers in the
BROKER_LIST_CONFIG

HTH,

Roger


On Fri, Jan 24, 2014 at 11:54 AM, Jay Kreps 

wrote:

As mentioned in a previous email we are working on a

re-implementation

of

the producer. I would like to use this email thread to discuss the

details

of the public API and the configuration. I would love for us to be
incredibly picky about this public api now so it is as good as

possible

and

we don't need to break it in the future.

The best way to get a feel for the API is actually to take a look at

the

javadoc, my hope is to get the api docs good enough so that it is
self-explanatory:



http://empathybox.com/kafka-javadoc/index.html?kafka/clients/producer/KafkaProducer.html

Please take a look at this API and give me any thoughts you may have!

It may also be reasonab

Re: Anyone working on a Kafka book?

2013-12-10 Thread David Arthur

There was some talk a few months ago, not sure what the current status is.

On 12/10/13 10:01 AM, S Ahmed wrote:

Is there a book or this was just an idea?


On Mon, Mar 25, 2013 at 12:42 PM, Chris Curtin wrote:


Thanks Jun,

I've updated the example with this information.

I've also removed some of the unnecessary newlines.

Thanks,

Chris


On Mon, Mar 25, 2013 at 12:04 PM, Jun Rao  wrote:


Chris,

This looks good. One thing about partitioning. Currently, if a message
doesn't have a key, we always use the random partitioner (regardless of
what "partitioner.class" is set to).

Thanks,

Jun







IRC channel has moved to #apache-kafka

2013-10-23 Thread David Arthur
We have created and registered a new IRC channel for the project. 
Freenode has setup a forward from #kafka to #apache-kafka for us, but 
people already in this channel will need to /join #apache-kafka


Sorry for the inconvenience.

-David


Re: Metadata API returns localhost.localdomain for one of the brokers in EC2

2013-10-03 Thread David Arthur
You can configure the hostname for the broker with the "host.name" 
property in the broker's config (server.properties?). If you don't 
specify one here, then all interfaces will be bound to and one will be 
chosen to get published via ZooKeeper (what the metadata API is reading)


See: http://kafka.apache.org/documentation.html#brokerconfigs

-David

On 10/3/13 2:57 AM, Aniket Bhatnagar wrote:

I have installed 2 brokers on EC2. I also have a (scala) application that
receives data stream and pushes to kafka cluster. By co-incidence, a
(slightly heavier) EC2 instance is running both a kafka broker and the data
receiver application. I am noticing that all data receiver application
nodes that are not on the shared kafka + reciever app EC2 instance are
complaining for connect errors to localhost.localdomain:9092. Is this a
possible bug that results in Kafka detecting instance hostname
as localhost.localdomain instead of actual hostname?

Also, how do I fix this temporarily until a permanent fix is available?





Re: as i understand rebalance happens on client side

2013-10-01 Thread David Arthur

Kane,

I'm the creator of kafka-python, just thought I'd give some insight.

Consumer rebalancing is actually pretty tricky to get right. It requires 
interaction with ZooKeeper which (though possible via kazoo) is 
something I've tried to avoid in kafka-python. It also seems a little 
strange to me to mix your consumers between Java/Scala and Python. If 
you really need rebalancing between Python consumers, you'd have to 
implement that on top of kafka-python.


Once the coordinator API is finalized for 0.9, I (or someone) will work 
on implementing it in kafka-python


Cheers
-David

On 10/1/13 11:56 AM, Kane Kane wrote:

The reason i was asking is that this library seems to have support only for
SimpleConsumer https://github.com/mumrah/kafka-python/, i was curious if
all should be implemented on client or kafka has some rebalancing logic and
prevent consuming from the same queue on server side in case of
SimpleConsumer api, but I see now that everything should implemented on the
client side.

Thanks.


On Tue, Oct 1, 2013 at 8:52 AM, Guozhang Wang  wrote:


I do not understand your question, what are you trying to implement?


On Tue, Oct 1, 2013 at 8:42 AM, Kane Kane  wrote:


So essentially you can't do "queue" pattern, unless you somehow implement
locking on the client?


On Tue, Oct 1, 2013 at 8:35 AM, Guozhang Wang 

wrote:

SimpleConsumer do not have any concept of group management, only the
high-level consumers have. So multiple simple consumers can

independently

consume from the same partition(s).

Guozhang


On Tue, Oct 1, 2013 at 8:11 AM, Kane Kane 

wrote:

Yeah, I noticed that, i'm curious how balancing happens if

SimpleConsumer

is used. I.e. i can provide a partition to read from if i use
SimpleConsumer, but what if someone else already attached to that
partition, what would happen? Also what would happen if one

SimpleConsumer

attached to all partitions? Noone would be able to join?


On Tue, Oct 1, 2013 at 6:33 AM, Neha Narkhede <

neha.narkh...@gmail.com

wrote:
There are 2 types of consumer clients in Kafka -

ZookeeperConsumerConnector

and SimpleConsumer. Only the former has the re balancing logic.

Thanks,
Neha
On Oct 1, 2013 6:30 AM, "Kane Kane"  wrote:


But it looks like some clients don't implement it?




--
-- Guozhang




--
-- Guozhang





Re: Reading Kafka directly from Pig?

2013-08-07 Thread David Arthur
I'd be happy to, if and when it becomes a real thing. Still very alpha 
quality right now


On 8/7/13 10:58 AM, Russell Jurney wrote:

David, can you share the code on Github so we can take a look? This
sounds awesome.

Russell Jurney http://datasyndrome.com

On Aug 7, 2013, at 7:49 AM, Jun Rao  wrote:


David,

That's interesting. Kafka provides an infinite stream of data whereas Pig
works on a finite amount of data. How did you solve the mismatch?

Thanks,

Jun


On Wed, Aug 7, 2013 at 7:41 AM, David Arthur  wrote:


I've thrown together a Pig LoadFunc to read data from Kafka, so you could
load data like:

QUERY_LOGS = load 'kafka://localhost:9092/logs.**query#8' using
com.mycompany.pig.**KafkaAvroLoader('com.**mycompany.Query');

The path part of the uri is the Kafka topic, and the fragment is the
number of partitions. In the implementation I have, it makes one input
split per partition. Offsets are not really dealt with at this point - it's
a rough prototype.

Anyone have thoughts on whether or not this is a good idea? I know usually
the pattern is: kafka -> hdfs -> mapreduce. If I'm only reading from this
data from Kafka once, is there any reason why I can't skip writing to HDFS?

Thanks!
-David





Re: Reading Kafka directly from Pig?

2013-08-07 Thread David Arthur
Right now it only terminates if SimpleConsumer hits the timeout. So, in 
theory it can forever. To bound the InputFormat, I would probably add a 
max time or max number of messages to consume (in addition to the timeout).


I started by looking at the Camus code, but it was easier to whip up a 
simple InputFormat for testing. If this becomes a real thing, I would 
probably figure out how to utilize Camus.


-David

On 8/7/13 10:49 AM, Jun Rao wrote:

David,

That's interesting. Kafka provides an infinite stream of data whereas Pig
works on a finite amount of data. How did you solve the mismatch?

Thanks,

Jun


On Wed, Aug 7, 2013 at 7:41 AM, David Arthur  wrote:


I've thrown together a Pig LoadFunc to read data from Kafka, so you could
load data like:

QUERY_LOGS = load 'kafka://localhost:9092/logs.**query#8' using
com.mycompany.pig.**KafkaAvroLoader('com.**mycompany.Query');

The path part of the uri is the Kafka topic, and the fragment is the
number of partitions. In the implementation I have, it makes one input
split per partition. Offsets are not really dealt with at this point - it's
a rough prototype.

Anyone have thoughts on whether or not this is a good idea? I know usually
the pattern is: kafka -> hdfs -> mapreduce. If I'm only reading from this
data from Kafka once, is there any reason why I can't skip writing to HDFS?

Thanks!
-David





Reading Kafka directly from Pig?

2013-08-07 Thread David Arthur
I've thrown together a Pig LoadFunc to read data from Kafka, so you 
could load data like:


QUERY_LOGS = load 'kafka://localhost:9092/logs.query#8' using 
com.mycompany.pig.KafkaAvroLoader('com.mycompany.Query');


The path part of the uri is the Kafka topic, and the fragment is the 
number of partitions. In the implementation I have, it makes one input 
split per partition. Offsets are not really dealt with at this point - 
it's a rough prototype.


Anyone have thoughts on whether or not this is a good idea? I know 
usually the pattern is: kafka -> hdfs -> mapreduce. If I'm only reading 
from this data from Kafka once, is there any reason why I can't skip 
writing to HDFS?


Thanks!
-David


Re: kafka 0.7 jar file

2013-07-24 Thread David Arthur
You also need to include the scala-library.jar at a minimum. If you are 
doing anything with ZooKeeper you will need the zookeeper and zkclient 
jars. Likewise, if you want Snappy compression, you will need snappy-java


HTH
-David



On 7/24/13 4:06 PM, Nandigam, Sujitha wrote:

Hi,

I have one application based on netty client which gets messages from my 
server. I want to send these messages to kafka. So I wrote my producer code 
along with application by importing kafka jar into this project and made jar 
file.

Now I ran application jar file to run producer class but got below exception..I 
included kafka jar in classpath.

java.lang.NoClassDefFoundError: scala/ScalaObject
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
 at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:14   
  1)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
 at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
 at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:14   
  1)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
 at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

Thanks,
Sujitha
"This message (including any attachments) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that is 
non-public, proprietary, privileged, confidential and exempt from disclosure under 
applicable law or may be constituted as attorney work product. If you are not the 
intended recipient, you are hereby notified that any use, dissemination, distribution, or 
copying of this communication is strictly prohibited. If you have received this message 
in error, notify sender immediately and delete this message immediately."





Re: invalid pom in maven central, for 0.8.0-beta1

2013-07-24 Thread David Arthur
For any Ivy people looking for a workaround, here is an ivysettings.xml 
that will resolve from Apache repos


https://gist.github.com/mumrah/6070548


On 7/24/13 6:01 AM, Joe Stein wrote:

Hi Jason, can you please try using
https://repository.apache.org/content/repositories/releases as a repository
until the beta2 release and confirm that works for you so we know the next
beta will fix things before the 0.8.0 release.  Thanks!

/***

  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly 
  Twitter: @allthingshadoop 

/

On Wed, Jul 24, 2013 at 1:53 AM, Jason Rosenberg  wrote:


Sorry, I realize now what I am observing here was discussed in a previous
thread.  Although I'm a bit different, in that I'm just trying to use
straight maven (no sbt or gradle, etc.).  Anyway, the pom in maven central
is invalid, and should probably be removed, I should think.

Jason


On Wed, Jul 24, 2013 at 1:47 AM, Jason Rosenberg  wrote:


I have been using a pom file for 0.8.0 that I hand-edited from the one
generated with sbt make:pom.  Now that there's a version up on maven
central, I'm trying to use that.

It looks like the pom file hosted now on maven central, is invalid for
maven?

I'm looking at this:


http://search.maven.org/remotecontent?filepath=org/apache/kafka/kafka_2.8.0/0.8.0-beta1/kafka_2.8.0-0.8.0-beta1.pom

It has 2  sections, which causes this error:

[WARNING] Invalid POM for org.apache.kafka:kafka_2.8.0:jar:0.8.0-beta1,
transitive dependencies (if any) will not be available, enable debug
logging for more details: 1 problem was encountered while building the
effective model
[FATAL] Non-parseable POM


/Users/jbr/.m2/repository/org/apache/kafka/kafka_2.8.0/0.8.0-beta1/kafka_2.8.0-0.8.0-beta1.pom:

Duplicated tag: 'dependencies' (position: START_TAG seen
...\n... @36:19)  @ line 36, column 19
  for project

It seems the offending section is:














Anyway, since I haven't heard of others having this issue, I'm a bit
perplexed.  Is everyone just downloading and hosting their own hand

edited

version of the is pom file?

I'm using maven 3.0.3 if that's relevant.

Jason





Re: Meetup in Raleigh/Durham, NC

2013-07-24 Thread David Arthur
Here are the slides from the meetup 
http://www.slideshare.net/mumrah/kafka-talk-tri-hug


We had 40-50 people show up which is about average for our meetup.

Roughly half of the audience had heard of Kafka before this talk, and 
about 10 or so folks had used it or were using it in production.


On 7/22/13 1:07 AM, Jun Rao wrote:

David,

Thanks for sharing this.

Jun


On Thu, Jul 18, 2013 at 9:19 PM, David Arthur  wrote:


There is a Hadoop meetup happening in Durham next week. I'm presenting an
intro to Kafka.

I don't suspect there are any Kafka users in the area who are not already
members of TriHUG, but sending this email out just in case :)

http://www.meetup.com/TriHUG/**events/129831822/<http://www.meetup.com/TriHUG/events/129831822/>

-David





Re: Logo

2013-07-22 Thread David Arthur

  
  
I actually did this the last time a logo was discussed :)

https://docs.google.com/drawings/d/11WHfjkRGbSiZK6rRkedCrgmgFoP_vQ-QuWNENd4u7UY/edit

As it turns out, it was a dung beetle in the book (I thought it was
a roach as well).

-David

On 7/22/13 2:59 PM, David Harris wrote:


  
  It should be a roach in honor of Franz Kafka's Metamorphosis.
  
  On 7/22/2013 2:55 PM, S Ahmed wrote:
  
  
Similar, yet different.  I like it!


On Mon, Jul 22, 2013 at 1:25 PM, Jay Kreps  wrote:



  Yeah, good point. I hadn't seen that before.

-Jay


On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski <
radek.gruchal...@portico.io> wrote:


  
296 looks familiar: https://www.nodejitsu.com/

Kind regards,
Radek Gruchalski
radek.gruchal...@technicolor.com (mailto:

  
  radek.gruchal...@technicolor.com)

  
| radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) |
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
00447889948663

Confidentiality:
This communication is intended for the above-named person and may be
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor
must you copy or show it to anyone; please delete/destroy and inform the
sender immediately.


On Monday, 22 July 2013 at 18:51, Jay Kreps wrote:



  Hey guys,

We need a logo!

I got a few designs from a 99 designs contest that I would like to put
forward:
https://issues.apache.org/jira/browse/KAFKA-982

If anyone else would like to submit a design that would be great.

Let's do a vote to choose one.

-Jay




  

  
  
  -- 

  
  David Harris
  Bridge Interactive Group
  email: dhar...@big-llc.com
  cell: 404-831-7015
  office: 888-901-0150
  
  Bridge Software Products:
  www.big-llc.com
  www.realvaluator.com
  www.rvleadgen.com
  


  



Re: Retrieve most-recent-n messages from kafka topic

2013-07-19 Thread David Arthur
There is not index-based access to messages in 0.7 like there is in 0.8. 
You have to start from a known good offset and iterate through the messages.


What's your use case? Running a job periodically that reads the latest N 
message from the queue? Is it impractical to run from the last known 
offset and only keep the last N?


On 7/19/13 3:45 AM, Shane Moriah wrote:

We're running Kafka 0.7 and I'm hitting some issues trying to access the
newest n messages in a topic (or at least in a broker/partition combo) and
wondering if my use case just isn't supported or if I'm missing something.
  What I'd like to be able to do is get the most recent offset from a
broker/partition combo, subtract an amount of bytes roughly equivalent to
messages_desired*bytes_per_message and then issue a FetchRequest with that
offset and amount of bytes.

I gathered from this
post
that
I need to use the Simple Consumer in order to do offset manipulation beyond
the start from beginning and start from end options.  And I saw from this
post
that
the offsets returned by getOffsetsBefore are really only the major
checkpoints when files are rolled over, every 500MB by default.  I also
found that if I take an offset returned from getOffsetsBefore and subtract
a fixed value, say 100KB, and submit that offset with a FetchRequest I get
a kafka.common.InvalidMessageSizeException, presumably since my computed
offset didn't align with a real message offset.

As far as I can tell, this leaves me only able to find the most recent
milestone offset, perhaps up to 500MB behind current data, and extract a
batch from that point forward. Is there any other way that I'm missing
here? The two things that seem to be lacking are access to the most recent
offset and the ability to rollback from that offset by a fixed amount of
bytes or messages without triggering the InvalidMessageSizeException.

Thanks,
Shane





Meetup in Raleigh/Durham, NC

2013-07-18 Thread David Arthur
There is a Hadoop meetup happening in Durham next week. I'm presenting 
an intro to Kafka.


I don't suspect there are any Kafka users in the area who are not 
already members of TriHUG, but sending this email out just in case :)


http://www.meetup.com/TriHUG/events/129831822/

-David


Re: message order, guarenteed?

2013-06-14 Thread David Arthur

Simple example of how to take advantage of this behavior:

Suppose you're sending document updates through Kafka. If you use the 
document ID as the message key and the default hash partitioner, the 
updates for a given document will exist on the same partition and come 
into the consumer in order.


On 6/10/13 8:37 AM, Neha Narkhede wrote:

Kafka guarantees order per topic  partition per source client.

Thanks,
Neha
On Jun 9, 2013 5:33 PM, "S Ahmed"  wrote:


I understand that there are no guarantees per say that a message may be a
duplicate (its the consumer's job to guarantee that), but when it comes to
message order, is kafka built in such a way that it is impossible to get
messages in the wrong order?

Certain use cases might not be sensitive to order, but when order is very
important, is kafka the wrong tool for the job or is there a way to get
this requirement?





Re: Versioning Schema's

2013-06-14 Thread David Arthur
I've done this in the past, and it worked out well. Stored Avro schema 
in ZooKeeper with an integer id and prefixed each message with the id. 
You have to make sure when you register a new schema that it resolves 
with the current version (ResolvingDecoder helps with this).


-David

On 6/13/13 4:07 AM, Shone Sadler wrote:

Thanks Jun & Phil!

Shone


On Thu, Jun 13, 2013 at 12:00 AM, Jun Rao  wrote:


Yes, we just have customized encoder that encodes the first 4 bytes of md5
of the schema, followed by Avro bytes.

Thanks,

Jun


On Wed, Jun 12, 2013 at 9:50 AM, Shone Sadler 
wrote:
Jun,
I like the idea of an explicit version field, if the schema can be

derived

from the topic name itself. The storage (say 1-4 bytes) would require

less

overhead than a 128 bit md5 at the added cost of managing the version#.

Is it correct to assume that your applications are using two schemas

then,

one system level schema to deserialize the schema id and bytes for the
application message and a second schema to deserialize those bytes with

the

application schema?

Thanks again!
Shone


On Wed, Jun 12, 2013 at 11:31 AM, Jun Rao  wrote:


Actually, currently our schema id is the md5 of the schema itself. Not
fully sure how this compares with an explicit version field in the

schema.

Thanks,

Jun


On Wed, Jun 12, 2013 at 8:29 AM, Jun Rao  wrote:


At LinkedIn, we are using option 2.

Thanks,

Jun


On Wed, Jun 12, 2013 at 7:14 AM, Shone Sadler <

shone.sad...@gmail.com

wrote:


Hello everyone,

After doing some searching on the mailing list for best practices on
integrating Avro with Kafka there appears to be at least 3 options

for

integrating the Avro Schema; 1) embedding the entire schema within

the

message 2) embedding a unique identifier for the schema in the

message

and

3) deriving the schema from the topic/resource name.

Option 2, appears to be the best option in terms of both efficiency

and

flexibility.  However, from a programming perspective it complicates

the

solution with the need for both an envelope schema (containing a

"schema

id" and "bytes" field for record data) and message schema

(containing

the

application specific message fields).  This requires two levels of
serialization/deserialization.
Questions:
1) How are others dealing with versioning of schemas?
2) Is there a more elegant means of embedding a schema ids in a Avro
message (I am new to both currently ;-)?

Thanks in advance!

Shone







Re: key used by producer

2013-05-09 Thread David Arthur
Yes, I'm pretty sure keys are now retained with the message and returned 
in the consumer.


In the Java/Scala client, ConsumerIterator returns a MessageAndMetadata 
which includes key, message, topic, partition, and offset


-David

On 5/9/13 10:44 AM, Yu, Libo wrote:

Hi,

I am looking at the example producer code for kafka 0.8.
I notice it is possible to specify a key when creating
KeyedMessage. This key will be used for assigning the
message to some partition. I wonder if the key will be
received by the consumer. Thanks.


Libo






Re: a few questions from high level consumer documentation.

2013-05-09 Thread David Arthur


On 5/9/13 8:27 AM, Chris Curtin wrote:

On Thu, May 9, 2013 at 12:36 AM, Rob Withers  wrote:




-Original Message-
From: Chris Curtin [mailto:curtin.ch...@gmail.com]

1 When you say the iterator may block, do you mean hasNext() may block?


Yes.

Is this due to a potential non-blocking fetch (broker/zookeeper returns an
empty block if offset is current)?  Yet this blocks the network call of the
consumer iterator, do I have that right?  Are there other reasons it could
block?  Like the call fails and a backup call is made?


I'll let the Kafka team answer this. I don't know the low level details.
The iterator will block if there is no more data to consume. The 
iterator is actually reading messages from a BlockingQueue which is fed 
messages by the fetcher threads. The reason for this is to allow you to 
configure blocking with or without a timeout in the ConsumerIterator. 
This is reflected in the consumer timeout property (consumer.timeout.ms)




b.  For client crash, what can client do to avoid duplicate

messages

when restarted? What I can think of is to read last message from log
file and ignore the first few received duplicate messages until
receiving the last read message. But is it possible for client to read

log file

directly?
If you can't tolerate the possibility of duplicates you need to look at

the

Simple Consumer example, There you control the offset storage.

Do you have example code that manages only once, even when a consumer for a
given partition goes away?


No, but if you look at the Simple Consumer example where the read occurs
(and the write to System.out) at that point you know the offset you just
read, so you need to put it somewhere. Using the Simple Consumer Kafka
leaves all the offset management to you.



What does happen with rebalancing when a consumer goes away?


Hmm, I can't find the link to the algorithm right now. Jun or Neha can you?
Down at the bottom of the 0.7 design page 
http://kafka.apache.org/07/design.html




Is this
behavior of the high-level consumer group?


Yes.



Is there a way to supply one's
own simple consumer with only once, within a consumer group that
rebalances?


No. Simple Consumers don't have rebalancing steps. Basically you take
control of what is requested from which topics and partitions. So you could
ask for a specific offset in a topic/partition 100 times in a row and Kafka
will happily return it to you. Nothing is written to ZooKeeper either, you
control everything.




What happens if a producer goes away?


Shouldn't matter to the consumers. The Brokers are what the consumers talk
to, so if nothing is writing the Broker won't have anything to send.


thanks much,
rob







Re: Kafka message order

2013-04-30 Thread David Arthur
Message order is guaranteed for a given partition, messages are read by 
the consumer in a FIFO manner.


You say partitioning based on topic. Do you mean you are using the 
default HashPartitioner with the topic name as the routing key? If this 
is the case, then all of your messages will be going to the same 
partition (which is not normally what you want). Perhaps you could paste 
a sample of the code you are using to produce messages, as this will 
clarify what you mean by "partitioned based on message topic"


Cheers
-David

On 4/30/13 5:57 AM, Arjun Harish wrote:

Hi

I have a kafka cluster partitioned based on message topic. My question is
if one of the topics get a lot of messages than usual (i mean a lot to use
up a lot of resources) and one of the other topics is coming at normal
rate, does kafka ensure that the messages in the latter topic reach the
consumer(Its a pull at the consumer)?

Regards
Arjun Harish Nadh





Lucene Revolution

2013-04-23 Thread David Arthur
Anyone going this year? Probably nothing to do with Kafka, but lots of 
interesting talks on Solr/Lucene and some "big data" stuff (I think 
there's a Storm talk?)


http://www.lucenerevolution.org/

-David


Re: Kafka Broker - In Memory Topics & Messages

2013-04-23 Thread David Arthur

Sounds like you want something like zeromq?

http://zguide.zeromq.org/page:all#Divide-and-Conquer

-David

On 4/23/13 2:07 AM, Pankaj Misra wrote:

Hi All,

I am working on using Kafka for building a highly scalable system. As I 
understand and have seen, Kafka broker has a very impressive and scalable file 
handling mechanisms to provide guaranteed delivery. However in one of the 
scenarios, I am facing a different challenge.

The scenario is such that the message payload is buffered and guaranteed for 
delivery by an external system, wherein there is no compelling need for 
guaranteed delivery from Kafka, but there is a need to parallel process the 
message streams. This made me wonder, if there is some way in Kafka, wherein I 
can avoid creation of files and instead stream the messages in-memory as they 
come and still take advantage of Kafka message streams, avoiding the small 
overhead of file management (avoid some more disk level IOPS).

Would greatly appreciate community's response.

Thanks & Regards
Pankaj Misra









NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.





Re: Analysis of producer performance -- and Producer-Kafka reliability

2013-04-23 Thread David Arthur
It seems there are two underlying things here: storing messages to 
stable storage, and making messages available to consumers (i.e., 
storing messages on the broker). One can be achieved simply and reliably 
by spooling to local disk, the other requires network and is inherently 
less reliable. Buffering messages in memory does not help with the first 
one since they are in volatile storage, but it does help with the second 
one in the event of a network partition.


I could imagine a producer running in "ultra-reliability" mode where it 
uses a local log file as a buffer where all messages written to and read 
from. One issue with this, though, is that now you have to worry about 
the performance and capacity of the disks on your producers (which can 
be numerous compared to brokers). As for performance, the data being 
written by producers is already in active memory, so writing it to a 
disk then doing a zero-copy transfer to the network should be pretty 
fast (maybe?).


Or, Kafka can remain more "protocol-ish" and less "application-y" and 
just give you errors when brokers are unavailable and let your 
application deal with it. This is basically what TCP/HTTP/etc do. HTTP 
servers don't say "hold on, there's a problem, let me try that request 
again in a second.."


Interesting discussion, btw :)
-David

On 4/15/13 2:18 PM, Piotr Kozikowski wrote:

Philip,

We would not use spooling to local disk on the producer to deal with
problems with the connection to the brokers, but rather to absorb temporary
spikes in traffic that would overwhelm the brokers. This is assuming that
1) those spikes are relatively short, but when they come they require much
higher throughput than normal (otherwise we'd just have a capacity problem
and would need more brokers), and 2) the spikes are long enough for just a
RAM buffer to be dangerous. If the brokers did go down, spooling to disk
would give us more time to react, but that's not the primary reason for
wanting the feature.

-Piotr

On Fri, Apr 12, 2013 at 8:21 AM, Philip O'Toole  wrote:


This is just my opinion of course (who else's could it be? :-)) but I think
from an engineering point of view, one must spend one's time making the
Producer-Kafka connection solid, if it is mission-critical.

Kafka is all about getting messages to disk, and assuming your disks are
solid (and 0.8 has replication) those messages are safe. To then try to
build a system to cope with the Kafka brokers being unavailable seems like
you're setting yourself for infinite regress. And to write code in the
Producer to spool to disk seems even more pointless. If you're that
worried, why not run a dedicated Kafka broker on the same node as the
Producer, and connect over localhost? To turn around and write code to
spool to disk, because the primary system that *spools to disk* is down
seems to be missing the point.

That said, even by going over local-host, I guess the network connection
could go down. In that case, Producers should buffer in RAM, and start
sending some major alerts to the Operations team. But this should almost
*never happen*. If it is happening regularly *something is fundamentally
wrong with your system design*. Those Producers should also refuse any more
incoming traffic and await intervention. Even bringing up "netcat -l" and
letting it suck in the data and write it to disk would work then.
Alternatives include having Producers connect to a load-balancer with
multiple Kafka brokers behind it, which helps you deal with any one Kafka
broker failing. Or just have your Producers connect directly to multiple
Kafka brokers, and switch over as needed if any one broker goes down.

I don't know if the standard Kafka producer that ships with Kafka supports
buffering in RAM in an emergency. We wrote our own that does, with a focus
on speed and simplicity, but I expect it will very rarely, if ever, buffer
in RAM.

Building and using semi-reliable system after semi-reliable system, and
chaining them all together, hoping to be more tolerant of failure is not
necessarily a good approach. Instead, identifying that one system that is
critical, and ensuring that it remains up (redundant installations,
redundant disks, redundant network connections etc) is a better approach
IMHO.

Philip


On Fri, Apr 12, 2013 at 7:54 AM, Jun Rao  wrote:


Another way to handle this is to provision enough client and broker

servers

so that the peak load can be handled without spooling.

Thanks,

Jun


On Thu, Apr 11, 2013 at 5:45 PM, Piotr Kozikowski 
wrote:
Jun,

When talking about "catastrophic consequences" I was actually only
referring to the producer side. in our use case (logging requests from
webapp servers), a spike in traffic would force us to either tolerate a
dramatic increase in the response time, or drop messages, both of which

are

really undesirable. Hence the need to absorb spikes with some system on

top

of Kafka, unless the spooling feature mentioned by Wing (
https://issues.apache.org/jira/

Re: pushing delete topic feature out of 0.8

2013-04-04 Thread David Arthur
Since you need to learn about the leader for a topic in order to do 
anything, you kind of already have "auto-create" since getting metadata 
for a topic will create it if it doesn't exist.


As for auto-delete, log files will be deleted over time (per the usual 
policy). There will be residual stuff in ZooKeeper, but I don't think it 
will be detrimental to performance (does the broker keep any active 
state for a topic that's not being used? - not sure)


Not to say topic deletion isn't a useful feature, but I think punting 
out of 0.8 is fine.


-David

On 4/4/13 3:01 PM, Jason Rosenberg wrote:

Ok,

This is a feature I've been hoping for, so I added an upvote to Kafka-330.
  But I will defer to you in terms of not wanting to delay 0.8 unnecessarily.

Will we still have a backhanded way to remove a topic if need be?
  Ultimately, I'd like  to see the feature where a topic automatically is
created on receipt of a new message, and that topic will then automatically
become deleted after a configurable period of inactivity.  In otherwords,
auto-creation and auto-deletion.

Jason


On Thu, Apr 4, 2013 at 11:35 AM, Jun Rao  wrote:


Hi,

We started the work on deleting topics in 0.8 (kafka-330). We realized that
it touches quite a few critical components such as controller, replica
manager, and log, and it will take some time to stabilize this. In order
not to delay the 0.8 release too much, I propose that we push this feature
out of 0.8. Since we don't really support deleting topics in 0.7, this
doesn't reduce the existing features.

Any concerns from people?

Thanks,

Jun





Re: .net (c#) kafka client

2013-04-04 Thread David Arthur
You can certainly use different clients for producers and consumers. 
E.g., you could have a Python producer, a C producer, and a Scala 
consumer; or any combination thereof.


If you want your consumers to participate in a consumer group, you'll 
need to use the Java or Scala client (or a 3rd party client that 
supports consumer groups).


On 4/4/13 7:36 AM, Oleg Ruchovets wrote:

By the way. Should producer and consumer have to be the same client? I mean
is it possible that producer will be C or Python and consumer will be java?

Thanks
Oleg.


On Thu, Apr 4, 2013 at 12:24 AM, Matthew Rathbone wrote:


Maybe you could write a C# client. I don't think it would be super hard to
write a basic one (that doesn't require zookeeper for example). C# has a
lot of great features that would make it great for a solid kafka client.


On Wed, Apr 3, 2013 at 3:58 PM, Oleg Ruchovets 
wrote:


Yes , I agree. So there is no way to use C# client or Rest API with

current

version of Kafka.

Thanks
Oleg,


On Wed, Apr 3, 2013 at 10:29 PM, David Arthur  wrote:


What is the bridge between C# and Node.js? If you're writing some

custom

middle man, why not write it in Java or Scala so you can then use the
official clients?

-David


On 4/3/13 3:22 PM, Oleg Ruchovets wrote:


I see , Is it a good Idea to use Node.js client? C# will produce

messages

to Node.js and Node.js will push it to the Kafka?
 Is there potential problem with such aproach?
Thanks
Oleg.


On Wed, Apr 3, 2013 at 9:18 PM, Joel Koshy 

wrote:

  Unfortunately no - there is a legacy 0.7 client (

https://svn.apache.org/repos/**asf/kafka/branches/legacy_**
client_libraries/csharp/<

https://svn.apache.org/repos/asf/kafka/branches/legacy_client_libraries/csharp/

)
but afaik, none for 0.8.

Re: rest API: it is one of the current open projects that people can
contribute to. There have been some discussions on the mailing list

about

implementing one and a jira as well: KAFKA-639

Joel

On Wed, Apr 3, 2013 at 9:46 AM, Oleg Ruchovets 
 Is there a stable C# client for Kafka? Is there a rest API for
Kafka?

Thanks
Oleg.





--
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matt...@foursquare.com | @rathboma <http://twitter.com/rathboma> |
4sq<http://foursquare.com/rathboma>





Re: .net (c#) kafka client

2013-04-03 Thread David Arthur
What is the bridge between C# and Node.js? If you're writing some custom 
middle man, why not write it in Java or Scala so you can then use the 
official clients?


-David

On 4/3/13 3:22 PM, Oleg Ruchovets wrote:

I see , Is it a good Idea to use Node.js client? C# will produce messages
to Node.js and Node.js will push it to the Kafka?
Is there potential problem with such aproach?
Thanks
Oleg.


On Wed, Apr 3, 2013 at 9:18 PM, Joel Koshy  wrote:


Unfortunately no - there is a legacy 0.7 client (

https://svn.apache.org/repos/asf/kafka/branches/legacy_client_libraries/csharp/
)
but afaik, none for 0.8.

Re: rest API: it is one of the current open projects that people can
contribute to. There have been some discussions on the mailing list about
implementing one and a jira as well: KAFKA-639

Joel

On Wed, Apr 3, 2013 at 9:46 AM, Oleg Ruchovets 
wrote:


Hi ,
Is there a stable C# client for Kafka? Is there a rest API for Kafka?

Thanks
Oleg.





0.8 Python client

2013-04-01 Thread David Arthur

Hello all,

I've been working on updating (i.e., rewriting) my Python client for the 
impending 0.8 release. Check it out:


https://github.com/mumrah/kafka-python/tree/0.8

In addition to 0.8 protocol support, the new client supports the 
broker-aware request routing required for replication in 0.8. Offset 
management is in there too, but disabled on the 0.8 branch since it will 
not be in that release. I believe it will be in a follow-on release 
(0.8.1, if JIRA can be believed).


TODOs:
* update unit tests for new protocol
* update queue.py with new producer/consumer stuff
* more tests
* more examples

I've been buried in the details of the protocol and request routing, so 
the actual useful API has had the least attention. I'm open to 
suggestions on how to make this library more useful/usable. And of 
course, pull requests are welcome!


Cheers,
David



Re: Hardware profile

2013-03-29 Thread David Arthur
Especially in light of replication (broker-broker communication), I'm 
wondering if all the brokers are in the same rack and what kind of 
networking interfaces are used (Gigabit ethernet, Fibre Channel, etc).


On 3/29/13 6:53 PM, Jun Rao wrote:

We have multiple Kafka clusters, each has about 10 brokers right now. Not
sure about the network topology. What kind of info do you want to know?

Thanks,

Jun

On Fri, Mar 29, 2013 at 11:47 AM, David Arthur  wrote:


How many brokers are you (LinkedIn) running? What kind of network topology?


On 3/29/13 2:45 PM, Neha Narkhede wrote:


1. We never share zookeeper and broker on the same hardware. Both need
significant memory to operate efficiently.
2. 14 drive setup is just for Kafka. We have a separate disk for the OS,
AFAIK.

Thanks,
Neha

On Fri, Mar 29, 2013 at 11:37 AM, Ian Friedman  wrote:


Thanks Jun. Couple more questions:
1. Do you guys have dedicated hardware for Zookeeper or do you have a
few machines run both a ZK and a broker? If so, do you keep the ZK and
Kafka data on separate volumes?
2. You use the 14 drive raid setup is just for Kafka data and a separate
drive for the OS?

Thanks again,
Ian


On Friday, March 29, 2013 at 12:43 PM, Jun Rao wrote:

  It's more or less the same. Our new server has 14 sata disks, each of 1

TB.
The disk also has better write latency due to larger write cache.

Thanks,

Jun

On Fri, Mar 29, 2013 at 8:32 AM, Ian Friedman  wrote:

  Hi all,

I'm wondering how up to date the hardware specs listed on this page
are:
https://cwiki.apache.org/**confluence/display/KAFKA/**Operations<https://cwiki.apache.org/confluence/display/KAFKA/Operations>

We're evaluating hardware for a Kafka broker/ZK quorum buildout and
looking for some tips and/or sample configurations if anyone can help
us
out with some recommendations.

Thanks in advance,
Ian








Re: Hardware profile

2013-03-29 Thread David Arthur

How many brokers are you (LinkedIn) running? What kind of network topology?

On 3/29/13 2:45 PM, Neha Narkhede wrote:

1. We never share zookeeper and broker on the same hardware. Both need
significant memory to operate efficiently.
2. 14 drive setup is just for Kafka. We have a separate disk for the OS, AFAIK.

Thanks,
Neha

On Fri, Mar 29, 2013 at 11:37 AM, Ian Friedman  wrote:

Thanks Jun. Couple more questions:
1. Do you guys have dedicated hardware for Zookeeper or do you have a few 
machines run both a ZK and a broker? If so, do you keep the ZK and Kafka data 
on separate volumes?
2. You use the 14 drive raid setup is just for Kafka data and a separate drive 
for the OS?

Thanks again,
Ian


On Friday, March 29, 2013 at 12:43 PM, Jun Rao wrote:


It's more or less the same. Our new server has 14 sata disks, each of 1 TB.
The disk also has better write latency due to larger write cache.

Thanks,

Jun

On Fri, Mar 29, 2013 at 8:32 AM, Ian Friedman mailto:i...@flurry.com)> wrote:


Hi all,

I'm wondering how up to date the hardware specs listed on this page are:
https://cwiki.apache.org/confluence/display/KAFKA/Operations

We're evaluating hardware for a Kafka broker/ZK quorum buildout and
looking for some tips and/or sample configurations if anyone can help us
out with some recommendations.

Thanks in advance,
Ian










Re: log.file.size limit?

2013-03-25 Thread David Arthur

If you look at the description of the "map" method, it states:

size - The size of the region to be mapped; must be non-negative and no 
greater than Integer.MAX_VALUE


-David

On 3/25/13 4:37 PM, S Ahmed wrote:

but it show's long not int?

Isn't it then Long.MAX_VALUE ?


On Mon, Mar 25, 2013 at 3:14 PM, David Arthur  wrote:


FileChannel#map docs indicate the max size is Integer.MAX_VALUE, so yea 2gb

http://docs.oracle.com/javase/**6/docs/api/java/nio/channels/**
FileChannel.html#map(java.nio.**channels.FileChannel.MapMode<http://docs.oracle.com/javase/6/docs/api/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode>,
long, long)



On 3/25/13 2:42 PM, S Ahmed wrote:


Is there any limit to how large a log file can be?
I swear I read somewhere that java's memory mapped implementation is
limited to 2GB but I'm not sure.






Re: log.file.size limit?

2013-03-25 Thread David Arthur

FileChannel#map docs indicate the max size is Integer.MAX_VALUE, so yea 2gb

http://docs.oracle.com/javase/6/docs/api/java/nio/channels/FileChannel.html#map(java.nio.channels.FileChannel.MapMode, 
long, long)



On 3/25/13 2:42 PM, S Ahmed wrote:

Is there any limit to how large a log file can be?
I swear I read somewhere that java's memory mapped implementation is
limited to 2GB but I'm not sure.





Re: Anyone working on a Kafka book?

2013-03-21 Thread David Arthur

This looks great! A few comments

* I think it would be useful to start with a complete example (ready to 
copy/paste) and then break it down bit by bit
* Some of the formatting is funky (gratuitous newlines), also I think 2 
spaces looks nicer than 4

* In the text, it might be useful to embolden or italicize class names

Also, maybe we should move this to a separate thread?

On 3/21/13 2:42 PM, Chris Curtin wrote:

I published my first Wiki example:

https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example

Can you guys take a look and see if the tone, format and of course content
fit into what you'd like to see?

Also, is there a naming convention we should be following?

Thanks,

Chris


On Thu, Mar 21, 2013 at 1:38 PM, Neha Narkhede wrote:


Yes, that works as well.

Thanks,
Neha


On Thu, Mar 21, 2013 at 10:33 AM, Chris Curtin 
wrote:
Or can I do it in the Wiki until you release 0.8.0 so people can comment

on

them? I think I can edit the Wiki with my Apache login.


On Thu, Mar 21, 2013 at 12:17 AM, Jun Rao  wrote:


Our webpage source is at https://svn.apache.org/repos/asf/kafka/site .

You

can file a jira and attach a patch.

Thanks,

Jun






Re: Consume from X messages ago

2013-03-19 Thread David Arthur
This API is exposed through the SimpleConsumer scala class. See 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/consumer/SimpleConsumer.scala#L60


You will need to set earliestOrLatest to -1 for the latest offset.

There is also a command line tool 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/GetOffsetShell.scala


-David

On 3/19/13 11:25 AM, James Englert wrote:

I'm still a bit lost.  Where is the offsets API?  I.e. which class?


On Tue, Mar 19, 2013 at 11:16 AM, David Arthur  wrote:


Using the Offsets API, you can get the latest offset by setting time to
-1. Then you subtract 1

There is no guarantee that 10k prior messages exist of course, so you'd
need to handle that case.

-David


On 3/19/13 11:04 AM, James Englert wrote:


Hi,

I'm using Kafka 0.8.  I would like to setup a consumer to fetch the last
10,000 messages and then start consuming messages.

I see the configuration autooffset.reset, but that isn't quite what I
want.  I want only the last 10,000 messages.

Is there a good way to achieve this in 0.8, besides just hacking the data
in ZK?

Thanks,
Jim








Re: Consume from X messages ago

2013-03-19 Thread David Arthur
Using the Offsets API, you can get the latest offset by setting time to 
-1. Then you subtract 1


There is no guarantee that 10k prior messages exist of course, so you'd 
need to handle that case.


-David

On 3/19/13 11:04 AM, James Englert wrote:

Hi,

I'm using Kafka 0.8.  I would like to setup a consumer to fetch the last
10,000 messages and then start consuming messages.

I see the configuration autooffset.reset, but that isn't quite what I
want.  I want only the last 10,000 messages.

Is there a good way to achieve this in 0.8, besides just hacking the data
in ZK?

Thanks,
Jim





Anyone working on a Kafka book?

2013-03-19 Thread David Arthur
I was approached by a publisher the other day to do a book on Kafka - 
something I've actually thought about pursuing. Before I say yes (or 
consider saying yes), I wanted to make sure no one else was working on a 
book. No sense in producing competing texts at this point.


So, anyone working on a Kafka book? Self published or otherwise?

-David




Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

2013-03-14 Thread David Arthur
I have used KafkaETLJob to write a job that consumes from Kafka and 
writes to HDFS. Kafka version 0.7.2 rc5 and CDH 4.1.2.


Is anything in particular not working?

-David

On 3/14/13 5:31 PM, Matt Lieber wrote:

Just curious, were you able to make Camus work with CDH4 then ?

Cheers,
Matt








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.





0.8 build problems

2013-02-21 Thread David Arthur
I'm having trouble building the project with sbt, specifically I am 
unable to run package and have the kafka-server-start.sh script work


git clone git://github.com/apache/kafka.git
./sbt update
./sbt "++2.8.0 package"
./bin/kafka-server-start.sh config/server.properties

Exception in thread "main" java.lang.NoClassDefFoundError: scala/ScalaObject
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)

at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)

at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at kafka.Kafka.main(Kafka.scala)
Caused by: java.lang.ClassNotFoundException: scala.ScalaObject
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 25 more

I have tried manually building a classpath and running java directly, 
but then it complains about missing slf4j. The only way I've been able 
to run Kafka is through sbt interactively with the "run" task (I noticed 
in this case it's picking up slf4j from ~/.ivy/cache).


Any advise?

-David


Re: status of 0.8?

2013-02-21 Thread David Arthur
I've been having trouble building trunk since the changes to sbt in 0.8. 
Is there some documentation on building and running trunk?



On 2/21/13 3:53 PM, Neha Narkhede wrote:

HEAD is good as of today and has been stable for past few days. There are
some bugs we are working on but you can certainly run the cluster and do
some basic send/receive operations. Also, let us know if you have feedback
on the APIs, protocols, tools etc since that takes some time to refactor
and change.

Good luck! :-)

Thanks,
Neha


On Thu, Feb 21, 2013 at 12:51 PM, Jason Rosenberg  wrote:


Thanks Neha,

That's helpful info.  Is there a reasonable checkpoint rev to check out now
and experiment with, or is HEAD as good as anything else?

Jason

On Thu, Feb 21, 2013 at 3:31 PM, Neha Narkhede 
wrote:
Hi Jason,

We are closely monitoring the health of one of our production clusters

that

has the 0.8 code deployed. This cluster is feeding off of LinkedIn's
production traffic. Once this cluster is fairly stable, we'd like to run
all of our tools and ensure those are working. Another thing we are

trying

is to introduce failures on this cluster when it is under load and ensure
that there is no data loss.

So far, we've been working on stabilizing this cluster and fixing bugs.
Next week, we will be working on tools and setting up audit so we can do
some data loss analysis, if any. This will probably take another month.
After that, I think we should be ready to release a public BETA and have
our users try it. If we release it sooner than that, I'm not sure it will
be helpful since tools and simple failure cases might not work as

expected.

As far as formal release goes, I believe end of March or April will be a
good timeframe. We will try our best to update documentation for the BETA
release, 3-4 weeks from now.

Thanks,
Neha


On Thu, Feb 21, 2013 at 9:44 AM, Jason Rosenberg 

wrote:

Just wanted to inquire as to the status 0.8 being released to beta.

I have several use cases now that would like to take advantage of the

new

features in 0.8, and I'm not sure if it makes sense to keep waiting for

an

actual release, before attempting to use the latest HEAD version in
staging/production environments.

How stable is the 0.8 branch at this point?  What is the schedule for a
formal release?  When will there at least be a link to 0.8

documentation,

where to download it, on the main apache kafka site?

Thanks,

Jason





Re: python and kafka - how to use as a queue

2013-02-16 Thread David Arthur

It is indeed pure python

On 2/17/13 12:20 AM, David Montgomery wrote:

Key issue with gevent is there can be no C bindings.  If pure python then
the sockets can be monkey patched as long as pure python code.  I use
gevent to run redis-py to make async calls to redis even though the client
in nature is blocking.  I do believe your client is pure python?

Thanks


On Sun, Feb 17, 2013 at 1:15 PM, David Arthur  wrote:


Replying to both messages inline:


On 2/16/13 9:07 PM, David Montgomery wrote:


By the way..I assume that python-kafka is gevent safe?


No idea, I've never used greenlet/gevent before. If the question is "are
you doing anything unusual in kafka-python", then the answer is no. Just
basic method calls and some socket stuff



Thanks


On Sun, Feb 17, 2013 at 10:04 AM, David Montgomery <
davidmontgom...@gmail.com> wrote:

  Ok...great...I see now about offsets.  I see how I can manage on

restarts.  Thanks!

So...considering I am in a disposable machine world then I will consider
redis as a centralized store.  Makes sense?


You can certainly used Redis as a fast, in-memory queue. It is, of

course, an entirely different system with different design goals and
features. The applicability of Redis or Kafka depends on your use case.



What is the time frame for v8 release?


I believe it is farily imminent, maybe sometime in March?








On Sun, Feb 17, 2013 at 3:27 AM, David Arthur  wrote:

  Greetings!

I am the maintainer of kafka-python. Very cool to see it used in the
wild.

The kafka-python library supports the low-level protocols of Kafka 0.7
(Produce/Fetch/MultiProduce/MultiFetch). When you ask Kafka for

messages via a Fetch request, you specify an offset + range (much like
reading a file). The `iter_messages` helper returns an iterator that
automatically handles paging offsets through successive Fetch requests.
However, it does not support _saving_ your offsets. One of the
parameters
to iter_messages is the offset to start at, so when you re-run your
script
it will start at that point again.

In 0.7, clients must talk to ZooKeeper in order to persist offsets in a
Kafka-compatible way (or they could just save them locally depending on
the
use case). Talking to ZooKeeper from Python is somewhat troublesome, and
implementing the Kafka "consumer group rebalancing" is even more
troublesome - so I chose to omit it.

In 0.8 (not yet released), consumer offsets are managed centrally by the
Kafka brokers and have APIs for clients to commit and fetch offsets. I
am
in the process of implementing a 0.8 compatible version of kafka-python.

So for the time being, you are on your own with regards to offset
management :-/

Cheers!

-David


On 2/16/13 1:35 PM, Philip O'Toole wrote:

  You need to read the Kafka design docs. Kafka does not delete messages

just because a Consumer reads it. It does not track what messages have
been
consumed by any Consumer.

It is up to Consumers to start off where they left off, by always
asking
for the right message (via offsets).

Philip

On Feb 16, 2013, at 4:48 AM, David Montgomery <
davidmontgom...@gmail.com>
wrote:

   Hi,


I have a zookeer and kafka set up.

I am using this python client:  https://github.com/mumrah/**
kafka-python 
<https://github.com/mumrah/**kafka-python<https://github.com/mumrah/kafka-python>

I can send and receive messages but they are not deleted.

How can I send a message to kafka and no other consumer can use it?


I feel I am missing something on how kafka works

def produce():
  kafka = KafkaClient("xxx.xxx", 9092)
  kafka.send_messages_simple("my-topic", "some message")

  kafka.close()
  print 'done'

def consume():
  kafka = KafkaClient("xxx.xxx", 9092)
  for msg in kafka.iter_messages("my-topic", 0, 0,

1024*1024,False):
  print(msg.payload)
  kafka.close()
  print 'done'

Every time I ran the above...everytime I ran consume the messages just
grew
from previous messages.

Am I missing something on the server.properties file?

Thanks






Re: python and kafka - how to use as a queue

2013-02-16 Thread David Arthur

Replying to both messages inline:

On 2/16/13 9:07 PM, David Montgomery wrote:

By the way..I assume that python-kafka is gevent safe?
No idea, I've never used greenlet/gevent before. If the question is "are 
you doing anything unusual in kafka-python", then the answer is no. Just 
basic method calls and some socket stuff


Thanks


On Sun, Feb 17, 2013 at 10:04 AM, David Montgomery <
davidmontgom...@gmail.com> wrote:


Ok...great...I see now about offsets.  I see how I can manage on
restarts.  Thanks!

So...considering I am in a disposable machine world then I will consider
redis as a centralized store.  Makes sense?
You can certainly used Redis as a fast, in-memory queue. It is, of 
course, an entirely different system with different design goals and 
features. The applicability of Redis or Kafka depends on your use case.


What is the time frame for v8 release?

I believe it is farily imminent, maybe sometime in March?










On Sun, Feb 17, 2013 at 3:27 AM, David Arthur  wrote:


Greetings!

I am the maintainer of kafka-python. Very cool to see it used in the wild.

The kafka-python library supports the low-level protocols of Kafka 0.7
(Produce/Fetch/MultiProduce/**MultiFetch). When you ask Kafka for
messages via a Fetch request, you specify an offset + range (much like
reading a file). The `iter_messages` helper returns an iterator that
automatically handles paging offsets through successive Fetch requests.
However, it does not support _saving_ your offsets. One of the parameters
to iter_messages is the offset to start at, so when you re-run your script
it will start at that point again.

In 0.7, clients must talk to ZooKeeper in order to persist offsets in a
Kafka-compatible way (or they could just save them locally depending on the
use case). Talking to ZooKeeper from Python is somewhat troublesome, and
implementing the Kafka "consumer group rebalancing" is even more
troublesome - so I chose to omit it.

In 0.8 (not yet released), consumer offsets are managed centrally by the
Kafka brokers and have APIs for clients to commit and fetch offsets. I am
in the process of implementing a 0.8 compatible version of kafka-python.

So for the time being, you are on your own with regards to offset
management :-/

Cheers!

-David


On 2/16/13 1:35 PM, Philip O'Toole wrote:


You need to read the Kafka design docs. Kafka does not delete messages
just because a Consumer reads it. It does not track what messages have been
consumed by any Consumer.

It is up to Consumers to start off where they left off, by always asking
for the right message (via offsets).

Philip

On Feb 16, 2013, at 4:48 AM, David Montgomery 
wrote:

  Hi,

I have a zookeer and kafka set up.

I am using this python client:  https://github.com/mumrah/**
kafka-python <https://github.com/mumrah/kafka-python>

I can send and receive messages but they are not deleted.

How can I send a message to kafka and no other consumer can use it?


I feel I am missing something on how kafka works

def produce():
 kafka = KafkaClient("xxx.xxx", 9092)
 kafka.send_messages_simple("**my-topic", "some message")
 kafka.close()
 print 'done'

def consume():
 kafka = KafkaClient("xxx.xxx", 9092)
 for msg in kafka.iter_messages("my-topic"**, 0, 0,
1024*1024,False):
 print(msg.payload)
 kafka.close()
 print 'done'

Every time I ran the above...everytime I ran consume the messages just
grew
from previous messages.

Am I missing something on the server.properties file?

Thanks





Re: python and kafka - how to use as a queue

2013-02-16 Thread David Arthur

Greetings!

I am the maintainer of kafka-python. Very cool to see it used in the wild.

The kafka-python library supports the low-level protocols of Kafka 0.7 
(Produce/Fetch/MultiProduce/MultiFetch). When you ask Kafka for messages 
via a Fetch request, you specify an offset + range (much like reading a 
file). The `iter_messages` helper returns an iterator that automatically 
handles paging offsets through successive Fetch requests. However, it 
does not support _saving_ your offsets. One of the parameters to 
iter_messages is the offset to start at, so when you re-run your script 
it will start at that point again.


In 0.7, clients must talk to ZooKeeper in order to persist offsets in a 
Kafka-compatible way (or they could just save them locally depending on 
the use case). Talking to ZooKeeper from Python is somewhat troublesome, 
and implementing the Kafka "consumer group rebalancing" is even more 
troublesome - so I chose to omit it.


In 0.8 (not yet released), consumer offsets are managed centrally by the 
Kafka brokers and have APIs for clients to commit and fetch offsets. I 
am in the process of implementing a 0.8 compatible version of kafka-python.


So for the time being, you are on your own with regards to offset 
management :-/


Cheers!

-David

On 2/16/13 1:35 PM, Philip O'Toole wrote:

You need to read the Kafka design docs. Kafka does not delete messages just 
because a Consumer reads it. It does not track what messages have been consumed 
by any Consumer.

It is up to Consumers to start off where they left off, by always asking for 
the right message (via offsets).

Philip

On Feb 16, 2013, at 4:48 AM, David Montgomery  wrote:


Hi,

I have a zookeer and kafka set up.

I am using this python client:  https://github.com/mumrah/kafka-python

I can send and receive messages but they are not deleted.

How can I send a message to kafka and no other consumer can use it?


I feel I am missing something on how kafka works

def produce():
kafka = KafkaClient("xxx.xxx", 9092)
kafka.send_messages_simple("my-topic", "some message")
kafka.close()
print 'done'

def consume():
kafka = KafkaClient("xxx.xxx", 9092)
for msg in kafka.iter_messages("my-topic", 0, 0, 1024*1024,False):
print(msg.payload)
kafka.close()
print 'done'

Every time I ran the above...everytime I ran consume the messages just grew
from previous messages.

Am I missing something on the server.properties file?

Thanks




Re: join all three groups

2013-02-14 Thread David Arthur

You need to send an email to:

 users-subscr...@kafka.apache.org
 dev-subscr...@kafka.apache.org
 commits-subscr...@kafka.apache.org

in order to be subscribed to the lists

Cheers,
David

On 2/13/13 9:22 PM, Sining Ma wrote:

Hi,
I am using kafka API right now. I need to join all these groups so that I
can ask questions.
Please drag my in these groups.





Re: where can I see the latest commits?

2013-02-13 Thread David Arthur

On 2/13/13 8:26 PM, S Ahmed wrote:

On the dev list I read how things are committed etc., but I don't see any
of the updates here: https://github.com/apache/kafka/commits/trunk
Github is just a mirror of the official Apache git repository located at 
http://git.apache.org/kafka.git/


Also, when someone submits a patch, is it possible to download the patched
version also?  Is it in a remote branch or something?

I tried to list the remote branches (in the apache repository) and I
couldn't see anything.

There are a few branches (listed at 
https://github.com/apache/kafka/branches), 0.8 being the most active as 
that is the current version in progress.


Re: Consumer re-design and Python

2013-02-11 Thread David Arthur

On 1/31/13 3:30 PM, Marc Labbe wrote:

Hi,

I am fairly new to Kafka and Scala, I am trying to see through the consumer
re-design changes, proposed and implemented for 0.8 and after, which will
affect other languages implementations. There are documentation pages on
the wiki, JIRA issues but I still can't figure out what's already there for
0.8, what will be there in the future and how it affects the consumers
written in other languages (Python in my case).

For instance, I am looking at
https://cwiki.apache.org/KAFKA/consumer-client-re-design.html and the very
well documented
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Detailed+Consumer+Coordinator+Design
and
I am not sure what part is in the works, done and still a proposal. I feel
there are changes there already in 0.8 but not completely, referring
especially to KAFKA-364 and KAFKA-264.

https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design

and

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Detailed+Consumer+Coordinator+Design

are the current design docs (as far as I know).


Is this all accurate and up to date? There are talks of a coordinator as
well but from what I see, this hasn't been implemented so far.
From my understanding, the client redesign has not been finalized and 
it still in-progress/todo.


After all, maybe my question is: other than the wire protocol changes, what
changes should I expect to do to SimpleConsumer client written in Python
for v0.8? What should I do next to implement a high level consumer
(ZookeeperConsumerConnector?) which fits with the design proposal?
With 0.8, you will not need to connect to ZooKeeper from the clients. 
With KAFKA-657, offsets are centrally managed by the broker. Any broker 
can handle these requests.


Has anyone started making changes to their implementation yet (thinking
Brod or Samsa)? I'll post that question on github too.
I am working updating my Python client: 
https://github.com/mumrah/kafka-python, still a ways to go yet. The 
biggest change (besides centralized offset management) is that each 
topic+partition is owned by a specific broker (the leader). When 
producing messages, you must send them to the correct leader. This 
requires that clients maintain some state of what belongs where which is 
a pain, but such is the cost of replication.


Thanks and cheers!
marc


-David



Re: ETL with Kafka

2013-01-06 Thread David Arthur
Storm has support for Kafka, if that's the sort of thing you're looking
for. Maybe you could describe your use case a bit more?

On Sunday, January 6, 2013, Guy Doulberg wrote:

> Hi
>
> I am looking for an ETL tool that can connect to kafka, as a consumer and
> as a producer,
>
> Have you heard of such a tool?
>
> Thanks
> Guy
>
>

-- 
David Arthur


Re: S3 Consumer

2012-12-27 Thread David Arthur
I don't think anything exists like this in Kafka (or contrib), but it 
would be a useful addition! Personally, I have written this exact thing 
at previous jobs.


As for the Hadoop consumer, since there is a FileSystem implementation 
for S3 in Hadoop, it should be possible. The Hadoop consumer works by 
writing out data files containing the Kafka messages along side offset 
files which contain the last offset read for each partition. If it is 
re-consuming from zero each time you run it, it means it's not finding 
the offset files from the previous run.


Having used it a bit, the Hadoop consumer is certainly an area that 
could use improvement.


HTH,
David

On 12/27/12 4:41 AM, Pratyush Chandra wrote:

Hi,

I am looking for a S3 based consumer, which can write all the received
events to S3 bucket (say every minute). Something similar to Flume HDFSSink
http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
I have tried evaluating hadoop-consumer in contrib folder. But it seems to
be more for offline processing, which will fetch everything from offset 0
at once and replace it in S3 bucket.
Any help would be appreciated ?





Re: Kafka Node.js Integration Questions/Advice

2012-12-22 Thread David Arthur
FWIW, message production is quite simpler than consumption. It does
not require the same complex coordination as the consumers. Producers
only use ZooKeeper to locate available brokers

Sent from my phone

On Dec 22, 2012, at 1:00 PM, Apoorva Gaurav  wrote:

> Thanks Radek,
> We also are thinking of Java / Scala for Consumers, for Producers whether
> franz-kafka is a good choice?
>
> --
> Thanks & Regards,
> Apoorva
>
> On Sat, Dec 22, 2012 at 9:38 PM, Radek Gruchalski <
> radek.gruchal...@portico.io> wrote:
>
>> We started using node-kafka before we learned franz-kafka was available.
>> In node, franz-kafka would be my preferred choice now. But tbh, our
>> consumers are all java. node-kafka does not support consumer settings like
>> autooffset.reset and so on (or it is not obvious how to use those).
>>
>> Afair franz-kafka offers those. Also, java zkconsumer gives you the jmx
>> monitoring tools, which may be helpful if you want to add some scaling
>> logic when consumer is lagging.
>>
>> Our first choice is node too but we're consuming exclusively with java.
>>
>> Hope this helps a little.
>>
>> On 22 Dec 2012, at 05:21, Apoorva Gaurav  wrote:
>>
>>> Which is the best ZK based implementation of kafka in node.js. Our use
>> case
>>> is that a pool of node js http servers will be listening to clients which
>>> will send json over http. Using node js we'll do minimal decoration and
>>> compression (preferably snappy) and write to brokers. We might also need
>>> json to avro conversion but thats not a deal breaker. Consumers will be
>>> writing these events to S3 (to begin with we don't plan to maintain HDFS
>>> cluster). To begin with we'll have to support a peak load of 50K events /
>>> second, average being much less, around 2K events / second. Suggestions
>>> please. Is any one using franz-kafka in production. I'm only two days
>> into
>>> kafka so don't know a lot, but franz-kafka looks clean and easy to work
>>> with.
>>>
>>> If none of the existing node.js implementation is capable of this then we
>>> are willing to move to Scala or Java but node.js is the first choice.
>>>
>>> Thanks & Regards,
>>> Apoorva
>>>
>>> On Sat, Dec 22, 2012 at 2:25 AM, Radek Gruchalski <
>>> radek.gruchal...@portico.io> wrote:
>>>
 We are using https://github.com/radekg/node-kafka, occasionally pushing
 about 2500 messages, 3.5K each / second. No issues so far. Different
>> story
 with consumers. They are stable but under heavy load we experienced CPU
 problems. I am the maintainer of that fork. The fork comes with ZK
 integration. Another kafka module is this one:
 https://github.com/dannycoates/franz-kafka.

 Kind regards,
 Radek Gruchalski
 radek.gruchal...@technicolor.com (mailto:
>> radek.gruchal...@technicolor.com)
 | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) |
 ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 00447889948663

 Confidentiality:
 This communication is intended for the above-named person and may be
 confidential and/or legally privileged.
 If it has come to you in error you must take no action based on it, nor
 must you copy or show it to anyone; please delete/destroy and inform the
 sender immediately.


 On Thursday, 20 December 2012 at 18:31, Jun Rao wrote:

> Chris,
>
> Not sure how stable those node.js clients are. In 0.8, we plan to
 provide a
> native C version of the producer. A thin node.js layer can potentially
>> be
> built on top of that.
>
> Thanks,
>
> Jun
>
> On Thu, Dec 20, 2012 at 8:46 AM, Christopher Alexander <
> calexan...@gravycard.com (mailto:calexan...@gravycard.com)> wrote:
>
>> During my due diligence to assess use of Kafka for both our activity
 and
>> log message streams, I would like to ask the project committers and
>> community users about using Kafka with Node.js. Yes, I am aware that a
>> Kafka client exists for Node.js (
>> https://github.com/marcuswestin/node-kafka), which has spurred
>> further
>> interest by our front-end team. Here are my questions, excuse me if
 they
>> seem "noobish".
>>
>> 1. How reliable is the Node.js client (
>> https://github.com/marcuswestin/node-kafka) in production
 applications?
>> If there are issues, what are they (the GitHub repo currently lists
 none)?
>> 2. To support real-time activity streams within Node.js, what is the
>> recommended consumer polling interval?
>> 3. General advise observations on integrating a front-end based
>> Node.js
>> application with Kafka mediated messaging.
>>
>> Thanks you!
>>
>> Chris
>>


Re: Http based producer

2012-12-21 Thread David Arthur

Pratyush,

I'm not a big node.js user so I can't speak to any of the node.js clients. I 
mostly use the Java/Scala client. Some clients attempt to support the ZooKeeper 
consumer coordination, some don't (since it is hard to get right). There is 
work in progress within Kafka to simplify the consumer offset management and 
centralizing it to the brokers. This will make things easier for the clients 
(no ZooKeeper communication necessary).

If node.js is you application language, you might try asking in their IRC 
channel for people using Kafka.

Good luck
-David

On 12/21/12 12:16 AM, Pratyush Chandra wrote:

Hi David,

I was looking into the listed node.js library. Prozess doesn't seem to use
zookeeper for connection.

Instead, I found one (mentioned below) which uses zookeeper based
connection in node.js .
https://npmjs.org/package/franz-kafka
https://github.com/dannycoates/franz-kafka

Are you aware of this library ?

Thanks
Pratyush

On Thu, Dec 20, 2012 at 7:26 PM, David Arthur  wrote:


There are several clients available listed on the project wiki. Node.js is
among them

https://cwiki.apache.org/**confluence/display/KAFKA/**
Kafka+non-java+clients<https://cwiki.apache.org/confluence/display/KAFKA/Kafka+non-java+clients>

Since Kafka doesn't support the websockets or HTTP directly, you would
need a middle man to redirect events from the browser to a Kafka broker.

-David


On 12/20/12 4:16 AM, Pratyush Chandra wrote:


Hi,

I am new to Kafka. I am exploring ways to pump events from http
browser(using javascript) or over tcp (say using node js) to broker.
Currently I see, only scala based producer in source code.
What is the best way to do it ? Is there any standard client library which
supports it ?

Thanks
Pratyush Chandra








Re: Kafka Node.js Integration Questions/Advice

2012-12-20 Thread David Arthur


On 12/20/12 11:46 AM, Christopher Alexander wrote:

During my due diligence to assess use of Kafka for both our activity and log message 
streams, I would like to ask the project committers and community users about using Kafka 
with Node.js. Yes, I am aware that a Kafka client exists for Node.js 
(https://github.com/marcuswestin/node-kafka), which has spurred further interest by our 
front-end team. Here are my questions, excuse me if they seem "noobish".

1. How reliable is the Node.js client 
(https://github.com/marcuswestin/node-kafka) in production applications? If 
there are issues, what are they (the GitHub repo currently lists none)?
Just FYI, there is another node.js library 
https://github.com/cainus/Prozess. I have no experience with either, so 
I cannot say how reliable they are.

2. To support real-time activity streams within Node.js, what is the 
recommended consumer polling interval?
What kind of data velocity do you expect? You should only have to poll 
if your consumer catches up to the broker and there's no more data. 
Blocking/polling behavior of the consumer depends entirely on the client 
implementation.

3. General advise observations on integrating a front-end based Node.js 
application with Kafka mediated messaging.

Thanks you!

Chris




Re: Http based producer

2012-12-20 Thread David Arthur
There are several clients available listed on the project wiki. Node.js 
is among them


https://cwiki.apache.org/confluence/display/KAFKA/Kafka+non-java+clients

Since Kafka doesn't support the websockets or HTTP directly, you would 
need a middle man to redirect events from the browser to a Kafka broker.


-David

On 12/20/12 4:16 AM, Pratyush Chandra wrote:

Hi,

I am new to Kafka. I am exploring ways to pump events from http
browser(using javascript) or over tcp (say using node js) to broker.
Currently I see, only scala based producer in source code.
What is the best way to do it ? Is there any standard client library which
supports it ?

Thanks
Pratyush Chandra





  1   2   >