[jira] [Created] (KAFKA-15007) MV is not set correctly in the MetadataPropagator in migration.

2023-05-18 Thread Akhilesh Chaganti (Jira)
Akhilesh Chaganti created KAFKA-15007:
-

 Summary: MV is not set correctly in the MetadataPropagator in 
migration.
 Key: KAFKA-15007
 URL: https://issues.apache.org/jira/browse/KAFKA-15007
 Project: Kafka
  Issue Type: Bug
Reporter: Akhilesh Chaganti


MV changes are not set in propagator unless we're in DUAL_WRITE mode. But we do 
this, we'll skip any known MV changes. The propagator should always know the 
correct MV so that it sends correct UMR and LISR during DUAL_WRITE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Migration from zookeeper to kraft not working

2023-05-18 Thread David Arthur
Elena,

Did you provision a KRaft controller quorum before restarting the brokers?

If you don't mind, could you create a JIRA and attach the config files used
for the brokers before/after the migration along with the controller
configs? Please include the sequence of steps you took in the JIRA as well.

Here is our JIRA project:
https://issues.apache.org/jira/projects/KAFKA/issues, and general info on
filing issues
https://cwiki.apache.org/confluence/display/KAFKA/Reporting+Issues+in+Apache+Kafka

Thanks!
David



On Tue, May 16, 2023 at 2:54 AM Elena Batranu
 wrote:

> Hello! I have a problem with my kafka configuration (kafka 3.4). I'm
> trying to migrate from zookeeper to kraft. I have 3 brokers, on one of them
> was also the zookeeper. I want to restart my brokers one by one, without
> having downtime. I started with putting the configuration with also kraft
> and zookeeper, to do the migration gradually. In this step my nodes are up,
> but i have the following error in the logs from kraft.
> [2023-05-16 06:35:19,485] DEBUG [BrokerToControllerChannelManager broker=0
> name=quorum]: No controller provided, retrying after backoff
> (kafka.server.BrokerToControllerRequestThread)[2023-05-16 06:35:19,585]
> DEBUG [BrokerToControllerChannelManager broker=0 name=quorum]: Controller
> isn't cached, looking for local metadata changes
> (kafka.server.BrokerToControllerRequestThread)[2023-05-16 06:35:19,586]
> DEBUG [BrokerToControllerChannelManager broker=0 name=quorum]: No
> controller provided, retrying after backoff
> (kafka.server.BrokerToControllerRequestThread)[2023-05-16 06:35:19,624]
> INFO [RaftManager nodeId=0] Node 3002 disconnected.
> (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,624] WARN
> [RaftManager nodeId=0] Connection to node 3002 (/192.168.25.172:9093)
> could not be established. Broker may not be available.
> (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,642] INFO
> [RaftManager nodeId=0] Node 3001 disconnected.
> (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,642] WARN
> [RaftManager nodeId=0] Connection to node 3001 (/192.168.25.232:9093)
> could not be established. Broker may not be available.
> (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,643] INFO
> [RaftManager nodeId=0] Node 3000 disconnected.
> (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,643] WARN
> [RaftManager nodeId=0] Connection to node 3000 (/192.168.25.146:9093)
> could not be established. Broker may not be available.
> (org.apache.kafka.clients.NetworkClient)
> I configured the controller on each broker, the file looks like this:
> # Licensed to the Apache Software Foundation (ASF) under one or more#
> contributor license agreements.  See the NOTICE file distributed with# this
> work for additional information regarding copyright ownership.# The ASF
> licenses this file to You under the Apache License, Version 2.0# (the
> "License"); you may not use this file except in compliance with# the
> License.  You may obtain a copy of the License at##
> http://www.apache.org/licenses/LICENSE-2.0## Unless required by
> applicable law or agreed to in writing, software# distributed under the
> License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR
> CONDITIONS OF ANY KIND, either express or implied.# See the License for the
> specific language governing permissions and# limitations under the License.
> ## This configuration file is intended for use in KRaft mode, where#
> Apache ZooKeeper is not present.  See config/kraft/README.md for details.#
> # Server Basics #
> # The role of this server. Setting this puts us in KRaft
> modeprocess.roles=controller
> # The node id associated with this instance's rolesnode.id=3000# The
> connect string for the controller
> quorum#controller.quorum.voters=3000@localhost
> :9093controller.quorum.voters=3000@192.168.25.146:9093,
> 3001@192.168.25.232:9093,
> 3002@192.168.25.172:9093#
> 
> Socket Server Settings #
> # The address the socket server listens on.# Note that only the controller
> listeners are allowed here when `process.roles=controller`, and this
> listener should be consistent with `controller.quorum.voters` value.#
>  FORMAT:# listeners = listener_name://host_name:port#   EXAMPLE:#
>  listeners = PLAINTEXT://your.host.name:9092listeners=CONTROLLER://:9093
> # A comma-separated list of the names of the listeners used by the
> controller.# This is required if running in KRaft
> mode.controller.listener.names=CONTROLLER
> # Maps listener names to security protocols, the default is for them to be
> the same. See the config documentation for more
> details#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
> # The number o

[GitHub] [kafka-site] omkreddy opened a new pull request, #512: MINOR: Add System Properties to config documentation section

2023-05-18 Thread via GitHub


omkreddy opened a new pull request, #512:
URL: https://github.com/apache/kafka-site/pull/512

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] omkreddy commented on pull request #512: MINOR: Add System Properties to config documentation section

2023-05-18 Thread via GitHub


omkreddy commented on PR #512:
URL: https://github.com/apache/kafka-site/pull/512#issuecomment-1553474279

   ![Uploading Screenshot 2023-05-18 at 11.57.27 PM.png…]()
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] omkreddy commented on pull request #512: MINOR: Add System Properties to config documentation section

2023-05-18 Thread via GitHub


omkreddy commented on PR #512:
URL: https://github.com/apache/kafka-site/pull/512#issuecomment-1553478623

   https://github.com/apache/kafka-site/assets/8134545/5e291b31-1de8-4bbd-99fc-66baf2263932";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-923: Add A Grace Period to Stream Table Join

2023-05-18 Thread Walker Carlson
Hey all,

Thanks for the comments, they gave me a lot to think about. I'll try to
address them all inorder. I have made some updates to the kip related to
them, but I mention where below.

Lucas

Good idea about the example. I added a simple one.

1) I have thought about including options for the underlying buffer
configuration. One of which might be adding an in memory option. My biggest
concern is about the semantic guarantees. This isn't like suppress or with
windows where producing incomplete results is repetitively harmless. Here
we would be possibly producing incorrect results. I also would like to keep
the interface changes as simple as I can. Making more than this change to
Joined I feel could make this more complicated than it needs to be. If we
really want to I could see adding a grace() option with a BufferConifg in
there or something, but I would rather not.

2) The buffer will be independent of if the table is versioned or not. If
table is not materialized it will materialize it as versioned. It might
make sense to do a follow up kip where we force the retention period  of
the versioned to be greater than whatever the max of the stream buffer is.

Victoria

1) Yes, records will exit in timestamp order not in offset order.
2) Late records will be dropped (Late as out of the grace period). From my
understanding that is the point of a grace period, no? Doesn't the same
thing happen with versioned stores?
3) The segment store already has an observed stream time, we advance based
on that. That should only advance based on records that enter the store. So
yes, only stream side records. We could maybe do an improvement later to
advance stream time from table side as well, but that might be debatable as
we might get more late records. Anyways I would rather have that as a
separate discussion.

in memory option? We can do that, for the buffer I plan to use the
TimeOrderedKeyValueBuffer interface which already has an in memory
implantation, so it would be simple.

I said more in my answer to Lucas's question. The concern I have with
buffer configs or in memory is complicating the interface. Also semantic
guarantees but in memory shouldn't effect that

Matthias

1) fixed out of order vs late terminology in the kip.

2) I was referring to having a stream. So after this kip we can have a
buffered stream or a normal one. For the table we can use a versioned table
or a normal table.

3 Good call out. I clarified this as "If the table side uses a materialized
version store, it can store multiple versions of each record within its
defined grace period." and modified the rest of the paragraph a bit.

4) I get the preserving off offset ordering, but if the stream is buffered
to join on timestamp instead of offset doesn't it already seem like we care
more about time in this case?

If we end up adding more options it might make sense to do this. Maybe
offset order processing can be a follow up?

I'll add a section for this in Rejected Alternatives. I think it makes
sense to do something like this but maybe in a follow up.

5) I hadn't thought about this. I suppose if they changed this in an
upgrade the next record would either evict a lot of records (if the grace
period decreased) or there would be a pause until the new grace period
reached. Increasing is a bit more problematic, especially if the table
grace period and retention time stays the same. If the data is reprocessed
after a change like that then there would be different results, but I feel
like that would be expected after such a change.

What do you think should happen?

Hopefully this answers your questions!

Walker

On Mon, May 8, 2023 at 11:32 AM Matthias J. Sax  wrote:

> Thanks for the KIP! Also some question/comments from my side:
>
> 10) Notation: you use the term "late data" but I think you mean
> out-of-order. We reserve the term "late" to records that arrive after
> grace period passed, and thus, "late == out-of-order data that is dropped".
>
>
> 20) "There is only one option from the stream side and only recently is
> there a second option on the table side."
>
> What are those options? Victoria already asked about the table side, but
> I am also not sure what option you mean for the stream side?
>
>
> 30) "If the table side uses a materialized version store the value is
> the latest by stream time rather than by offset within its defined grace
> period."
>
> The phrase "the value is the latest by stream time" is confusing -- in
> the end, a versioned stores multiple versions, not just one.
>
>
> 40) I am also wondering about ordering. In general, KS tries to preserve
> offset-order during processing (with some exception, when offset order
> preservation is not clearly defined). Given that the stream-side buffer
> is really just a "linear buffer", we could easily preserve offset-order.
> But I also see a benefit of re-ordering and emitting out-of-order data
> right away when read (instead of blocking them behind in-order records
> that are no

[jira] [Created] (KAFKA-15008) GCS Sink Connector to parse JSON having leading 0's for an integer field

2023-05-18 Thread Lubna Naqvi (Jira)
Lubna Naqvi created KAFKA-15008:
---

 Summary: GCS Sink Connector to parse JSON having leading 0's for 
an integer field
 Key: KAFKA-15008
 URL: https://issues.apache.org/jira/browse/KAFKA-15008
 Project: Kafka
  Issue Type: Bug
Reporter: Lubna Naqvi


Our Kafka data which is in JSON format has an attribute (gtin) which is a 
number starting with Zero and Kafka Connect GCS Sink Connector fails to parse 
it. gtin is a very common attribute across any item related Kafka source and it 
is causing a major(blocking) issue as we are not able to parse this attribute.

 

We'd like to have the ability to parse the integer field in JSON message if it 
is prefixed with zero. Can this be investigated for consideration into release? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)