[jira] [Created] (FLINK-33866) KafkaSinkBuilder in flink-connector-kafka references DeliveryGuarantee in flink-connector-base

2023-12-17 Thread Kurt Ostfeld (Jira)
Kurt Ostfeld created FLINK-33866:


 Summary: KafkaSinkBuilder in flink-connector-kafka references 
DeliveryGuarantee in flink-connector-base
 Key: FLINK-33866
 URL: https://issues.apache.org/jira/browse/FLINK-33866
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: kafka-3.0.2
Reporter: Kurt Ostfeld


I have a Flink project that has code like:

```
KafkaSink.builder().setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
```
 
This worked with flink-connector-kafka 3.0.1 as well as past versions of Flink.
 
This fails to compile with flink-connector-kafka 3.0.2 because that release 
changed flink-connector-base to a provided dependency so the reference to the 
DeliveryGuarantee class becomes a compiler error.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [NOTICE] Experimental Java 17 support now available on master

2023-06-18 Thread Kurt Ostfeld
I think there is some confusion:

Chesnay, not me, recently checked in changes into master so that Flink will 
build + test + run with experimental support for Java 17 but with Kryo 2.x 
as-is so this will error with Java records. Chesnay created this particular 
email thread related to this work.

I (Kurt), created a PR+FLIP several weeks ago for upgrading Kryo from 2.x to 
5.x, with full backward compatibility for existing savepoints/checkpoints, that 
enables Flink to run on Java 17 with support for Java records. This isn't 
merged into master. I haven't gotten much feedback on this.

I recently rebased the Kryo upgrade PR onto the master branch, whicch includes 
Chesnay commits. The PR branch was already running successfully on Java 17, 
Chesnay's changes enable Flink to build and run the CI test suite in Java 17 as 
well. However, without the Kryo upgrade, Flink isn't compatible with Java 
records.

I'd be happy to follow the standard process and do the the FLIP vote, but 
before this is ready for a vote, this PR needs review + testing by someone 
other than me. Specifically, I'd like someone to try to create a Flink 
application that tries to break the upgrade process: either confirm that 
everything works or demonstrate an error scenario. 

The Kryo PR code is passing all automated CI tests, which include several tests 
covering backwards compatibility scenarios. I also created this simple 
application https://github.com/kurtostfeld/flink-kryo-upgrade-demo to create 
state with Flink 1.17 and test the upgrade process. From what I can see it 
works, but this would definitely need more testing from people other than just 
me.



--- Original Message ---
On Sunday, June 18th, 2023 at 7:41 AM, Jing Ge  
wrote:


> 
> 
> Hi Kurt,
> 
> Thanks for your contribution. I am a little bit confused about the email
> title, since your PR[1] is not merged into the master yet. I guess, with
> "Experimental Java 17 support", you meant it is available on your branch
> which is based on the master.
> 
> If I am not mistaken, there is no vote thread of FLIP 317 on ML. Would you
> like to follow the standard process[2] defined by the Flink community?
> Thanks!
> 
> 
> Best regards,
> Jing
> 
> [1] https://github.com/apache/flink/pull/22660
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> 
> On Sun, Jun 18, 2023 at 1:18 AM Kurt Ostfeld kurtostf...@proton.me.invalid
> 
> wrote:
> 
> > I built the Flink master branch and tried running this simple Flink app
> > that uses a Java record:
> > 
> > https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java
> > 
> > It fails with the normal exception that Kryo 2.x throws when you try to
> > serialize a Java record. The full stack trace is here:
> > https://pastebin.com/HGhGKUWt
> > 
> > I tried removing this line:
> > 
> > https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java#L36
> > 
> > and that had no impact, I got the same error.
> > 
> > In the other thread, you said that the plan was to use PojoSerializer to
> > serialize records rather than Kryo. Currently, the Flink code bases uses
> > Kryo 2.x by default for generic user data types, and that will fail when
> > the data type is a record or contains records. Ultimately, if Flink wants
> > to fully support Java records, it seems that it has to move off of Kryo
> > 2.x. PojoSerializer is part of what is basically a custom serialization
> > library internal to Flink that is an alternative to Kryo. That's one
> > option: move off of Kryo to a Flink-internal serialization library. The
> > other two options are upgrade to the new Kryo or use a different
> > serialization library.
> > 
> > The Kryo 5.5.0 upgrade PR I submitted (
> > https://github.com/apache/flink/pull/22660) with FLIP 317 (
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0)
> > works with records. The Flink app linked above that uses records works with
> > the PR and that's what I posted to this mailing list a few weeks ago. I
> > rebased the pull request on to the latest master branch and it's passing
> > all tests. From my testing, it supports stateful upgrades, including
> > checkpoints. If you can demonstrate a scenario where stateful upgrades
> > error I can try to resolve that.


[NOTICE] Experimental Java 17 support now available on master

2023-06-17 Thread Kurt Ostfeld
I built the Flink master branch and tried running this simple Flink app that 
uses a Java record:

https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java

It fails with the normal exception that Kryo 2.x throws when you try to 
serialize a Java record. The full stack trace is here: 
https://pastebin.com/HGhGKUWt

I tried removing this line:

https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java#L36

and that had no impact, I got the same error.

In the other thread, you said that the plan was to use PojoSerializer to 
serialize records rather than Kryo. Currently, the Flink code bases uses Kryo 
2.x by default for generic user data types, and that will fail when the data 
type is a record or contains records. Ultimately, if Flink wants to fully 
support Java records, it seems that it has to move off of Kryo 2.x. 
PojoSerializer is part of what is basically a custom serialization library 
internal to Flink that is an alternative to Kryo. That's one option: move off 
of Kryo to a Flink-internal serialization library. The other two options are 
upgrade to the new Kryo or use a different serialization library.

The Kryo 5.5.0 upgrade PR I submitted 
(https://github.com/apache/flink/pull/22660) with FLIP 317 
(https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0)
 works with records. The Flink app linked above that uses records works with 
the PR and that's what I posted to this mailing list a few weeks ago. I rebased 
the pull request on to the latest master branch and it's passing all tests. 
From my testing, it supports stateful upgrades, including checkpoints. If you 
can demonstrate a scenario where stateful upgrades error I can try to resolve 
that.

Re: [DISCUSS] FLIP-317: Upgrade Kryo from 2.24.0 to 5.5.0

2023-06-10 Thread Kurt Ostfeld
Regarding this comment: "The version in the state is the serializer version, 
and applies to the entire state, independent of what it contains. If you use 
Kryo2 for reading and Kryo5 for writing (which also implies writing the new 
serializer version into state), then I'd assume that a migration is an 
all-or-nothing kind of deal."

Much of Flink uses the TypeSerializerSnapshot classes for serialization. With 
that, the fully qualified package+class name of a subclass of 
TypeSerializerSnapshot is written to the state as a string. The pull-request 
uses this class name to determine the correct version of Kryo to use. Flink up 
to and including 1.17.x uses 
`org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshot` for 
Kryo 2.x serialized data. The pull request uses 
`org.apache.flink.api.java.typeutils.runtime.kryo5.KryoSerializerSnapshot` for 
Kryo 5.x serialized data. Serialized state mixes different types of snapshots 
and if it has both Kryo 2.x and Kryo 5.x snapshot data, that works without 
problems and uses the correct version of Kryo to deserialize successfully.

The state version number is used to determine the serialized Kryo version at 
only one point in the source code where the Snapshot classes are not used:

https://github.com/kurtostfeld/flink/blob/e013e9e95096efb41d376f3b6584b5d3d78dc916/flink-runtime/src/main/java/org/apache/flink/runtime/state/heap/StateTableByKeyGroupReaders.java#L73

This seems to work from my testing. If I can find a scenario where this doesn't 
work I can come up with a revised solution.

I'd like to conclude that this pull-request demonstrates that a backward 
compatible Kryo upgrade is possible and is mostly done. More testing from a 
wider pool of people would be needed to proceed, but this demonstrates it is 
possible. However, whether this proceeds at all is up to the Flink project. If 
the plan of the Flink project is to drop all backward compatibility anyway for 
a 2.0 release as Martijn Visser suggested in this thread, then the Kryo upgrade 
can be done in a much much simpler fashion, and doing a more complex backward 
compatible upgrade seems unnecessary.




Re: [DISCUSS] FLIP-317: Upgrade Kryo from 2.24.0 to 5.5.0

2023-06-08 Thread Kurt Ostfeld
ok:
- I start a Flink 1.17.1 cluster, run the job, then run `flink stop` and 
generate a savepoint. This savepoint will have Kryo 2.x data from standard 
Flink 1.17.1.
- I start a Flink 1.18-SNAPSHOT cluster with the pull-request, run the job with 
resume from the savepoint from Flink 1.17, then I kill the cluster. I have a 
checkpoint with metadata. I believe this checkpoint is all using Kryo 5.x 
serialization.
- I restart the cluster, run the job resuming from the checkpoint, and 
everything runs successfully. The job picks up where it left off and there are 
no errors, all output data looks correct.

Am I following the scenario correctly? Why would a checkpoint created by the 
new pull-request code have Kryo 2.x serialized data?

Here is the code for my test app that I'm using. The checkpoint configuration 
settings are mostly from 
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/datastream/fault-tolerance/checkpointing/

https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-kryo-upgraded/src/main/java/demo/app/Main.java


--- Original Message ---
On Thursday, June 8th, 2023 at 9:33 AM, Chesnay Schepler  
wrote:


> 
> 
> On 08/06/2023 16:06, Kurt Ostfeld wrote:
> 
> > If I understand correctly, the scenario is resuming from multiple 
> > checkpoint files or from a savepoint and checkpoint files which may be 
> > generated by different versions of Flink
> 
> 
> No; it's the same version of Flink, you just didn't do a full migration
> of the savepoint from the start.
> 
> So, load old savepoint -> create an incremental checkpoint (which writes
> 
> bit new state with Kryo5) -> jobs fails -> try recover job (which now
> 
> has to read state was written with either Kryo2 or Kryo5).
> 
> On 08/06/2023 16:06, Kurt Ostfeld wrote:
> 
> > This pull-request build supports Java records
> 
> 
> We'd have to see but of the top of my head I doubt we want to use Kryo
> for that, and rather extend our PojoSerializer. At least so far that was
> the plan.


Re: [DISCUSS] FLIP-317: Upgrade Kryo from 2.24.0 to 5.5.0

2023-06-08 Thread Kurt Ostfeld
If the Flink project is planning to completely drop all stateful upgrade 
compatibility within the near year for a Flink 2.0 release, then providing a 
stateful migration pathway from Kryo 2.x to Kryo 5.x is probably unnecessary. 
Is that correct? Is the Flink project pretty confident that Flink 2.0 will not 
be compatible with Flink 1.x state?




--- Original Message ---
On Monday, June 5th, 2023 at 7:51 AM, Martijn Visser  
wrote:


> 
> 
> Hi ConradJam,
> 
> That assumes that it will be possible to upgrade statefully to Flink 2.0:
> given that it is a major breaking change, I wouldn't assume that will be
> possible.
> 
> Best regards,
> 
> Martijn
> 
> On Mon, Jun 5, 2023 at 2:37 PM ConradJam jam.gz...@gmail.com wrote:
> 
> > Here I have a suggestion, because I mentioned Flink2.0 earlier, I am
> > wondering if there is a possibility: whether the user can perform the
> > migration of all states to Kryo5 when performing the first start-up
> > task of migrating to version 2.0 in the future, until we give up
> > maintaining Kryo2 later
> > 
> > Don't know if my idea coincides with Chesnay's
> > 
> > Chesnay Schepler ches...@apache.org 于2023年6月1日周四 23:25写道:
> > 
> > > The version in the state is the serializer version, and applies to the
> > > entire state, independent of what it contains.
> > > If you use Kryo2 for reading and Kryo5 for writing (which also implies
> > > writing the new serializer version into state), then I'd assume that a
> > > migration is an all-or-nothing kind of deal.
> > > IOW, you'd have to load a savepoint and write out an entirely new
> > > savepoint with the new state.
> > > Otherwise you may have only re-written part of the checkpoint, and now
> > > contains a mix of Kryo2/Kryo5 serialized classes, which should then fail
> > > hard on any recovery attempt because we wouldn't use Kryo2 to read
> > > anything.
> > > 
> > > If I'm right, then as is this sounds like quite a trap for users to fall
> > > into because from what I gathered this is the default behavior in the PR
> > > (I could be wrong though since I haven't fully dug through the 8k lines
> > > PR yet...)
> > > 
> > > What we kind of want is this:
> > > 1) Kryo5 is used as the default for new jobs. (maybe not even that,
> > > making it an explicit opt-in)
> > > 2) Kryo2 is used for reading AND writing for existing* jobs by default.
> > > 3) Users can explicitly (and easily!) do a full migration of their jobs,
> > > after which 2) should no longer apply.
> > > 
> > > In the PR you mentioned running into issues on Java 17; to have have
> > > some error stacktraces and examples data/serializers still around?
> > > 
> > > On 30/05/2023 00:38, Kurt Ostfeld wrote:
> > > 
> > > > > I’d assumed that there wasn’t a good way to migrate state stored with
> > > > > an older version of Kryo to a newer version - if you’ve solved that, 
> > > > > kudos.
> > > > > I hope I've solved this. The pull request is supposed to do exactly
> > > > > this. Please let me know if you can propose a scenario that would 
> > > > > break
> > > > > this.
> > > > 
> > > > The pull-request has both Kryo 2.x and 5.x dependencies. It looks at
> > > > the state version number written to the state to determine which 
> > > > version of
> > > > Kryo to use for deserialization. Kryo 2.x is not used to write new 
> > > > state.
> > > > 
> > > > --- Original Message ---
> > > > On Monday, May 29th, 2023 at 5:17 PM, Ken Krugler <
> > > > kkrugler_li...@transpac.com> wrote:
> > > > 
> > > > > Hi Kurt,
> > > > > 
> > > > > I personally think it’s a very nice improvement, and that the
> > > > > longer-term goal of removing built-in Kryo support for state 
> > > > > serialization
> > > > > (while a good one) warrants a separate FLIP.
> > > > > 
> > > > > Perhaps an intermediate approach would be to disable the use of Kryo
> > > > > for state serialization by default, and force a user to disregard 
> > > > > warnings
> > > > > and explicitly enable it if they want to go down that path.
> > > > > 
> > > > > I’d assumed that there wasn’t a good way to migrate state stored with
> > > > > an older version of Kryo to a newer version - if you’ve solved that,

Re: [DISCUSS] FLIP-317: Upgrade Kryo from 2.24.0 to 5.5.0

2023-06-08 Thread Kurt Ostfeld
Thank you very much for the feedback.

- With this pull-request build, Flink runs successfully with a JDK 17 runtime 
for applications without saved state or with applications with saved state from 
this pull-request build which is using Kryo 5.x. FYI, the Maven build is still 
run with JDK 8 or 11 but the Flink jobmanager and taskmanager can be run with a 
JDK 17 runtime.
- Kryo 2.x is still on the classpath for backwards compatibility purposes, and 
if you try to load a savepoint from Flink 1.17 or older which uses the Kryo 2.x 
serialization library with JDK 17+, that will fail with exceptions.
- A stateful upgrade pathway looks like this: Applications run a Flink cluster 
with this pull-request under JDK 8 or 11, load an existing savepoint with Kryo 
2.x data, write out a new savepoint which automatically uses Kryo 5.x, restart 
the Flink cluster with a JDK 17 runtime, and resume from the new savepoint 
successfully.
- This pull-request build supports Java records (which obviously requires 
JDK17+ at runtime) with the Flink DataStream API. Kryo 5.x supports records so 
this works without any extra configuration. A simple demo is here: 
https://github.com/kurtostfeld/flink-kryo-upgrade-demo/blob/main/flink-record-demo/src/main/java/demo/app/Main.java.
 The app is built with JDK 17, Flink's Maven build still runs with JDK 8/11, 
but the Flink cluster uses JDK 17 at runtime.


I need to investigate the scenario you describe. If I understand correctly, the 
scenario is resuming from multiple checkpoint files or from a savepoint and 
checkpoint files which may be generated by different versions of Flink and 
therefore may be using different Kryo library versions. Is that accurate? We 
need to accommodate that scenario and I will investigate.




--- Original Message ---
On Thursday, June 1st, 2023 at 10:25 AM, Chesnay Schepler  
wrote:


>
>
> The version in the state is the serializer version, and applies to the
> entire state, independent of what it contains.
> If you use Kryo2 for reading and Kryo5 for writing (which also implies
> writing the new serializer version into state), then I'd assume that a
> migration is an all-or-nothing kind of deal.
> IOW, you'd have to load a savepoint and write out an entirely new
> savepoint with the new state.
> Otherwise you may have only re-written part of the checkpoint, and now
> contains a mix of Kryo2/Kryo5 serialized classes, which should then fail
> hard on any recovery attempt because we wouldn't use Kryo2 to read
> anything.
>
> If I'm right, then as is this sounds like quite a trap for users to fall
> into because from what I gathered this is the default behavior in the PR
> (I could be wrong though since I haven't fully dug through the 8k lines
> PR yet...)
>
> What we kind of want is this:
> 1) Kryo5 is used as the default for new jobs. (maybe not even that,
> making it an explicit opt-in)
> 2) Kryo2 is used for reading AND writing for existing* jobs by default.
> 3) Users can explicitly (and easily!) do a full migration of their jobs,
> after which 2) should no longer apply.
>
>
>
> In the PR you mentioned running into issues on Java 17; to have have
> some error stacktraces and examples data/serializers still around?
>
> On 30/05/2023 00:38, Kurt Ostfeld wrote:
>
> > > I’d assumed that there wasn’t a good way to migrate state stored with an 
> > > older version of Kryo to a newer version - if you’ve solved that, kudos.
> > > I hope I've solved this. The pull request is supposed to do exactly this. 
> > > Please let me know if you can propose a scenario that would break this.
> >
> > The pull-request has both Kryo 2.x and 5.x dependencies. It looks at the 
> > state version number written to the state to determine which version of 
> > Kryo to use for deserialization. Kryo 2.x is not used to write new state.
> >
> > --- Original Message ---
> > On Monday, May 29th, 2023 at 5:17 PM, Ken Krugler 
> > kkrugler_li...@transpac.com wrote:
> >
> > > Hi Kurt,
> > >
> > > I personally think it’s a very nice improvement, and that the longer-term 
> > > goal of removing built-in Kryo support for state serialization (while a 
> > > good one) warrants a separate FLIP.
> > >
> > > Perhaps an intermediate approach would be to disable the use of Kryo for 
> > > state serialization by default, and force a user to disregard warnings 
> > > and explicitly enable it if they want to go down that path.
> > >
> > > I’d assumed that there wasn’t a good way to migrate state stored with an 
> > > older version of Kryo to a newer version - if you’ve solved that, kudos.
> > >
> > > — Ken
> > >
> > > > On May 29, 2023, at

Re: [DISCUSS] FLIP-317: Upgrade Kryo from 2.24.0 to 5.5.0

2023-05-29 Thread Kurt Ostfeld
> I’d assumed that there wasn’t a good way to migrate state stored with an 
> older version of Kryo to a newer version - if you’ve solved that, kudos.

I hope I've solved this. The pull request is supposed to do exactly this. 
Please let me know if you can propose a scenario that would break this.

The pull-request has both Kryo 2.x and 5.x dependencies. It looks at the state 
version number written to the state to determine which version of Kryo to use 
for deserialization. Kryo 2.x is not used to write new state.

--- Original Message ---
On Monday, May 29th, 2023 at 5:17 PM, Ken Krugler  
wrote:


> 
> 
> Hi Kurt,
> 
> I personally think it’s a very nice improvement, and that the longer-term 
> goal of removing built-in Kryo support for state serialization (while a good 
> one) warrants a separate FLIP.
> 
> Perhaps an intermediate approach would be to disable the use of Kryo for 
> state serialization by default, and force a user to disregard warnings and 
> explicitly enable it if they want to go down that path.
> 
> I’d assumed that there wasn’t a good way to migrate state stored with an 
> older version of Kryo to a newer version - if you’ve solved that, kudos.
> 
> — Ken
> 
> 
> > On May 29, 2023, at 2:21 PM, Kurt Ostfeld kurtostf...@proton.me.INVALID 
> > wrote:
> > 
> > Hi everyone. I would like to start the discussion thread for FLIP-317: 
> > Upgrade Kryo from 2.24.0 to 5.5.0 [1].
> > 
> > There is a pull-request associated with this linked in the FLIP.
> > 
> > I'd particularly like to hear about:
> > 
> > - Chesnay Schepler's request to consider removing Kryo serializers from the 
> > execution config. Is this a reasonable task to add into this FLIP? Is there 
> > consensus on how to resolve that? Would that be better addressed in a 
> > separate future FLIP after the Kryo upgrade FLIP is completed?
> > 
> > - Backwards compatibility. The automated CI tests have a lot of backwards 
> > compatibility tests that are passing. I also wrote a Flink application with 
> > keyed state using custom Kryo v2 serializers and then an upgraded version 
> > with both Kryo v2 and Kryo v5 serializers to stress test the upgrade 
> > process. I'd like to hear about additional scenarios that need to be tested.
> > 
> > - Is this worth pursuing or is the Flink project looking to go in a 
> > different direction? I'd like to do some more work on the pull request if 
> > this is being seriously considered for adoption.
> > 
> > I'm looking forward to hearing everyone's feedback and suggestions.
> > 
> > Thank you,
> > Kurt
> > 
> > [1] 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0
> 
> 
> --
> Ken Krugler
> http://www.scaleunlimited.com
> Custom big data solutions
> Flink, Pinot, Solr, Elasticsearch
>


[DISCUSS] FLIP-317: Upgrade Kryo from 2.24.0 to 5.5.0

2023-05-29 Thread Kurt Ostfeld
Hi everyone. I would like to start the discussion thread for FLIP-317: Upgrade 
Kryo from 2.24.0 to 5.5.0 [1].

There is a pull-request associated with this linked in the FLIP.

I'd particularly like to hear about:

- Chesnay Schepler's request to consider removing Kryo serializers from the 
execution config. Is this a reasonable task to add into this FLIP? Is there 
consensus on how to resolve that? Would that be better addressed in a separate 
future FLIP after the Kryo upgrade FLIP is completed?

- Backwards compatibility. The automated CI tests have a lot of backwards 
compatibility tests that are passing. I also wrote a Flink application with 
keyed state using custom Kryo v2 serializers and then an upgraded version with 
both Kryo v2 and Kryo v5 serializers to stress test the upgrade process. I'd 
like to hear about additional scenarios that need to be tested.

- Is this worth pursuing or is the Flink project looking to go in a different 
direction? I'd like to do some more work on the pull request if this is being 
seriously considered for adoption.

I'm looking forward to hearing everyone's feedback and suggestions.

Thank you,
Kurt

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0


Kryo Upgrade: Request FLIP page create access

2023-05-28 Thread Kurt Ostfeld
Chesnay Schepler asked me to create a FLIP for this pull request:
https://github.com/apache/flink/pull/22660

I created an account for the Flink Confluence site with username "kurto", but I 
don't have access to create pages, and therefore don't have access to create a 
FLIP. I see the FLIP docs say:

> If you don't have the necessary permissions for creating a new page, please 
> ask on the development mailing list.

https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

Can I request this access please? Thank you :)

[jira] [Created] (FLINK-32104) stop-with-savepoint fails and times out with simple reproducible example

2023-05-15 Thread Kurt Ostfeld (Jira)
Kurt Ostfeld created FLINK-32104:


 Summary: stop-with-savepoint fails and times out with simple 
reproducible example
 Key: FLINK-32104
 URL: https://issues.apache.org/jira/browse/FLINK-32104
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Affects Versions: 1.17.0
Reporter: Kurt Ostfeld


I've put together a simple demo app that reproduces the issue with instructions 
on how to reproduce:

[https://github.com/kurtostfeld/flink-stop-issue]

 

The issue is with a very simple application written with the Flink DataStream 
API, `stop-with-savepoint` fails and times out like this:

 
{code:java}
./bin/flink stop --type native --savepointPath ../savepoints 
d69a952625497cca0665dfdcdb9f4718

Suspending job "d69a952625497cca0665dfdcdb9f4718" with a NATIVE savepoint.


 The program finished with the following exception:

org.apache.flink.util.FlinkException: Could not stop with a savepoint job 
"d69a952625497cca0665dfdcdb9f4718".
at 
org.apache.flink.client.cli.CliFrontend.lambda$stop$4(CliFrontend.java:595)
at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1041)
at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:578)
at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1110)
at 
org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189)
at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
at 
org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157)
Caused by: java.util.concurrent.TimeoutException
at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at 
org.apache.flink.client.cli.CliFrontend.lambda$stop$4(CliFrontend.java:591)
... 7 more {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31938) Failing Unit Test: FlinkConnectionTest.testCatalogSchema "Failed to get response for the operation"

2023-04-25 Thread Kurt Ostfeld (Jira)
Kurt Ostfeld created FLINK-31938:


 Summary: Failing Unit Test: FlinkConnectionTest.testCatalogSchema 
"Failed to get response for the operation"
 Key: FLINK-31938
 URL: https://issues.apache.org/jira/browse/FLINK-31938
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / JDBC
Reporter: Kurt Ostfeld


{noformat}
[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.885 s 
<<< FAILURE! - in org.apache.flink.table.jdbc.FlinkConnectionTest
[ERROR] org.apache.flink.table.jdbc.FlinkConnectionTest.testCatalogSchema  Time 
elapsed: 1.513 s  <<< ERROR!
org.apache.flink.table.client.gateway.SqlExecutionException: Failed to get 
response for the operation 733f0d91-e9e8-4487-949f-f3abb13384e8.
at 
org.apache.flink.table.client.gateway.ExecutorImpl.getFetchResultResponse(ExecutorImpl.java:416)
at 
org.apache.flink.table.client.gateway.ExecutorImpl.fetchUtilResultsReady(ExecutorImpl.java:376)
at 
org.apache.flink.table.client.gateway.ExecutorImpl.executeStatement(ExecutorImpl.java:242)
at 
org.apache.flink.table.jdbc.FlinkConnectionTest.testCatalogSchema(FlinkConnectionTest.java:95){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31937) Failing Unit Test: ClientTest.testClientServerIntegration "Connection leak"

2023-04-25 Thread Kurt Ostfeld (Jira)
Kurt Ostfeld created FLINK-31937:


 Summary: Failing Unit Test: ClientTest.testClientServerIntegration 
"Connection leak"
 Key: FLINK-31937
 URL: https://issues.apache.org/jira/browse/FLINK-31937
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Queryable State
Reporter: Kurt Ostfeld


{code:java}
[ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.68 s 
<<< FAILURE! - in org.apache.flink.queryablestate.network.ClientTest[ERROR] 
org.apache.flink.queryablestate.network.ClientTest.testClientServerIntegration  
Time elapsed: 3.801 s  <<< FAILURE!java.lang.AssertionError: Connection leak 
(server) at 
org.apache.flink.queryablestate.network.ClientTest.testClientServerIntegration(ClientTest.java:719)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31897) Failing Unit Test: org.apache.flink.queryablestate.network.ClientTest.testRequestUnavailableHost

2023-04-23 Thread Kurt Ostfeld (Jira)
Kurt Ostfeld created FLINK-31897:


 Summary: Failing Unit Test: 
org.apache.flink.queryablestate.network.ClientTest.testRequestUnavailableHost
 Key: FLINK-31897
 URL: https://issues.apache.org/jira/browse/FLINK-31897
 Project: Flink
  Issue Type: Bug
  Components: API / State Processor
Reporter: Kurt Ostfeld


 
 
{code:java}
[ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.612 s 
<<< FAILURE! - in org.apache.flink.queryablestate.network.ClientTest [ERROR] 
org.apache.flink.queryablestate.network.ClientTest.testRequestUnavailableHost 
Time elapsed: 0.006 s <<< FAILURE! java.lang.AssertionError: 
Expected: A CompletableFuture that will have failed within 360 milliseconds 
with: java.net.ConnectException but: Future completed with different exception: 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedSocketException:
 Can't assign requested address: /:0 Caused by: 
java.net.BindException: Can't assign requested address  {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31880) Bad Test in OrcColumnarRowSplitReaderTest

2023-04-21 Thread Kurt Ostfeld (Jira)
Kurt Ostfeld created FLINK-31880:


 Summary: Bad Test in OrcColumnarRowSplitReaderTest
 Key: FLINK-31880
 URL: https://issues.apache.org/jira/browse/FLINK-31880
 Project: Flink
  Issue Type: Bug
  Components: Connectors / ORC, Formats (JSON, Avro, Parquet, ORC, 
SequenceFile)
Reporter: Kurt Ostfeld


This is a development issue with, what looks like a buggy unit test.
 
I tried to build Flink with a clean copy of the repository and I get:
 
```
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] OrcColumnarRowSplitReaderTest.testReadFileWithTypes:365
expected: "1969-12-31"
but was: "1970-01-01"
[INFO]
[ERROR] Tests run: 26, Failures: 1, Errors: 0, Skipped: 0
```
 
I see the test is testing Date data types with `new Date(562423)` which is 9 
minutes and 22 seconds after the epoch time, which is 1970-01-01 UTC time, or 
when I run that on my laptop in CST timezone, I get `Wed Dec 31 18:09:22 CST 
1969`.
 
I have a simple pull request ready which fixes this issue and uses the Java 8 
LocalDate API instead which avoids time zones entirely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)