Checkpoints and windows size

2024-06-18 Thread banu priya
Hi All,

I have a flink job with key by, tumbling window(2sec window time &uses
processing time)and aggregator.

How often should I run the check point??I don't need the data to be
retained after 2s.

I want to use incremental check point with rocksdb.


Thanks
Banupriya


Re: [ANNOUNCE] Apache Flink CDC 3.1.1 released

2024-06-18 Thread Paul Lam
Well done! Thanks a lot for your hard work!

Best,
Paul Lam

> 2024年6月19日 09:47,Leonard Xu  写道:
> 
> Congratulations! Thanks Qingsheng for the release work and all contributors 
> involved.
> 
> Best,
> Leonard 
> 
>> 2024年6月18日 下午11:50,Qingsheng Ren  写道:
>> 
>> The Apache Flink community is very happy to announce the release of Apache
>> Flink CDC 3.1.1.
>> 
>> Apache Flink CDC is a distributed data integration tool for real time data
>> and batch data, bringing the simplicity and elegance of data integration
>> via YAML to describe the data movement and transformation in a data
>> pipeline.
>> 
>> Please check out the release blog post for an overview of the release:
>> https://flink.apache.org/2024/06/18/apache-flink-cdc-3.1.1-release-announcement/
>> 
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>> 
>> Maven artifacts for Flink CDC can be found at:
>> https://search.maven.org/search?q=g:org.apache.flink%20cdc
>> 
>> The full release notes are available in Jira:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354763
>> 
>> We would like to thank all contributors of the Apache Flink community who
>> made this release possible!
>> 
>> Regards,
>> Qingsheng Ren
> 



Re: [ANNOUNCE] Apache Flink CDC 3.1.1 released

2024-06-18 Thread Leonard Xu
Congratulations! Thanks Qingsheng for the release work and all contributors 
involved.

Best,
Leonard 

> 2024年6月18日 下午11:50,Qingsheng Ren  写道:
> 
> The Apache Flink community is very happy to announce the release of Apache
> Flink CDC 3.1.1.
> 
> Apache Flink CDC is a distributed data integration tool for real time data
> and batch data, bringing the simplicity and elegance of data integration
> via YAML to describe the data movement and transformation in a data
> pipeline.
> 
> Please check out the release blog post for an overview of the release:
> https://flink.apache.org/2024/06/18/apache-flink-cdc-3.1.1-release-announcement/
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html
> 
> Maven artifacts for Flink CDC can be found at:
> https://search.maven.org/search?q=g:org.apache.flink%20cdc
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354763
> 
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
> 
> Regards,
> Qingsheng Ren



[ANNOUNCE] Apache Flink CDC 3.1.1 released

2024-06-18 Thread Qingsheng Ren
The Apache Flink community is very happy to announce the release of Apache
Flink CDC 3.1.1.

Apache Flink CDC is a distributed data integration tool for real time data
and batch data, bringing the simplicity and elegance of data integration
via YAML to describe the data movement and transformation in a data
pipeline.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/2024/06/18/apache-flink-cdc-3.1.1-release-announcement/

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for Flink CDC can be found at:
https://search.maven.org/search?q=g:org.apache.flink%20cdc

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354763

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Qingsheng Ren


Elasticsearch 8 - FLINK-26088

2024-06-18 Thread Tauseef Janvekar
Dear Team,
As per https://issues.apache.org/jira/browse/FLINK-26088, elasticsearch 8
support is already added but I do not see it in any documentation.
Also the last version that supports any elasticsearch is 1.17.x.

Can I get the steps on how to integrate with elastic 8 and some sample code
would be appreciated.

Thank you,
Tauseef


[no subject]

2024-06-18 Thread Dat Nguyen Tien



[no subject]

2024-06-18 Thread Dat Nguyen Tien



RE: Problem reading a CSV file with pyflink datastream in k8s with Flink operator

2024-06-18 Thread gwenael . lebarzic
Hello Rob.

This workaround works indeed !

Cdt.
[Logo Orange]

Gwenael Le Barzic


De : Robert Young 
Envoyé : mardi 18 juin 2024 03:54
À : LE BARZIC Gwenael DTSI/SI 
Cc : user@flink.apache.org
Objet : Re: Problem reading a CSV file with pyflink datastream in k8s with 
Flink operator

CAUTION : This email originated outside the company. Do not click on any links 
or open attachments unless you are expecting them from the sender.
ATTENTION : Cet e-mail provient de l'extérieur de l'entreprise. Ne cliquez pas 
sur les liens ou n'ouvrez pas les pièces jointes à moins de connaitre 
l'expéditeur.

Hi Gwenael,

From the logs I thought it was a JVM module opens/exports issue, but I found it 
had a similar issue using a java8 base image too. I think the issue is it's not 
permitted for PythonCsvUtils to call the package-private constructor of 
CsvReaderFormat across class loaders.

One workaround I found is to add a `RUN cp /opt/flink/opt/flink-python* 
/opt/flink/lib/` to the Dockerfile, so that the flink-python-1.18.1,jar is 
present in both /opt and /lib. Then when Flink tries to classload 
org.apache.flink.formats.csv.PythonCsvUtils it will be available to the app 
classloader.

Thanks
Rob Young

On Mon, Jun 17, 2024 at 11:53 PM 
mailto:gwenael.lebar...@orange.com>> wrote:
Hello everyone.

Does someone know how to solve this please ?

Cdt.
[Logo Orange]

Gwenael Le Barzic
Ingénieur technique techno BigData
Orange/OF/DTSI/SI/DATA-IA/SOLID/CXP

Mobile : +33 6 48 70 85 75 

gwenael.lebar...@orange.com

Nouveau lien vers le Portail de suivi des Tickets du 
CXP



Orange Restricted
De : LE BARZIC Gwenael DTSI/SI
Envoyé : vendredi 14 juin 2024 22:02
À : user@flink.apache.org
Objet : Problem reading a CSV file with pyflink datastream in k8s with Flink 
operator

Hello everyone.

I get the following error when trying to read a CSV file with pyflink 
datastream in a k8s environment using the flink operator.
###
  File "/opt/myworkdir/myscript.py", line 30, in 
run_flink_job(myfile)
  File "/opt/myworkdir/myscript.py", line 21, in run_flink_job
csvshem = CsvReaderFormat.for_schema(file_csv_schema)
  File "/opt/flink/opt/python/pyflink.zip/pyflink/datastream/formats/csv.py", 
line 322, in for_schema
items = list(charFrequency[char].items())
  File "/opt/flink/opt/python/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 
1322, in __call__
  File "/opt/flink/opt/python/pyflink.zip/pyflink/util/exceptions.py", line 
146, in deco
  File "/opt/flink/opt/python/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 
326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling 
z:org.apache.flink.formats.csv.PythonCsvUtils.createCsvReaderFormat.
: java.lang.IllegalAccessError: class 
org.apache.flink.formats.csv.PythonCsvUtils tried to access method 'void 
org.apache.flink.formats.csv.CsvReaderFormat.(org.apache.flink.util.function.SerializableSupplier,
 org.apache.flink.util.function.SerializableFunction, java.lang.Class, 
org.apache.flink.formats.common.Converter, 
org.apache.flink.api.common.typeinfo.TypeInformation, boolean)' 
(org.apache.flink.formats.csv.PythonCsvUtils is in unnamed module of loader 
org.apache.flink.util.ChildFirstClassLoader @5d9b7a8a; 
org.apache.flink.formats.csv.CsvReaderFormat is in unnamed module of loader 
'app')
at 
org.apache.flink.formats.csv.PythonCsvUtils.createCsvReaderFormat(PythonCsvUtils.java:48)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown 
Source)
at java.base/java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at 
org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Unknown Source)
###

Here is my dockerfile :
###
FROM flink:1.18.1

RUN apt-get update -y && \
apt-get install -y python3 python3-pip python3-dev && rm -rf 
/var/lib/apt/lists/*
RUN ln 

Help required to fix security vulnerabilities in Flink Docker image

2024-06-18 Thread elakiya udhayanan
Hi Community,

In one of our applications we are using a Fink Docker image and running
Flink as a Kubernetes pod. As per policy, we tried scanning the Docker
image for security vulnerabilities using JFrog XRay and we find that there
are multiple critical vulnerabilities being reported as seen in the below
table. This is the same case for the latest Flink version 1.19.0 as well

| Severity  | Direct Package   | Impacted Package  |
Impacted Package Version | Fixed Versions | Type  | CVE
   |
|---|--|---|---||---||
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.19.8, 1.20.3]   | Go|
CVE-2023-24538 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.19.9, 1.20.4]   | Go|
CVE-2023-24540 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.19.10, 1.20.5]  | Go|
CVE-2023-29404 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.19.10, 1.20.5]  | Go|
CVE-2023-29405 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.19.10, 1.20.5]  | Go|
CVE-2023-29402 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.16.9, 1.17.2]   | Go|
CVE-2021-38297 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.16.14, 1.17.7]  | Go|
CVE-2022-23806 |
| Critical  | sha256__0690274ef266a9a2f... | certifi   |
2020.6.20 | [2023.7.22]| Python|
CVE-2023-37920 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.12.6, 1.13beta1]| Go|
CVE-2019-11888 |
| Critical  | sha256__c6571bb0f39f334ef... | github.com/golang/go  |
1.11.1| [1.11.13, 1.12.8]  | Go|
CVE-2019-14809 |

These vulnerabilities are related to the github.com/golang/go and certifi
packages.

Please help me addressing the below questions:
Is there any known workaround for these vulnerabilities while using the
affected Flink versions?
Is there an ETA for a fix for these vulnerabilities in upcoming Flink
releases?
Are there any specific steps recommended to mitigate these issues in the
meantime?
Any guidance or recommendations would be greatly appreciated.

Thanks in advance

Thanks,
Elakiya U