Hi,
i'm using Flink (v.1.11.2) and would like to use the StreamingFileSink for
writing into HDFS in Parquet format.
As it says in the documentation I have added the dependencies:
org.apache.flink
flink-parquet_${scala.binary.version}
${flink.version}
And this is my file sink definit
]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/project-configuration.html#create-project
Best,
Jan
Von: Dawid Wysakowicz
Gesendet: Donnerstag, 14. Januar 2021 12:42
An: Jan Oelschlegel ; user@flink.apache.org
Betreff: Re: StreamingFileSink with ParquetAvroWriters
Hi Jan
Could
:34
An: Jan Oelschlegel ; user@flink.apache.org
Betreff: Re: StreamingFileSink with ParquetAvroWriters
Hi Jan,
Could you have a try by adding this dependency ?
org.apache.parquet
parquet-avro
1.11.1
Best,
Yun
--Original Mail --
Sender:Jan
Yes, after unzipping it is in the jar:
[cid:image002.jpg@01D6EB3D.33BCC530]
Von: Dawid Wysakowicz
Gesendet: Freitag, 15. Januar 2021 12:10
An: Jan Oelschlegel ; user@flink.apache.org
Betreff: Re: AW: StreamingFileSink with ParquetAvroWriters
Have you checked if the class (org/apache/parquet
Sorry, wrong build. It is not in the jar.
Von: Jan Oelschlegel
Gesendet: Freitag, 15. Januar 2021 12:52
An: Dawid Wysakowicz ; user@flink.apache.org
Betreff: AW: AW: StreamingFileSink with ParquetAvroWriters
Yes, after unzipping it is in the jar:
[cid:image001.jpg@01D6EB3F.136D8560]
Von
Hi ,
i'm looking for an comfortable way to read a CSV file with the DataStream API
in Flink 1.11.
Without using the Table/SQL-API before.
This is my first approach:
val typeInfo =
TypeInformation.of(classOf[CovidEvent]).asInstanceOf[PojoTypeInfo[CovidEvent]]
val csvInputFormat = new PojoCsv
But then you need a way to consume a database as a DataStream.
I found this one https://github.com/ververica/flink-cdc-connectors.
I want to implement a similar use case, but I don’t know how to parse the
SourceRecord (which comes from the connector) into an PoJo for further
processing.
Best,
Hi at all,
i'm using Flink 1.11 with the datastream api. I would like to write my results
in parquet format into HDFS.
Therefore i generated an Avro SpecificRecord with the avro-maven-plugin:
org.apache.avro
avro-maven-plugin
1.8.2
to lookup such conflicts manually and
then choose the same version like at flink dependencies ?
Best,
Jan
Von: Till Rohrmann
Gesendet: Mittwoch, 3. Februar 2021 11:41
An: Jan Oelschlegel
Cc: user
Betreff: Re: AbstractMethodError while writing to parquet
Hi Jan,
it looks to me that you might
Rohrmann
Gesendet: Donnerstag, 4. Februar 2021 10:08
An: Jan Oelschlegel
Cc: user
Betreff: Re: AbstractMethodError while writing to parquet
I guess it depends from where the other dependency is coming. If you have
multiple dependencies which conflict then you have to resolve it. One way to
detect
Hi,
i have a question regarding FlinkSQL connector for Kafka. I have 3 Kafka
partitions and 1 Kafka SQL source connector (Parallelism 1). The data within
the Kafka parttitons are sorted based on a event-time field, which is also my
event-time in Flink. My Watermark is generated with a delay of
By using the DataStream API with the same business logic I'm getting no
dropped events.
Von: Jan Oelschlegel
Gesendet: Mittwoch, 17. Februar 2021 19:18
An: user
Betreff: Kafka SQL Connector: dropping events if more partitions then source
tasks
Hi,
i have a question regarding Fli
If i increase the watermark, the dropped events getting lower. But why is the
DataStream API Job still running with 12 hours watermark delay?
By the way, I'm using Flink 1.11. It would be nice if someone could give me
some advice.
Best,
Jan
Von: Jan Oelschlegel
Gesendet: Donnersta
://issues.apache.org/jira/browse/FLINK-20041
Best,
Jan
Von: Arvid Heise
Gesendet: Mittwoch, 24. Februar 2021 14:10
An: Jan Oelschlegel
Cc: user ; Timo Walther
Betreff: Re: Kafka SQL Connector: dropping events if more partitions then
source tasks
Hi Jan,
Are you running on historic data? Then
. Februar 2021 00:04
An: Jan Oelschlegel
Cc: Arvid Heise ; user ; Timo Walther
Betreff: Re: Kafka SQL Connector: dropping events if more partitions then
source tasks
Hi Jan,
What you are observing is correct for the current implementation.
Current watermark generation is based on subtask
case I have observed how with a larger number of
source tasks no results are produced.
Best,
Jan
Von: Shengkai Fang
Gesendet: Freitag, 26. Februar 2021 15:32
An: Jan Oelschlegel
Cc: Benchao Li ; Arvid Heise ; user
; Timo Walther
Betreff: Re: Kafka SQL Connector: dropping events if more
/browse/FLINK-20041
Best,
Jan
Von: Shengkai Fang
Gesendet: Samstag, 27. Februar 2021 05:03
An: Jan Oelschlegel
Cc: Benchao Li ; Arvid Heise ; user
; Timo Walther
Betreff: Re: Kafka SQL Connector: dropping events if more partitions then
source tasks
Hi Jan.
Thanks for your reply. Do you set the
Hi at all,
i would like to know how far a state schema evolution is possible by using SQL
API of Flink. Which query changes can I do without disrupting the schema of my
savepoint?
In the documentation is, only for the DataStream API , written what are the
do's and don'ts regarding a safe sch
Maybe this strategy with evolving an application without the need for state
restore could help you
https://docs.ververica.com/v2.3/user_guide/sql_development/sql_scripts.html#sql-script-changes
-Ursprüngliche Nachricht-
Von: Chesnay Schepler
Gesendet: Mittwoch, 3. März 2021 21:35
An: so
Wysakowicz
Cc: Jan Oelschlegel ; user
Betreff: Re: State Schema Evolution within SQL API
Hello Dawid
I'm interested in this discussion because I'm currently trying to upgrade flink
from 1.9 to 1.12 for a bunch of sql jobs running in production. From what you
said, this seems to
I think simply changing the parallelism or upgrading the underlying Flink
version in a cluster should be no problem.
But you want to change your queries, right ?
Best,
Jan
Von: Jan Oelschlegel
Gesendet: Donnerstag, 4. März 2021 12:26
An: XU Qinghui ; Dawid Wysakowicz
Cc: user
Betreff: AW
Hi,
i'm using Flink 1.11.3. With Prometheus and Grafana I get some metrics from my
standalone cluster.
For this simple calculation there will be very less RAM (lee than 10 MB):
flink_jobmanager_Status_JVM_Memory_NonHeap_Committed -
flink_jobmanager_Status_JVM_Memory_NonHeap_Used
Is that the r
Hi,
Im using Flink 1.11.3 and run a batch job. In the log of the jobmanager I see
that all operators switched from running to finished. And then there is a
timeout of the jobmanager. And after some pause the overall status is switched
from running to finished.
Why is there a big gap in betwee
23 matches
Mail list logo