On Mon, May 17, 2021 at 01:22:16PM +0200, Arvid Heise wrote:
> Hi ChangZhuo,
>
> This looks indeed like a bug. I created FLINK-22686 [1] to track it. It
> looks unrelated to reactive mode to me and more related to unaligned
> checkpoints. So, you can try out reactive mode with aligned
Hi,
can you maybe share some details about the code you're running?
Regards
Ingo
On Tue, May 18, 2021 at 5:10 AM 杨建春/00250041 wrote:
> I'm using flink1.13.0, table Function, why report this error ? what
> reason ? Thanks!
>
>
>
> Traceback (most recent call last):
> File
不需要提供全部字段
--
Sent from: http://apache-flink.147419.n8.nabble.com/
tableEnv.executeSql(DDLSourceSQLManager.cdcCreateTableSQL("order_info"));
tableEnv
.toRetractStream(tableEnv.from("order_info"), Row.class)
.filter((FilterFunction>)
booleanRowTuple2 -> booleanRowTuple2.f0)
.map((MapFunction, Row>)
Hi,
1) The cause of the exception:
The dependencies added via pipeline.jars / pipeline.classpaths will be used to
construct user class loader.
For your job, the exception happens when HadoopUtils.getHadoopConfiguration is
called. The reason is that HadoopUtils is provided by Flink which is
----
??: ""<62...@163.com;
: 2021??5??17??(??) 6:23
??: "user-zh"
我的flink sql作业如下
SELECT
product_name,
window_start,
window_end,
CAST(SUM(trans_amt)ASDECIMAL(24,2)) trans_amt,
CAST(COUNT(order_no)ASBIGINT) trans_cnt,
-- LOCALTIMESTAMP AS insert_time,
'微支付事业部'AS bus_name
FROM(
mysql sink表的定义如下
CREATE TABLE XXX (
) Engine=InnoDB AUTO_INCREMENT=31 DEFAULT
采用flink sql定义源表时,哪些connector支持提供部分字段,哪些connector必须要提供全量字段呢?
这边常用的connector主要有kafka, jdbc, clickhouse和mysql-cdc。
cdc是不是必须要提供对应表的全量字段呢?如果上游mysql业务表新增了字段,flink sql作业会不会出错?
kafka表定义是否支持部分字段?
Hi all,
Most of the posts says that "Most of the times, the crashes in ZIP_GetEntry
occur when the jar file being accessed has been modified/overwritten while
the JVM instance was running. ", but do not know when and which jar file
was modified according to the job running in flink.
for your
I'm using flink1.13.0, table Function, why report this error ? what reason ?
Thanks!
Traceback (most recent call last):
File "D:/yjc/AIOPS/Flink/UDTFcallstack.py", line 149, in
t_result.wait()
File "D:\Program Files
(x86)\python36\lib\site-packages\pyflink\table\table_result.py",
Hello
你可以把具体的问题描述清楚点,比如给出一些数据和sql,能够复现你遇到的问题,这样大家才能帮忙排查。
祝好,
Leonard Xu
> 在 2021年5月18日,09:49,清酌 写道:
>
> 您好!
> 我在使用1.11版本flink sql cdc
> 时候,用sql形式想对多表关联生成实时的宽表,发现经常出现宽表的数据不准。特别是源表在cdc变更时候。比如:宽表本应该10条数据变更,但是实际只变更了3条。
> 我想知道这个问题是基于我使用不当产生的还是1.11版本的问题,如果是版本的问题后续会修复吗?
您好!
我在使用1.11版本flink sql cdc
时候,用sql形式想对多表关联生成实时的宽表,发现经常出现宽表的数据不准。特别是源表在cdc变更时候。比如:宽表本应该10条数据变更,但是实际只变更了3条。
我想知道这个问题是基于我使用不当产生的还是1.11版本的问题,如果是版本的问题后续会修复吗?
只需要最新的维表数据,可以用处理时间,这样是事实表每条都实时去查mysql最新维表数据;
如果业务可以接受近似最新的维表数据,也可以将查询的维表结果通过缓存优化,减少访问mysql io访问,这两个参数:
lookup.cache.max-rows"
lookup.cache.ttl
祝好,
Leonard
[1]
https://ci.apache.org/projects/flink/flink-docs-master/zh/docs/connectors/table/jdbc/#%E8%BF%9E%E6%8E%A5%E5%99%A8%E5%8F%82%E6%95%B0
> 在
ping?
On Tue, May 11, 2021 at 11:31 PM Jin Yi wrote:
> hello. thanks ahead of time for anyone who answers.
>
> 1. verifying my understanding: for a kafka source that's partitioned on
> the same piece of data that is later used in a keyBy, if we are relying on
> the kafka timestamp as the
大佬们好,我们现在有个场景,是left join
mysql维度表的,但是更新比较慢,大概10分钟更新一条,但是事实表速度比较快,每秒几万条。并且需要join最新数据。如果采用mysql
cdc形式,那水位对等就要较长延迟。有什么好方式能够join到最新数据吗,使用处理时间?
大佬们好,我们现在有个场景,是left join
mysql维度表的,但是更新比较慢,大概10分钟更新一条,但是事实表速度比较快,每秒几万条。并且需要更新最新数据。如果采用mysql
cdc形式,那水位对等就要较长延迟。有什么好方式能够join到最新数据吗,使用处理时间?
没有人知道吗?
在 2021-05-13 17:20:15,"casel.chen" 写道:
flink sql怎样将change log stream转换成append log stream?
通过cdc接入了change log stream,后面就不能使用窗口聚合了,像Tumble和Hop, Session窗口。只能通过state ttl +
group by timestamp这种方式聚合。
问一下有没有办法将change log stream转换成append log stream,从而可以使用上述窗口聚合了呢?谢谢!
没有人知道吗?
在 2021-05-13 08:19:24,"casel.chen" 写道:
>flink sql如何修改执行计划?例如,修改上下游算子不同的并行度,或者将算子链人为打断等等。
>我知道如何获取flink sql执行计划,但要怎么人为干预这个执行计划呢?还请大佬解答一下,谢谢!
Hello,
Is new KafkaSource/KafkaSourceBuilder ready to be used ? If so, is KafkaSource
state compatible with legacy FlinkKafkaConsumer, for example if I replace
FlinkKafkaConsumer by KafkaSource, will offsets continue from what we had in
FlinkKafkaConsumer ?
Thanks,
Alexey
I think it should be possible to use Helm pre-upgrade hook to take savepoint
and stop currently running job and then Helm will upgrade image tags. The
problem is that if you hit timeout while taking savepoint, it is not clear how
to recover from this situation
Alexey
Hi Pedro,
There is currently no official Kubernetes Operator for Flink and, by
extension, there is no official Helm chart. It would be relatively easy to
create a chart for simply deploying standalone Flink resources via the
Kubernetes manifests described here[1], though it would leave out the
Hi everyone!
We’re very excited to launch the Call for Presentations [1] for Flink
Forward Global 2021! If you have an inspiring Apache Flink use case,
real-world application or best practice, Flink Forward is the platform for
you to share your experiences.
We look forward to receiving your
On Mon, 17 May 2021, 17:38 Priyanka Manickam,
wrote:
> Hi Yang,
>
> I have checked the documents you have shared in the previous mail.
> But i am not sure what we have to give for the $(CONTAINER_SCRIPTS) in
> the line
>
>
>
Youngwoo,
I was trying to configure my flink/MINIO connector based on
"s3_setup_with_provider"
in the end-to-end test setup script; this requires that AWS_ACCES and AWS
_SECRET keys be set as ENV variables but I think that needs to be set
during the installation of MINIO, not flink.
I'm just not
Hello,
Forwarding this question from the dev mailing list in case this is a more
appropriate list.
Does flink have an official Helm Chart? I haven't been able to find any,
the closest most up-to-date one seems to be
https://github.com/GoogleCloudPlatform/flink-on-k8s-operator.
Is this correct or
Hey Robert,
I’m not sure why you need to set env var. but That’s a flink configuration,
See
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/s3/#configure-path-style-access
Thanks,
Youngwoo
2021년 5월 17일 (월) 오후 10:47, Robert Cullen 님이 작성:
> Arvid, Is
Hi,
Flink allows to enable latency tracking [1] and exposes several metrics
that might be what you are looking for [2, look for e.g. "numRecordsIn" or
"numBytesIn"]. You can query these metrics using the REST API [3] or by
registering a metrics reporter [4] that exposes them. As noted in the
Arvid, Is there a way to set environment variables in the flink-conf.yaml?
Setting them on the CLI isn't working.
On Sat, May 15, 2021 at 3:56 PM Arvid Heise wrote:
> Hi Robert,
>
> we have an end-to-end-test with minio. You have to use
> s3.path.style.access; I think the actual key depends on
Hi,
I am not familiar with hibench. Does the Flink UI show the configured
parallelism of 20 for the job, and there are indeed 20 partitions on the
Kafka topic you consume?
Which Flink version are you running? The repo
https://github.com/Intel-bigdata/HiBench mentions Flink 1.0.3, which is *very
When using hibench to test Flink, the jobs submitted are built-in applications
of hibench, that is, the code logic of programs like wordcount cannot be
changed.
How can I get the throughput and processing delay of Flink?
In addition, in the /report/hibench.report file of hibench, we can't get
Hello,
I was wondering whether anyone has tried and/or had any luck creating a
custom catalog with Iceberg + Flink via the Python API (
https://iceberg.apache.org/flink/#custom-catalog)?
When doing so, the docs mention that dependencies need to be specified via
*pipeline.jars* /
Hi,
On Sat, May 15, 2021 at 5:07 PM wrote:
> First I was told that my application need only perform keyed aggregation
> of streaming IoT data on a sliding window. Flink seemed the obvious choice.
>
> Then I was told that the window size must be configurable, taking on one
> of 5 possible
Hi Suchithra,
this is very likely a version mixup: Can you make sure all jars in your
classpath are Flink 1.11.1?
On Mon, May 17, 2021 at 2:05 PM V N, Suchithra (Nokia - IN/Bangalore) <
suchithra@nokia.com> wrote:
> Hi,
>
>
>
> With flink 1.11.1 version, taskmanager initialization is
使用hibench测试flink时,hibench的report目录下的hibench.report没有吞吐量相关的信息。
请问使用hibench对flink进行测试时,如何获取flink的吞吐量和处理时延呢。
Hi,
With flink 1.11.1 version, taskmanager initialization is failing with below
error. Could you please help to debug the issue.
log":"[main] org.apache.flink.runtime.io.network.netty.NettyConfig NettyConfig
[server address: /0.0.0.0, server port: 4121, ssl enabled: false, memory
segment size
Hi Yun,
That’s very helpful and good to know that the problem/use-case has been thought
about. Since my need is probably shorter-term than later, I’ll likely need to
explore a workaround.
Do you know of an approach that might not require the use of check pointing and
restarting? I was looking
Hey Arvid,
I found that It's a constant from Flink.
https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSchemaConverter.java#L307
I believe, it would be good to substitute 'record' to 'Record'
What do you think?
Thanks,
Hi ChangZhuo,
This looks indeed like a bug. I created FLINK-22686 [1] to track it. It
looks unrelated to reactive mode to me and more related to unaligned
checkpoints. So, you can try out reactive mode with aligned checkpoints.
If you can provide us with the topology, we can also fix it soonish:
Hi Youngwoo,
You can try to use aliases for it [1].
Even better would be to use a different name for the record. In general,
since Avro originally comes from the Java World, it's more common to use
camel case for record names.
[1] https://avro.apache.org/docs/current/spec.html#Aliases
On Mon,
Hi Adi,
To me, this looks like a version conflict of some kind. Maybe you use
different Avro versions for your user program and on your Flink cluster.
Could you check that you don't have conflicting versions on your classpath?
It would also be helpful to have a minimal example that allows
集群重启如何保证带状态的任务自动从最近一个checkpoint恢复?
看源码是在
catch里面的(应该是在executeCheckpointing的时候报错了,但是catch里面还有一个nullpoint没catch导致程序退出):
if (LOG.isDebugEnabled()) {
LOG.debug("{} - did NOT finish
synchronous part of checkpoint {}. " +
"Alignment duration:
One small addition: The old mapping looks to use the
SubtaskStateMapper.RANGE whereas the new mapping looks to use the
SubtaskStateMapper.ROUND_ROBIN.
On Mon, May 17, 2021 at 11:56 AM Till Rohrmann wrote:
> Hi ChangZhuo Chen,
>
> This looks like a bug in Flink. Could you provide us with the
Hi ChangZhuo,
IIRC, even you have specified a savepoint when starting, the JobManager
could recover from the latest checkpoint when the JobManager failed.
Because when recovering, DefaultCompletedCheckpointStore will sort all the
checkpoints(including the savepoint) and pick the latest one.
So,
Hi ChangZhuo Chen,
This looks like a bug in Flink. Could you provide us with the logs of the
run and more information about your job? In particular, how does your
topology look like?
My suspicion is the following: You have an operator with two inputs. One
input is keyed whereas the other input
Hi Zerah,
I guess you could provide a Python implementation for
ConfluentRegistryAvroDeserializationSchema if needed. It’s just a wrapper for
the Java implementation and so it’s will be very easy to implement. You could
take a look at AvroRowDeserializationSchema [1] as an example.
Regards,
Hi,
I have below use case
1. Read streaming data from Kafka topic using Flink Python API
2. Apply transformations on the data stream
3. Write back to different kafka topics based on the incoming data
Input data is coming from Confluent Avro Producer. By using the existing
Hi,
I have a table backed by confluent avro format and the generated schema
from flink looks like following:
{
"type": "record",
"name": "record",
"fields": [
{
"name": "dt",
"type": [
"null",
{
"type": "int",
"logicalType": "date"
Could you share your pod-template.yaml or check whether the container name
is configured to "flink-main-container"?
Best,
Yang
ChangZhuo Chen (陳昌倬) 于2021年5月17日周一 下午5:20写道:
> On Mon, May 17, 2021 at 01:29:55PM +0530, Priyanka Manickam wrote:
> > Hi All,
> >
> > Do we required to add any image
On Mon, May 17, 2021 at 01:29:55PM +0530, Priyanka Manickam wrote:
> Hi All,
>
> Do we required to add any image for flink-main-container in
> pod-template.yaml file because it giving an error saying
> "spec.containers(0).image value required.
>
>
> Could anyone help with this please
Hi,
You
Thanks for reading this email.
According to the introduction, the identity program in hibench reads data from
Kafka and then writes it back to Kafka.
When using the identity program in hibench to test the Flink, set the
parallelism to 20 in the flink.conf file in the conf directory of
根据介绍,identity是从kafka中读取数据,然后写回kafka。
在使用hibench中的identity程序对flink进行测试时,在hibench的conf目录下的flink.conf文件中将并行度设置为20。
提交任务后,在ui界面上发现只有一个子任务,即只有一个节点的一个slot中被分配了任务。请问如何在使用identity测试flink时,能有多个子任务呢?
(好像每次放图片都无法显示,就没有提供截图了)
>
> Hi All,
>
> Do we required to add any image for flink-main-container in
> pod-template.yaml file because it giving an error saying
> "spec.containers(0).image value required.
>
>
> Could anyone help with this please
>
> Thanks,
> Priyanka Manickam
>
> .
>>
>>
>>
>>
>>
>>
Hi All,
Do we required to add any image for flink-main-container in
pod-template.yaml file because it giving an error saying
"spec.containers(0).image value required.
Could anyone help with this please
Thanks,
Priyanka Manickam
On Thu, 22 Apr 2021, 08:41 Milind Vaidya, wrote:
> Hi
>
> I see
Hi Milind,
A job can be stopped with a savepoint in the following way [1]: ./bin/flink
stop --savepointPath [:targetDirectory] :jobId
Best,
Matthias
[1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/state/savepoints/#stopping-a-job-with-savepoint
On Sun, May 16, 2021 at 1:12
Yes, it does.
Internally, each re-scheduling is performed as stop-and-resume the job,
similar to a failover. Without checkpoints, the job will always restore
from the very beginning.
Thank you~
Xintong Song
On Mon, May 17, 2021 at 2:54 PM Alexey Trenikhun wrote:
> Hi Xintong,
> Does
Hi,
I agree that both `connect` and `registerTableSource` are useful for
generating Table API pipelines. It is likely that both API methods will
get a replacement in the near future.
Let me explain the current status briefly:
connect(): The CREATE TABLE DDL evolved faster than connect().
Hi Xintong,
Does reactive mode need checkpoint for re-scheduling ?
Thanks,
Alexey
From: Xintong Song
Sent: Sunday, May 16, 2021 7:30:15 PM
To: Flink User Mail List
Subject: Re: reactive mode and back pressure
Hi Alexey,
I don't think the new reactive mode
You check if there is a configuration option listed here:
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/
If it is, you can add it to config/flink-config.yaml.
Maybe others have other pointers.
Otherwise you will need to use Table API instead of SQL
Hi Rion,
I think FLIP-150[1] should be able to solve this scenario.
Since FLIP-150 is still under discussion, for now a temporary method come
to me might be
1. Write a first job to read the kafka and update the broadcast state of some
operator. The job
would keep the source alive after all
When I upgraded from Flink1.10.0 to Flink1.12.0. Unable to restore SavePoint
And prompt the following error
2021-05-14 22:02:44,716 WARN org.apache.flink.metrics.MetricGroup
[] - The operator name Calc(select=[((CAST((log_info
get_json_object2
Hi Alexey,
Flink supports rescaling from a normal checkpoint if you are not changing
your application too much. So if normal checkpointing works, you can just
use that for rescaling by using Retained Checkpoints and supply the path on
resume at the place where you supplied the savepoint path
62 matches
Mail list logo