Hi Rion,
Sorry for the late reply, another simpler method might indeed be in
initializeState,
the operator directly read the data from the kafka to initialize the state.
Best,
Yun
--Original Mail --
Sender:Rion Williams
Send Date:Mon May 17 19:53:35 2021
hi
我们使用 flink.1.12 读取 ACID hive table 时报错(Reading or writing ACID table %s is
not supported),我们尝试修改源码放开这个限制也会出现后续的一些错误如(cast转换 BytesColumnVector 为
LongColumnVector 出错)。
背景:目前我们生产想采用 flink 做 ETL 等数据迁移工作,对应的hive都是hive 3.0左右的版本或者hive
json格式改debezium-json试试
--
Sent from: http://apache-flink.147419.n8.nabble.com/
Thanks! I have updated the detail and task manager log in
https://issues.apache.org/jira/browse/FLINK-22688.
Regards,
-Gary
On Tue, 18 May 2021 at 16:22, Matthias Pohl wrote:
> Sorry, for not getting back earlier. I missed that thread. It looks like
> some wrong assumption on our end. Hence,
I have a DataStream running in Batch Execution mode within YARN on EMR.
My job failed an hour into the job two times in a row because the task
manager heartbeat timed out.
Can somebody point me out how to restart a job in this situation? I can't
find that section of the documentation.
thank you.
可以在metrics 上报时或落地前对source两次上报间隔的numRecordsOut值进行相减,最后呈现的时候按时间段累计就可以了
--
Sent from: http://apache-flink.147419.n8.nabble.com/
是的,我们把这个信息集成到监控系统里面,其中有一项是统计随机时间的(例如最近1天,最近1周),kafka成功消费的数据量。
--
Sent from: http://apache-flink.147419.n8.nabble.com/
步长多少?随时随刻的最近24小时?
zzzyw 于2021年5月18日周二 下午9:33写道:
> Hi 各位,
> 我需要统计出flink最近 n小时(例如24小时?) 成功从kafka中消费的数据量,有什么比较好的方案吗?
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
Hi,
I am using the DataStream API in Batch Execution Mode, and my "source" is
an s3 Buckets with about 500 GB of data spread across many files.
Where does Flink stored the results of processed / produced data between
tasks?
There is no way that 500GB will fit in memory. So I am very curious
Hey all,
I’ve been taking a very TDD-oriented approach to developing many of the Flink
apps I’ve worked on, but recently I’ve encountered a problem that has me
scratching my head.
A majority of my integration tests leverage a few external technologies such as
Kafka and typically a relational
Hey all,
Thanks for the details, John! Hmm, that doesn't look too good either but
probably a different issue with the RMQ source/ sink. Hopefully, the new
FLIP-27 sources will help you guys out there! The upcoming HybridSource in
FLIP-150 [1] might also be interesting to you in finely
There is already a ticket for this. Note that this functionality should
be implemented in a generic fashion to be usable for all reporters.
https://issues.apache.org/jira/browse/FLINK-17495
On 5/18/2021 8:16 PM, Andrew Otto wrote:
Sounds useful!
On Tue, May 18, 2021 at 2:02 PM Mason Chen
Sounds useful!
On Tue, May 18, 2021 at 2:02 PM Mason Chen wrote:
> Hi all,
>
> Would people appreciate enhancements to the prometheus reporter to include
> extra labels via a configuration, as a contribution to Flink? I can see it
> being useful for adding labels that are not job specific, but
Hi all,
Would people appreciate enhancements to the prometheus reporter to include
extra labels via a configuration, as a contribution to Flink? I can see it
being useful for adding labels that are not job specific, but infra specific.
The change would be nicely integrated with the Flink’s
If flink-conf.yaml is readonly, flink will complain but work fine?
From: Chesnay Schepler
Sent: Wednesday, May 12, 2021 5:38 AM
To: Alex Drobinsky
Cc: user@flink.apache.org
Subject: Re: After upgrade from 1.11.2 to 1.13.0 parameter
Hi,
I'm trying to read from and write to S3 with Flink 1.12.2. I'm submitting
the job to local cluster (tar.gz distribution). I do not have a Hadoop
installation running in the same machine. S3 (not Amazon) is running in a
remote location and I have access to it via endpoint and access/secret
Hi there,
I have the following (probably very common) usecase: I have some lookup data (
100 million records ) which change only slowly (in the range of some thousands
per day). My event stream is in the order of tens of billions events per day
and each event needs to be enriched from the 100
Hey all,
Yeah, I'd be interested to see the Helm pre-upgrade hook setup, though I'd
agree with you, Alexey, that it does not provide enough control to be a
stable solution.
@Pedro Silva I don't know if there are talks for an
official operator yet, but Kubernetes support is important to the
背景:我想试用 flink sql 的 deduplicate 处理一个带*主键*的流,我发现
1. 如果我使用 mysql-cdc 获得一个流,它会报错 Deduplicate doesn't support consuming update
and delete changes
2. 如果我使用 kafka json 获得一个流,虽然 deduplicate 不报错,但是不能设置主键,报错 The Kafka table
'...' with 'json' format doesn't support defining PRIMARY KEY constraint on
the
Great to hear that you fixed the problem by specifying an explicit
serializer for the state.
Cheers,
Till
On Tue, May 18, 2021 at 9:43 AM Joshua Fan wrote:
> Hi Till,
> I also tried the job without gzip, it came into the same error.
> But the problem is solved now. I was about to give up to
Hi 各位,
我需要统计出flink最近 n小时(例如24小时?) 成功从kafka中消费的数据量,有什么比较好的方案吗?
--
Sent from: http://apache-flink.147419.n8.nabble.com/
With the help from Dian and friends, it turns out the root cause is:
When it `create_temporary_function`, it is in the default catalog. However,
when it `execute_sql(TRANSFORM)`, it is in the "hive" catalog. A function
defined as a temporary function in catalog "default" is not accessible from
When I upgraded from Flink1.10.0 to Flink1.12.0. Unable to restore SavePoint
And prompt the following error
2021-05-14 22:02:44,716 WARN org.apache.flink.metrics.MetricGroup
[] - The operator name Calc(select=[((CAST((log_info
get_json_object2
那你就要看一下你数据库表的每个字段的编码格式是什么?有没有调整编码格式?我这边是可以的
在 2021-05-18 18:19:31,"casel.chen" 写道:
>我的URL连接串已经使用了 useUnicode=truecharacterEncoding=UTF-8 结果还是会有乱码
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2021-05-18 17:21:12,"王炳焱" <15307491...@163.com> 写道:
我从flink1.10升级到flink1.12时,flink1.10的SQLapi的任务无法从savePoint中进行恢复,报错如下:
2021-05-14 22:02:44,716 WARN org.apache.flink.metrics.MetricGroup
[] - The operator name Calc(select=[((CAST((log_info
get_json_object2 _UTF-16LE'eventTime')) / 1000) FROM_UNIXTIME
Hi,
Stop command is failing with below error with apache flink 1.12.3 version.
Could you pls help.
log":"[Flink-RestClusterClient-IO-thread-2]
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel Force-closing a
channel whose registration task was not accepted by an event loop:
Hi Dian,
I changed the udf to:
```python
@udf(
input_types=[DataTypes.BIGINT(), DataTypes.BIGINT()],
result_type=DataTypes.BIGINT(),
)
def add(i, j):
return i + j
```
But I still get the same error.
On Tue, May 18, 2021 at 5:47 PM Dian Fu wrote:
> Hi Yik San,
>
> The expected input types for
我的URL连接串已经使用了 useUnicode=truecharacterEncoding=UTF-8 结果还是会有乱码
在 2021-05-18 17:21:12,"王炳焱" <15307491...@163.com> 写道:
>你在flinkSQL连接mysql表的时候配置url=jdbc:mysql://127.0.0.1:3306/database?useUnicode=truecharacterEncoding=UTF-8,像这样CREATE
> TABLE jdbc_sink(id INT COMMENT '订单id',
Hi Jin,
1) As far as I know the order is only guaranteed for events from the same
partition. If you want events across partitions to remain in order you may
need to use parallelism 1. I'll attach some links here which might be
useful:
在yarn模式下提交flink任务总是报没有这个东西是为啥啊
org.apache.flink.metrics.MetricGroup.addGroup(Ljava/lang/String;Ljava/lang/String;)Lorg/apache/flink/metrics/MetricGroup;
| |
mao18698726900
|
|
邮箱:mao18698726...@163.com
|
签名由 网易邮箱大师 定制
Could you share how you are starting the Flink native k8s application with
pod template?
Usually it look like the following commands. And you need to have the Flink
binary on your local machine.
Please note that pod template is working with native K8s mode only. And you
could not use the
没什么传不传的。processWatermark中应该就会直接将2窗口都处理掉了。
曲洋 于2021年5月10日周一 上午10:31写道:
>
>
>
> 对的,就是两个窗口同时存在,(3,1,2)(6,5,4)这就是两个窗口,然后watermark(7)来了,但是我不知道这个watermark是同时触发
>
> 两个窗口都计算,还是触发第一个,因为watermark也是一种特殊的数据,我看源码也没有找到当一个operater进行processWatermark之后
> 会不会把这个watermark进一步传递给并行的同时存在的窗口,也来触发它。
>
>
>
>
>
>
>
>
>
Hi Yik San,
The expected input types for add are DataTypes.INT, however, the schema of
aiinfra.mysource is: a bigint and b bigint.
Regards,
Dian
> 2021年5月18日 下午5:38,Yik San Chan 写道:
>
> Hi,
>
> I have a PyFlink script that fails to use a simple UDF. The full script can
> be found below:
>
Hi,
I have a PyFlink script that fails to use a simple UDF. The full script can
be found below:
```python
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.table import (
DataTypes,
EnvironmentSettings,
SqlDialect,
StreamTableEnvironment,
)
from pyflink.table.udf import udf
flink mysql cdc支持mysql的json格式吗?
你在flinkSQL连接mysql表的时候配置url=jdbc:mysql://127.0.0.1:3306/database?useUnicode=truecharacterEncoding=UTF-8,像这样CREATE
TABLE jdbc_sink(id INT COMMENT '订单id',goods_name
VARCHAR(128) COMMENT '商品名称',price DECIMAL(32,2) COMMENT '商品价格',
user_name VARCHAR(64) COMMENT '用户名称') WITH (
Sorry, for not getting back earlier. I missed that thread. It looks like
some wrong assumption on our end. Hence, Yangze and Guowei are right. I'm
gonna look into the issue.
Matthias
On Fri, May 14, 2021 at 4:21 AM Guowei Ma wrote:
> Hi, Gary
>
> I think it might be a bug. So would you like to
Hi Till,
I also tried the job without gzip, it came into the same error.
But the problem is solved now. I was about to give up to solve it, I found
the mail at
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/JVM-crash-SIGSEGV-in-ZIP-GetEntry-td17326.html.
So I think maybe it
Hi Dian,
thanks a lot for the explanation and help. Option 2) is what I needed and
it works.
Regards
Eduard
On Tue, May 18, 2021 at 6:21 AM Dian Fu wrote:
> Hi,
>
> 1) The cause of the exception:
> The dependencies added via pipeline.jars / pipeline.classpaths will be
> used to construct user
Hi Joshua,
could you try whether the job also fails when not using the gzip format?
This could help us narrow down the culprit. Moreover, you could try to run
your job and Flink with Java 11 now.
Cheers,
Till
On Tue, May 18, 2021 at 5:10 AM Joshua Fan wrote:
> Hi all,
>
> Most of the posts
Arvid,
I found a jira related to my issue.
https://issues.apache.org/jira/browse/FLINK-18096
Added a comment and I think Seth's idea is way better than just renaming
the current name of the record from avro schema.
Thanks,
Youngwoo
On Mon, May 17, 2021 at 8:37 PM Youngwoo Kim (김영우)
wrote:
>
41 matches
Mail list logo