+1 to make blink planner as default planner.
We should give blink planner more exposure to encourage users trying out
new features and lead users to migrate to blink planner.
Glad to see blink planner is used in production since 1.9! @Benchao
Best,
Jark
On Wed, 1 Apr 2020 at 11:31, Benchao Li
Hi Kurt,
It's excited to hear that the community aims to make Blink Planner default
in 1.11.
We have been using blink planner since 1.9 for streaming processing, it
works very well,
and covers many use cases in our company.
So +1 to make it default in 1.11 from our side.
Kurt Young
大家好,
正如大家所知,Blink planner 是 Flink 1.9 版本中引入的一个全新的 Table API 和 SQL 的翻译优化
器,并且我们已经在 1.10 版本中将其作为 SQL CLI 的默认 planner。由于社区很早就已经决定不再
针对老的优化器去增添任何新的功能,所以从功能和性能上来说,老的 flink planner 已经缺了很多
现在社区最新的 SQL 相关的功能,比如新的类型系统,更多的DDL的支持,以及即将在 1.11 发布
的新的 TableSource 和 TableSink 接口和随之而来的对 Binlog 类日志的解析。
Hi Dev and User,
Blink planner for Table API & SQL is introduced in Flink 1.9 and already be
the default planner for
SQL client in Flink 1.10. And since we already decided not introducing any
new features to the
original Flink planner, it already lacked of so many great features that
the
Thanks a lot!
Jark Wu 于2020年4月1日周三 上午1:13写道:
> Hi Jing,
>
> I created https://issues.apache.org/jira/browse/FLINK-16889 to support
> converting from BIGINT to TIMESTAMP.
>
> Best,
> Jark
>
> On Mon, 30 Mar 2020 at 20:30, jingjing bai
> wrote:
>
>> Hi jarkWu!
>>
>> Is there a FLIP to do so?
Hi
我觉得你的整个程序能从没有checkpoint开始跑就很奇怪,你们的 value state descriptor里面没有定义default
value,那么调用#value() 接口返回的就是null,所以第一次调用 #update 时候还从state里面取值,最后还能跑通就很奇怪。
我建议本地在IDE里面debug看一下吧,可以把clear的条件改一下,不要弄成隔天才清理,可以让本地可以复现问题。
祝好
唐云
From: 守护 <346531...@qq.com>
Sent: Tuesday, March
Hi Jing,
I created https://issues.apache.org/jira/browse/FLINK-16889 to support
converting from BIGINT to TIMESTAMP.
Best,
Jark
On Mon, 30 Mar 2020 at 20:30, jingjing bai
wrote:
> Hi jarkWu!
>
> Is there a FLIP to do so? I'm very glad to learn from idea.
>
>
> Best,
> jing
>
> Jark Wu
Forwarding Seth's answer to the list
-- Forwarded message -
From: Seth Wiesman
Date: Tue, Mar 31, 2020 at 4:47 PM
Subject: Re: Complex graph-based sessionization (potential use for stateful
functions)
To: Krzysztof Zarzycki
Cc: user ,
Hi Krzysztof,
This is a great use case
Hi, I'm trying to run a job in a Flink cluster in Amazon EMR from java code
but I'm having some problems
This is how I create the cluster:
StepConfig copyJarStep = new StepConfig()
I could not make it work as I wanted with taskmanager.numberOfTaskSlots to
2, but I found a way for running them in parallel, just creating a cluster
for each job since they are independent
Thanks
On Mon, Mar 30, 2020 at 4:22 PM Gary Yao wrote:
> Can you try to set config option
Currently I've Flink consumer with following properties, Flink consumes
record at around 400 messages/sec at start of program but later on as
numBuffersOut exceeds 100, data rate falls to 200messages/sec. I've
set parallelism to only 1, it's Avro based consumer and checkpointing is
disabled.
hi:
我们这面想使用hive来存储flink catalog数据,那么在元数据保存删除的时候怎么来校验是否拥有hive元数据操作权限哪
Hi
We have in both Flink 1.9.2 and 1.10 struggled with random deserialze and Index
out of range exception in one of our job. We also get out of memory exceptions.
We have now identified it as a latency tracking together with broadcast state
Causing the problem. When we do integration testing
The container cut-off accounts for not only metaspace, but also native
memory footprint such as thread stack, code cache, compressed class space.
If you run streaming jobs with rocksdb state backend, it also accounts for
the rocksdb memory usage.
The consequence of less cut-off depends on your
The container cut-off accounts for not only metaspace, but also native
memory footprint such as thread stack, code cache, compressed class space.
If you run streaming jobs with rocksdb state backend, it also accounts for
the rocksdb memory usage.
The consequence of less cut-off depends on your
各位好:
我现在在搭建flink平台的时候,发现dashboard没有账号权限的控制,只要知道了地址就可以访问并提交job,感觉这个风险比较大,大家在生产环境是如何解决这个问题的?
我看到网上说通过Nginx做认证控制, 我感觉这个有点不太优雅,flink官方有没有认证相关的插件?有没有稍微优雅一点的解决方案?
期待大家的回复
Hi community,
Now I am optimizing the flink 1.6 task memory configuration. I see the
source code, at first, the flink task config the cut-off memory, cut-off
memory = Math.max(600,containerized.heap-cutoff-ratio * TaskManager
Memory), containerized.heap-cutoff-ratio default value is 0.25. For
Hi community,
Now I am optimizing the flink 1.6 task memory configuration. I see the
source code, at first, the flink task config the cut-off memory, cut-off
memory = Math.max(600,containerized.heap-cutoff-ratio * TaskManager
Memory), containerized.heap-cutoff-ratio default value is 0.25. For
Hi Tzu-Li,
thanks a lot for your answer. I will try this!
However, I was looking for something that does fully simulate a Flink
cluster, including job-manager to task manager serialization issues and
full isolation of the task managers (I guess in the MiniClusterResource, we
are still on the
Hello Mike,
thanks for the info.
I tried to do sth similar in Java. Not there yet but I think that should be
feasible.
However, like you said, that means additional operations for each event.
Yesterday I managed to find another solution: create the type information
outside of the class and pass
??
??if(stateDate.equals("") ||
stateDate.equals(date))pv_st.clear()1.??pv_st2.state.clear()
nullnullpv_st.update(pv_st.value()
+
Hi
从代码看 if(stateDate.equals("") || stateDate.equals(date))
无法判断究竟是从哪里获取到stateDate变量赋值,不清楚你这里里面的判断逻辑是否能生效。
其次,state.clear() 之后,再次获取时,返回值会是null,代码片段里面也没有看出来哪里有对这个的校验。
祝好
唐云
From: 守护 <346531...@qq.com>
Sent: Tuesday, March 31, 2020 12:33
To: user-zh
Subject:
HI,再次补充一下我的场景,如下图所示:
1、kafka TopicA的Partiton1的数据包含3个user的数据
2、flink在对该分区生成了w1、w2、w3...的watermark
问题来了:
1、w1、w2、w3...的watermark只能保证user1、user2、user3的整体数据的有序处理对吗?
2、在对user1、user2、user3进行keyby后,w1、w2、w3...的watermark能保证user1或者user2或者user3的有序处理吗?
期待大神的回复!
[image: image.png]
jun su 于2020年3月31日周二
Hi,
嗯,之前尝试了一下,没有写属性,所以没有值显示,还以为是不支持MAP。
使用的时候data[‘a’]就好了
Best,
Xinghalo
在2020年03月31日 14:59,Benchao Li 写道:
可以尝试把data字段定义为一个map类型。
111 于2020年3月31日周二 下午2:56写道:
Hi,
我们在使用streamsets作为CDC工具,输出到kafka中的内容是嵌套多变的类型,如:
{database:a, table: b, type:update, data:{a:1,b:2,c:3}}
{database:a, table: c,
Hi, there.
In the latest version, the calculator supports dynamic options. You
could append all your dynamic options to the end of "bin/calculator.sh
[-h]".
Since "-tm" will be deprecated eventually, please replace it with
"-Dtaskmanager.memory.process.size=".
Best,
Yangze Guo
On Mon, Mar 30,
Hi, there.
In the latest version, the calculator supports dynamic options. You
could append all your dynamic options to the end of "bin/calculator.sh
[-h]".
Since "-tm" will be deprecated eventually, please replace it with
"-Dtaskmanager.memory.process.size=".
Best,
Yangze Guo
On Mon, Mar 30,
可以尝试把data字段定义为一个map类型。
111 于2020年3月31日周二 下午2:56写道:
> Hi,
> 我们在使用streamsets作为CDC工具,输出到kafka中的内容是嵌套多变的类型,如:
> {database:a, table: b, type:update, data:{a:1,b:2,c:3}}
> {database:a, table: c, type:update, data:{c:1,d:2}}
> 请问这种类型该如何定义DDL?
>
>
> Best,
> Xinghalo
>
>
--
Benchao Li
School of
Hi,
我们在使用streamsets作为CDC工具,输出到kafka中的内容是嵌套多变的类型,如:
{database:a, table: b, type:update, data:{a:1,b:2,c:3}}
{database:a, table: c, type:update, data:{c:1,d:2}}
请问这种类型该如何定义DDL?
Best,
Xinghalo
Hey Vitaliy,
Check this documentation on how to use Flink with Hadoop:
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/hadoop.html
For your setup, I would recommend referencing the Hadoop jars from your
Hadoop vendor by setting
export HADOOP_CLASSPATH=`hadoop
29 matches
Mail list logo