Re: I call Pandas UDF N times, do I have to initiate the UDF N times?

2021-05-07 Thread Dian Fu
Hi Yik San, Is it acceptable to rewrite the UDF a bit to accept multiple parameters and then rewrite the program as following: ``` SELECT LABEL_ENCODE(a, b, c) ... ``` Regards, Dian > 2021年5月8日 上午11:56,Yik San Chan 写道: > > Hi community, > > I am using PyFlink and Pandas UDF in my job. > >

Re: How to recovery from last count when using CUMULATE window after restart flink-sql job?

2021-05-07 Thread Kurt Young
Hi, please use user mailing list only to discuss these issues. Best, Kurt On Sat, May 8, 2021 at 1:05 PM 1095193...@qq.com <1095193...@qq.com> wrote: > Hi >I have tried cumalate window function in Flink-1.13 sql to accumulate > data from Kafka. When I restart a cumulate window sql job,

How to recovery from last count when using CUMULATE window after restart flink-sql job?

2021-05-07 Thread 1095193...@qq.com
Hi I have tried cumalate window function in Flink-1.13 sql to accumulate data from Kafka. When I restart a cumulate window sql job, last count state is not considered and the count state accumulates from 1. Any solutions can help recovery from last count state when restarting Flink-sql job?

I call Pandas UDF N times, do I have to initiate the UDF N times?

2021-05-07 Thread Yik San Chan
Hi community, I am using PyFlink and Pandas UDF in my job. The job executes a SQL like this: ``` SELECT LABEL_ENCODE(a), LABEL_ENCODE(b), LABEL_ENCODE(c) ... ``` And my LABEL_ENCODE UDF is defined below: ``` class LabelEncode(ScalarFunction): def open(self, function_context):

flink on native kubernetes要如何修改Logging配置?

2021-05-07 Thread casel.chen
我用native kubernetes方式部署flink session cluster,想修改某个包下的日志级别,如果直接修改configmap下的log4j-console.properties再重新部署是能生效的,但是通过命令行 (./bin/kubernetes-session.sh -Dkubernetes.cluster-id=xxx) 起flink session cluster会将之前的修改冲掉,有什么办法可以保留下之前的修改吗?是否有命令行启动参数可以指定自定义的logging配置?谢谢!

Re: [External] : Re: StopWithSavepoint() method doesn't work in Java based flink native k8s operator

2021-05-07 Thread Yang Wang
Since your problem is about the flink-native-k8s-operator, let's move the discussion there. Best, Yang Fuyao Li 于2021年5月8日周六 上午5:41写道: > Hi Austin, Yang, Matthias, > > > > I am following up to see if you guys have any idea regarding this problem. > > > > I also created an issue here to

What does enableObjectReuse exactly do?

2021-05-07 Thread 杨力
I wrote a streaming job with scala, using only immutable case class. Is it safe to enable object reuse? Will it get benefits from enabling object reuse? I reached to documents but they cover neither streaming cases nor immutable data structures.

Flinl CDC????????

2021-05-07 Thread ELVIS_SUE
Flink CDC??Unable to convert to LocalDateTime from unexpected value '2020-02-25T23:26:14Z' of type java.lang.String??MySQL??timestampFlink ?? flink-1.11.3, flink-sql-connector-mysql-cdc-1.0.0.jar Flink CDC?? CREATE TABLE

Re: How to increase the number of task managers?

2021-05-07 Thread Yik San Chan
Hi Yangze, Thanks for the answer! That helps. Best, Yik San On Sat, May 8, 2021 at 10:15 AM Yangze Guo wrote: > Hi, > > > I wonder if I can tune the number of task managers? Is there a > corresponding config? > > With K8S/Yarn resource provider, the task managers are allocated on > demand.

Re: Flink写clickhouse怎么实现精准一次性

2021-05-07 Thread 张锴
clickhouse不支持事务及幂等写入,无法保证end to end 精准一次。 李一飞 于2021年5月7日周五 下午10:27写道: > 请问 Flink写clickhouse怎么实现精准一次性,有啥好办法呀

退订

2021-05-07 Thread 薛旭旭
退订

Flink CDC 问题

2021-05-07 Thread lp
我最近正在研究flink Connector相关的内容,官网:https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/overview/;又了解到Flink CDC相关的概念:https://github.com/ververica/flink-cdc-connectors;请教一下flink Connector和Flink CDC二者之间是什么样的关系? -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Table name for table created fromDataStream

2021-05-07 Thread Leonard Xu
> 在 2021年5月8日,08:00,tbud 写道: > > Hi Leonard, > Yes that would be one solution. But why is it necessary to create a > temporaryView from already created table ? The name “Table” is quite misleading here, the table API object Table actually represents a relational query (e.g. Table table =

Re: How to increase the number of task managers?

2021-05-07 Thread Yangze Guo
Hi, > I wonder if I can tune the number of task managers? Is there a corresponding > config? With K8S/Yarn resource provider, the task managers are allocated on demand. So, the number of them are depends on the max parallelism and the slot sharing group topology of your job. In standalone mode,

Re: Enabling Checkpointing using FsStatebackend

2021-05-07 Thread Yangze Guo
Hi, I think the checkpointing is not the root cause of your job failure. As the log describes, your job failed caused by the authorization issue of Kafka. "Caused by: org.apache.kafka.common.errors.TransactionalIdAuthorizationException: Transactional Id authorization failed." Best, Yangze Guo

Re: flink job task在taskmanager上分布不均衡

2021-05-07 Thread wenyuan138
测试了下,这个参数确实有有效 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink job task在taskmanager上分布不均衡

2021-05-07 Thread wenyuan138
十分感谢黄潇 这个参数的功能描述看起来完全跟我的现象一致,今天我来修改尝试下 -- Sent from: http://apache-flink.147419.n8.nabble.com/

退订

2021-05-07 Thread 袁刚
退订

退订

2021-05-07 Thread 杨军
退订

Re: Table name for table created fromDataStream

2021-05-07 Thread tbud
Hi Leonard, Yes that would be one solution. But why is it necessary to create a temporaryView from already created table ? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Question regarding cpu limit config in Flink standalone mode

2021-05-07 Thread Fan Xie
Hi Xintong, Thanks for answering my question! After discussing with my teammates, we decide to rely on k8s pod and an external control plane to restrict the CPU usage of a job. Thanks again for your help! Best, Fan From: Xintong Song Sent: Thursday, May 6,

Re: [External] : Re: StopWithSavepoint() method doesn't work in Java based flink native k8s operator

2021-05-07 Thread Fuyao Li
Hi Austin, Yang, Matthias, I am following up to see if you guys have any idea regarding this problem. I also created an issue here to describe the problem. [1] After looking into the source code[1], I believe for native k8s, three configuration files should be added to the flink-config-

Re: Questions about implementing a flink source

2021-05-07 Thread Evan Palmer
Hi Arvid, thank you so much for the detailed reply! A few replies / questions inline. Somewhat relatedly, I'm also wondering where this connector should live. I saw that there's already a pubsub connector in

Savepoint/checkpoint confusion

2021-05-07 Thread Igor Basov
Hello, I got confused about usage of savepoints and checkpoints in different scenarios. I understand that checkpoints' main purpose is fault tolerance, they are more lightweight and don't support changing job graph, parallelism or state backend when restoring from them, as mentioned in the latest

how to split a column value into multiple rows in flink sql?

2021-05-07 Thread 1095193...@qq.com
Hi For example , a table like this: A B C -- a1 b1c1,c2,c3 --- how to split c1,c2,c3 into multiple rows like this in flink sql function: A B C a1 b1 c1 a1 b1 c2 a1 b1 c3 Thank you 1095193...@qq.com

Viewing the offsets stored in a Savepoint

2021-05-07 Thread Zachary Manno
Hello, Someone else asked this question on Stackoverflow and we would also find it very helpful: https://stackoverflow.com/questions/66256168/querying-kafka-offsets-from-flink-savepoint Is there a way to check the external savepoint data for what Kafka offset it is going to resume from? We

Re: Task Local Recovery with mountable disks in the cloud

2021-05-07 Thread Sonam Mandal
Hi Till, Thanks for getting back to me. Apologies for my delayed response. Thanks for confirming that the slot ID (Allocation ID) is indeed necessary today for task local recovery to kick in, and thanks for your insights on how to make this work. We are interested in exploring this

Enabling Checkpointing using FsStatebackend

2021-05-07 Thread sudhansu jena
Hi Team, We have recently enabled checking pointing using FsStateBackend where we are trying to use S3 bucket as the persistent storage but after enabling it we are running into issues while submitting the job into the cluster. Can you please let us know if we are missing anything ? Below is

Iceberg Upsert: Iceberg 通过Kafka一次性插入多条主键值相同的数据,数据查询不了

2021-05-07 Thread xwmr
Iceberg upsert: 当iceberg中同一批插入多条主键值相同的数据的时候,通过flink sql进行查询,数据查询不到,并且会报如下的错误: java.lang.IllegalArgumentException: Row arity: 3, but serializer arity: 2 这个报错的意思是:原本的数据是两列,表的schema也是两列,但是现在schema已经变成了三列了 大家有见过这个报错或问题么? 但是当主键不同时候,同一批插入多条数据,是可以进行正常查询的,并且也是可以upsert的 -- Sent from:

Iceberg Upsert: Iceberg 通过Kafka一次性插入多条主键值相同的数据,数据查询不了

2021-05-07 Thread xwmr
Iceberg upsert: 当iceberg中同一批插入主键相同的多条数据,flink sql 查询不出来,会报错。报错如下: java.lang.IllegalArgumentException: Row arity: 3, but serializer arity: 2 这个错误的意思就是,我插入的数据是两列,但是table schema已经是三列了。 大家知道这是什么问题么? 当主键不同的时候,同一批插入多条数据都是可以正常插入,并且也是可以upsert的。 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Flink写clickhouse怎么实现精准一次性

2021-05-07 Thread 李一飞
请问 Flink写clickhouse怎么实现精准一次性,有啥好办法呀

Re: Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Chesnay Schepler
This is where the 2.21 dependency comes from: [INFO] +- io.confluent:kafka-schema-registry:jar:5.5.1:compile [INFO] |  +- org.apache.kafka:kafka_2.12:jar:5.5.1-ccs:compile This is the entry added by your dependency: [INFO] +- org.apache.kafka:kafka_2.11:jar:0.10.2.1:compile On 5/7/2021 3:15 PM,

Re: FlieSystem Connector's Success File Was Submitted Late

2021-05-07 Thread Leonard Xu
Hi, forideal I also encountered this problem and opened an issue[1], you can have a look. Best, Leonard [1] https://issues.apache.org/jira/browse/FLINK-22472 > 在 2021年5月7日,20:31,forideal 写道: > > I found the reason: > >Late data processing: The record will be written into its partition

Re: Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Ragini Manjaiah
hi , true.. but iam using where scala.version is 2.11 . wondering from where this 2.12 is added org.apache.kafka kafka_${scala.version} ${kafka.version} On Fri, May 7, 2021 at 6:24 PM Chesnay Schepler wrote: > I see a several scala 2.12 dependencies in there. > > [INFO] | +-

Re: Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Chesnay Schepler
I see a several scala 2.12 dependencies in there. [INFO] |  +- org.apache.kafka:kafka_2.12:jar:5.5.1-ccs:compile [INFO] +- io.confluent:kafka-schema-registry:jar:5.5.1:compile [INFO] |  |  \- com.kjetland:mbknor-jackson-jsonschema_2.12:jar:1.0.39:compile On 5/7/2021 2:47 PM, Ragini Manjaiah

Re: Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Ragini Manjaiah
Hi , Please find the details [INFO] X:XXSNAPSHOT [INFO] +- org.apache.flink:flink-java:jar:1.11.3:compile [INFO] | +- org.apache.flink:flink-core:jar:1.11.3:compile [INFO] | | +- org.apache.flink:flink-annotations:jar:1.11.3:compile [INFO] | | +-

Re: flink job task在taskmanager上分布不均衡

2021-05-07 Thread Shawn Huang
看你的描述应该是Standalone部署模式。 默认调度方法是以slot为单位的,并且会倾向于分配到同一个TaskManager的slot中。 想要充分利用所有slot,一个方法是把集群中slot总数设为所有作业的并行度之和, 或者尝试将配置项cluster.evenly-spread-out-slots设为true。 Best, Shawn Huang 张锴 于2021年5月7日周五 下午7:50写道: > 给l另一个job设置个组别名,不同的组不会slot共享,会跑到别的slot上,slot可以灵活的运行在不同的TM上。 > > allanqinjy

Re: Read kafka offsets from checkpoint - state processor

2021-05-07 Thread bat man
Anyone who has tried this or can help on this. Thanks. On Thu, May 6, 2021 at 10:34 AM bat man wrote: > Hi Users, > > Is there a way that Flink 1.9 the checkpointed data can be read using the > state processor api. > Docs [1] says - When reading operator state, users specify the operator >

Re:FlieSystem Connector's Success File Was Submitted Late

2021-05-07 Thread forideal
I found the reason: Late data processing: The record will be written into its partition when a record is supposed to be written into a partition that has already been committed, and then the committing of this partition will be triggered again. So, I see that the success file is slower to

Re: Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Chesnay Schepler
Can you show us the dependency tree of your project? (If you are using maven, run "mvn dependency:tree") On 5/7/2021 2:15 PM, Ragini Manjaiah wrote: The scala version is same across the pom file . 2.11 On Fri, May 7, 2021 at 5:06 PM Chesnay Schepler > wrote: It

Re: Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Ragini Manjaiah
The scala version is same across the pom file . 2.11 On Fri, May 7, 2021 at 5:06 PM Chesnay Schepler wrote: > It looks like you have different scala versions on the classpath. Please > check that all your dependencies use the same scala version. > > On 5/7/2021 1:25 PM, Ragini Manjaiah wrote: >

Re: flink job task在taskmanager上分布不均衡

2021-05-07 Thread 张锴
给l另一个job设置个组别名,不同的组不会slot共享,会跑到别的slot上,slot可以灵活的运行在不同的TM上。 allanqinjy 于2021年5月7日周五 下午7:38写道: > > https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html > flink的配置中是有flink taskmanager配置的,一个tm对应几个slots >

FlieSystem Connector's Success File Was Submitted Late

2021-05-07 Thread forideal
Hi My friends: I use FlieSystem in Flink SQL, and I found that my success file was submitted late, probably dozens of minutes late. Here I provide some information: 1.Flink version is 1.11.1. 2.Source DDL create table test ( `timestamp bigint`, event_time as

Re: Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Chesnay Schepler
It looks like you have different scala versions on the classpath. Please check that all your dependencies use the same scala version. On 5/7/2021 1:25 PM, Ragini Manjaiah wrote: Hi , I am surfacing when submitting flink from intellij  IDE . what cloud the issues. Do need to change the scala

Re: How to increase the number of task managers?

2021-05-07 Thread Tamir Sagi
Hey num of TMs = parallelism / num of slots parallelism.default is another config you should consider. Read also https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/execution/parallel/

回复:flink job task在taskmanager上分布不均衡

2021-05-07 Thread allanqinjy
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html flink的配置中是有flink taskmanager配置的,一个tm对应几个slots 。taskmanager.numberOfTaskSlots默认是1.并行度是对应了slots数据,一般我们的slots与并行度最大的一样。你可以看一下这个参数设置。然后对照官网说明。 | | allanqinjy | | allanqi...@163.com | 签名由网易邮箱大师定制 在2021年05月7日

Flink :Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V

2021-05-07 Thread Ragini Manjaiah
Hi , I am surfacing when submitting flink from intellij IDE . what cloud the issues. Do need to change the scala version flink 1.11.3 scala 2.11 Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V at

How to increase the number of task managers?

2021-05-07 Thread Yik San Chan
Hi community, According to the [docs]( https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/ ): > taskmanager.numberOfTaskSlots: The number of slots that a TaskManager offers *(default: 1)*. Each slot can take one task or pipeline. Having multiple slots in a

Re: Watermark time zone issue

2021-05-07 Thread Leonard Xu
Hi, forideal It’s not because the time zone issue, the watermark value is timestamp in UTC mills, you should convert it to UTC timestamp and then compare with your data. Best, Leonard > 在 2021年5月7日,18:28,forideal 写道: > > Hi My friends: > My watermark added 8 more hours to the timestamp

Watermark time zone issue

2021-05-07 Thread forideal
Hi My friends: My watermark added 8 more hours to the timestamp displayed on the flink web. What is the reason for this? Actually looking at the data, it is correct. I don't know where the problem occurred? Is it because of the time zone? Flink 1.11.1 Best Wishes!!! forideal

回复: Flink sql 自定义udaf函数 出现 No match found for function signature count_uadf()

2021-05-07 Thread JackJia
您好,能否把解决的思路介绍一下? 祝好 在2020年12月18日 10:38,丁浩浩<18579099...@163.com> 写道: 问题我自己已经解决。 在 2020年12月17日,下午9:00,丁浩浩 <18579099...@163.com> 写道: flink版本:1.11.1 udaf函数代码来自于阿里云官网文档 以下是代码 public class TestSql { public static void main(String[] args) throws Exception {

How to comsume and send data with two different kerberos cetified kafka in one flink task?

2021-05-07 Thread 1095193...@qq.com
Hi By setting security.kerberos.* configure, we can connect one kerberos certified Kafka in Flink sql task. How to consume and produce with two different kerberos cetified Kafka in one flink sql task? Kafka allow multiple SASL authenticated Java clients in a single JVM process.

Re: The problem of getting data by Rest API

2021-05-07 Thread Chesnay Schepler
To be more precise, the update of the data is scheduled at most once every 10 seconds, but it can of course happen that the result of said update arrives in a different interval. As in, this would be possible: T00: Issue update 1 T10: Issue update 2 T12: Receive update1 T14: Receive update2

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-07 Thread Yangze Guo
Thanks, Dawid & Guowei for the great work, thanks to everyone involved. Best, Yangze Guo On Thu, May 6, 2021 at 5:51 PM Rui Li wrote: > > Thanks to Dawid and Guowei for the great work! > > On Thu, May 6, 2021 at 4:48 PM Zhu Zhu wrote: >> >> Thanks Dawid and Guowei for being the release

Re: The problem of getting data by Rest API

2021-05-07 Thread Chesnay Schepler
The WebUI also retrieves all data from the REST API, which should be updated with a minimum interval of 10 seconds. On 5/7/2021 3:57 AM, penguin. wrote: On the Web UI page, we can see that the relevant data is updated every 3S, such as the read-bytes of each operator. But when I get data

????minicluster????????????????

2021-05-07 Thread ????buaa
miniclusterslotdebugminicluster

Re: History Server是否可以查看TaskManager聚合后的日志

2021-05-07 Thread Yang Wang
目前Flink的history server并没有和Yarn NM的log aggregation进行整合,所以任务结束以后只能看webui以及exception 日志是没有办法看的 Best, Yang lhuiseu 于2021年5月7日周五 下午1:57写道: > Hi: > flink 1.12.0 > on yarn 模式 > 已经Finish的任务可以再history server中找到。但是通过WebUI查看TaskManager Log报404。目前Flink > History Server是不支持查看TaskManager聚合后的日志吗?希望了解history

回复: Re: 扩展SqlServerDialect 运行在flink on k8s报错

2021-05-07 Thread 18756225...@163.com
非常感谢! 发件人: Leonard Xu 发送时间: 2021-05-07 14:26 收件人: user-zh 主题: Re: 扩展SqlServerDialect 运行在flink on k8s报错 Hi 看日志是加载不到对应的class文件,(1)可以对比下你jar包里的路径是不是不正确,(2) 检查下集群上是不是还有之前的jar包,没替换干净 祝好 Leonard > 在 2021年5月7日,13:58,18756225...@163.com 写道: > > 大家好,遇到一个问题: > 坏境:flink 版本1.12.1,

Re: 扩展SqlServerDialect 运行在flink on k8s报错

2021-05-07 Thread Leonard Xu
Hi 看日志是加载不到对应的class文件,(1)可以对比下你jar包里的路径是不是不正确,(2) 检查下集群上是不是还有之前的jar包,没替换干净 祝好 Leonard > 在 2021年5月7日,13:58,18756225...@163.com 写道: > > 大家好,遇到一个问题: > 坏境:flink 版本1.12.1, k8s集群为session模式, 该集群之前可以将数据正常写入到mysql > 参考mysqlDialect 扩展了一个 >

请问在native kubernetes上如何运行Flink History Server?

2021-05-07 Thread casel.chen
请问在native kubernetes上如何运行Flink History Server? 有没有相应的文档?

扩展SqlServerDialect 运行在flink on k8s报错

2021-05-07 Thread 18756225...@163.com
大家好,遇到一个问题: 坏境:flink 版本1.12.1, k8s集群为session模式, 该集群之前可以将数据正常写入到mysql 参考mysqlDialect 扩展了一个 SqlServerDialect,替换了flink的lib包下的flink-connector-jdbc_2.11-1.12.1.jar,在on yarn时 任务正常运行,flink-sql也可以将数据写入到sqlserver 在同一台机器坏境 提交到k8s搭建的flink session模式集群, 就报如下错误 , JdbcBatchingOutputFormat