Re: Improved performance when using incremental checkpoints

2020-06-16 Thread Yun Tang
Hi Nick It's really strange that performance could improve when checkpoint is enabled. In general, enable checkpoint might bring a bit performance downside to the whole job. Could you give more details e.g. Flink version, configurations of RocksDB and simple code which could reproduce this

Flink SQL ddl 中含有关键字 且在ddl中使用TO_TIMESTAMP、TO_DATE函数语法检查异常问题

2020-06-16 Thread zilong xiao
如题,在SQL ddl中如含有关键字字段(已用反引号括上),并且在ddl中使用了TO_TIMESTAMP、TO_DATE函数,在进行语法检查的时候会报关键字冲突异常,如果将图中的```itime as TO_TIMESTAMP(time2)```去掉,语法检查正常通过,还麻烦各位大佬帮忙看看~ 代码如下图: [image: image.png] 异常堆栈:

Re: [External] Measuring Kafka consumer lag

2020-06-16 Thread Theo Diefenthal
Hi Padarn, We configure our Flink KafkaConsumer with setCommitOffsetsOnCheckpoints(true). In this case, the offsets are committed on each checkpoint for the conumer group of the application. We have an external monitoring on our kafka consumer groups (Just a small script) which writes kafka

?????? flink1.11 ??????(???? DDL ??????(???? Table ????))

2020-06-16 Thread kcz
?? ---- ??:""https://developer.aliyun.com/live/2894?accounttraceid=07deb589f50c4c1abbcbed103e534316qnxq 04:17:00?? Kurt Young

Re: pyflink连接elasticsearch5.4问题

2020-06-16 Thread Dian Fu
可以发一下完整的异常吗? > 在 2020年6月16日,下午3:45,jack 写道: > > 连接的版本部分我本地已经修改为 5了,发生了下面的报错; > >> st_env.connect( > >> Elasticsearch() > >> .version("5") > >> .host("localhost", 9200, "http") > >> .index("taxiid-cnts") > >> .document_type('taxiidcnt')

Re: pyflink连接elasticsearch5.4问题

2020-06-16 Thread Dian Fu
可以发一下完整的异常吗? > 在 2020年6月16日,下午3:45,jack 写道: > > 连接的版本部分我本地已经修改为 5了,发生了下面的报错; > >> st_env.connect( > >> Elasticsearch() > >> .version("5") > >> .host("localhost", 9200, "http") > >> .index("taxiid-cnts") > >> .document_type('taxiidcnt')

Re: Re: 如何做checkpoint的灾备

2020-06-16 Thread dixingxin...@163.com
@Congxian 感谢你的回复,我们会参考你的思路。 Best, Xingxing Di Sender: Congxian Qiu Send Time: 2020-06-15 09:55 Receiver: user-zh cc: zhangyingchen; pengxingbo Subject: Re: Re: 如何做checkpoint的灾备 正常的流程来说,能找到 checkpoint meta 文件,checkpoint 就是完整的。但是也可能会出现其他的一些异常(主要可能会有 FileNotFound 等异常),那些异常如果需要提前知道的话,可以再 JM

Re: Latency tracking together with broadcast state can cause job failure

2020-06-16 Thread Arvid Heise
Hi Lasse, your reported issue [1] will be fixed in the next release of 1.10 and the upcoming 1.11. Thank you for your detailed report. [1] https://issues.apache.org/jira/browse/FLINK-17322 On Wed, Apr 22, 2020 at 12:54 PM Lasse Nedergaard < lassenedergaardfl...@gmail.com> wrote: > Hi Yun > >

Re:Re: pyflink连接elasticsearch5.4问题

2020-06-16 Thread jack
连接的版本部分我本地已经修改为 5了,发生了下面的报错; >> st_env.connect( >> Elasticsearch() >> .version("5") >> .host("localhost", 9200, "http") >> .index("taxiid-cnts") >> .document_type('taxiidcnt') >> .key_delimiter("$")) \ 在

Re:Re: pyflink连接elasticsearch5.4问题

2020-06-16 Thread jack
连接的版本部分我本地已经修改为 5了,发生了下面的报错; >> st_env.connect( >> Elasticsearch() >> .version("5") >> .host("localhost", 9200, "http") >> .index("taxiid-cnts") >> .document_type('taxiidcnt') >> .key_delimiter("$")) \ 在

Improved performance when using incremental checkpoints

2020-06-16 Thread nick toker
Hello, We are using RocksDB as the backend state. At first we didn't enable the checkpoints mechanism. We observed the following behaviour and we are wondering why ? When using the rocksDB *without* checkpoint the performance was very extremely bad. And when we enabled the checkpoint the

Re:Re: Re: Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread Zhou Zach
有输出的 在 2020-06-16 15:24:29,"王松" 写道: >那你在命令行执行:hadoop classpath,有hadoop的classpath输出吗? > >Zhou Zach 于2020年6月16日周二 下午3:22写道: > >> >> >> >> >> >> >> 在/etc/profile下,目前只加了 >> export HADOOP_CLASSPATH=`hadoop classpath` >> 我是安装的CDH,没找到sbin这个文件。。 >> >> >> >> >> >> >> >> >> >> >> >> 在

Re: pyflink连接elasticsearch5.4问题

2020-06-16 Thread Dian Fu
I guess it's because the ES version specified in the job is `6`, however, the jar used is `5`. > 在 2020年6月16日,下午1:47,jack 写道: > > 我这边使用的是pyflink连接es的一个例子,我这边使用的es为5.4.1的版本,pyflink为1.10.1,连接jar包我使用的是 > flink-sql-connector-elasticsearch5_2.11-1.10.1.jar,kafka,json的连接包也下载了,连接kafka测试成功了。 >

Re: pyflink连接elasticsearch5.4问题

2020-06-16 Thread Dian Fu
I guess it's because the ES version specified in the job is `6`, however, the jar used is `5`. > 在 2020年6月16日,下午1:47,jack 写道: > > 我这边使用的是pyflink连接es的一个例子,我这边使用的es为5.4.1的版本,pyflink为1.10.1,连接jar包我使用的是 > flink-sql-connector-elasticsearch5_2.11-1.10.1.jar,kafka,json的连接包也下载了,连接kafka测试成功了。 >

MapState bad performance

2020-06-16 Thread nick toker
Hello, We wrote a very simple streaming pipeline containing: 1. Kafka consumer 2. Process function 3. Kafka producer The code of the process function is listed below: private transient MapState testMapState; @Override public void processElement(Map value, Context ctx, Collector> out)

Re: Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread Yang Wang
你这个看着hadoop兼容导致的问题,ContentSummary这个类是从hadoop 2.8以后发生了 变化。所以你需要确认你的lib下带的flink-shaded-hadoop与hdfs集群的版本是兼容的 Best, Yang Zhou Zach 于2020年6月16日周二 下午2:53写道: > flink/lib/下的jar: > flink-connector-hive_2.11-1.10.0.jar > flink-dist_2.11-1.10.0.jar > flink-jdbc_2.11-1.10.0.jar > flink-json-1.10.0.jar >

Re: Re: Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread 王松
那你在命令行执行:hadoop classpath,有hadoop的classpath输出吗? Zhou Zach 于2020年6月16日周二 下午3:22写道: > > > > > > > 在/etc/profile下,目前只加了 > export HADOOP_CLASSPATH=`hadoop classpath` > 我是安装的CDH,没找到sbin这个文件。。 > > > > > > > > > > > > 在 2020-06-16 15:05:12,"王松" 写道: > >你配置HADOOP_HOME和HADOOP_CLASSPATH这两个环境变量了吗? > > >

Re:Re: Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread Zhou Zach
在/etc/profile下,目前只加了 export HADOOP_CLASSPATH=`hadoop classpath` 我是安装的CDH,没找到sbin这个文件。。 在 2020-06-16 15:05:12,"王松" 写道: >你配置HADOOP_HOME和HADOOP_CLASSPATH这两个环境变量了吗? > >export HADOOP_HOME=/usr/local/hadoop-2.7.2 >export HADOOP_CLASSPATH=`hadoop classpath` >export

Re: flink1.11 小疑问(提升 DDL 易用性(动态 Table 属性))

2020-06-16 Thread 王松
6.14号的meetup中讲的动态 Table 属性很清楚,附个链接: https://developer.aliyun.com/live/2894?accounttraceid=07deb589f50c4c1abbcbed103e534316qnxq ,大概在04:17:00左右。 Kurt Young 于2020年6月16日周二 下午12:12写道: > table hint的语法是紧跟在你query中访问某张表的时候,所以我理解并不会有 ”这个动态参数作用在哪张表“ 上的疑问吧? > > Best, > Kurt > > > On Tue, Jun 16, 2020 at

Re: Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread 王松
你配置HADOOP_HOME和HADOOP_CLASSPATH这两个环境变量了吗? export HADOOP_HOME=/usr/local/hadoop-2.7.2 export HADOOP_CLASSPATH=`hadoop classpath` export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH Zhou Zach 于2020年6月16日周二 下午2:53写道: > flink/lib/下的jar: > flink-connector-hive_2.11-1.10.0.jar >

Re:Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread Zhou Zach
flink/lib/下的jar: flink-connector-hive_2.11-1.10.0.jar flink-dist_2.11-1.10.0.jar flink-jdbc_2.11-1.10.0.jar flink-json-1.10.0.jar flink-shaded-hadoop-2-3.0.0-cdh6.3.0-7.0.jar flink-sql-connector-kafka_2.11-1.10.0.jar flink-table_2.11-1.10.0.jar flink-table-blink_2.11-1.10.0.jar

Re:Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread Zhou Zach
high-availability: zookeeper 在 2020-06-16 14:48:43,"Zhou Zach" 写道: > > > > >high-availability.storageDir: hdfs:///flink/ha/ >high-availability.zookeeper.quorum: cdh1:2181,cdh2:2181,cdh3:2181 >state.backend: filesystem >state.checkpoints.dir:

Re:Re: flink sql job 提交到yarn上报错

2020-06-16 Thread Zhou Zach
high-availability.storageDir: hdfs:///flink/ha/ high-availability.zookeeper.quorum: cdh1:2181,cdh2:2181,cdh3:2181 state.backend: filesystem state.checkpoints.dir: hdfs://nameservice1:8020//user/flink10/checkpoints state.savepoints.dir: hdfs://nameservice1:8020//user/flink10/savepoints

Re: flink sql job 提交到yarn上报错

2020-06-16 Thread 王松
你的配置文件中ha配置可以贴下吗 Zhou Zach 于2020年6月16日周二 下午1:49写道: > org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to > initialize the cluster entrypoint YarnJobClusterEntrypoint. > > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187) >

Re:flink sql job 提交到yarn上报错

2020-06-16 Thread Zhou Zach
将flink-shaded-hadoop-2-3.0.0-cdh6.3.0-7.0.jar放在flink/lib目录下,或者打入fat jar都不起作用。。。 At 2020-06-16 13:49:27, "Zhou Zach" wrote: org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint. at

Re: EventTimeSessionWindow firing too soon

2020-06-16 Thread Ori Popowski
So why is it happening? I have no clue at the moment. My event-time timestamps also do not have big gaps between them that would explain the window triggering. On Mon, Jun 15, 2020 at 9:21 PM Robert Metzger wrote: > If you are using event time in Flink, it is disconnected from the real > world

<    1   2