??????flink open ???? transient??????????

2020-06-23 Thread kcz
??state ---- ??:""<13162790...@163.com; :2020??6??24??(??) 1:36 ??:"user-zh"

Re:flink open 时候 transient使用问问题

2020-06-23 Thread 程龙
1 首先transient 是对修饰的变量不进行序列化 2 你使用transient的目的需要明确 使用来干啥的 3 状态都是可以读取并且使用的 不进行序列化 在 2020-06-24 11:37:09,"kcz" <573693...@qq.com> 写道: >请教大佬一个代码问题,当在open初始化一些mysql的client或者 >初始化state时候,用了transient是不是会对代码有优化作用,这里不是太理解。

Re: FlinkSQL 是否支持类似临时中间表的概念

2020-06-23 Thread Leonard Xu
Hello, 你的需求其实是要 抽取记录的字段定义watermark, 这个只能放到source 表的DDL中,view上也不支持的。 1.10里的计算列 + udf 应该就可以满足你的需求, 大概长这样: CREATE TABLE sourceTable ( request_uri STRING, ts as extractTsUdf(request_uri), WATERMARK FOR ts AS ts - INTERVAL '5' SECOND ) WITH ( .. ); select ... from ( select ts, T.* from

Re: Re: FlinkSQL 是否支持类似临时中间表的概念

2020-06-23 Thread Jark Wu
你可以在 DDL 中直接用计算列去从 request_uri 里获得 heart_time 哈,然后在这个计算列上定义 watermark 即可。 例如: CREATE TABLE sourceTable ( request_uri STRING, heart_time AS my_parse(request_uri), WATERMARK FOR heart_time AS heart_time - INTERVAL '1' SECOND ) WITH ( ... ); 虽然这会导致重复解析两遍。 Best, Jark On Wed, 24 Jun 2020

Re:Re: FlinkSQL 是否支持类似临时中间表的概念

2020-06-23 Thread Weixubin
感谢你提供了子查询的思路,不过经过试验后有点可惜,这似乎还是满足不了我们的需求。 我们的场景是从阿里云SLS读取消息。每条消息有一个字段是request_uri。 第一个数据处理过程就是将 request_uri 解析为多个属性(多列),存成一行,作为一条记录。 第二个数据处理过程就是将每条记录里的 声明heart_time为事件时间属性并使用5秒延迟水印策略,进行开窗聚合处理。 //如果应用到子查询的话,Flink是不支持这样做的。WATERMARK FOR 水印声明只能在DDL里应用。如下: select ..ts as

datadog failed to send report

2020-06-23 Thread Fanbin Bu
Hi, Does any have any idea on the following error msg: (it flooded my task manager log) I do have datadog metrics present so this is probably only happens for some metrics. 2020-06-24 03:27:15,362 WARN org.apache.flink.metrics.datadog.DatadogHttpClient- Failed sending request to

Re: flink open 时候 transient使用问问题

2020-06-23 Thread Benchao Li
transient关键字主要的作用是告诉JVM,这个字段不需要序列化。 之所以建议很多能够在open函数里面初始化的变量用transient,是因为这些变量本身不太需要参与序列化, 比如一些cache之类的;或者有些变量也做不到序列化,比如一些连接相关的对象。 kcz <573693...@qq.com> 于2020年6月24日周三 上午11:37写道: > 请教大佬一个代码问题,当在open初始化一些mysql的client或者 > 初始化state时候,用了transient是不是会对代码有优化作用,这里不是太理解。 -- Best, Benchao Li

flink open ???? transient??????????

2020-06-23 Thread kcz
??open??mysql??client ??state??transient??

Re:flink任务失败重启时, flink last checkpoint 失败但任务仍然正常重启,导致 state 重启前后不一致

2020-06-23 Thread 程龙
可以自己改一下源码中的消费者 判断偏移量 ,如果是原先的正常启动 如果不是则不进行启动 在 2020-06-22 20:09:11,"莫失莫忘" 写道: >如题,可以要求flink失败重启时 必须正常从checkpoint恢复,否则就重启失败吗?

为啥KafkaFetcher要新开线程KafkaConsumerThread去消费获取数据,而不是使用LegacySourceFunctionThread去做loop循环呢

2020-06-23 Thread Han Han1 Yue
Hi,从kafka获取数据的while循环中,为啥不直接使用当前线程(LegacySourceFunctionThread),而是新创建了consumerThread每次取一条数据然后在交给当前线程 源码版本:1.10.0 // KafkaFetcher.java @Override public void runFetchLoop() throws Exception { try { final Handover handover = this.handover; // kick off the actual

Re: Jobmanager重启,cannot set up a jobmanager

2020-06-23 Thread Yang Wang
"service temporarily unavailable due to an ongoing leader election" 只是说明rest server leader还没有 选出来,是正常的 你把失败的JM以及新的JM log发出来吧,这样方便看到是不是Flink自己去清理的 Best, Yang 绘梦飘雪 <318666...@qq.com> 于2020年6月23日周二 下午4:28写道: > hdfs上 ha storage 目录还在,但里的文件没了,作业占用的资源还在并没有释放,访问flinkui 报service > temporarily

Re:Flink 1.10中是否有接口或方法获取批任务执行进度

2020-06-23 Thread 程龙
可以试试自定义listener 在 2020-06-24 09:12:05,"faaron zheng" 写道: >Flink 1.10中是否有接口或方法获取批任务执行进度,百分比? faaron zheng 邮箱:faaronzh...@gmail.com 签名由 >网易邮箱大师 定制

Re: Flink DataStream

2020-06-23 Thread xuhaiLong
是我的问题,引用了old planner。感谢! On 6/23/2020 21:05,LakeShen wrote: Hi xuhaiLong, 看你的依赖,应该用的 old planner,你要使用 blink planner 才能使用row_number 函数。要使用 flink-table-planner-blink_2.11 具体文档参考: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/#table-program-dependencies Best, LakeShen

Flink 1.10中是否有接口或方法获取批任务执行进度

2020-06-23 Thread faaron zheng
Flink 1.10中是否有接口或方法获取批任务执行进度,百分比? faaron zheng 邮箱:faaronzh...@gmail.com 签名由 网易邮箱大师 定制

Re: Session Window with Custom Trigger

2020-06-23 Thread KristoffSC
It seems that I'm clearing the timers in a right way, but there is a new timer created from WindowOperator::registerCleanupTimer method. This one is called from WindowOperator::processElement at the end of both if/else branches. How can I mitigate this? I dont want to have any "late firings" for

RichAggregationFunction

2020-06-23 Thread Steven Nelson
I am trying to add some custom metrics to my window (because the window is causing a lot of backpressure). However I can't seem to use a RichAggregationFunction instead of an AggregationFunction. I am trying to see how long things get held in our EventTimeSessionWindows.withGap window. Is there

Re: Non parallel file sources

2020-06-23 Thread Vishwas Siravara
Thanks that makes sense. On Tue, Jun 23, 2020 at 2:13 PM Laurent Exsteens < laurent.exste...@euranova.eu> wrote: > Hi Nick, > > On a project I worked on, we simply made the file accessible on a shared > NFS drive. > Our source was custom, and we forced it to parallelism 1 inside the job, > so

Re: DROOLS rule engine with flink

2020-06-23 Thread Georg Heiler
Why not use flink CEP? https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html has a nice interactive example Best, Georg Jaswin Shah schrieb am Di. 23. Juni 2020 um 21:03: > Hi I am thinking of using some rule engine like DROOLS with flink to solve > a problem described below:

Re: Non parallel file sources

2020-06-23 Thread Laurent Exsteens
Hi Nick, On a project I worked on, we simply made the file accessible on a shared NFS drive. Our source was custom, and we forced it to parallelism 1 inside the job, so the file wouldn't be read multiple times. The rest of the job was distributed. This was also on a standalone cluster. On a

DROOLS rule engine with flink

2020-06-23 Thread Jaswin Shah
Hi I am thinking of using some rule engine like DROOLS with flink to solve a problem described below: I have stream of events coming from kafka topic and I want to analyze those events based on some rules and give the results in results streams when rules are satisfied. Now, I am able to

Non parallel file sources

2020-06-23 Thread Nick Bendtner
Hi guys, What is the best way to process a file from a unix file system since there is no guarantee as to which task manager will be assigned to process the file. We run flink in standalone mode. We currently follow the brute force way in which we copy the file to every task manager, is there a

Re: Session Window with Custom Trigger

2020-06-23 Thread KristoffSC
Hi Marco Villalobos-2 unfortunately I don't think Tumbling window will work in my case. The reasons: 1. Window must start only when there is a new event, and previous window is closed. The new Tumbling window is created just after previews one is purged. In my case I have to use SessionWindow

??????Flink SQL??UDF????????????????json????????

2020-06-23 Thread lonely Wanderer
?? ??Flink(1.8)SQL ??UDF,jsonname??name??value (e.g.??appKey) ??json: {"appKey": "qq", "eventId": "18", "name" : [{"a":"jack","b":"mark","c":"tark"},{...},...]} json??

Re: Interact with different S3 buckets from a shared Flink cluster

2020-06-23 Thread Steven Wu
Internally, we have our own ConfigurableCredentialsProvider. Based on the config in core-site.xml, it does assume-role with the proper IAM credentials using STSAssumeRoleSessionCredentialsProvider. We just need to grant permission for the instance credentials to be able to assume the IAM role for

Re: Session Window with Custom Trigger

2020-06-23 Thread Marco Villalobos
Hi Kristoff, > On Jun 23, 2020, at 6:52 AM, KristoffSC > wrote: > > Hi all, > I'm using Flink 1.9.2 and I would like to ask about my use case and approach > I've took to meet it. > > The use case: > I have a keyed stream, where I have to buffer messages with logic: > 1. Buffering should

Re: Session Window with Custom Trigger

2020-06-23 Thread KristoffSC
One addition: in clear method of my custom trigger I do call ctx.deleteProcessingTimeTimer(window.maxTimestamp()); -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

State leak

2020-06-23 Thread Ori Popowski
Hi, When working with an ever growing key-space (let's say session ID), and a SessionWindow with a ProcessFunction - should we worry about the state growing indefinitely? Or does the window make sure to clean state after triggers? Thanks

Re: flink1.9 on yarn 运行二个多月之后出现错误

2020-06-23 Thread LakeShen
Hi guanyq, 从日志中,我看到 TaskManager 所在机器的本地存储几乎快用完了。 看下是否因为 TaskManager 所在机器的存储不够导致 Best, LakeShen xueaohui_...@163.com 于2020年6月20日周六 上午9:57写道: > 不知道有没有yarn上面的详细日志。 > > hdfs是否有权限问题 > > > > xueaohui_...@163.com > > 发件人: guanyq > 发送时间: 2020-06-20 08:48 > 收件人: user-zh > 主题: flink1.9 on yarn

Session Window with Custom Trigger

2020-06-23 Thread KristoffSC
Hi all, I'm using Flink 1.9.2 and I would like to ask about my use case and approach I've took to meet it. The use case: I have a keyed stream, where I have to buffer messages with logic: 1. Buffering should start only when message arrives. 2. The max buffer time should not be longer than 3

Faild to load dependency after migration to Flink 1.10

2020-06-23 Thread Thms Hmm
Hey all, we are currently migrating our Flink jobs from v1.9.1 to v1.10.1. The code has been migrated as well as our Docker images (deploying on K8s using standalone mode). Now an issue occurs if we use log4j2 and the Kafka Appender which was working before. There are a lot of errors regarding

Re: Flink DataStream

2020-06-23 Thread LakeShen
Hi xuhaiLong, 看你的依赖,应该用的 old planner,你要使用 blink planner 才能使用row_number 函数。要使用 flink-table-planner-blink_2.11 具体文档参考: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/#table-program-dependencies Best, LakeShen xuhaiLong 于2020年6月23日周二 下午8:14写道: > "org.apache.flink" %%

Re: Flink DataStream

2020-06-23 Thread xuhaiLong
"org.apache.flink" %% "flink-table-api-scala-bridge" % "1.10.1", "org.apache.flink" %% "flink-table-planner" % "1.10.1" % "provided", 看下粘贴的 sbt 依赖 On 6/23/2020 20:06,Jark Wu wrote: 图片无法查看,你可以把图片上传到某图床,然后将链接贴这里。 On Tue, 23 Jun 2020 at 19:59, xuhaiLong wrote: 使用的是1.10.1,在 table api

Re: Flink DataStream

2020-06-23 Thread Jark Wu
图片无法查看,你可以把图片上传到某图床,然后将链接贴这里。 On Tue, 23 Jun 2020 at 19:59, xuhaiLong wrote: > 使用的是1.10.1,在 table api 无法使用ROW_NUMBER > On 6/23/2020 19:52,Jark Wu wrote: > > Hi xuhaiLong, > > 1.10 blink planner 是支持 ROW_NUMBER() over 的 (配合 rownum <= N 使用)。你是不是用的 old > planner 呢? > > Best, > Jark > > On Tue, 23

Re: Flink DataStream

2020-06-23 Thread xuhaiLong
使用的是1.10.1,在 table api 无法使用ROW_NUMBER On 6/23/2020 19:52,Jark Wu wrote: Hi xuhaiLong, 1.10 blink planner 是支持 ROW_NUMBER() over 的 (配合 rownum <= N 使用)。你是不是用的 old planner 呢? Best, Jark On Tue, 23 Jun 2020 at 19:44, LakeShen wrote: Hi xuhaiLong, 看下是否能够在 Flink SQL 中,通过自定义 UDTF 来满足你的需求。 Best,

Re: Flink DataStream

2020-06-23 Thread Jark Wu
Hi xuhaiLong, 1.10 blink planner 是支持 ROW_NUMBER() over 的 (配合 rownum <= N 使用)。你是不是用的 old planner 呢? Best, Jark On Tue, 23 Jun 2020 at 19:44, LakeShen wrote: > Hi xuhaiLong, > > 看下是否能够在 Flink SQL 中,通过自定义 UDTF 来满足你的需求。 > > Best, > LakeShen > > xuhaiLong 于2020年6月23日周二 下午7:18写道: > > > Hi > > > >

Re: flink任务失败重启时, flink last checkpoint 失败但任务仍然正常重启,导致 state 重启前后不一致

2020-06-23 Thread LakeShen
Hi , 正如 Congxian 所说,当 Flink 任务容错恢复重启时,会从上一次成功的 Checkpoint 进行恢复。 所以你所说的 last checkpoint 失败,具体是什么含义呢? Best, LakeShen Congxian Qiu 于2020年6月22日周一 下午8:23写道: > hi > > 这里说的 state 不一致是什么意思呢?checkpoint 恢复保证全局的 state 被重置到之前某个成功的 checkpoint。 > > Best, > Congxian > > > 莫失莫忘 于2020年6月22日周一 下午8:09写道: > >

Re: Flink DataStream

2020-06-23 Thread LakeShen
Hi xuhaiLong, 看下是否能够在 Flink SQL 中,通过自定义 UDTF 来满足你的需求。 Best, LakeShen xuhaiLong 于2020年6月23日周二 下午7:18写道: > Hi > > 请教一个问题 > > > 我需要对一个类似这样的数据进行计算获取用户 categoryId > | userId | articleID | categoryId | score | > | 01 | A | 1 | 10 | > | 01 | B | 1 | 20 | > | 01 | C | 2 | 30 | > > > > >

Re: Flink(1.8.3) UI exception is overwritten, and the cause of the failure is not displayed

2020-06-23 Thread Arvid Heise
Hi Andrew, this looks like your Flink cluster has a flaky connection to the Kafka cluster or your Kafka cluster was down. Since the operator failed on the sync part of the snapshot, it resorted to failure to avoid having inconsistent operator state. If you configured restarts, it just restart

【flink web】Flink 1.7 Yarn开启http 的kerberos 认证,去访问flink web 界面的时候出现403.

2020-06-23 Thread tao wang
hi, 请教一个问题: *环境:* yarn 2.9.2 http 开启kerberos hadoop.http.authentication.type kerberos *flink 版本*:官方 1.7.1 *1.10 版本可以正常访问。* 访问 flink 的web 界面的时候报下面这个错误。 [image: D79C0EE4-F084-436B-8944-83677A57A320_4_5005_c.jpeg]

Flink DataStream

2020-06-23 Thread xuhaiLong
Hi 请教一个问题 我需要对一个类似这样的数据进行计算获取用户 categoryId | userId | articleID | categoryId | score | | 01 | A | 1 | 10 | | 01 | B | 1 | 20 | | 01 | C | 2 | 30 | 目前我的实现是使用tableAPI 根据 UserId和categoryID 分组做 score 聚合 再通过状态做TopN排序,有没有其他更好的方案来实现? 我使用过 ROW_NUMBER() over() ,在 flink 1.10 上,并无法使用。分组聚合除了使用Table

?????? ??????savepoint????????????????????

2020-06-23 Thread claylin
UID??name ---- ??:"Sun.Zhu"<17626017...@163.com; :2020??6??23??(??) 5:18 ??:"user-zh@flink.apache.org"https://issues.apache.org/jira/browse/FLINK-5601

?????? ??????savepoint????????????????????

2020-06-23 Thread Sun.Zhu
hi??claylin ??uidDAG?? | | Sun.Zhu | | 17626017...@163.com | ?? ??2020??06??23?? 16:29??claylin<1012539...@qq.com> ??

Re: 维表join不支持事件时间窗口级联

2020-06-23 Thread Benchao Li
这不是隐藏函数呀,这个就是用来声明处理时间属性的函数。 如果你在DDL里通过计算列声明处理时间属性的话,也是用这个函数的。 赵玉豪 于2020年6月23日周二 下午4:37写道: > 感谢大佬,我试一下。proctime()是一个隐藏函数么,没有在官网上见到过。 > > > > ---原始邮件--- > 发件人: "Benchao Li" 发送时间: 2020年6月23日(周二) 下午4:31 > 收件人: "user-zh" 主题: Re: 维表join不支持事件时间窗口级联 > > > 你可以尝试一下在做完了时间时间窗口之后,再做一个view,类似于`select *,

Re: what to consider when testing a data stream application using the TPC-H benchmark data?

2020-06-23 Thread Felipe Gutierrez
I am afraid that you can be much more precise if you use System.nanoTime() instead of System.currentTimeMillis() together with Thread.sleep(delay);. First because Thread.sleep is less precise [1] and second because you can do less operations with System.nanoTime() in an empty loop. Like this:

回复:维表join不支持事件时间窗口级联

2020-06-23 Thread 赵玉豪
感谢大佬,我试一下。proctime()是一个隐藏函数么,没有在官网上见到过。 ---原始邮件--- 发件人: "Benchao Li"

Re: 维表join不支持事件时间窗口级联

2020-06-23 Thread Benchao Li
你可以尝试一下在做完了时间时间窗口之后,再做一个view,类似于`select *, PROCTIME() AS proctime from window_result`, 这样又可以有处理时间属性了,也就是后面可以做维表join了。 赵玉豪 于2020年6月23日周二 下午4:21写道: > 当前维表join写法需要 > 左表中包含proctime字段,但是使用事件时间窗口后就会丢失proctime属性,语法进行维表join。有什么好的解决方案么? -- Best, Benchao Li

?????? ??????savepoint????????????????????

2020-06-23 Thread claylin
??savepoint?? flatmap??jobgraph

回复:Jobmanager重启,cannot set up a jobmanager

2020-06-23 Thread 绘梦飘雪
hdfs上 ha storage 目录还在,但里的文件没了,作业占用的资源还在并没有释放,访问flinkui 报service temporarily unavailable due to an ongoing leader election ---原始邮件--- 发件人: "Yang Wang"

Re: [EXTERNAL] Re: Renaming the metrics

2020-06-23 Thread Ori Popowski
Thanks for the suggestion. After digging a bit, we've found it most convenient to just add labels to all our Prometheus queries, like this: flink_taskmanager_job_task_operator_currentOutputWatermark{job_name=""} The job_name label will be exposed if you run your job with a job name like this:

维表join不支持事件时间窗口级联

2020-06-23 Thread 赵玉豪
当前维表join写法需要 左表中包含proctime字段,但是使用事件时间窗口后就会丢失proctime属性,语法进行维表join。有什么好的解决方案么?

Re: Jobmanager重启,cannot set up a jobmanager

2020-06-23 Thread Yang Wang
HA storage除非任务结束或者失败,Flink自己是不会去清理的。在JM failover 的时候会从HDFS上面拉回来 是不是外部的系统把HA storage里面的内容清理了呢 Best, Yang 绘梦飘雪 <318666...@qq.com> 于2020年6月23日周二 下午12:50写道: > jobmanager重启时会org.apache.flink.runtime.client.jobexecutionexception could > not set up jobmanager > cannot set up the user code libraries

Re: FlinkSQL 是否支持类似临时中间表的概念

2020-06-23 Thread Leonard Xu
Hi 我的意思是你如果中间结果表如果要输出,那你就一个sink写到中间结果表(什么表根据你的需要),一个sink写到你的最终结果表,在这两个sink之前的`select * from sourceTable , LATERAL TABLE(ParseUriRow(request_uri)) as T( )….` 的这段sql是可以复用的,就和 VIEW的作用类似。 如果你不需要中间结果表,只是要最终的结果表,那你写个嵌套的sql就行了,里层是`select * from sourceTable , LATERAL TABLE(ParseUriRow(request_uri)) as

Re:Re: flink 通过 --classpath 传入https-xxx.xxx.jar 在1.8上正常运行 在flink1.10 就会报传入的jar包找不到

2020-06-23 Thread 程龙
在同一套集群离 安装有连个版本的flink 使用-C http://xxx.jar的方式 1.8能正常运行 说明都能访问 使用1.10 就不行 At 2020-06-22 17:46:56, "Yang Wang" wrote: > -C,--classpath Adds a URL to each user code > classloader on all nodes in the >

Re:Re:flink 通过 --classpath 传入https-xxx.xxx.jar 在1.8上正常运行 在flink1.10 就会报传入的jar包找不到

2020-06-23 Thread 程龙
嗯 确认都能访问 使用http的方式 在 2020-06-23 10:04:55,"Weixubin" <18925434...@163.com> 写道: >和版本应该没什么关系。如果是多节点部署的情况下,-C 所指定的URL 需要各个节点都能访问得到。 确认下该URL能被所有节点访问到吗 > Best, > Bin > > > > > > > > > > > > > > >At 2020-06-22 11:43:11, "程龙" <13162790...@163.com> wrote: >>2020-06-22 10:16:34,379 INFO

Re:Re: FlinkSQL 是否支持类似临时中间表的概念

2020-06-23 Thread Weixubin
Hi, 关于这句 “把 ` select * from sourceTable , LATERAL TABLE(ParseUriRow(request_uri)) as T( )….` 这段 sql insert 到中间结果表 和 group后再写入最终结果表就可以了” 我不太明白所说的中间结果表是什么意思, 我所理解为在数据库创建多一张middleSinkTable表,作为中间结果表。请问这样理解是否有误? 可否简单举个例子。 Thanks, Bin 在 2020-06-23 11:57:28,"Leonard Xu" 写道: >Hi, >是的,这个是在

jobmanager restart failed. could not set up jobmanager

2020-06-23 Thread 绘梦飘雪
jobmanager restart failed, throw exception. org.apache.flink.runtime.client.jobexecutionexception could not set up jobmanager cannot set up the user code libraries file does not exist /flink/recovery/appid/blob/job***i cannot find this file in hdfs, but the directory existed in hdfs.

Re: 作业从savepoint启动,状态不一致问题

2020-06-23 Thread Congxian Qiu
具体的依赖你生成 watermark 的逻辑,换句话说,如果你的作业不 failvover 的话,watermark 应该是怎么样的,然后 failover 之后,你的 watermark 应该是怎么样的。你需要能够保证这两个是一致的。 checkpoint 包含 watermark 之前有个 issue[1] 如果你需要这个功能的话,可以在 issue 那边进行评论 [1] https://issues.apache.org/jira/browse/FLINK-5601

Re: adding s3 object metadata while using StreamFileSink

2020-06-23 Thread Dawid Wysakowicz
Hi, Maybe a bit crazy idea, but you could also try extending the S3 filesystem and add the metadata there. You could write a thin wrapper for the existing filesystem. If you'd like to go that route you might want to check this page[1]. You could use that filesystem with your custom scheme. Best,

?????? ??????savepoint????????????????????

2020-06-23 Thread claylin
??watermark?? ---- ??:"Congxian Qiu"

Re: Rocksdb state directory path in EMR

2020-06-23 Thread Dawid Wysakowicz
Hi, If I understand you correctly, you want to check the local RocksDB files, right? They are stored locally on each TaskManager in a temporary directory. This can be configured via "state.backend.rocksdb.localdir"[1]. If not specified it will use the globally defined temporary directory set via

Re: 作业从savepoint启动,状态不一致问题

2020-06-23 Thread Congxian Qiu
现在 watermark 没有被记录在 checkpoint/savepoint 中,因此结果可能会不一致,这需要看下 从 savepoint 恢复之后 watermark 的生成和之前是否完全一致。 Best, Congxian claylin <1012539...@qq.com> 于2020年6月23日周二 上午9:35写道: > 1. 生成savepoint的作业还在正常运行,我是从savepoint又重新起了一个任务,然后对他们的输出做了对比,发现输出结果不一致 > 2. 是的,我这边有window窗口,使用的是tumble event time window > 3.