Re: Flink on Yarn resource arrangement

2019-11-13 Thread vino yang
Hi Alex, Which Flink version are you using? AFAIK, since Flink 1.8+, the config option: "-yn" for Flink on YARN job cluster mode does not take effect(always 1 and would be overridden). So, the config option "-ys" and "-p" will decide the number of TM. The first example: -p(20)/-ys(3) should be

Re: Flink-JDBC JDBCUpsertTableSink keyFields Problem

2019-11-13 Thread Polarisary
My sql is regular insert like “insert into sink_table select c1,c2,c3 from source_table”, I want to know which case it will judge to append only? Does it has doc for this? Many thanks! > 在 2019年11月14日,上午10:05,张万新 写道: > > Yes it's related to your sql, flink checks the plan of your sql to

Flink on Yarn resource arrangement

2019-11-13 Thread qq
Hi all, Could you list details how Flink job on Yarn resources managed ? I used command “-p 20 -yn 5 -ys 3 -yjm 2048m -ytm 2048m” to run flink job. I got containers vcores 8 22 Task Managers 7 Total Task Slots 21 I used command “-p 20 -yn 7 -ys 4 -yjm 2048m -ytm 2048m” to

Re: Initialization of broadcast state before processing main stream

2019-11-13 Thread vino yang
Hi Vasily, Currently, Flink did not do the coordination between a general stream and broadcast stream, they are both streams. Your scene of using the broadcast state is a special one. In a more general scene, the states need to be broadcasted is an unbounded stream, the state events may be

Initialization of broadcast state before processing main stream

2019-11-13 Thread Vasily Melnik
Hi all. In our task we have two Kafka topics: - one with fact stream (web traffic) - one with dimension We would like to put dimension data into broadcast state and lookup on int with facts. But we see that not all dimension records are put into state before first fact record is processed, so

Re: 流处理任务失败该如何追回之前的数据

2019-11-13 Thread Dian Fu
如果使用的event time,watermark是根据event计算出来的,和系统时间没有关系,所以从最后一次checkpoint恢复即可。为什么你会觉得有问题? > 在 2019年11月13日,下午8:29,柯桂强 写道: > > 我现在有一个流处理任务失败了,并且保留了checkpoint或者savepoint,我希望从最后一次checkpoint恢复,但是任务使用的是事件时间,超过窗口的数据就会被丢弃,我想到一个方法是,重启之前的数据通过批处理完成然后跑流处理,想问问大家这个方案是否可行,但是感觉如何限定批处理的范围并且和之后的流处理完美拼接是一个比较难的问题

Re: Flink (Local) Environment Thread Leaks?

2019-11-13 Thread tison
It is because MiniCluster start a SystemResourcesCounter for gathering metrics but no logic for shutdown. Thus on cluster exist the thread leak. Best, tison. tison 于2019年11月14日周四 上午10:21写道: > We found this issue previous. > > In our case where leak thread comes from is tracked as >

Re: Flink (Local) Environment Thread Leaks?

2019-11-13 Thread tison
We found this issue previous. In our case where leak thread comes from is tracked as https://issues.apache.org/jira/browse/FLINK-14565 Best, tison. vino yang 于2019年11月14日周四 上午10:15写道: > Hi Theo, > > If you think there is a thread leakage problem. You can create a JIRA > issue and write a

Re: Flink (Local) Environment Thread Leaks?

2019-11-13 Thread vino yang
Hi Theo, If you think there is a thread leakage problem. You can create a JIRA issue and write a detailed description. Ping @Gary Yao and @Zhu Zhu to help to locate and analyze this problem? Best, Vino Theo Diefenthal 于2019年11月14日周四 上午3:16写道: > I included a Solr End2End test in my

Re: flink里删除cassandra的记录

2019-11-13 Thread 163
Dear 您好, 我们使用 DataStax Cassandra Driver 定制了 Cassandra Sink 实现了 detele 操作。 > On Nov 13, 2019, at 3:54 PM, 陈程程 wrote: > > 大家好, > 我们可以通过CassandraSink插入行到Cassandra,但是如何删除该行呢? > 我试图写了一个定制的sink,但是出错了(NotSerializableException)。到底如何执行删除行操作呀? > Thanks,程程

Re: Flink-JDBC JDBCUpsertTableSink keyFields Problem

2019-11-13 Thread 张万新
Yes it's related to your sql, flink checks the plan of your sql to judge whether your job is append only or has updates. If your job is append only, that means no result need to be updated. If you still have problems, please post your sql and complete error message to help people understand your

Flink (Local) Environment Thread Leaks?

2019-11-13 Thread Theo Diefenthal
I included a Solr End2End test in my project, inheriting from Junit 4 SolrCloudTestCase. The solr-test-framework for junit 4 makes use of com.carrotsearch.randomizedtesting which automatically tests for thread leakages on test end. In my other projects, that tool doesn't produce any

回复: Flink1.7.2,TableApi的自定义聚合函数中如何使用自定义状态

2019-11-13 Thread Yuan,Youjun
这个场景应可以通过标准的SQL完成计算。大致思路如下: 1,内层查询统计每个设备一分钟的最大温度,max(temp) as max_temperature + tumble窗口 2,外层通过row over窗口,拿到当前分钟的max_temperature,和前后2分钟最大温度的和,即SUM(max_temperature) AS sum_temperature 3,最外层,就直接select 2 * max_temperature - sum_temperature就是你需要的前后2个分钟最大温度的差了。 假设输入消息有三个字段: Ts: 时间戳 Deviceid:设备编号

Re: Running flink example programs-WordCount

2019-11-13 Thread RAMALINGESWARA RAO THOTTEMPUDI
Respected Sir, This is my code of flink gelly: for(Vertex vertex : node_set) { System.out.println(vertex); graph_copy.removeVertex( vertex); temp.add(vertex); } For some reason .removeVertex is not working even on giving correct parameters. Please help. Given a list of edges, remove

Re: Running flink example programs-WordCount

2019-11-13 Thread RAMALINGESWARA RAO THOTTEMPUDI
Respected Sir, This is my code of flink gelly: for(Vertex vertex : node_set) { System.out.println(vertex); graph_copy.removeVertex( vertex); temp.add(vertex); } For some reason .removeVertex is not working even on giving correct parameters. Please help. Given a list of edges, remove

流处理任务失败该如何追回之前的数据

2019-11-13 Thread 柯桂强
我现在有一个流处理任务失败了,并且保留了checkpoint或者savepoint,我希望从最后一次checkpoint恢复,但是任务使用的是事件时间,超过窗口的数据就会被丢弃,我想到一个方法是,重启之前的数据通过批处理完成然后跑流处理,想问问大家这个方案是否可行,但是感觉如何限定批处理的范围并且和之后的流处理完美拼接是一个比较难的问题

Re: Flink1.7.2,TableApi的自定义聚合函数中如何使用自定义状态

2019-11-13 Thread Dian Fu
1)在Table API & SQL中,RuntimeContext是不暴露给用户用的,所以是private 2)窗口之间聚合值的差值,可以看看cep能否满足需求,可以参考文档: https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/match_recognize.html > 在

Flink-JDBC JDBCUpsertTableSink keyFields Problem

2019-11-13 Thread Polarisary
Hi When I use flink-jdbc JDBCUpsertTableSink for sink to mysql, the isAppendOnly is modified to ture, and keyFields is modified to null by StreamExecSink, but i want to upsert, Does this related to sql? the stack as follows: at