Re: 远程提交代码到Flink集群

2019-03-28 Thread Lifei Chen
有一个小巧的go cli, 支持直接部署jar包到flink manager上。 https://github.com/ing-bank/flink-deployer 希望能帮到你! Kaibo Zhou 于2019年3月29日周五 上午11:08写道: > 可以用 flink 提供的 Restful API 接口,upload 上传 jar 包然后 run。 > > 参考: > > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-upload > 和

Does flink configuration support configed by environment variables?

2019-03-28 Thread Lifei Chen
Hi guys, I am using flink 1.7.2 deployed by kubernetes, and I want to change the configurations about flink, for example customize `taskmanager.heap.size`. Does flink support using environment variables to override configurations in `conf/flink-conf.yaml` ?

Re: 远程提交代码到Flink集群

2019-03-28 Thread Kaibo Zhou
可以用 flink 提供的 Restful API 接口,upload 上传 jar 包然后 run。 参考: https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jars-upload 和 https://files.alicdn.com/tpsservice/a8d224d6a3b8b82d03aa84e370c008cc.pdf 文档的介绍 文报 <1010467...@qq.com> 于2019年3月28日周四 下午9:06写道: > 各位好! > > >

请教一下Blink资源分配问题

2019-03-28 Thread 邓成刚【qq】
请教一下Blink资源分配问题: blink 任务并行度设置 20  提示0个满足:Batch request 40 slots, but only 0 are fulfilled. 调整到 3 并行度 提示:Batch request 6 slots, but only 4 are fulfilled., 但是我的TASK SLOTS有配 48,没有其它任务, 按理説没有资源问题啊,集群配置情况: 其它的都是默认配置: taskmanager.numberOfTaskSlots: 24 jobmanager.heap.size: 20480m # The heap size

Re: blink开源版本维表关联时开启缓存方式

2019-03-28 Thread Kurt Young
Hi, Blink开源的时候把Cache的实现暂时拿掉了,你可以根据自己的需要自己实现一个cache。 Best, Kurt On Wed, Mar 27, 2019 at 4:44 PM 苏 欣 wrote: > 我在ppt里面看到这些内容,但是在开源的blink里面没有找到相关的配置,请问各位老师应该如何开启缓存策略? > > > > 发送自 Windows 10 版邮件 应用 > > >

Re: AskTimeoutException - Cannot deploy task

2019-03-28 Thread Avi Levi
Hi All, Following my previous mail below I see the exception below. I really appreciate any help here Attached is log files Looking at the logs we see this exception all around : 2019-03-28 23:51:58,460 WARN org.apache.kafka.common.network.Selector - Unexpected error from

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
btw the max DAY window that is allowed is 99 days. After that it blows up here: https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlIntervalQualifier.java#L371 "SQL validation failed. From line 12, column 19 to line 12, column 36: Interval field value 100

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
Dawid, After the above change my SQL (that uses TUMBLE(rowtime, INTERVAL '1' MONTH)) fails with an error now: *(testing with org.apache.flink:flink-table_2.11:jar:1.7.1:compile now)* org.apache.flink.table.api.TableException: *Only constant window intervals with millisecond resolution are

Re: Schema Evolution on Dynamic Schema

2019-03-28 Thread Shahar Cizer Kobrinsky
Hmm kinda stuck here. Seems like SQL Group by is translated to a *GroupAggProcessFunction* which stores a state for every aggregation element (thus flattening the map items for state store). Seems like there's no way around it. Am i wrong? is there any way to evolve the map elements when doing

AskTimeoutException - Cannot deploy task

2019-03-28 Thread Avi Levi
Hi, I see the following exceptions, will really appreciate any help on that Thanks Avi This is the first one (out of three) : java.lang.Exception: Cannot deploy task KeyedProcess -> Sink: Unnamed (3/100) (2c9646634afe1488659da404e92697b0) - TaskManager

Re: Calcite SQL Map to Pojo Map

2019-03-28 Thread shkob1
Apparently the solution is to force map creating using UDF and to have the UDF return Types.GENERIC(Map.class) That makes them compatible and treated both as GenericType Thanks! -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
Thanks Dawid! Can you please point me to a jira which tracked the fix? Thanks! Vinod On Thu, Mar 28, 2019 at 12:46 PM Dawid Wysakowicz wrote: > It should be fixed since version 1.6.3. > Best, > Dawid > > > [1] >

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
Doh! Sorry about that! :) Thanks again! On Thu, Mar 28, 2019 at 12:49 PM Dawid Wysakowicz wrote: > I did ;) but here is the link one more time: > https://issues.apache.org/jira/browse/FLINK-11017?jql=project%20%3D%20FLINK%20AND%20text%20~%20Month > > On Thu, 28 Mar 2019, 20:48 Vinod Mehra,

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Dawid Wysakowicz
I did ;) but here is the link one more time: https://issues.apache.org/jira/browse/FLINK-11017?jql=project%20%3D%20FLINK%20AND%20text%20~%20Month On Thu, 28 Mar 2019, 20:48 Vinod Mehra, wrote: > Thanks Dawid! Can you please point me to a jira which tracked the fix? > > Thanks! > Vinod > > On

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Dawid Wysakowicz
It should be fixed since version 1.6.3. Best, Dawid [1] https://issues.apache.org/jira/browse/FLINK-11017?jql=project%20%3D%20FLINK%20AND%20text%20~%20Month On Thu, 28 Mar 2019, 19:32 Vinod Mehra, wrote: > Hi All! > > We are using: org.apache.flink:flink-table_2.11:jar:1.4.2:compile > >

Support for custom triggers in Table / SQL

2019-03-28 Thread Piyush Narang
Hi folks, I’m trying to write a Flink job that computes a bunch of counters which requires custom triggers and I was trying to figure out the best way to express that. The query looks something like this: SELECT userId, customUDAF(...) AS counter1, customUDAF(...) AS counter2, ... FROM (

Re: Throttling/effective back-pressure on a Kafka sink

2019-03-28 Thread Konstantin Knauf
Hi Marc, the Kafka Producer should be able to create backpressure. Could you try to increase max.block.ms to Long.MAX_VALUE? The exceptions you shared for the failure case don't look like the root causes of the problem. Could you share the full stacktraces or even full logs for this time frame.

TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
Hi All! We are using: org.apache.flink:flink-table_2.11:jar:1.4.2:compile SELECT COALESCE(user_id, -1) AS user_id, count(id) AS count_per_window, sum(amount) AS charge_amount_per_window, TUMBLE_START(rowtime, INTERVAL '2' YEAR) AS twindow_start, TUMBLE_END(rowtime, INTERVAL '2' YEAR)

Re: Calcite SQL Map to Pojo Map

2019-03-28 Thread Shahar Cizer Kobrinsky
Based on this discussion http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/HashMap-HashSet-Serialization-Issue-td10899.html this seems by design that HashMap/Map are handled as GenericTypes . However that doesn't work with the query result table schema which generates a Map type.

Re: Calcite SQL Map to Pojo Map

2019-03-28 Thread Shahar Cizer Kobrinsky
Hey Rong, I don't think this is about a UDF, i reproduce the same exception with a simple map['a','b'] where the Pojo has a Map property btw for the UDF i'm already doing it (clazz is based on the specific map im creating): @Override public TypeInformation getResultType(Class[] signature) {

Re: Calcite SQL Map to Pojo Map

2019-03-28 Thread Rong Rong
If your conversion is done using a UDF you need to override the getResultType method [1] to explicitly specify the key and value type information. As generic erasure will not preseve the part of your code. Thanks, Rong [1]

IllegalArgumentException when trying to execute job

2019-03-28 Thread Papadopoulos, Konstantinos
Hi all, I am trying to execute a batch job that gets a list of IDs and perform a loop with a number of steps during each iteration including reading from a MS SQL Server DB. A sample pseudo-code of our implementation is the following: List ids = ... ids.foreach( id ->

问下大家,有做好的blink的docker image镜像吗?能够共享下坐标或者dockerfile,谢谢

2019-03-28 Thread 陈韬
问下大家,有做好的blink的docker image镜像吗?能够共享下坐标或者dockerfile,谢谢

??????????????Flink????

2019-03-28 Thread ????
Flink???(Flinkjarweb??)

Throttling/effective back-pressure on a Kafka sink

2019-03-28 Thread Marc Rooding
Hi We’ve got a job producing to a Kafka sink. The Kafka topics have a retention of 2 weeks. When doing a complete replay, it seems like Flink isn’t able to back-pressure or throttle the amount of messages going to Kafka, causing the following error:

Re: RemoteTransportException: Connection unexpectedly closed by remote task manager

2019-03-28 Thread yinhua.dai
I have put the task manager of the data sink log to https://gist.github.com/yinhua2018/7de42ff9c1738d5fdf9d99030db903e2 -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: RemoteTransportException: Connection unexpectedly closed by remote task manager

2019-03-28 Thread yinhua.dai
Hi Qi, I checked the JVM heap of the sink TM is low. I tried to read flink source code to identify where is exact the error happen. I think the exception happened inside DataSinkTask.invoke() // work! while (!this.taskCanceled &&

Re: RemoteTransportException: Connection unexpectedly closed by remote task manager

2019-03-28 Thread qi luo
Hi Yinhua, This looks like the TM executing the sink is down, maybe due to OOM or some other error. You can check the JVM heap and GC log to see if there’re any clues. Regards, Qi > On Mar 28, 2019, at 7:23 PM, yinhua.dai wrote: > > Hi, > > I write a single flink job with flink SQL with

Re: questions regarding offset

2019-03-28 Thread Avi Levi
Thanks for answering. please see my comments below On Thu, Mar 28, 2019 at 12:32 PM Dawid Wysakowicz wrote: > Hi Avi, > > Yes, you are right. Kafka offsets are kept in state. > > Ad. 1 If you try to restore a state in a completely different > environment, and offsets are no longer compatible it

Re: Flink Job 监控

2019-03-28 Thread Biao Liu
Hi, 可以了解下 RESTful API https://ci.apache.org/projects/flink/flink-docs-release-1.7/monitoring/rest_api.html cheng 于2019年3月28日周四 下午5:08写道: > 我们目前是用standalone 模式部署的集群,请问这个job state 有关于job是否挂掉或者重启的指标?我看官方文档好像没找到。 > > > 在 2019年3月28日,下午4:51,浪人 <1543332...@qq.com> 写道: > > > >

RemoteTransportException: Connection unexpectedly closed by remote task manager

2019-03-28 Thread yinhua.dai
Hi, I write a single flink job with flink SQL with version 1.6.1 I have one table source which read data from a database, and one table sink to output as avro format file. The table source has parallelism of 19, and table sink only has parallelism of 1. But there is always

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Ufuk Celebi
Thanks Gordon. We already have 5 people watching it. :-) On Thu, Mar 28, 2019 at 10:23 AM Tzu-Li (Gordon) Tai wrote: > > @Ufuk > > Yes, creating a JIRA now already to track this makes sense. > > I've proceeded to open one: https://issues.apache.org/jira/browse/FLINK-12047 > Let's move any

Re: questions regarding offset

2019-03-28 Thread Dawid Wysakowicz
Hi Avi, Yes, you are right. Kafka offsets are kept in state. Ad. 1 If you try to restore a state in a completely different environment, and offsets are no longer compatible it will most probably fail as it won't be able to derive up to which point we already processed the records. Ad.2 What do

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Tzu-Li (Gordon) Tai
@Ufuk Yes, creating a JIRA now already to track this makes sense. I've proceeded to open one: https://issues.apache.org/jira/browse/FLINK-12047 Let's move any further discussions there. Cheers, Gordon On Thu, Mar 28, 2019 at 5:01 PM Ufuk Celebi wrote: > I think such a tool would be really

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Vishal Santoshi
+1 On Thu, Mar 28, 2019, 5:01 AM Ufuk Celebi wrote: > I think such a tool would be really valuable to users. > > @Gordon: What do you think about creating an umbrella ticket for this > and linking it in this thread? That way, it's easier to follow this > effort. You could also link Bravo and

Re: RocksDBStatebackend does not write checkpoints to backup path

2019-03-28 Thread Paul Lam
Hi Gordon, Thanks for your reply. I’ve found out that it should be a bug of RocksDBStateBackend [1]. [1] https://issues.apache.org/jira/browse/FLINK-12042 Best, Paul Lam > 在 2019年3月28日,17:03,Tzu-Li (Gordon) Tai 写道: > > Hi, > > Do you

Re: Flink Job 监控

2019-03-28 Thread cheng
我们目前是用standalone 模式部署的集群,请问这个job state 有关于job是否挂掉或者重启的指标?我看官方文档好像没找到。 > 在 2019年3月28日,下午4:51,浪人 <1543332...@qq.com> 写道: > > 如果是使用flink集成cluster可以监控flink的job state,如果是yarn是超脱模式可以监控yarn的状态。 > > > > > -- 原始邮件 -- > 发件人: "cheng"; > 发送时间: 2019年3月28日(星期四) 下午4:38 >

Re: RocksDBStatebackend does not write checkpoints to backup path

2019-03-28 Thread Tzu-Li (Gordon) Tai
Hi, Do you have the full error message of the failure? A wild guess to begin with: have you made sure that there are sufficient permissions to create the directory? Best, Gordon On Tue, Mar 26, 2019 at 5:46 PM Paul Lam wrote: > Hi, > > I have a job (with Flink 1.6.4) which uses rocksdb

关于Blink 消费kafka并行度问题

2019-03-28 Thread 邓成刚【qq】
请教一下,Blink 消费kafka数据时,把并行度设置 30 ,就会出现Timeout,JOB跑不起来,应该是没有消费到数据,把并行度调 到 5就没问题,另外,JOB用到4个TOPic,每个30个PARTITION,但是把这同样JOB提交给FLINK 设置 30 并行度 就可以跑起来,有哪位大佬知道什么情况吗?

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Ufuk Celebi
I think such a tool would be really valuable to users. @Gordon: What do you think about creating an umbrella ticket for this and linking it in this thread? That way, it's easier to follow this effort. You could also link Bravo and Seth's tool in the ticket as starting points. – Ufuk

Re: RocksDB local snapshot sliently disappears and cause checkpoint to fail

2019-03-28 Thread Yu Li
Ok, much clearer now. Thanks. Best Regards, Yu On Thu, 28 Mar 2019 at 15:59, Paul Lam wrote: > Hi Yu, > > I’ve set `fs.default-scheme` to hdfs, and it's mainly used for simplifying > checkpoint / savepoint / HA paths. > > And I leave the rocksdb local dir empty, so the local snapshot still

??????Flink Job ????

2019-03-28 Thread ????
??flinkclusterflink??job state,??yarn??yarn -- -- ??: "cheng"; : 2019??3??28??(??) 4:38 ??: "user-zh"; : Flink Job Flink Job

Flink Job 监控

2019-03-28 Thread cheng
各位好! 请教下各位,Flink Job 在生产上运行时,关于job运行状态的监控和告警一般是采用什么方案处理的? 比如监控job是否在正常运行,如果发现job 挂掉了 或者重启了 就进行告警。我这边有将一些metric 推到prometheus 但是好像没有发现关于job是否挂掉的metric。 希望有做过这种方案的朋友能赐教下,谢谢了!!

Flink Job 监控

2019-03-28 Thread cheng
各位好! 请教下各位,Flink Job 在生产上运行时,关于job运行状态的监控和告警一般是采用什么方案处理的? 比如监控job是否在正常运行,如果发现job 挂掉了 或者重启了 就进行告警。我这边有将一些metric 推到prometheus 但是好像没有发现关于job是否挂掉的metric。 希望有做过这种方案的朋友能赐教下,谢谢了!!

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Tzu-Li (Gordon) Tai
Hi! Regarding the support for savepoint reading / writing / processing directly in core Flink, we've been thinking about that lately and might push a bit for adding the functionality to Flink in the next release. For example, beside Bravo, Seth (CC'ed) also had implemented something [1] for this.

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Gyula Fóra
Hi! I dont think there is any ongoing effort in core Flink other than this library we created. You are probably right that it is pretty hacky at the moment. I would say this one way we could do it that seemed convenient to me at the time I have written the code. If you have ideas how to

Re: blink消费kafka出现诡异的情况,困扰很久了,哪位大佬知道怎么回事

2019-03-28 Thread 邓成刚【qq】
通过测试发现,不是sql 脚本的问题,是并行度的问题,30个并行度不行,改成5就OK了。。。 env.setParallelism(5);   发件人: 邓成刚【qq】 发送时间: 2019-03-26 18:17 收件人: user-zh 主题: blink消费kafka出现诡异的情况,困扰很久了,哪位大佬知道怎么回事 HI,各位大佬:       发现一个很诡异的问题:使用SQL API时,在窗口上group by,JOB 5分钟后会timeout,但如果改成select * 就能正常消费kafka。。。 说明:本地模式和提交JOB均存在此异常 相关信息: blink 1.5.1

Re: RocksDB local snapshot sliently disappears and cause checkpoint to fail

2019-03-28 Thread Paul Lam
Hi Yu, I’ve set `fs.default-scheme` to hdfs, and it's mainly used for simplifying checkpoint / savepoint / HA paths. And I leave the rocksdb local dir empty, so the local snapshot still goes to YARN local cache dirs. Hope that answers your question. Best, Paul Lam > 在 2019年3月28日,15:34,Yu Li

Re: RocksDB local snapshot sliently disappears and cause checkpoint to fail

2019-03-28 Thread Yu Li
Hi Paul, Regarding "mistakenly uses the default filesystem scheme, which is specified to hdfs in the new cluster in my case", could you further clarify the configuration property and value you're using? Do you mean you're using an HDFS directory to store the local snapshot data? Thanks. Best

Re: RocksDB local snapshot sliently disappears and cause checkpoint to fail

2019-03-28 Thread Paul Lam
Hi, It turns out that under certain circumstances rocksdb statebackend mistakenly uses the default filesystem scheme, which is specified to hdfs in the new cluster in my case. I’ve filed a Jira to track this[1]. [1] https://issues.apache.org/jira/browse/FLINK-12042