Re: flink sql解析kafka数据

2022-07-04 Thread 林影
Hi, 'json.infer-schema.flatten-nested-columns.enable'='true' 这个参数不是属于社区Flink 的feature,是阿里云的vvr flink引擎才支持的参数。 JasonLee <17610775...@163.com> 于2022年7月5日周二 11:33写道: > Hi > 解析嵌套 JSON 可以参考这篇文章哈,https://mp.weixin.qq.com/s/KHVUlOsLSHPzCprRWSJYcA > > > Best > JasonLee > > > 回复的原邮件 > | 发件人 |

回复:flink sql解析kafka数据

2022-07-04 Thread JasonLee
Hi 解析嵌套 JSON 可以参考这篇文章哈,https://mp.weixin.qq.com/s/KHVUlOsLSHPzCprRWSJYcA Best JasonLee 回复的原邮件 | 发件人 | 小昌同学 | | 发送日期 | 2022年06月30日 15:02 | | 收件人 | user-zh@flink.apache.org | | 主题 | flink sql解析kafka数据 | 各位大佬 请教一下就是我kafka的数据是这样的嵌套格式 ARRAY里面嵌套了ROW类型 我这边想直接通过flink sql建表语句拿到最里面的字段的值

Re:flink sql解析kafka数据

2022-07-04 Thread Xuyang
Hi, 目前我在flink master上没找到这个参数'json.infer-schema.flatten-nested-columns.enable'='true'。 你可以试一下在source读完整数据,然后通过UDF手动展开潜逃类型。 在 2022-06-30 15:02:55,"小昌同学" 写道: 各位大佬 请教一下就是我kafka的数据是这样的嵌套格式 ARRAY里面嵌套了ROW类型 我这边想直接通过flink sql建表语句拿到最里面的字段的值 我百度找到了

Re:Re:what time support java17 ?

2022-07-04 Thread Xuyang
Hi,社区已经有一个issue[1]在尝试推进支持java17了,可以关注下。[1] https://issues.apache.org/jira/browse/FLINK-15736 在 2022-07-04 17:04:52,"jiangjiguang719" 写道: >这个问题没人解答吗? spark现在已经支持java17了 > > > > > > > > > > > > > > > > > >At 2022-07-02 18:15:10, "jiangjiguang719" wrote: >>hi guys: >>what time can flink support

Re: Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Hangxiang Yu
Hi, Vishal. IIUC, 1. FIFO compaction drops the old data by the configured size in L0, so the old data may be dropped but we could not know. That's why "it's basically a TTL compaction style and It is suited for keeping event log data with very low overhead (query log for example)". If it's the

Re: [CEP] State compatibility when a pattern is modified

2022-07-04 Thread Dian Fu
Hi Nicolas, The state isn't compatible, besides, as the partial matches will also not be dropped and so the behavior is undefined. The original events will be dropped after being evaluated and so when the pattern changes, there is no way to evaluate them against the new pattern. Regards, Dian

Re: Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Zhanghao Chen
Hi Vishal, FIFO compaction with RocksDB can result in data loss as it discards the oldest SST file by a size-based trigger or based on TTL (an internal RocksDB option, irrelevant to the TTL setting in Flink). So use with cause. I'm not sure why you observed SST file compactions, did you

Re: How to mock new DataSource/Sink

2022-07-04 Thread Alexander Fedulov
Hi David, I started working on FLIP-238 exactly with the concerns you've mentioned in mind. It is currently in development, feel free to join the discussion [1]. If you need something ASAP and are not interested in rate-limiting functionality, you could drop in this [2] class into your tests

Re: Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Alexander Fedulov
Hi Vishal, I am not sure I get what you mean by the question #2: >2. SST files get created each time a checkpoint is triggered. At this point, does the data for a given key get merged in case the initial data was read from an SST file while the update must have happened in memory? Could you

Re: Flink SQL Client 解析 Protobuf

2022-07-04 Thread Min Tu
多谢,我们会去试一下这个PR。 On Sun, Jul 3, 2022 at 5:21 PM Benchao Li wrote: > Hi Min, > > ProtoBuf Format[1] 有一个相关的PR,我们正在推进review和改进,预期是在1.16 > 中可以release出去。你也可以基于这个PR的代码编译打包一下,提前试用一下。 > > [1] https://github.com/apache/flink/pull/14376 > > Min Tu 于2022年7月4日周一 02:38写道: > >> 各位大佬, >> >> 我们想利用 Flink SQL

Re: ContinuousFileMonitoringFunction retrieved invalid state.

2022-07-04 Thread Vishal Surana
Wow! This is bad! I am using reactive mode and this is indeed the issue. This should have been urgently patched as jobs with upgraded Flink version are in very precarious position. With all the other upgrades (rocksdb, etc.) going into 1.15.0 there's no easy rollback. On Fri, Jul 1, 2022 at 8:14

Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Vishal Surana
In my load tests, I've found FIFO compaction to offer the best performance as my job needs state only for so long. However, this particular statement in RocksDB documentation concerns me: "Since we never rewrite the key-value pair, we also don't ever apply the compaction filter on the keys."

[CEP] State compatibility when a pattern is modified

2022-07-04 Thread Nicolas Richard
Hello! What happens to partial matches if I deploy a new version of a CEP application with a modified pattern. * Application v1 looks for pattern a b c * Application v2 looks for pattern a b+ d c Is state compatible? Are partial matches dropped when a new version of an application is

RE: Unaligned checkpoint waiting in 'start delay' with AsyncDataStream

2022-07-04 Thread Nathan Sharp
Thank you for trying it out! Hopefully, there is just some setting that needs to be changed. I have an Ubuntu VM where I created a single node Docker swarm. Then I used the following command to run Flink 1.15.0 using the docker-compose.yml file in the repository: docker stack up -c

Re: Alternate Forms of Deserialization in Flink SQL CLI

2022-07-04 Thread Martijn Visser
Hi Eric, It would basically mean implementing Protobuf as is now done for AVRO or JSON. There is a pull request on adding Protobuf support currently being reviewed, hopefully that will make it in Flink 1.16 [1]. You could consider trying that out of course. Best regards, Martijn [1]

Re: Does Flink 1.14 support comsume Kafka 0.9?

2022-07-04 Thread Martijn Visser
Hi, You can't use the released Flink 1.14 to connect to a Kafka 0.9 cluster. If you want to do it yourself, you would have to change the Flink source code to incorporate the downgrade and all the API changes from the currently used Kafka Client version (v2.4.1) [1] to a version that is still

Re: 请教下flink的提交方式

2022-07-04 Thread Weihua Hu
Hi, 根据你的描述你应该使用的 session cluster,并通过命令行提交作业,这种情况下的确只能在日志中看到 job id,并且级别的是 INFO. 可以尝试通过 RestAPI 提交任务[1],这种方式会返回 JobID。但是整体提交流程改动比较大,建议把 client 侧的日志级别调整成 INFO,不会打印非常多的日志 [1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jars-jarid-run Best, Weihua On Mon, Jul 4, 2022 at

Re: Unaligned checkpoint waiting in 'start delay' with AsyncDataStream

2022-07-04 Thread Chesnay Schepler
I ran your code in the IDE and it worked just fine; checkpoints are being completed and results are printed to the console. Can you expand on how you run the job? On 02/07/2022 00:26, Nathan Sharp wrote: I am attempting to use unaligned checkpointing with AsyncDataStream, but the checkpoints

Re: 请教下flink的提交方式

2022-07-04 Thread Lijie Wang
Hi, 拿不到任务 id 是指 Flink job id 么? 另外你的部署方式是什么样子的? 如果是 session/perjob 这种在 client 端编译 job graph,你可以在 main 方法中打印 job id 的 Best, Lijie sherlock zw 于2022年7月4日周一 17:51写道: > 目前我需要去监控已经提交的flink任务, >

Re: How to mock new DataSource/Sink

2022-07-04 Thread Chesnay Schepler
It is indeed not easy to mock sources/sink with the new interfaces. There is an effort to make this easier for sources in the future (FLIP-238 ). For the time being I'd stick with the

Re: 反复提交Job会导致TaskManager 元空间oom?

2022-07-04 Thread tison
你这个邮件地址有点不对劲,可能被邮件列表拦截了,或者没有先通过 user-zh-subscr...@flink.apache.org 订阅。因为上面已经有人回复你了,但是你好像还没看到。 如何订阅邮件列表可以自己搜一下,我记得 Flink China 写过文章手把手教学。在此之前你可以通过 https://lists.apache.org/thread/7v19bkqqwp49vpdmkcr4yvdh6bn5bfkm 看看其他人的回复。 Best, tison. LuNing Wang 于2022年7月4日周一 17:11写道: >

Re: 反复提交Job会导致TaskManager 元空间oom?

2022-07-04 Thread LuNing Wang
目前我觉得最好的解决办法就是定期重启JM和TM进程。 知而不惑 于2022年7月4日周一 17:07写道: > 这个问题没有人解答吗? > > > > > --原始邮件-- > 发件人: > "知而不惑" > < > chenliangv...@qq.com; >

??????????????Job??????TaskManager ??????oom?

2022-07-04 Thread ????????
?? ---- ??: ""

Re:what time support java17 ?

2022-07-04 Thread jiangjiguang719
这个问题没人解答吗? spark现在已经支持java17了 At 2022-07-02 18:15:10, "jiangjiguang719" wrote: >hi guys: >what time can flink support java17 ? I urgently want to use java17 new >features. > > >I have try to upgrade to java17,but failed. Has anyone succeeded? > > >thanks, >jiguang

How to mock new DataSource/Sink

2022-07-04 Thread David Jost
Hi, we are currently looking at replacing our sinks and sources with the respective counterparts using the 'new' data source/sink API (mainly Kafka). What holds us back is that we are not sure how to test the pipeline with mocked sources/sinks. Up till now, we somewhat followed the 'Testing