Re: 求问为什么KafkaDynamicSource 在批模式下不能运行

2020-12-26 Thread Shengkai Fang
现在Flink 走的是流批一体, 为什么说 Kafka 不支持批模式呢? wxpcc 于2020年12月25日周五 下午7:10写道: > kafka在我们这边场景上除了用来存放实时流式数据,还会用作临时大数据量的存储,主要用于: > > 1. > > 数据同步时,将全量数据同步到一个临时的kafka中,增量数据持续性同步到kafka中,目前我们都使用流模式消费其中的数据,就会有手动停止,或者借助指标等自动停止流式任务 > 2. 数据恢复时 > 3. 临时查看某个时间区间的数据 > > 如果批模式 sql能够完成这些事情的话那该多好 > > > > -- > Sent from:

Re: rowtime的时区问题

2020-12-26 Thread Shengkai Fang
社区正在解决这个问题,1.13应该会有一个系统性地修复。 CC Leonard 作为work around,可以参考下这个博客[1] [1] https://blog.csdn.net/tzs_1041218129/article/details/109064015?utm_medium=distribute.pc_relevant.none-task-blog-title-3=1001.2101.3001.4242 ゞ野蠻遊戲χ 于2020年12月26日周六 下午11:16写道: > Hi 大家好 > >

Re: Flink SQL并发度设置问题

2020-12-26 Thread Shengkai Fang
可以通过该配置[1]来设置 [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/config.html#table-exec-resource-default-parallelism 赵一旦 于2020年12月27日周日 下午12:44写道: > 了解下多少数据量呀,128的并发其实很高了感觉。 > > guaishushu1...@163.com 于2020年12月26日周六 下午5:39写道: > > > Flink > > >

Re: Re: checkpoint delay consume message

2020-12-26 Thread nick toker
Hi, Hi, We think we are using the default values unless we are missing something. So this doesn't explain the problem we are facing. Could you please tell us how to choose synchronous or asynchronous checkpoints just to be sure we are using the correct configuration ? BR, Nick ‫בתאריך יום ה׳,

Re: Long latency when consuming a message from KAFKA and checkpoint is enabled

2020-12-26 Thread nick toker
Hi any idea? is it a bug? regards' nick ‫בתאריך יום ד׳, 23 בדצמ׳ 2020 ב-11:10 מאת ‪nick toker‬‏ <‪ nick.toker@gmail.com‬‏>:‬ > Hello > > We noticed the following behavior: > If we enable the flink checkpoints, we saw that there is a delay between > the time we write a message to the KAFKA

Re: Dynamic StreamingFileSink

2020-12-26 Thread Rafi Aroch
Hi Sidney, Have a look at implementing a BucketAssigner for StreamingFileSink: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment Rafi On Sat, Dec 26, 2020 at 11:48 PM Sidney Feiner wrote: > Hey, > I would like to create a dynamic

Re: Chaining DataStream API Functions results in collision in Flink Stateful Function

2020-12-26 Thread Le Xu
Hello! I seem to wire the wrong function by attaching the first function's output to the remote reply function of the second function. The process works great now. Thanks! Le On Sat, Dec 26, 2020 at 11:23 PM Le Xu wrote: > I apologize -- I meant to point out line 152, where context.send was

Re: Chaining DataStream API Functions results in collision in Flink Stateful Function

2020-12-26 Thread Le Xu
I apologize -- I meant to point out line 152, where context.send was used. On Sat, Dec 26, 2020 at 11:21 PM Le Xu wrote: > Hi Igal: > > Thanks for the suggestion -- I changed the implementation based on your > suggestion by attaching the second function right after the first one using > the

Re: Chaining DataStream API Functions results in collision in Flink Stateful Function

2020-12-26 Thread Le Xu
Hi Igal: Thanks for the suggestion -- I changed the implementation based on your suggestion by attaching the second function right after the first one using the same builder. The only difference is that except the first function send to egress -- it now sends to the the second function and then

Re: Flink SQL并发度设置问题

2020-12-26 Thread 赵一旦
了解下多少数据量呀,128的并发其实很高了感觉。 guaishushu1...@163.com 于2020年12月26日周六 下午5:39写道: > Flink > SQL中Source和sink可以通过修改connector配置实现并发度配置,而其他算子的并发度都是根据Source并发度来设置的,这样最多是128个并发度。但是有些算子做聚合等处理,128并发明显不够这个应该怎么解决呢?支持通过配置设置其他算子并发度吗? > > > > guaishushu1...@163.com >

????????

2020-12-26 Thread ?I ?? ?? ?I
??

Re: Issue in WordCount Example with DataStream API in BATCH RuntimeExecutionMode

2020-12-26 Thread Derek Sheng
Thank you both very much! Happy holidays! Aljoscha Krettek 于2020年12月24日周四 下午4:00写道: > Thanks for reporting this! This is not the expected behaviour, I created > a Jira Issue: https://issues.apache.org/jira/browse/FLINK-20764. > > Best, > Aljoscha > > On 23.12.20 22:26, David Anderson wrote: > >

Re: Flink Stateful Function: The program's entry point class not found in the jar file

2020-12-26 Thread Le Xu
Thanks Igal! I might be missing something here. I did place statefun-flink-distribution as part of my dependency in the pom (see line 46 at [1]). Is there a correct way to include the jar? I'm having the same problem across many examples I'm running. [1]

Dynamic StreamingFileSink

2020-12-26 Thread Sidney Feiner
Hey, I would like to create a dynamic StreamingFileSink for my Streaming pipeline. By dynamic, I mean that it will write to a different directory based on the input. For example, redirect the row to a different directory based on the first 2 characters of the input, so if the content I'm writing

Re: Flink Stateful Function: The program's entry point class not found in the jar file

2020-12-26 Thread Igal Shilman
Hello :-) It seems like in your attached pom you are not bundling the dependencies. Check out the docs here [1]. [1] https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/deployment-and-operations/packaging.html#flink-jar On Wed, Dec 23, 2020 at 3:07 AM Le Xu wrote: > Hello!

Re: Chaining DataStream API Functions results in collision in Flink Stateful Function

2020-12-26 Thread Igal Shilman
Hi Le, You can attach many different functions in a single StateFun builder, and let them message each other. In your example, you can make the "Greet" function message Greet2 directly (in addition to emitting a message as an egress). Embedding multiple copies of StateFun within a Datastream

Flink reads data from JDBC table only on startup

2020-12-26 Thread Taras Moisiuk
Hi everyone! I'm using Flink 1.12.0 with SQL API. I'm developing a streaming job with join and insertion into postgreSQL. There is two tables in join: 1. Dynamic table based on kafka topic 2. Small lookup JDBC table >From what I can see Flink job reads data from JDBC table only on startup and

rowtime??????????

2020-12-26 Thread ?g???U?[????
Hi ?? DataStreamtableeventTimerowtime??rowtime8??

table rowtime timezome problem

2020-12-26 Thread ?g???U?[????
Hi all When DataStream is converted to table, eventTime is converted to rowTime. Rowtime is 8 hours slow. How to solve this problem? Thanks Jiazhi

Flink SQL并发度设置问题

2020-12-26 Thread guaishushu1...@163.com
Flink SQL中Source和sink可以通过修改connector配置实现并发度配置,而其他算子的并发度都是根据Source并发度来设置的,这样最多是128个并发度。但是有些算子做聚合等处理,128并发明显不够这个应该怎么解决呢?支持通过配置设置其他算子并发度吗? guaishushu1...@163.com

Re: Flink 操作hive 一些疑问

2020-12-26 Thread Jacob
Thanks! 还是决定暂时用两个job执行吧 一个job执行流处理生成数据 另一个job(使用flink的hive功能)执行批处理。 主要验证一下第二个job的使用相比MapReduce节省了多少资源 - Thanks! Jacob -- Sent from: http://apache-flink.147419.n8.nabble.com/