Re: How to install Flink + YARN?

2019-12-06 Thread Pankaj Chand
Is it required to use exactly the same versions of Hadoop as the pre-bundled hadoop version? I'm using Hadoop 2.7.1 cluster with Flink 1.9.1 and the corresponding Prebundled Hadoop 2.7.5. When I submit a job using: [vagrant@node1 flink]$ ./bin/flink run -m yarn-cluster

?????? flink??50000??????????mysql????????????????????????????????????????????????????????

2019-12-06 Thread sun
---- ??:"acoldbear"<1392427...@qq.com; :2019??12??7??(??) 12:33 ??:"user-zh"

?????? flink??50000??????????mysql????????????????????????????????????????????????????????

2019-12-06 Thread sun
?? ---- ??:"18612537914"<18612537...@163.com; :2019??12??7??(??) 12:32 ??:"user-zh"

Re: flink,50000条数据写入mysql太慢怎么办啊,除了多线程还有其他方法吗,已经是批量写入了

2019-12-06 Thread 18612537914
可以写到hive 直接sqoop导入mysql。 发自我的iPhone > 在 2019年12月7日,下午12:20,sun <1392427...@qq.com> 写道: > > hi请问: > flink5条数据写入mysql太慢怎么办啊,除了多线程还有其他方法吗,已经是批量写入了, > 每次写5条的

Re: 如果用flink sql持续查询过去30分钟登录网站的人数?

2019-12-06 Thread 陈帅
你们这个平台还挺方便快速验证的,是扩展了Flink SQL吗? 虽然没有完全解决我的问题,但还是要谢谢你。 Yuan,Youjun 于2019年12月5日周四 上午10:41写道: > 可以用30分钟的range over窗口来处理,但是你提到两个0值得输出恐怕做不到,没有数据,没有产出。 > 假设你得输入包含ts和userid两个字段,分别为时间戳和用户id,那么SQL应该这样: > INSERT INTO mysink > SELECT >ts, userid, >COUNT(userid) >OVER (PARTITION BY userid

Re: StreamingFileSink doesn't close multipart uploads to s3?

2019-12-06 Thread Li Peng
Ok I seem to have solved the issue by enabling checkpointing. Based on the docs (I'm using 1.9.0), it seemed like only StreamingFileSink.forBulkFormat() should've required checkpointing, but based on this

StreamingFileSink doesn't close multipart uploads to s3?

2019-12-06 Thread Li Peng
Hey folks, I'm trying to get StreamingFileSink to write to s3 every minute, with flink-s3-fs-hadoop, and based on the default rolling policy, which is configured to "roll" every 60 seconds, I thought that would be automatic (I interpreted rolling to mean actually close a multipart upload to s3).

Re: What S3 Permissions does StreamingFileSink need?

2019-12-06 Thread Li Peng
Ah, I figured it out after all, turns out it was due to KMS encryption on the bucket; needed to add KMS permissions for the IAM role, otherwise there is an unauthorized error. Thanks for your help! On Fri, Dec 6, 2019 at 2:34 AM Khachatryan Roman < khachatryan.ro...@gmail.com> wrote: > Hey Li, >

Re: [DISCUSS] Adding e2e tests for Flink's Mesos integration

2019-12-06 Thread Piyush Narang
+1 from our end as well. At Criteo, we are running some Flink jobs on Mesos in production to compute short term features for machine learning. We’d love to help out and contribute on this initiative. Thanks, -- Piyush From: Till Rohrmann Date: Friday, December 6, 2019 at 8:10 AM To: dev Cc:

FLINK WEEKLY 2019/48

2019-12-06 Thread tison
FLINK WEEKLY 2019/48 感谢社区同学 forideal 负责编写本期 FLINK WEEKLY! 用户问题 - 如何成为flink的contributor

Re: Joining multiple temporal tables

2019-12-06 Thread Benoît Paris
Hi all! I believe this is a duplicate of another JIRA: https://issues.apache.org/jira/browse/FLINK-14200; where the query side does not accept a Table, only a TableSource (or has planner rule issues). I think in this case, the Logical Correlate extracted from the Temporal Table join transforms

Re: Joining multiple temporal tables

2019-12-06 Thread Kurt Young
Hi Chris, If you only interest the latest data of the dimension table, maybe you can try the temporal table join: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sql.html#operations see "Join with Temporal Table" Best, Kurt On Fri, Dec 6, 2019 at 11:13 PM Fabian Hueske

Re: Joining multiple temporal tables

2019-12-06 Thread Fabian Hueske
Thank you! Please let me know if the workaround works for you. Best, Fabian Am Fr., 6. Dez. 2019 um 16:11 Uhr schrieb Chris Miller : > Hi Fabian, > > Thanks for confirming the issue and suggesting a workaround - I'll give > that a try. I've created a JIRA issue as you suggested, >

Re: Joining multiple temporal tables

2019-12-06 Thread Chris Miller
Hi Fabian, Thanks for confirming the issue and suggesting a workaround - I'll give that a try. I've created a JIRA issue as you suggested, https://issues.apache.org/jira/browse/FLINK-15112 Many thanks, Chris -- Original Message -- From: "Fabian Hueske" To: "Chris Miller" Cc:

Re: Joining multiple temporal tables

2019-12-06 Thread Fabian Hueske
Hi Chris, Your query looks OK to me. Moreover, you should get a SQLParseException (or something similar) if it wouldn't be valid SQL. Hence, I assume you are running in a bug in one of the optimizer rules. I tried to reproduce the problem on the SQL training environment and couldn't write a

Re: Need help using AggregateFunction instead of FoldFunction

2019-12-06 Thread devinbost
I think there might be a bug in `.window(EventTimeSessionWindows.withGap(Time.seconds(5)))` (unless I'm just not using it correctly) because I'm able to get output when I use the simpler window `.timeWindow(Time.seconds(5))` However, I don't get any output when I used the session-based window.

Re: Flink 1.9.1 allocating more containers than needed

2019-12-06 Thread Chesnay Schepler
I would expect January. With 1.8.3 release being underway, 1.10 feature freeze coming close and, of course, Christmas, it seems unlikely that we'll manage to pump out another bugfix release in December. On 06/12/2019 15:18, eSKa wrote: Thank you for quick reply. Will wait for 1.9.2 then. I

Re: Flink 1.9.1 allocating more containers than needed

2019-12-06 Thread eSKa
Thank you for quick reply. Will wait for 1.9.2 then. I believe you dont have any estimates on when it can happen? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink 1.9.1 allocating more containers than needed

2019-12-06 Thread Chesnay Schepler
Note that FLINK-10848 is included in 1.9.X, but it didn't fix the issue completely. On 06/12/2019 15:10, Chesnay Schepler wrote: There are some follow-up issues that are fixed for 1.9.2; release date for that is TBD. https://issues.apache.org/jira/browse/FLINK-12342

Re: Flink 1.9.1 allocating more containers than needed

2019-12-06 Thread Chesnay Schepler
There are some follow-up issues that are fixed for 1.9.2; release date for that is TBD. https://issues.apache.org/jira/browse/FLINK-12342 https://issues.apache.org/jira/browse/FLINK-13184 On 06/12/2019 15:08, eSKa wrote: Hello, recently we have upgraded our environment to from 1.6.4 to 1.9.1.

Re: Row arity of from does not match serializers.

2019-12-06 Thread Fabian Hueske
Hi, The inline lambda MapFunction produces a Row with 12 String fields (12 calls to String.join()). You use RowTypeInfo rowTypeDNS to declare the return type of the lambda MapFunction. However, rowTypeDNS is defined with much more String fields. The exception tells you that the number of fields

Flink 1.9.1 allocating more containers than needed

2019-12-06 Thread eSKa
Hello, recently we have upgraded our environment to from 1.6.4 to 1.9.1. We started to notice similar behaviour we met in 1.6.2, which was allocating more containers on yarn then are needed by job - i think it was fixed by https://issues.apache.org/jira/browse/FLINK-10848, but that one is still

Re: [DISCUSS] Adding e2e tests for Flink's Mesos integration

2019-12-06 Thread Till Rohrmann
Big +1 for adding a fully working e2e test for Flink's Mesos integration. Ideally we would have it ready for the 1.10 release. The lack of such a test has bitten us already multiple times. In general I would prefer to use the official image if possible since it frees us from maintaining our own

Re: Basic question about flink programms

2019-12-06 Thread KristoffSC
Hi, Im having the same problem now. What is your approach now after gaining some experience? Also do you use Spring DI to setup/initialize your jobs/process functions? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

回复: flink1.9.0 standalone模式高可用配置问题

2019-12-06 Thread pengchenglin
我刚给你测了一下,两个节点都启动jobmanager.sh ip port 后,kill掉其中一个,另一个的web界面提示正在进行选举,说明是高可用状态。 发件人: pengchenglin 发送时间: 2019-12-06 19:47 收件人: user-zh 主题: flink1.9.0 standalone模式高可用配置问题 各位: 1.9.0版本的flink,两台机器,flink-conf.yaml配置相同(high-availability.zookeeper.path.root和high-availability.cluster-id也相同),

flink1.9.0 standalone模式高可用配置问题

2019-12-06 Thread pengchenglin
各位: 1.9.0版本的flink,两台机器,flink-conf.yaml配置相同(high-availability.zookeeper.path.root和high-availability.cluster-id也相同), 先在机器1运行bin/jobmanager.sh start ip1 port 然后到机器2运行bin/jobmanager.sh start ip2 port 打开ip1:port,ip2:port同时存在,并不像1.7.2一样,跳转到ip1:port 请问一下,这是没有配置好,还是1.9.0版本的高可用就是如此。

Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-06 Thread Benchao Li
+1 for dropping. Zhenghua Gao 于2019年12月5日周四 下午4:05写道: > +1 for dropping. > > *Best Regards,* > *Zhenghua Gao* > > > On Thu, Dec 5, 2019 at 11:08 AM Dian Fu wrote: > > > +1 for dropping them. > > > > Just FYI: there was a similar discussion few months ago [1]. > > > > [1] > > >

Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-06 Thread Benchao Li
+1 for dropping. Zhenghua Gao 于2019年12月5日周四 下午4:05写道: > +1 for dropping. > > *Best Regards,* > *Zhenghua Gao* > > > On Thu, Dec 5, 2019 at 11:08 AM Dian Fu wrote: > > > +1 for dropping them. > > > > Just FYI: there was a similar discussion few months ago [1]. > > > > [1] > > >

Re: What S3 Permissions does StreamingFileSink need?

2019-12-06 Thread Khachatryan Roman
Hey Li, > my permissions is as listed above As I understand it, it's a terraform script above. But what are the actual permissions in AWS? And it also makes sense to make sure that they are associated with the right role and role with user. > Maybe I need to add the directory level as a

KeyBy/Rebalance overhead?

2019-12-06 Thread Komal Mariam
Hello everyone, I want to get some insights on the KeyBy (and Rebalance) operations as according to my understanding they partition our tasks over the defined parallelism and thus should make our pipeline faster. I am reading a topic which contains 170,000,000 pre-stored records with 11 Kafka

[DISCUSS] Adding e2e tests for Flink's Mesos integration

2019-12-06 Thread Yangze Guo
Hi, all, Currently, there is no end to end test or IT case for Mesos deployment while the common deployment related developing would inevitably touch the logic of this component. Thus, some work needs to be done to guarantee experience for both Meos users and contributors. After offline

[flink-sql]????tableEnv.sqlUpdate(ddl);??????????????????rowtime??

2019-12-06 Thread ????
??tableEnv.sqlUpdate(ddl);?? rowtimerowtime?? ??flink??? ??csvkafka sql?? CREATE

Re: How to explain the latency at different source injection rate?

2019-12-06 Thread Till Rohrmann
Hi Rui, it is hard to explain the results you are observing without knowing the complete benchmark setup. For example, it would be interesting to know the exact details of your job and the way you are measuring latencies. W/o this information I would only be able to guess things. Cheers, Till

Re: User program failures cause JobManager to be shutdown

2019-12-06 Thread Khachatryan Roman
Hi Dongwon, This should work but it could also interfere with Flink itself exiting in case of a fatal error. Regards, Roman On Fri, Dec 6, 2019 at 2:54 AM Dongwon Kim wrote: > FYI, we've launched a session cluster where multiple jobs are managed by a > job manager. If that happens, all the

Re: 关于flink和hadoop版本的问题

2019-12-06 Thread jingwen jingwen
没有什么问题的,只是hadoop2.8和hadoop3.0在一些特性上存在不一样,对于你正常使用flink不受影响 下载flink源码打包一直没有编译成功,需要看下问题的原因,可能是一些maven的源的问题 cljb...@163.com 于2019年12月6日周五 下午4:14写道: > 您好: > > 问一个关于flink和hadoop版本的问题。目前我们生产环境是hadoop3.0+的版本,现在官网上flink1.9+没有直接打包好的捆绑的hadoop3.0+的版本。 > 但是我自己下载flink1.9.1版本,然后下载了 可选组件里的 Pre-bundled

关于flink和hadoop版本的问题

2019-12-06 Thread cljb...@163.com
您好: 问一个关于flink和hadoop版本的问题。目前我们生产环境是hadoop3.0+的版本,现在官网上flink1.9+没有直接打包好的捆绑的hadoop3.0+的版本。 但是我自己下载flink1.9.1版本,然后下载了 可选组件里的 Pre-bundled Hadoop 2.8.3 (asc, sha1) ,并且将这个包放到flink的lib下,也是可以正常操作hadoop的。 请问这样有什么影响吗? 因为自己下载flink源码打包一直没有编译成功。麻烦告知! 感谢! 陈军 cljb...@163.com