flink table??????????????????????????????

2020-07-21 Thread ??????
flink??g_unit()??kafka??g_unit??g_summary??g_line from pyflink.datastream import StreamExecutionEnvironment, TimeCharacteristic, CheckpointingMode

flink1.11启动问题

2020-07-21 Thread 酷酷的浑蛋
服了啊,这个flink1.11启动怎么净是问题啊 我1.7,1.8,1.9 都没有问题,到11就不行 ./bin/flink run -m yarn-cluster -yqu root.rt_constant -ys 2 -yjm 1024 -yjm 1024 -ynm sql_test ./examples/batch/WordCount.jar --input hdfs://xxx/data/wangty/LICENSE-2.0.txt --output hdfs://xxx/data/wangty/a 报错: Caused by:

Re: Key group is not in KeyGroupRange

2020-07-21 Thread Robert Metzger
Looks like this thread is already being resolved in https://issues.apache.org/jira/browse/FLINK-18637 On Tue, Jul 21, 2020 at 10:26 AM Robert Metzger wrote: > Hi Ori, > thanks a lot for your email. Which version of Flink are you using? > > On Tue, Jul 14, 2020 at 1:03 PM Ori Popowski wrote: >

????: ?????? pyflink1.11.0window

2020-07-21 Thread chengyanan1...@foxmail.com
Hi?? kafkaTableSink??AppendStreamTableSinkfinal_result2source1??group by??final_result??mysqljoingroup by??StreamTable RetractStreamTableSink??

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Rui Li
Hey Paul, Could you please share more about your job, e.g. the schema of your Hive table, whether it's partitioned, and the table properties you've set? On Tue, Jul 21, 2020 at 4:02 PM Paul Lam wrote: > Hi, > > I'm doing a POC on Hive connectors and find that when writing orc format > Hive

Re: Map type param escaping :

2020-07-21 Thread Robert Metzger
Cool! Thanks for sharing the solution! On Tue, Jul 14, 2020 at 11:39 PM Bohinski, Kevin wrote: > Figured it out, pulled StructuredOptionsSplitter into a debugger and was > able to get it working with: > > -Dkubernetes.jobmanager.annotations="\"KEY:\"\"V:A:L:U:E\"\"\"" > > > > Best > > kevin >

Re: flink解析kafka json数据

2020-07-21 Thread Leonard Xu
Hi, 我理解应该做不到,因为这两个format参数在format里就做的。 json.ignore-parse-errors 是在 format解析时跳过解析失败的数据继续解析下一行,json.fail-on-missing-field 是标记如果字段少时是否失败还是继续(缺少的字段用null补上) 这两个不能同时为ture,语义上就是互斥的。 Best Leonard Xu > 在 2020年7月21日,16:08,Dream-底限 写道: > >

Re: Possible bug clean up savepoint dump into filesystem and low network IO starting from savepoint

2020-07-21 Thread David Magalhães
Hi Till, I'm using s3:// schema, but not sure what was the default used if s3a or s3p. then the state backend should try to directly write to the target file > system That was the behaviour that I saw the second time I've run this with more slots. Does the savepoint write directly to S3 via

Re: Saving file to the ftp server

2020-07-21 Thread Robert Metzger
Hi Paweł, I believe this is a bug. I don't think many people use Flink to write to an FTP server, that's why this hasn't been addressed yet. There's probably something off with the semantics of distributed vs non-distributed file systems. I guess the easiest way to resolve this is by running your

flink解析kafka json数据

2020-07-21 Thread Dream-底限
hi 我这面在使用sql api解析kafka json数据,在创建表的时候发现json数据解析的时候有下面两项,这两项如果开启那么解析失败的数据是会被丢掉吗,有没有方式可以把解析失败的数据打到外部存储 json.ignore-parse-errors son.fail-on-missing-field

FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
Hi, I'm doing a POC on Hive connectors and find that when writing orc format Hive tables, the job failed with FileNotFoundException right after ingesting data (full stacktrace at the bottom of the mail). The error can be steadily reproduced in my environment, which is Hadoop 2.6.5(CDH-5.6.0),

Re: flink1.11 pyflink stream job 退出

2020-07-21 Thread Xingbo Huang
是的,execute是1.10及以前使用的,execute_sql是1.11之后推荐使用的 Best, Xingbo lgs <9925...@qq.com> 于2020年7月21日周二 下午3:57写道: > 谢谢。加上后就可以了。 > > 改成原来的sql_update然后st_env.execute("job")好像也可以。 > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ >

Re: flink1.11 pyflink stream job 退出

2020-07-21 Thread lgs
谢谢。加上后就可以了。 改成原来的sql_update然后st_env.execute("job")好像也可以。 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink1.11 pyflink stream job 退出

2020-07-21 Thread Xingbo Huang
Hi, execute_sql是一个异步非阻塞的方法,所以你需要在你的代码末尾加上 sql_result.get_job_client().get_job_execution_result().result() 对此我已经创建了JIRA[1] [1] https://issues.apache.org/jira/browse/FLINK-18598 Best, Xingbo lgs <9925...@qq.com> 于2020年7月21日周二 下午3:35写道: > python flink_cep_example.py 过几秒就退出了,应该一直运行不退出的啊。 >

flink1.11 pyflink stream job 退出

2020-07-21 Thread lgs
python flink_cep_example.py 过几秒就退出了,应该一直运行不退出的啊。 代码如下,使用了MATCH_RECOGNIZE: s_env = StreamExecutionEnvironment.get_execution_environment() b_s_settings = EnvironmentSettings.new_instance().use_blink_planner().in_streaming_mode().build() st_env = StreamTableEnvironment.create(s_env,

Re: flink 1.11 cdc: kafka中存了canal-json格式的多张表信息,需要按表解析做处理,sink至不同的下游,要怎么支持?

2020-07-21 Thread godfrey he
http://apache-flink.147419.n8.nabble.com/flink-1-10-sql-kafka-format-json-schema-json-object-tt4665.html 这个邮件里提到了类似的问题。 https://issues.apache.org/jira/browse/FLINK-18002 这个issue完成后(1.12),你可以将 “data”,“mysqlType”等格式不确定的字段定义为String类型, 下游通过udf自己再解析对应的json Best, Godfrey jindy_liu

Re: Flink 1.11 submit job timed out

2020-07-21 Thread SmileSmile
Hi,Congxian 因为是测试环境,没有配置HA,目前看到的信息,就是JM刷出来大量的no hostname could be resolved,jm失联,作业提交失败。 将jm内存配置为10g也是一样的情况(jobmanager.memory.pprocesa.size:10240m)。 在同一个环境将版本回退到1.10没有出现该问题,也不会刷如上报错。 是否有其他排查思路? Best! | | a511955993 | | 邮箱:a511955...@163.com | 签名由 网易邮箱大师 定制 On 07/16/2020 13:17, Congxian

Re: Kafka Rate Limit in FlinkConsumer ?

2020-07-21 Thread Till Rohrmann
Two quick comments: With unaligned checkpoints which are released with Flink 1.11.0, the problem of slow checkpoints under backpressure has been resolved/mitigated to a good extent. Moreover, the community wants to work on event time alignment for sources in the next release. This should prevent

回复: (无主题)

2020-07-21 Thread 罗显宴
hi,我想到解决办法了,可以用全局window,我一直以为是要分区在做窗口运算其实可以直接用timewindowAll来算,然后用状态保存就够了 val result = num.timeWindowAll(Time.seconds(20)) //.trigger(ContinuousEventTimeTrigger.of(Time.seconds(20))) .process(new ProcessAllWindowFunction[IncreaseNumPerHour,IncreasePerHour,TimeWindow] { private var

Re: Possible bug clean up savepoint dump into filesystem and low network IO starting from savepoint

2020-07-21 Thread Till Rohrmann
Hi David, which S3 file system implementation are you using? If I'm not mistaken, then the state backend should try to directly write to the target file system. If this should result in temporary files on your TM, then this might be a problem of the file system implementation. Having access to

回复: (无主题)

2020-07-21 Thread 罗显宴
hi, 我觉得你说的是对的,我刚才没有理解trigger,我以为trigger当成一个1小时窗口的20分钟的小窗口了,其实我要的结果就是每20分钟有多少个窗口比如当前20分钟有A类型、B类型和C类型三个窗口,那么输出就是3,后来20分钟有A类型、B类型和D类型的结果,那么A类型和B类型是重复的只有D不是重复的,结果为4 | | 罗显宴 | | 邮箱:15927482...@163.com | 签名由网易邮箱大师定制 在2020年7月21日 13:58,shizk233 写道: Hi,

Re: flink app crashed

2020-07-21 Thread Yang Wang
Could you check whether the Flink job has been submitted successfully? You could find some logs like the following in JobManager. Starting execution of job ... Also it will help a lot if you could share the full jobmanager and client log. Best, Yang Rainie Li 于2020年7月16日周四 上午4:03写道: > These

Handle idle kafka source in Flink 1.9

2020-07-21 Thread bat man
Hello, I have a pipeline which consumes data from a Kafka source. Since, the partitions are partitioned by device_id in case a group of devices is down some partitions will not get normal flow of data. I understand from documentation here[1] in flink 1.11 one can declare the source idle -

<    1   2