flink1.11??Streaming File Sink????

2021-02-22 Thread op
flink1.11??Streaming File Sinkhdfsexactly-once

????

2021-02-22 Thread ??????

Re: Re: 大佬们, keyby()两次, 然后再window(), 会有几个窗口?

2021-02-22 Thread yidan zhao
而如果是连续keyBy,比如.keyBy(xx).keyBy(yy).window()这样keyBy多少此也只最后一个有效,window当然还是只有1个。不会出现多个window的。 yidan zhao 于2021年2月23日周二 下午3:31写道: > 我突然感觉还是沟通问题。window的只有1个。因为你就写了一个window。 > 我指的是flatMap和window是分开的算子,不会是1个算子。 > > hdxg1101300123 于2021年2月22日周一 下午11:37写道: > >> >> 为什么flatmap就是2个 >> >> >> 发自vivo智能手机

Re: Re: 大佬们, keyby()两次, 然后再window(), 会有几个窗口?

2021-02-22 Thread yidan zhao
我突然感觉还是沟通问题。window的只有1个。因为你就写了一个window。 我指的是flatMap和window是分开的算子,不会是1个算子。 hdxg1101300123 于2021年2月22日周一 下午11:37写道: > > 为什么flatmap就是2个 > > > 发自vivo智能手机 > > 不对,看了你描述没看代码。你代码那么写的化是2个哈。因为你keyBy后做了flatMap,再keyBy就是另外一个了哈。 > > > > yidan zhao 于2021年2月22日周一 上午10:31写道: > > > > > 只有最后一个keyBy有效。 > > > > >

Re: Question

2021-02-22 Thread Matthias Pohl
Yes, Flink jobs are deployed using `./bin/flink run`. It will use the configuration in conf/flink-conf.yaml to connect to the Flink cluster. It looks like you don't have the right dependencies loaded onto your classpath. Have you had a look at the documentation about project configuration [1]?

通过普通ddl来读写hive

2021-02-22 Thread silence
问一下社区有没有计划支持普通的ddl(不用hive的catalog)来进行读写hive表吗 现在不支持是有什么考虑吗 -- Sent from: http://apache-flink.147419.n8.nabble.com/

BroadcastState dropped when data deleted in Kafka

2021-02-22 Thread bat man
Hi, I have 2 streams one event data and the other rules. I broadcast the rules stream and then key the data stream on event type. The connected stream is processed thereafter. We faced an issue where the rules data in the topic got deleted because of Kafka retention policy. Post this the existing

大佬们, assignTimestampsAndWatermarks() 支持将eventTime设置为未来的时间吗

2021-02-22 Thread Hongyuan Ma
我想预测某个轨迹点后续5秒的轨迹, 并设置eventTime为未来的时间 我使用AscendingTimestampExtractor 但是报了 WARN Timestamp monotony violated xxx < yyy // 对于每个轨迹点, 预测输出其后续10秒的点, 比如A车10秒时来了一条, B车15秒时来了一条 stream.flatmap() // 预测出A车11~20秒的轨迹, B车16~25秒时的轨迹 .assignTimestamps(new AscendingTimestampExtractor()) //

Re: Configure operator based on key

2021-02-22 Thread Abhinav Sharma
Hi Yidan, Thank you for your reply. I was wondering if there is some way that the process function can kiw which condition fired the trigger. Eg: If I set trigger to fire when he object associated with key have value 2, 8, 10 (3 conditions for the trigger to fire), then if he process function, I

Flink custom trigger use case

2021-02-22 Thread Diwakar Jha
Hello, I'm trying to use a custom trigger for one of my use case. I have a basic logic (as shown below) of using keyBy on the input stream and using a window of 1 min. .keyBy() .window(TumblingEventTimeWindows.of(Time.seconds(60))) .trigger(new CustomTrigger())

Re: Re: Flink SQL 写入Hive问题请教

2021-02-22 Thread Rui Li
是的,hive表必须存在HiveCatalog里才能正常读写 On Tue, Feb 23, 2021 at 10:14 AM yinghua...@163.com wrote: > > Flink的版本是1.11.3,目前我们所有表的catalog类型都是GenericInMemoryCatalog,是不是Hive表要用HiveCatalog才行? > > > > yinghua...@163.com > > 发件人: Rui Li > 发送时间: 2021-02-23 10:05 > 收件人: user-zh > 主题: Re: Re: Flink SQL 写入Hive问题请教

Re: Re: Flink SQL 写入Hive问题请教

2021-02-22 Thread yinghua...@163.com
Flink的版本是1.11.3,目前我们所有表的catalog类型都是GenericInMemoryCatalog,是不是Hive表要用HiveCatalog才行? yinghua...@163.com 发件人: Rui Li 发送时间: 2021-02-23 10:05 收件人: user-zh 主题: Re: Re: Flink SQL 写入Hive问题请教 你好, 用的flink是什么版本呢?另外这张hive表是创建在HiveCatalog里的么? On Tue, Feb 23, 2021 at 9:01 AM 邮件帮助中心 wrote: >

Re: Re: Flink SQL 写入Hive问题请教

2021-02-22 Thread Rui Li
你好, 用的flink是什么版本呢?另外这张hive表是创建在HiveCatalog里的么? On Tue, Feb 23, 2021 at 9:01 AM 邮件帮助中心 wrote: > 我增加调试日志后,发现执行DDL语句创建hive表时,设置了dialect 为hive,现在报错根据堆栈信息是在执行DML语句insert > into时创建Hive表时提示没有连接器的配置 > Table options are: 'is_generic'='false' > 'partition.time-extractor.timestamp-pattern'='$dt $hr' >

Re: Re:Re: Flink SQL 写入Hive问题请教

2021-02-22 Thread eriendeng
在hive catalog下创建kafka source表会在hive metastore中创建一张仅包含元数据的表,hive不可查,flink任务中可以识别并当成hive表读,然后只需要在hive dialect下正常读出写入即可。 参考 https://my.oschina.net/u/2828172/blog/4415970 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re:Re: Flink SQL 写入Hive问题请教

2021-02-22 Thread 邮件帮助中心
我增加调试日志后,发现执行DDL语句创建hive表时,设置了dialect 为hive,现在报错根据堆栈信息是在执行DML语句insert into时创建Hive表时提示没有连接器的配置 Table options are: 'is_generic'='false' 'partition.time-extractor.timestamp-pattern'='$dt $hr' 'sink.partition-commit.delay'='0S' 'sink.partition-commit.policy.kind'='metastore,success-file'

Re: Community chat?

2021-02-22 Thread Yuval Itzchakov
A dedicated Slack would be awesome. On Mon, Feb 22, 2021, 22:57 Sebastián Magrí wrote: > Is there any chat from the community? > > I saw the freenode channel but it's pretty dead. > > A lot of the time a more chat alike venue where to discuss stuff > synchronously or just share ideas turns out

Community chat?

2021-02-22 Thread Sebastián Magrí
Is there any chat from the community? I saw the freenode channel but it's pretty dead. A lot of the time a more chat alike venue where to discuss stuff synchronously or just share ideas turns out very useful and estimulates the community. -- Sebastián Ramírez Magrí

Julia API/Interface for Flink

2021-02-22 Thread Beni Bilme
Hello, Is there a julia api or interface for using flink? Thanks in advance for any response. Beni

回复: Re: 大佬们, keyby()两次, 然后再window(), 会有几个窗口?

2021-02-22 Thread hdxg1101300123
为什么flatmap就是2个 发自vivo智能手机 > 不对,看了你描述没看代码。你代码那么写的化是2个哈。因为你keyBy后做了flatMap,再keyBy就是另外一个了哈。 > > yidan zhao 于2021年2月22日周一 上午10:31写道: > > > 只有最后一个keyBy有效。 > > > > Hongyuan Ma 于2021年2月21日周日 下午10:59写道: > > > >> 大佬们, 如果keyby两次然后再调用window()的话是只根据最后一次keyby的键生成n个窗口, > >> 还是在前一次keyby的基础上生成m*n个窗口?

WatermarkStrategy.for_bounded_out_of_orderness in table API

2021-02-22 Thread joris.vanagtmaal
Can i set the watermark strategy to bounded out of orderness when using the table api and sql DDL to assign watermarks? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Install/Run Streaming Anomaly Detection R package in Flink

2021-02-22 Thread Robert Cullen
My customer wants us to install this package in our Flink Cluster: https://github.com/twitter/AnomalyDetection One of our engineers developed a python version: https://pypi.org/project/streaming-anomaly-detection/ Is there a way to install this in our cluster? -- Robert Cullen 240-475-4490

Re: Question

2021-02-22 Thread Matthias Pohl
Hi, running your job from within your IDE with no specific configuration provided (like the Flink job examples provided by the Flink [1]) means that you spin up a local Flink cluster (see MiniCluster [2]). This does not have the web UI enabled by default. You could enable it by calling

trying to understand watermark effect on rolling average

2021-02-22 Thread joris.vanagtmaal
I'm trying to calculate a simple rolling average using pyflink, but somehow the last rows streaming in seem to be excluded, which i expected to be the result of data arriving out of order. However i fail to understand why. exec_env = StreamExecutionEnvironment.get_execution_environment()

Re: [Statefun] Dynamic behavior

2021-02-22 Thread Igal Shilman
Hi Miguel, I think that there are a couple of ways to achieve this, and it really depends on your specific use case, and the trade-offs that you are willing to accept. For example, one way to approach this: - Suppose you have an external service somewhere that returns a representation of the

RE: Sharding of Operators

2021-02-22 Thread Tripathi,Vikash
Thanks Chesnay, that answers my question. In my case NextOp is operating on keyed streams and now it makes sense to me that along with key re-distribution, the state will also be re-distributed so effectively the ‘NextOp4’ instance can process all the tuples together for key ‘A’, those that

Re: State Access Beyond RichCoFlatMapFunction

2021-02-22 Thread Kezhu Wang
Flink IT tests covers queryable state with mini cluster. All tests: https://github.com/apache/flink/tree/5c772f0a9cb5f8ac8e3f850a0278a75c5fa059d5/flink-queryable-state/flink-queryable-state-runtime/src/test/java/org/apache/flink/queryablestate/itcases Setup/Configs:

Re: Sharding of Operators

2021-02-22 Thread Chesnay Schepler
Let's clarify what NextOp is. Is it operating on a keyed stream, or is something like a map function that has some internal state based on keys of the input elements (e.g., it has something like a Map that it queries/modifies for each input element)? If NextOp operators on a keyed stream

Re: Flink standalone模式如何区分各个任务的日志?

2021-02-22 Thread Yang Wang
Flink的standalone application模式[1]是可以每个app都单独记录日志的 [1]. https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/#deploy-application-cluster Best, Yang xingoo <23603...@qq.com> 于2021年2月22日周一 下午12:01写道: > Hi, > >

RE: Sharding of Operators

2021-02-22 Thread Tripathi,Vikash
Just needed more clarity in terms of a processing scenario. Say, I was having records of key ‘A’ on a parallel instance ‘Op1’ of operator ‘Op’ and the next operator ‘NextOp’ in the sequence of transformation was getting records of key ‘A’ on it’s parallel instance ‘NextOp2’ at the time when

Re: Joining and windowing multiple streams using DataStream API or Table API & SQL

2021-02-22 Thread Pieter Bonte
Hi Till, Thanks for the feedback. My use case is a little bit more tricky as I can’t key all the streams by the same field. Basically I’m trying to solve Continuous SPARQL queries, which consist of many joins. I’ve seen that SPARQL queries over RDF data has been discussed before on the

flink sql 写入clickhouse性能优化

2021-02-22 Thread kandy.wang
flink-jdbc写入clickhouse 在flink 1.12版本 tps只能到35W左右的tps,各位,有什么可以性能调优?

Re: Flink SQL 写入Hive问题请教

2021-02-22 Thread eriendeng
你这没有把dialect set成hive吧,走到了else分支。default dialect是需要指定connector的,参考文档的kafka到hive代码 https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/hive/hive_read_write.html#writing -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Compile time checking of SQL

2021-02-22 Thread Sebastián Magrí
Thanks a lot Timo! On Mon, 22 Feb 2021 at 08:19, Timo Walther wrote: > Yes, this is possible. You can run `tableEnv.sqlQuery("...").explain()` > or `tableEnv.toRetractStream(table)` which would trigger the complete > translation of the SQL query without executing it. > > Regards, > Timo > > On

Re: Compile time checking of SQL

2021-02-22 Thread Timo Walther
Yes, this is possible. You can run `tableEnv.sqlQuery("...").explain()` or `tableEnv.toRetractStream(table)` which would trigger the complete translation of the SQL query without executing it. Regards, Timo On 20.02.21 18:46, Sebastián Magrí wrote: I mean the SQL queries being validated when