Hi Andres Angel,
At present, there seems to be no such built-in function, and you need to
register a user-defined function to do that. You can look at the following
document to see how to do.
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/udfs.html#table-functions
Best,
The following JIRA is about the problem you encounter. I think you should be
very interested in its comments.There does seem to be a problem with shading
Akka, and Flink is considering isolating the classloader that contain Akka and
Scala to allow the applications and Flink to use different
Hi Fanbin,
> 2. I have parallelism = 32 and only one task has the record. Can you
please elaborate more on why this would affect the watermark advancement?
Each parallel subtask of a source function usually generates its watermarks
independently, say wk1, wk2... wkn. The downstream window
Hi, it's because the Outer Joins will generate retractions, consider the
behavior of Left Outer Join
1. left record arrives, no matched right record, so +(left, null) will be
generated.
2 right record arrives, the previous result should be retracted, so
-(left, null) and +(left, right) will be
Hello guys I have registered some table environments and now I'm trying to
perform a query on these using LEFT JOIN like the example below:
Table fullenrichment = tenv.sqlQuery(
"SELECT pp.a,pp.b,pp.c,pp.d,pp.a " +
" FROM t1 pp LEFT JOIN t2 ent" +
Hequn,
Thanks for the help. It is indeed a watermark problem. From Flink UI, I can
see the low watermark value for each operator. And the groupBy operator has
lagged value of watermark. I checked the link from SO and confirmed that:
1. I do see record coming in for this operator
2. I have
Hi everyone,
I'm happy to announce the program of the Flink Forward EU 2019 conference.
The conference takes place in the Berlin Congress Center (bcc) from October
7th to 9th.
On the first day, we'll have four training sessions [1]:
* Apache Flink Developer Training
* Apache Flink Operations
hi,仲尼:
通常这种时间超前的数据是由于你机器的时间有问题(未对齐),然后采集上来的数据使用的那个时间可能就会比当前时间超前了(大了),你可以有下面解决方法:
1、在 Flink 从 Kafka 中消费数据后就进行 filter
部分这种数据(可以获取到时间后和当前时间相比一下,如果超前或者超前多久就把这条数据丢掉,之前我自己项目也有遇到过这种数据问题,设置的超前 5
分钟以上的数据就丢失),就不让进入后面生成水印,这样就不会导致因为水印过大而导致你后面的问题
2、在生成水印的地方做判断,如果采集上来的数据的时间远大于当前时间(比如超过 5
Also wanted to check if anyone has ventured into this exercise of shading
Akka in Flink ..
Is this something that qualifies as one of the roadmap items in Flink ?
regards.
On Wed, Jul 24, 2019 at 3:44 PM Debasish Ghosh
wrote:
> Hi Haibo - Thanks for the clarification ..
>
> regards.
>
> On
I was on vacation but wanted to thank Biao for summarizing the current
state! Thanks!
On Mon, Jul 15, 2019 at 2:00 AM Biao Liu wrote:
> Hi Aaron,
>
> From my understanding, you want shutting down a Task Manager without
> restart the job which has tasks running on this Task Manager?
>
> Based on
Hello everyone,
Following the current available functions
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html,
how could I split a column string by a caracter?
example
column content : col =a,b,c
query: Select col from tenv
expected return : cola , colb, colc
Hi.
Do we have an idea for this exception?
Thanks,
Yitzchak.
On Tue, Jul 23, 2019 at 12:59 PM Fabian Hueske wrote:
> Hi Yitzchak,
>
> Thanks for reaching out.
> I'm not an expert on the Kafka consumer, but I think the number of
> partitions and the number of source tasks might be interesting
Note that in order for the static class approach to work you have to
ensure that the class is loaded by the parent classloader, either by
placing the class in /lib or configuring
`classloader.parent-first-patterns-additional` to pick up this
particular class.
On 24/07/2019 10:24, Haibo Sun
Sure guys thanks for the support.
I need to create an register a table based on the content of a DS<>, the
point is that within the content I need to parse it somehow and get the
part which is the values and the headers. I already tried to create a DS
and register the new DS as table with headers
Note that this will only work when running the the application in the
IDE; specifically it will not work when running on an actual cluster,
since your function isn't executed on the same machine as your
(presumably) main[] function.
We can give you better advice if you tell us what exactly
Hi Andres,
Just define a variable outside and modify it in the anonymous class.
Andres Angel 于2019年7月24日周三 下午8:44写道:
> Hello everyone,
>
> I was wondering if there is a way how to read the content of a varible
> build within a map/flatmap function out of the DS method.
>
> example:
>
>
Hello everyone,
I was wondering if there is a way how to read the content of a varible
build within a map/flatmap function out of the DS method.
example:
DataStream dsString = env.fromElements("1,a,1.1|2,b,2.2,-2",
"3,c|4,d,4.4");
DataStream dsTuple = dsString.flatMap(new
FlatMapFunction()
Hi Caizhi,
thank you for your response, the full exception is the following:
Exception in thread "main" org.apache.flink.table.api.TableException: Arity
[7] of result [ArrayBuffer(String, String, String, String, String, String,
Timestamp)] does not match the number[1] of requested type
问题解决了,非常感谢!
解决流程:
1、确实在log/下找到了Could not load CLI class
org.apache.flink.yarn.cli.FlinkYarnSessionCli.异常
2、设置 export HADOOP_CONF_DIR=`hadoop classpath`
3、重新运行 bin/flink run --help ,出现了`Options for yarn-cluster mode` 选项
感谢大佬!❤❤❤
Zili Chen 于2019年7月24日周三 上午9:51写道:
> 你好,可以查看下 log/
Hi Federico,
I can't reproduce the error in my local environment. Would you mind sharing
us your code and the full exception stack trace? This will help us diagnose
the problem. Thanks.
Federico D'Ambrosio 于2019年7月24日周三 下午5:45写道:
> Hi Caizhi,
>
> thank you for your response.
>
> 1) I see, I'll
Hi Haibo - Thanks for the clarification ..
regards.
On Wed, Jul 24, 2019 at 2:58 PM Haibo Sun wrote:
> Hi Debasish Ghosh,
>
> I agree that Flink should shade its Akka.
>
> Maybe you misunderstood me. I mean, in the absence of official shading
> Akka in Flink, the relatively conservative way
Hi Caizhi,
thank you for your response.
1) I see, I'll use a compatible string format
2) I'm defining the case class like this:
case class cEvent(state: String, id: String, device: String,
instance: String, subInstance: String, groupLabel:
String, time: Timestamp)
object
Hi Debasish Ghosh,
I agree that Flink should shade its Akka.
Maybe you misunderstood me. I mean, in the absence of official shading Akka in
Flink, the relatively conservative way is to shade Akka of your application (I
concern Flink won't work well after shading its Akka).
Best,
Haibo
Hi,
The heap in a flink TaskManager k8s pod include the following parts:
- jvm heap, limited by -Xmx
- jvm non-heap, limited by -XX:MaxMetaspaceSize
- jvm direct memory, limited by -XX:MaxDirectMemorySize
- native memory, used by rocksdb, just as Yun Tang said, could be
limited
For our application users are expected to work with Akka APIs - hence if I
shade Akka in my application users will need to work with shaded imports
which feels unnatural. With Flink, Akka is an implementation detail and
Flink users are not expected to use Akka APIs. Hence shading will not have
any
Hi Federico,
1) As far as I know, you can't set a format for timestamp parsing currently
(see `SqlTimestampParser`, it just feeds your string to
`SqlTimestamp.valueOf`, so your timestamp format must be compatible with
SqlTimestamp).
2) How do you define your case class? You have to define its
I think it is better to shade all the dependencies of flink so that all the
projects that use flink won't hit this kind of issue.
Haibo Sun 于2019年7月24日周三 下午4:07写道:
> Hi, Debasish Ghosh
>
> I don't know why not shade Akka, maybe it can be shaded. Chesnay may be
> able to answer that.
> I
Hi Stephen,
I don't think it's possible to use the same connection pool for the entire
topology, because the nodes on the topology may run in different JVMs and on
different machines.
If you want all operators running in the same JVM to use the same connection
pool, I think you can
Hello everyone,
I've always used the DataStream API and now I'm trying out the Table API to
create a datastream from a CSV and I'm finding a couple of issues:
1) I'm reading a csv with 7 total fields, the 7th of which is a date
serialized as a Spark TimestampType, written on the csv like this:
Hi, Debasish Ghosh
I don't know why not shade Akka, maybe it can be shaded. Chesnay may be able to
answer that.
I recommend to shade Akka dependency of your application because it don't be
known what's wrong with shading Flink's Akka.
CC @Chesnay Schepler
Best,
Haibo
At 2019-07-24
The problem that I am facing is with Akka serialization .. Why not shade
the whole of Akka ?
java.lang.AbstractMethodError:
> akka.remote.RemoteActorRefProvider.serializationInformation()Lakka/serialization/Serialization$Information;
> at
>
各位Flink社区大佬,
您好!
我使用Flink SQL (Flink
1.8.0)进行一些聚合计算,消费的是Kafka数据,使用的是EventTime,但是有时候,偶然会出现rowtime字段来了一条未来时间的数据(可能是上送的数据时区导致),这样Watermark会直接推到了未来某个时间点,导致这笔错误数据到达后的数据,到未来时间点之间的数据会被丢弃。
这个问题根本确实是业务方面的问题,但是我们还是希望有一些方案应对这种异常情况。
目前,我们这边处理的方法是:
I can see that we relocate akka's netty, akka uncommon math but also
be curious why Flink doesn't shaded all of akka dependencies...
Best,
tison.
Debasish Ghosh 于2019年7月24日周三 下午3:15写道:
> Hello Haibo -
>
> Yes, my application depends on Akka 2.5.
> Just curious, why do you think it's
Oh and I'd also need some way to clean up the per-node transient state if
the topology stops running on a specific node.
On Wed, 24 Jul 2019 at 08:18, Stephen Connolly <
stephen.alan.conno...@gmail.com> wrote:
> Hi,
>
> So we have a number of nodes in our topology that need to do things like
>
Hi,
So we have a number of nodes in our topology that need to do things like
checking a database, e.g.
* We need a filter step to drop events on the floor from systems we are no
longer interested in
* We need a step that outputs on a side-channel if the event is for an
object where the parent is
Hello Haibo -
Yes, my application depends on Akka 2.5.
Just curious, why do you think it's recommended to shade Akka version of my
application instead of Flink ?
regards.
On Wed, Jul 24, 2019 at 12:42 PM Haibo Sun wrote:
> Hi Debasish Ghosh,
>
> Does your application have to depend on Akka
Hi Debasish Ghosh,
Does your application have to depend on Akka 2.5? If not, it's a good idea to
always keep the Akka version that the application depend on in line with Flink.
If you want to try shading Akka dependency, I think that it is more recommended
to shade Akka dependency of your
Hi William
Have you ever set the memory limit of your taskmanager pod when launching it in
k8s? If not, I'm afraid your node might come across node out-of-memory [1]. You
could increase the limit by analyzing your memory usage
When talking about the memory usage of RocksDB, a rough calculation
Hello -
An application that uses Akka 2.5 and Flink 1.8.0 gives runtime errors
because of version mismatch between Akka that we use and the one that Flink
uses (which is Akka 2.4). Anyone tried shading Akka dependency with Flink ?
Or is there any other alternative way to handle this issue ? I
39 matches
Mail list logo