Is there any way to directly convert org.apache.flink.table.api.Table into my POJO.

2020-12-12 Thread Luo Jason
Hello, I'm new to Flink. Thank you for your help. My application scenario is to process the log through the Flink program, and finally store the log in HBase. Through Kafka, my Flink application receives log information from other systems. This information can not be immediately sent to

Tumble Window某个时间区间没数据的问题

2020-12-12 Thread guoliubi...@foxmail.com
Flink 1.12.0版本,java代码做了如下处理 txnStream .window(TumblingEventTimeWindows.of(Time.seconds(3L))) .process(new NetValueAggregateFunction()) 在某个3秒的区间没有数据的话,就不会进入process的function里, 是否有什么配置可以让每3秒必定进process的function吗 guoliubi...@foxmail.com

Pandas UDF处理过的数据sink问题

2020-12-12 Thread Lucas
使用了1.12.0的flink,3.7的python。自定义了一个pandas的UDF,定义大概如下 @udf(input_types=[DataTypes.STRING(), DataTypes.FLOAT()], result_type=DataTypes.ROW( [DataTypes.FIELD('buyQtl', DataTypes.BIGINT()), DataTypes.FIELD('aveBuy', DataTypes.INT())), func_type='pandas') def orderCalc(code,

如何让FlinkSQL访问到阿里云MaxCompute上的表?

2020-12-12 Thread 陈帅
如何让FlinkSQL访问到阿里云MaxCompute上的表? 又或者是Confluent Schema Registry上那些带schema的kafka topic? 需要自己定义Catalog吗?有相关的教程和资料么?谢谢!

Re: Disk usage during savepoints

2020-12-12 Thread Rex Fenley
Our job just crashed while running a savepoint, it ran out of disk space. I inspected the disk and found the following: -rw--- 1 yarn yarn 10139680768 Dec 12 22:14 presto-s3-10125099138119182412.tmp -rw--- 1 yarn yarn 10071916544 Dec 12 22:14 presto-s3-10363672991943897408.tmp

Re: How to debug a Flink Exception that does not have a stack trace?

2020-12-12 Thread Dan Hill
Thanks! That makes sense. On Sat, Dec 12, 2020 at 11:13 AM Steven Wu wrote: > This is a performance optimization in JVM when the same exception is > thrown too frequently. You can set `-XX:-OmitStackTraceInFastThrow` to > disable the feature. You can typically find the full stack trace in the

Re: Optimizing for super long checkpoint times

2020-12-12 Thread Steven Wu
> things are actually moving pretty smoothly Do you mean the job is otherwise healthy? like there is no lag etc. Do you see any bottleneck at system level, like CPU, network, disk I/O etc.? On Sat, Dec 12, 2020 at 10:54 AM Rex Fenley wrote: > Hi, > > We're running a job with on the order of

Re: How to debug a Flink Exception that does not have a stack trace?

2020-12-12 Thread Steven Wu
This is a performance optimization in JVM when the same exception is thrown too frequently. You can set `-XX:-OmitStackTraceInFastThrow` to disable the feature. You can typically find the full stack trace in the log before the optimization kicks in. On Sat, Dec 12, 2020 at 2:05 AM Till Rohrmann

Optimizing for super long checkpoint times

2020-12-12 Thread Rex Fenley
Hi, We're running a job with on the order of >100GiB of state. For our initial run we wanted to keep things simple, so we allocated a single core node with 1 Taskmanager and 1 parallelism and 1 TiB storage (split between 4 disks on that machine). Overall, things are actually moving pretty

Re: Disk usage during savepoints

2020-12-12 Thread Rex Fenley
Also, small correction from earlier, there are 4 volumes of 256 GiB so that's 1 TiB total. On Sat, Dec 12, 2020 at 10:08 AM Rex Fenley wrote: > Our first big test run we wanted to eliminate as many variables as > possible, so this is on 1 machine with 1 task manager and 1 parallelism. > The

Re: Disk usage during savepoints

2020-12-12 Thread Rex Fenley
Our first big test run we wanted to eliminate as many variables as possible, so this is on 1 machine with 1 task manager and 1 parallelism. The machine has 4 disks though, and as you can see, they mostly all use around the same space for storage until a savepoint is triggered. Could it be that

Re: Spill RocksDB to external Storage

2020-12-12 Thread Rex Fenley
Noted, thanks! On Sat, Dec 12, 2020 at 2:28 AM David Anderson wrote: > RocksDB can not be configured to spill to another filesystem or object > store. It is designed as an embedded database, and each task manager needs > to have sufficient disk space for its state on the host disk. You might be

Strange time format output by flink

2020-12-12 Thread Appleyuchi
I'm trying flatAggregate, the whole code is bug free and as follows: https://paste.ubuntu.com/p/TM6n2jdZfr/ the result I get is: 8> (true,1,+1705471-09-26T16:50,+1705471-09-26T16:55,+1705471-09-26T16:54:59.999,4,1) 4>

flink-shaded-hadoop-2-uber*-* 版本确定问题

2020-12-12 Thread Jacob
请问在升级flink版本的过程中,需要在flink/lib里面引入该包,但该包的版本号如何确定? flink-shaded-hadoop-2-uber*-* -- Sent from: http://apache-flink.147419.n8.nabble.com/

Flink 1.10.0 on yarn 提交job失败

2020-12-12 Thread Jacob
Hello, 请问在flink 1.10.0 on yarn提交job出现此问题是什么原因,hadoop jar包依赖吗?该程序在1.10以下的版本均可运行,在1.10.0无法提交。 谢谢! [jacob@hadoop001 bin]$ ./yarn logs -applicationId application_1603495749855_57650 20/12/11 18:52:55 INFO client.RMProxy: Connecting to

Re: Disk usage during savepoints

2020-12-12 Thread David Anderson
RocksDB does do compaction in the background, and incremental checkpoints simply mirror to S3 the set of RocksDB SST files needed by the current set of checkpoints. However, unlike checkpoints, which can be incremental, savepoints are always full snapshots. As for why one host would have much

Re: Spill RocksDB to external Storage

2020-12-12 Thread David Anderson
RocksDB can not be configured to spill to another filesystem or object store. It is designed as an embedded database, and each task manager needs to have sufficient disk space for its state on the host disk. You might be tempted to use network attached storage for the working state, but that's

Re: How to debug a Flink Exception that does not have a stack trace?

2020-12-12 Thread Till Rohrmann
Ok, then let's see whether it reoccurs. What you could do is to revert the fix and check the stack trace again. Cheers, Till On Sat, Dec 12, 2020, 02:16 Dan Hill wrote: > Hmm, I don't have a good job I can separate for reproduction. I was using > Table SQL and inserting a long field (which

Over window的watermark没有触发计算(附完整代码),谢谢

2020-12-12 Thread Appleyuchi
代码是: https://paste.ubuntu.com/p/GTgGhhcjyZ/ 文档是: https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/tableApi.html#group-windows 中的 Over Window Aggregation 代码bug free 但是无法输出,求助,谢谢

tEnv.executeSql(query).print() 这样不能成功消费kafka的数据

2020-12-12 Thread jy l
Hi: 我Flink消费kafka的数据,我创建一张表如下: val kafkaSource = """ |create table kafka_order( |order_id string, |order_price decimal(10,2), |order_time timestamp(3) |) |with( |'connector' = 'kafka', |'topic' = 'iceberg.order',