[jira] [Created] (FLINK-16592) The doc of Streaming File Sink has a mistake of grammar

2020-03-13 Thread Chen (Jira)
Chen created FLINK-16592: Summary: The doc of Streaming File Sink has a mistake of grammar Key: FLINK-16592 URL: https://issues.apache.org/jira/browse/FLINK-16592 Project: Flink Issue Type: Task

[jira] [Created] (FLINK-16591) Flink-zh Doc‘s show a wrong Email address

2020-03-13 Thread forideal (Jira)
forideal created FLINK-16591: Summary: Flink-zh Doc‘s show a wrong Email address Key: FLINK-16591 URL: https://issues.apache.org/jira/browse/FLINK-16591 Project: Flink Issue Type: Bug

Re: Cancel the flink task and restore from checkpoint ,can I change the flink operator's parallelism

2020-03-13 Thread LakeShen
Hi Eleanore , if you resume from savepoint , you can't change the flink operator's max parallelism . Eleanore Jin 于2020年3月14日周六 上午12:51写道: > Hi Piotr, > Does this also apply to savepoint? (meaning the max parallelism should not > change for job resume from savepoint?) > > Thanks a lot! >

Re: [DISCUSS] FLIP-112: Support User-Defined Metrics for Python UDF

2020-03-13 Thread Hequn Cheng
Hi everyone, If there are no more concerns, I will raise the vote next week on Monday. Thank you all for the feedback. Best, Hequn On Fri, Mar 13, 2020 at 9:00 AM jincheng sun wrote: > Hi Hequn, > > +1, thank you for this discussion, and metrics are very important for > monitoring the

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-13 Thread Yangze Guo
@Shephan Do you mean Minicluster? Yes, it makes sense to share the GPU Manager in such scenario. If that's what you worry about, I'm +1 for holding GPUManager(ExternalResourceManagers) in TaskExecutor instead of TaskManagerServices. Regarding the RuntimeContext/FunctionContext, it just holds the

FLIP-117: HBase catalog

2020-03-13 Thread Flavio Pompermaier
Hello everybody, I started a new FLIP to discuss about an HBaseCatalog implementation[1] after the opening of the relative issue by Bowen [2]. I drafted a very simple version of the FLIP just to discuss about the critical points (in red) in order to decide how to proceed. Best, Flavio [1]

Re: time-windowed joins and tumbling windows

2020-03-13 Thread Vinod Mehra
I wanted to add that when I used the following the watermark was delayed by 3 hours instead of 2 hours that I would have expected: AND o.rowtime BETWEEN c.rowtime - INTERVAL '2' hour AND c.rowtime (time window constraint between o and c: 1st and 3rd table) Thanks, Vinod On Fri, Mar 13, 2020 at

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-13 Thread Isaac Godfried
On Fri, 13 Mar 2020 15:58:20 + se...@apache.org wrote > > Can we somehow keep this out of the TaskManager services > I fear that we could not. IMO, the GPUManager(or > ExternalServicesManagers in future) is conceptually one of the task > manager services, just like

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-13 Thread Isaac Godfried
On Fri, 13 Mar 2020 15:58:20 + se...@apache.org wrote > > Can we somehow keep this out of the TaskManager services > I fear that we could not. IMO, the GPUManager(or > ExternalServicesManagers in future) is conceptually one of the task > manager services, just like

Re: [DISCUSS] Releasing Flink 1.10.1

2020-03-13 Thread Andrey Zagrebin
> @Andrey and @Xintong - could we have a quick poll on the user mailing list > about increasing the metaspace size in Flink 1.10.1? Specifically asking > for who has very small TM setups? There has been a survey about this topic since 10 days: `[Survey] Default size for the new JVM Metaspace

Re: Cancel the flink task and restore from checkpoint ,can I change the flink operator's parallelism

2020-03-13 Thread Eleanore Jin
Hi Piotr, Does this also apply to savepoint? (meaning the max parallelism should not change for job resume from savepoint?) Thanks a lot! Eleanore On Fri, Mar 13, 2020 at 6:33 AM Piotr Nowojski wrote: > Hi, > > Yes, you can change the parallelism. One thing that you can not change is > “max

[jira] [Created] (FLINK-16590) flink-oss-fs-hadoop: Not all dependencies in NOTICE file are bundled

2020-03-13 Thread Gary Yao (Jira)
Gary Yao created FLINK-16590: Summary: flink-oss-fs-hadoop: Not all dependencies in NOTICE file are bundled Key: FLINK-16590 URL: https://issues.apache.org/jira/browse/FLINK-16590 Project: Flink

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-13 Thread Stephan Ewen
> > Can we somehow keep this out of the TaskManager services > I fear that we could not. IMO, the GPUManager(or > ExternalServicesManagers in future) is conceptually one of the task > manager services, just like MemoryManager before 1.10. > - It maintains/holds the GPU resource at TM level and all

Re: [DISCUSS] Releasing Flink 1.10.1

2020-03-13 Thread Yu Li
Another blocker for 1.10.1: FLINK-16576 State inconsistency on restore with memory state backends Let me recompile the watching list with recent feedbacks. There're totally 45 issues with Blocker/Critical priority for 1.10.1, out of which 14 already resolved and 31 left, and the below ones are

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-13 Thread Kostas Kloudas
I think that the ExecutorListener idea could work. With a bit more than FLIP-85, it is true that we can get rid of the "exception throwing" environments and we need to introduce an "EmbeddedExecutor" which is going to run on the JM. So, the 2 above, coupled with an ExecutorListener can have the

Re: Flink Kafka consumer auto-commit timeout

2020-03-13 Thread Rong Rong
Yes, this is a Kafka side issue. Since the affected version of Kafka is all below 1.1.0, ideally speaking we should upgrade Kafka minor version on flink-connector-kafka-0.10/0.11 once the fix was back-ported on the Kafka side. However based on the fact that the PR has been merged for 2 years, I

Re: Flink Kafka consumer auto-commit timeout

2020-03-13 Thread Aljoscha Krettek
Thanks for the update! On 13.03.20 13:47, Rong Rong wrote: 1. I think we have finally pinpointed what the root cause to this issue is: When partitions are assigned manually (e.g. with assign() API instead subscribe() API) the client will not try to rediscover the coordinator if it dies [1].

Re: Cancel the flink task and restore from checkpoint ,can I change the flink operator's parallelism

2020-03-13 Thread Piotr Nowojski
Hi, Yes, you can change the parallelism. One thing that you can not change is “max parallelism”. Piotrek > On 13 Mar 2020, at 04:34, Sivaprasanna wrote: > > I think you can modify the operator’s parallelism. It is only if you have set > maxParallelism, and while restoring from a checkpoint,

[VOTE] FLIP-106: Support Python UDF in SQL Function DDL

2020-03-13 Thread Wei Zhong
Hi all, I would like to start the vote for FLIP-106[1] which is discussed and reached consensus in the discussion thread[2]. The vote will be open for at least 72 hours. I'll try to close it by 2020-03-18 14:00 UTC, unless there is an objection or not enough votes. Best, Wei [1]

Re: [DISCUSS] JMX remote monitoring integration with Flink

2020-03-13 Thread Forward Xu
Hi RongRong, Thank you for bringing this discussion, it is indeed not appropriate to occupy additional ports in the production environment to provide jmxrmi services. I think [2] RestApi or JobManager/TaskManager UI is a good idea. Best, Forward Rong Rong 于2020年3月13日周五 下午8:54写道: > Hi All, > >

Re: [DISCUSS] FLIP-106: Support Python UDF in SQL Function DDL

2020-03-13 Thread Wei Zhong
Hi all, Thanks for all of your response. If there's no more comments, I would like to bring up the VOTE. Best, Wei > 在 2020年3月13日,20:50,Xingbo Huang 写道: > > Hi Wei, > Thanks a lot for drafting the FLIP and kicking off the discussion. > Big +1 for this feature. > This feature will greatly

[DISCUSS] JMX remote monitoring integration with Flink

2020-03-13 Thread Rong Rong
Hi All, Has anyone tried to manage production Flink applications through JMX remote monitoring & management[1]? We were experimenting to enable JMXRMI on Flink by default in production and would like to share some of our thoughts: ** Is there any straightforward way to dynamically allocate

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-13 Thread Yangze Guo
Thanks for the feedback, Stephan. > Can we somehow keep this out of the TaskManager services I fear that we could not. IMO, the GPUManager(or ExternalServicesManagers in future) is conceptually one of the task manager services, just like MemoryManager before 1.10. - It maintains/holds the GPU

Re: [DISCUSS] FLIP-106: Support Python UDF in SQL Function DDL

2020-03-13 Thread Xingbo Huang
Hi Wei, Thanks a lot for drafting the FLIP and kicking off the discussion. Big +1 for this feature. This feature will greatly facilitate PyFlink users to use Python UDF in SQL scenarios. Best, Xingbo Hequn Cheng 于2020年3月13日周五 下午5:10写道: > Big +1 on this feature! It would be great to extend the

Re: Flink Kafka consumer auto-commit timeout

2020-03-13 Thread Rong Rong
Hi Aljoscha, Thank you for the help and reply. 1. I think we have finally pinpointed what the root cause to this issue is: When partitions are assigned manually (e.g. with assign() API instead subscribe() API) the client will not try to rediscover the coordinator if it dies [1]. This seems to no

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-13 Thread Stephan Ewen
It sounds fine to initially start with GPU specific support and think about generalizing this once we better understand the space. About the implementation suggested in FLIP-108: - Can we somehow keep this out of the TaskManager services? Anything we have to pull through all layers of the TM

[jira] [Created] (FLINK-16589) Flink Table SQL fails/crashes with big queries with lots of fields

2020-03-13 Thread Viet Pham (Jira)
Viet Pham created FLINK-16589: - Summary: Flink Table SQL fails/crashes with big queries with lots of fields Key: FLINK-16589 URL: https://issues.apache.org/jira/browse/FLINK-16589 Project: Flink

[jira] [Created] (FLINK-16588) Add Disk Space metrics to TaskManagers

2020-03-13 Thread Thomas Wozniakowski (Jira)
Thomas Wozniakowski created FLINK-16588: --- Summary: Add Disk Space metrics to TaskManagers Key: FLINK-16588 URL: https://issues.apache.org/jira/browse/FLINK-16588 Project: Flink Issue

Re: [DISCUSS] FLIP-115: Filesystem connector in Table

2020-03-13 Thread Yun Gao
Hi, Very thanks for Jinsong to bring up this discussion! It should largely improve the usability after enhancing the FileSystem connector in Table. I have the same question with Piotr. From my side, I think it should be better to be able to reuse existing

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-13 Thread Stephan Ewen
Few thoughts on the discussion: ## Changes on the Master If possible, let's avoid changes to the master (JobManager / Dispatcher). These components are complex, we should strive to keep anything out of them that we can keep out of them. ## Problems in different deployments (applications /

Re: [DISCUSS] Drop Bucketing Sink

2020-03-13 Thread Guowei Ma
+1 to drop it. To Jingsong : we are planning to implement the orc StreamingFileSink in 1.11. I think users also could reference the old BucktSink from the old version. Best, Guowei Jingsong Li 于2020年3月13日周五 上午10:07写道: > Hi Robert, > > +1 to drop it but maybe not 1.11. > > ORC has not been

Re: [DISCUSS] Releasing Flink 1.10.1

2020-03-13 Thread Stephan Ewen
@Andrey and @Xintong - could we have a quick poll on the user mailing list about increasing the metaspace size in Flink 1.10.1? Specifically asking for who has very small TM setups? On Fri, Mar 13, 2020 at 6:23 AM Yu Li wrote: > Thanks for the reminder Stephan and the inputs/discussion Andrey

Re: [VOTE] [FLIP-76] Unaligned checkpoints

2020-03-13 Thread Yu Li
+1 (binding) The updated FLIP doc LGTM. Thanks for addressing the comments Arvid and Roman. Best Regards, Yu On Fri, 13 Mar 2020 at 03:48, Arvid Heise wrote: > I added a roadmap section to the FLIP as suggested by Yu and Roman. > > Unless someone objects, I'd still consider the voting period

Re: [DISCUSS] FLIP-115: Filesystem connector in Table

2020-03-13 Thread Piotr Nowojski
Hi, Which actual sinks/sources are you planning to use in this feature? Is it about exposing StreamingFileSink in the Table API? Or do you want to implement new Sinks/Sources? Piotrek > On 13 Mar 2020, at 10:04, jinhai wang wrote: > > Thanks for FLIP-115. It is really useful feature for

[jira] [Created] (FLINK-16587) Add basic CheckpointBarrierHandler for unaligned checkpoint

2020-03-13 Thread Arvid Heise (Jira)
Arvid Heise created FLINK-16587: --- Summary: Add basic CheckpointBarrierHandler for unaligned checkpoint Key: FLINK-16587 URL: https://issues.apache.org/jira/browse/FLINK-16587 Project: Flink

Re: [DISCUSS] Drop Bucketing Sink

2020-03-13 Thread Yu Li
Thanks for bringing up this discussion Robert! According to the inputs, I suggest we create an umbrella JIRA issue to track all critical improvements StreamingFileSink should have before we could completely discard bucketing sink, so we could have a clear view of the progress and how soon we

Re: [DISCUSS] FLIP-106: Support Python UDF in SQL Function DDL

2020-03-13 Thread Hequn Cheng
Big +1 on this feature! It would be great to extend the usage of Python UDF in SQL scenarios. The design doc looks good from my side now. Thank you for the update. Best, Hequn On Tue, Mar 10, 2020 at 3:50 PM Wei Zhong wrote: > Hi Timo, > > Thanks for your reply. > > If we aim for the option 1,

Re: [DISCUSS] FLIP-115: Filesystem connector in Table

2020-03-13 Thread jinhai wang
Thanks for FLIP-115. It is really useful feature for platform developers who manage hundreds of Flink to Hive jobs in production. I think we need add 'connector.sink.username' for UserGroupInformation when data is written to HDFS 在 2020/3/13 下午3:33,“Jingsong Li” 写入: Hi everyone,

Re: [DISCUSS] FLIP-115: Filesystem connector in Table

2020-03-13 Thread jinhai wang
Thanks for FLIP-115. It is really useful feature for platform developers who manage hundreds of Flink to Hive jobs in production. I think we need add 'connector.sink.username' for UserGroupInformation when data is written to HDFS Jingsong Li 于2020年3月13日周五 下午3:33写道: > Hi everyone, > > I'd like

[Discuss] FLINK-16039 Add API method to get last element in session window

2020-03-13 Thread Manas Kale
Hi all, I would like to start a discussion on this feature request (JIRA link). Consider the events : [1, event], [2, event] where first element is event timestamp in seconds and second element is event code/name. Also consider that an Event

Re: [DISCUSS] Drop Bucketing Sink

2020-03-13 Thread jinhai wang
Hi +1 to drop bucketing sink FLIP-115 also needs to be prioritized for 1.11 在 2020/3/13 上午10:07,“Jingsong Li” 写入: Hi Robert, +1 to drop it but maybe not 1.11. ORC has not been supported on StreamingFileSink. I have seen lots of users run ORC in the bucketing sink.

[jira] [Created] (FLINK-16586) Build ResultSubpartitionInfo and InputChannelInfo in respective constructors

2020-03-13 Thread Zhijiang (Jira)
Zhijiang created FLINK-16586: Summary: Build ResultSubpartitionInfo and InputChannelInfo in respective constructors Key: FLINK-16586 URL: https://issues.apache.org/jira/browse/FLINK-16586 Project: Flink

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-13 Thread tison
Hi Gyula and all, Thanks for the discussion so far. It seems that the requirement is to deliver some metadata of the submitted job, and such metadata can be simply extracted from StreamGraph. I'm unfamiliar with metadata Atlas needs so I make some assumptions. Assumption: Metadata needed by

[DISCUSS] FLIP-115: Filesystem connector in Table

2020-03-13 Thread Jingsong Li
Hi everyone, I'd like to start a discussion about FLIP-115 Filesystem connector in Table [1]. This FLIP will bring: - Introduce Filesystem table factory in table, support csv/parquet/orc/json/avro formats. - Introduce streaming filesystem/hive sink in table CC to user mail list, if you have any

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-13 Thread Gyula Fóra
Thanks again Kostas for diving deep into this, it is great feedback! I agree with the concerns regarding the custom executor, it has to be able to properly handle the "original" executor somehow. This might be quite tricky if we want to implement the AtlasExecutor outside Flink. In any case does

[jira] [Created] (FLINK-16585) Table program cannot be compiled.Cannot determine simple type name "com"

2020-03-13 Thread hiliuxg (Jira)
hiliuxg created FLINK-16585: --- Summary: Table program cannot be compiled.Cannot determine simple type name "com" Key: FLINK-16585 URL: https://issues.apache.org/jira/browse/FLINK-16585 Project: Flink

Re: Flink YARN app terminated before the client receives the result

2020-03-13 Thread DONG, Weike
Hi Yangze and all, I have tried numerous times, and this behavior persists. Below is the tail log of taskmanager.log: 2020-03-13 12:06:14.240 [flink-akka.actor.default-dispatcher-3] INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot TaskSlot(index:0, state:ACTIVE,

[jira] [Created] (FLINK-16584) Whether to support the long type field in table planner when the source is kafka and event time field's type is long

2020-03-13 Thread hehuiyuan (Jira)
hehuiyuan created FLINK-16584: - Summary: Whether to support the long type field in table planner when the source is kafka and event time field's type is long Key: FLINK-16584 URL:

[jira] [Created] (FLINK-16583) SQLClientKafkaITCase.testKafka failed in SqlClientException

2020-03-13 Thread Zhijiang (Jira)
Zhijiang created FLINK-16583: Summary: SQLClientKafkaITCase.testKafka failed in SqlClientException Key: FLINK-16583 URL: https://issues.apache.org/jira/browse/FLINK-16583 Project: Flink Issue