date:20200516

Re: [DISCUSS] FLIP-84 Feedback Summary

2020-05-16 Thread godfrey he

hi everyone,

I would like to bring up another topic about the return value of
TableResult#collect() method.
Currently, the return type is `Iterator`, we meet some problems when
implementing FLINK-14807[1].

In current design, the sink operator has a buffer pool which buffers the
data from upstream,
and waits the client to consume the data. The client will pull the data
when `Iterator#next()` method is called.
If the client submits a select job, consumes a part of data and exits. The
job will not be finished.
This will cause resource leak. We can't require the client must consume all
data. for unbounded stream job, it's also impossible.
Currently, users can also cancel the job via
`TableResult.getJobClient().get().cancel()` method.
But this approach is not intuitive and convenient.

So, I want to change the return type from `Iterator` to
`CloseableRowIterator`,
the new method likes like:

public interface TableResult {

  CloseableRowIterator collect();

}

public interface CloseableRowIterator extends Iterator, AutoCloseable {

}

Prefixing the name with "Closeable" is intended to remind the users that
this iterator should be closed,
users can conveniently use try-with-resources statement to close the
resources.
The resource leak problem is still there if users do not close the iterator
or cancel the job through job client,
 we just provide an easier way for users to avoid this.

I also notice that there is a `CloseableIterator` interface in
`org.apache.flink.util` package.
But I still tend to introduce `CloseableRowIterator`. My point of view is:
1) `CloseableIterator` is in a util package, not a public interface.
2) `CloseableRowIterator` is more convenient, users do not need to define
generic type ``.

What do you think?

Best,
Godfrey


[1] https://issues.apache.org/jira/browse/FLINK-14807


Fabian Hueske  于2020年5月7日周四 下午3:59写道：

> Thanks for the update Godfrey!
>
> +1 to this approach.
>
> Since there can be only one primary key, I'd also be fine to just use
> `PRI` even if it is composite, but `PRI(f0, f5)` might be more convenient
> for users.
>
> Thanks, Fabian
>
> Am Do., 7. Mai 2020 um 09:31 Uhr schrieb godfrey he :
>
>> Hi fabian,
>> Thanks for you suggestions.
>>
>> Agree with you that `UNQ(f2, f3)` is more clear.
>>
>> A table can have only ONE primary key,
>> this primary key can consist of single or multiple columns. [1]
>> if primary key consists of single column,
>> we can simply use `PRI` (or `PRI(xx)`) to represent it.
>> if primary key have multiple columns,
>> we should use `PRI(xx, yy, ...)` to represent it.
>>
>> A table may have multiple unique keys,
>> each unique key can consist of single or multiple columns. [2]
>> if there is only one unique key and this unique key has only single
>> column,
>> we can simply use `UNQ` (or `UNQ(xx)`) to represent it.
>> otherwise, we should use `UNQ(xx, yy, ...)` to represent it.
>> (a corner case: two unique keys with same columns, like `UNQ(f2, f3)`,
>> `UNQ(f2, f3)`,
>> we can forbid this case or add a unique name for each key in the future)
>>
>> primary key and unique key with multiple columns example:
>> create table MyTable (
>>   f0 BIGINT NOT NULL,
>>   f1 ROW,
>>   f2 VARCHAR<256>,
>>   f3 AS f0 + 1,
>>   f4 TIMESTAMP(3) NOT NULL,
>>   f5 BIGINT NOT NULL,
>>  * PRIMARY KEY (f0, f5)*,
>>   *UNIQUE (f3, f2)*,
>>   WATERMARK f4 AS f4 - INTERVAL '3' SECOND
>> ) with (...)
>>
>>
>> ++--+---++---+--+
>> | name | type  |
>> null   | key  | compute column | watermark
>>|
>>
>> ++--+---++---+--+
>> | f0   | BIGINT |
>> false | PRI(f0, f5)   |  (NULL)   |   (NULL)
>> |
>>
>> ++--+---++---+--+
>> | f1   | ROW | true   | (NULL)|
>> (NULL)   |  (NULL) |
>>
>> ++--+---++---+--+
>> | f2   | VARCHAR<256> | true   |
>> UNQ(f2, f3) |  (NULL)   |  (NULL)
>>|
>>
>> ++--+---++---+--+
>> | f3   | BIGINT |
>> false | UNQ(f2, f3) |  f0 + 1  |  (NULL)
>>  |
>>
>>

[jira] [Created] (FLINK-17768) UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnLocalAndRemoteChannel is instable

2020-05-16 Thread Dian Fu (Jira)

Dian Fu created FLINK-17768:
---

 Summary: 
UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnLocalAndRemoteChannel
 is instable
 Key: FLINK-17768
 URL: https://issues.apache.org/jira/browse/FLINK-17768
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Dian Fu


UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnLocalAndRemoteChannel
 and shouldPerformUnalignedCheckpointOnParallelRemoteChannel failed in azure:
{code}
2020-05-16T12:41:32.3546620Z [ERROR] 
shouldPerformUnalignedCheckpointOnLocalAndRemoteChannel(org.apache.flink.test.checkpointing.UnalignedCheckpointITCase)
  Time elapsed: 18.865 s  <<< ERROR!
2020-05-16T12:41:32.3548739Z java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
2020-05-16T12:41:32.3550177Zat 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
2020-05-16T12:41:32.3551416Zat 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
2020-05-16T12:41:32.3552959Zat 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1665)
2020-05-16T12:41:32.3554979Zat 
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74)
2020-05-16T12:41:32.3556584Zat 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1645)
2020-05-16T12:41:32.3558068Zat 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1627)
2020-05-16T12:41:32.3559431Zat 
org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.execute(UnalignedCheckpointITCase.java:158)
2020-05-16T12:41:32.3560954Zat 
org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnLocalAndRemoteChannel(UnalignedCheckpointITCase.java:145)
2020-05-16T12:41:32.3562203Zat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-05-16T12:41:32.3563433Zat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-05-16T12:41:32.3564846Zat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-05-16T12:41:32.3565894Zat 
java.lang.reflect.Method.invoke(Method.java:498)
2020-05-16T12:41:32.3566870Zat 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
2020-05-16T12:41:32.3568064Zat 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2020-05-16T12:41:32.3569727Zat 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
2020-05-16T12:41:32.3570818Zat 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2020-05-16T12:41:32.3571840Zat 
org.junit.rules.Verifier$1.evaluate(Verifier.java:35)
2020-05-16T12:41:32.3572771Zat 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
2020-05-16T12:41:32.3574008Zat 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
2020-05-16T12:41:32.3575406Zat 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
2020-05-16T12:41:32.3576476Zat 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
2020-05-16T12:41:32.3577253Zat java.lang.Thread.run(Thread.java:748)
2020-05-16T12:41:32.3578228Z Caused by: 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
2020-05-16T12:41:32.3579520Zat 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
2020-05-16T12:41:32.3580935Zat 
org.apache.flink.client.program.PerJobMiniClusterFactory$PerJobMiniClusterJobClient.lambda$getJobExecutionResult$2(PerJobMiniClusterFactory.java:186)
2020-05-16T12:41:32.3582361Zat 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
2020-05-16T12:41:32.3583456Zat 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
2020-05-16T12:41:32.3584816Zat 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
2020-05-16T12:41:32.3585874Zat 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
2020-05-16T12:41:32.3587059Zat 
org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229)
2020-05-16T12:41:32.3588572Zat 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
2020-05-16T12:41:32.3589733Zat 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
2020-05-16T12:41:32.3590860Zat 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)

[jira] [Created] (FLINK-17767) Tumble and Hop window support offset

2020-05-16 Thread hailong wang (Jira)

hailong wang created FLINK-17767:


 Summary: Tumble and Hop window support offset
 Key: FLINK-17767
 URL: https://issues.apache.org/jira/browse/FLINK-17767
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: hailong wang
 Fix For: 1.11.0


TUMBLE window and HOP window with alignment is not supported yet. We can 
support by 

(, , )



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17766) Use checkpoint lock instead of fine-grained locking in Kafka AbstractFetcher

2020-05-16 Thread Aljoscha Krettek (Jira)

Aljoscha Krettek created FLINK-17766:


 Summary: Use checkpoint lock instead of fine-grained locking in 
Kafka AbstractFetcher
 Key: FLINK-17766
 URL: https://issues.apache.org/jira/browse/FLINK-17766
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Kafka
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek
 Fix For: 1.11.0


In {{emitRecordsWithTimestamps()}}, we are currently locking on the partition 
state object itself to prevent concurrent access (and to make sure that changes 
are visible across threads). However, after recent changes (FLINK-17307) we 
hold the checkpoint lock for emitting the whole "bundle" of records from Kafka. 
We can now also just use the checkpoint lock in the periodic emitter callback 
and then don't need the fine-grained locking on the state for record emission.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Zhu Zhu

+1

Thanks,
Zhu Zhu

Yun Tang  于2020年5月16日周六 下午11:18写道：

>
> +1 (non-binding)
>
> Yun Tang
> 
> From: Benchao Li 
> Sent: Saturday, May 16, 2020 22:00
> To: dev 
> Subject: Re: [VOTE] Guarantee that @PublicEvolving classes are API and
> binary compatible across bug fix releases (x.y.u and x.y.v)
>
> +1 (non-binding)
>
> Piotr Nowojski  于2020年5月16日周六 下午8:10写道：
>
> > +1
> >
> > Piotrek
> >
> > > On 16 May 2020, at 13:21, Yu Li  wrote:
> > >
> > > +1 (not sure whether my vote is binding, I guess yes since this is a
> > > development-oriented vote?)
> > >
> > > Minor:
> > > bq. This means that a version x.y.u is API and binary compatible to
> x.y.v
> > > with u <= v wrt all @PublicEvolving classes.
> > > I guess you mean "with u *>* v" to keep backward compatibility instead
> of
> > > forward?
> > >
> > > Thanks.
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Sat, 16 May 2020 at 18:40, Jeff Zhang  wrote:
> > >
> > >> Definitely +1
> > >>
> > >> Dian Fu  于2020年5月16日周六 下午5:48写道：
> > >>
> > >>> +1 (non-binding)
> > >>>
> > >>> Regards,
> > >>> Dian
> > >>>
> >  在 2020年5月16日，下午2:33，Congxian Qiu  写道：
> > 
> >  +1 (non-binding)
> >  Best,
> >  Congxian
> > 
> > 
> >  Yangze Guo  于2020年5月16日周六 上午12:51写道：
> > 
> > > +1
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Sat, May 16, 2020 at 12:26 AM Yuan Mei 
> > >>> wrote:
> > >>
> > >> +1
> > >>
> > >> On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi 
> > wrote:
> > >>
> > >>> +1
> > >>>
> > >>> – Ufuk
> > >>>
> > >>>
> > >>> On Fri, May 15, 2020 at 4:54 PM Zhijiang <
> > >> wangzhijiang...@aliyun.com
> > >>> .invalid>
> > >>> wrote:
> > >>>
> >  Sounds good, +1.
> > 
> >  Best,
> >  Zhijiang
> > 
> > 
> > 
> --
> >  From:Thomas Weise 
> >  Send Time:2020年5月15日(星期五) 21:33
> >  To:dev 
> >  Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are
> API
> > >> and
> >  binary compatible across bug fix releases (x.y.u and x.y.v)
> > 
> >  +1
> > 
> > 
> >  On Fri, May 15, 2020 at 6:15 AM Till Rohrmann <
> > >> trohrm...@apache.org>
> >  wrote:
> > 
> > > Dear community,
> > >
> > > with reference to the dev ML thread about guaranteeing API and
> > > binary
> > > compatibility for @PublicEvolving classes across bug fix
> releases
> > > [1] I
> > > would like to start a vote about it.
> > >
> > > The proposal is that the Flink community starts to guarantee
> > > that @PublicEvolving classes will be API and binary compatible
> > > across
> > >>> bug
> > > fix releases of the same minor version. This means that a
> version
> > > x.y.u
> >  is
> > > API and binary compatible to x.y.v with u <= v wrt all
> > > @PublicEvolving
> > > classes.
> > >
> > > The voting options are the following:
> > >
> > > * +1, Provide the above described guarantees
> > > * -1, Do not provide the above described guarantees (please
> > >> provide
> > > specific comments)
> > >
> > > The vote will be open for at least 72 hours. It is adopted by
> > > majority
> > > approval with at least 3 PMC affirmative votes.
> > >
> > > [1]
> > >
> > >
> > 
> > >>>
> > >
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
> > >
> > > Cheers,
> > > Till
> > >
> > 
> > 
> > >>>
> > >
> > >>>
> > >>>
> > >>
> > >> --
> > >> Best Regards
> > >>
> > >> Jeff Zhang
> > >>
> >
> >
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: libenc...@gmail.com; libenc...@pku.edu.cn
>

[jira] [Created] (FLINK-17765) Verbose client error messages

2020-05-16 Thread Till Rohrmann (Jira)

Till Rohrmann created FLINK-17765:
-

 Summary: Verbose client error messages
 Key: FLINK-17765
 URL: https://issues.apache.org/jira/browse/FLINK-17765
 Project: Flink
  Issue Type: Bug
  Components: Client / Job Submission, Runtime / Coordination
Affects Versions: 1.10.1, 1.11.0
Reporter: Till Rohrmann
 Fix For: 1.11.0


Some client operations if they fail produce very verbose error messages which 
are hard to decipher for the user. For example, if the job submission fails 
because the savepoint path does not exist, then the user sees the following:

{code}
org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
JobGraph.
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:148)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:689)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:227)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:906)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:982)
at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:982)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
JobGraph.
at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:290)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1766)
at 
org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:104)
at 
org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:71)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1645)
at 
org.apache.flink.streaming.examples.statemachine.StateMachineExample.main(StateMachineExample.java:142)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:288)
... 8 more
Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
JobGraph.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1761)
... 17 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph.
at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$7(RestClusterClient.java:366)
at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$8(FutureUtils.java:290)
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at 
java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

[jira] [Created] (FLINK-17763) No log files when starting scala-shell

2020-05-16 Thread Jeff Zhang (Jira)

Jeff Zhang created FLINK-17763:
--

 Summary: No log files when starting scala-shell
 Key: FLINK-17763
 URL: https://issues.apache.org/jira/browse/FLINK-17763
 Project: Flink
  Issue Type: Bug
  Components: Scala Shell
Affects Versions: 1.11.0
Reporter: Jeff Zhang


I see the following error when starting scala shell.

 
{code:java}
Starting Flink Shell:
ERROR StatusLogger No Log4j 2 configuration file found. Using default 
configuration (logging only errors to the console), or user programmatically 
provided configurations. Set system property 'log4j2.debug' to show Log4j 2 
internal initialization logging. See 
https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions 
on how to configure Log4j 2 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17764) Update tips about the default planner when the parameter planner is not recognized

2020-05-16 Thread xushiwei (Jira)

xushiwei created FLINK-17764:


 Summary: Update tips about the default planner when the parameter 
planner is not recognized
 Key: FLINK-17764
 URL: https://issues.apache.org/jira/browse/FLINK-17764
 Project: Flink
  Issue Type: Improvement
  Components: Examples
Reporter: xushiwei


This default planner has been set to blink in the code.

However, when the planner parameter value is not recognized, the default 
planner is prompted to be flink. 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Yun Tang


+1 (non-binding)

Yun Tang

From: Benchao Li 
Sent: Saturday, May 16, 2020 22:00
To: dev 
Subject: Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary 
compatible across bug fix releases (x.y.u and x.y.v)

+1 (non-binding)

Piotr Nowojski  于2020年5月16日周六 下午8:10写道：

> +1
>
> Piotrek
>
> > On 16 May 2020, at 13:21, Yu Li  wrote:
> >
> > +1 (not sure whether my vote is binding, I guess yes since this is a
> > development-oriented vote?)
> >
> > Minor:
> > bq. This means that a version x.y.u is API and binary compatible to x.y.v
> > with u <= v wrt all @PublicEvolving classes.
> > I guess you mean "with u *>* v" to keep backward compatibility instead of
> > forward?
> >
> > Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Sat, 16 May 2020 at 18:40, Jeff Zhang  wrote:
> >
> >> Definitely +1
> >>
> >> Dian Fu  于2020年5月16日周六 下午5:48写道：
> >>
> >>> +1 (non-binding)
> >>>
> >>> Regards,
> >>> Dian
> >>>
>  在 2020年5月16日，下午2:33，Congxian Qiu  写道：
> 
>  +1 (non-binding)
>  Best,
>  Congxian
> 
> 
>  Yangze Guo  于2020年5月16日周六 上午12:51写道：
> 
> > +1
> >
> > Best,
> > Yangze Guo
> >
> > On Sat, May 16, 2020 at 12:26 AM Yuan Mei 
> >>> wrote:
> >>
> >> +1
> >>
> >> On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi 
> wrote:
> >>
> >>> +1
> >>>
> >>> �C Ufuk
> >>>
> >>>
> >>> On Fri, May 15, 2020 at 4:54 PM Zhijiang <
> >> wangzhijiang...@aliyun.com
> >>> .invalid>
> >>> wrote:
> >>>
>  Sounds good, +1.
> 
>  Best,
>  Zhijiang
> 
> 
>  --
>  From:Thomas Weise 
>  Send Time:2020年5月15日(星期五) 21:33
>  To:dev 
>  Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API
> >> and
>  binary compatible across bug fix releases (x.y.u and x.y.v)
> 
>  +1
> 
> 
>  On Fri, May 15, 2020 at 6:15 AM Till Rohrmann <
> >> trohrm...@apache.org>
>  wrote:
> 
> > Dear community,
> >
> > with reference to the dev ML thread about guaranteeing API and
> > binary
> > compatibility for @PublicEvolving classes across bug fix releases
> > [1] I
> > would like to start a vote about it.
> >
> > The proposal is that the Flink community starts to guarantee
> > that @PublicEvolving classes will be API and binary compatible
> > across
> >>> bug
> > fix releases of the same minor version. This means that a version
> > x.y.u
>  is
> > API and binary compatible to x.y.v with u <= v wrt all
> > @PublicEvolving
> > classes.
> >
> > The voting options are the following:
> >
> > * +1, Provide the above described guarantees
> > * -1, Do not provide the above described guarantees (please
> >> provide
> > specific comments)
> >
> > The vote will be open for at least 72 hours. It is adopted by
> > majority
> > approval with at least 3 PMC affirmative votes.
> >
> > [1]
> >
> >
> 
> >>>
> >
> >>>
> >>
> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
> >
> > Cheers,
> > Till
> >
> 
> 
> >>>
> >
> >>>
> >>>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >>
>
>

--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn

[jira] [Created] (FLINK-17762) Postgres Catalog should pass table's primary key to catalogTable

2020-05-16 Thread Leonard Xu (Jira)

Leonard Xu created FLINK-17762:
--

 Summary: Postgres Catalog should pass table's primary key to 
catalogTable
 Key: FLINK-17762
 URL: https://issues.apache.org/jira/browse/FLINK-17762
 Project: Flink
  Issue Type: Sub-task
Reporter: Leonard Xu


for upsert query, if the table comes from a catalog rather than create in 
FLINK,  Postgres Catalog should pass table's primary key to catalogTable so 
that JdbcDynamicTableSink can determine to work on upsert mode or append only 
mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] increase "state.backend.fs.memory-threshold" from 1K to 100K

2020-05-16 Thread Yun Tang

If we cannot get rid of union state, I think we should introduce memory control 
on the serialized TDDs when deploying
tasks instead of how union state is implemented when assign state in 
StateAssignmentOperation.
The duplicated TaskStateSnapshot would not really increase much memory as the 
ByteStreamStateHandle's are
actually share the same reference until they are serialized.

When talking about the estimated memory footprint, I previously think that 
depends on the pool size of future executor (HardWare#getNumberCPUCores). 
However, with the simple program below, I found the async submit task logic 
make the number of existing RemoteRpcInvocation in JM at the same time larger 
than the HardWare#getNumberCPUCores.
Take below program for example, we have 200 parallelism of source and the 
existing RemoteRpcInvocation in JM at the same time could be nearly 200 while 
our pool size of future executor is only 96. I think if we could clear the 
serialized data in RemoteRpcInvocation as soon as possible, we might mitigate 
this problem greatly.

Simple program which used union state to reproduce the memory footprint 
problem: one sub-task of the total union state is 100KB bytes array, and 200 
sub-tasks in total could lead to more than 100KB * 200 * 200 = 3.8GB memory for 
all union state.

public class Program {
   private static final Logger LOG = LoggerFactory.getLogger(Program.class);

   public static void main(String[] args) throws Exception {
  final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
  env.enableCheckpointing(60 * 1000L);
  env.addSource(new MySource()).setParallelism(200).print();
  env.execute("Mock program");
   }

   private static class MySource extends RichParallelSourceFunction 
implements CheckpointedFunction {
  private static final ListStateDescriptor stateDescriptor = new 
ListStateDescriptor<>("list-1", byte[].class);
  private ListState unionListState;
  private volatile boolean running = true;
  @Override
  public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
 unionListState.clear();
 byte[] array = new byte[100 * 1024];
 ThreadLocalRandom.current().nextBytes(array);
 unionListState.add(array);
  }

  @Override
  public void initializeState(FunctionInitializationContext context) throws 
Exception {
 if (context.isRestored()) {
unionListState = 
context.getOperatorStateStore().getUnionListState(stateDescriptor);
List collect = 
StreamSupport.stream(unionListState.get().spliterator(), 
false).collect(Collectors.toList());
LOG.info("union state Collect size: {}.", collect.size());
 } else {
unionListState = 
context.getOperatorStateStore().getUnionListState(stateDescriptor);
 }
  }

  @Override
  public void run(SourceContext ctx) throws Exception {
 while (running) {
synchronized (ctx.getCheckpointLock()) {
   ctx.collect(ThreadLocalRandom.current().nextInt());
}
Thread.sleep(100);
 }
  }

  @Override
  public void cancel() {
 running = false;
  }
   }
}

Best
Yun Tang

From: Stephan Ewen 
Sent: Saturday, May 16, 2020 18:56
To: dev 
Cc: Till Rohrmann ; Piotr Nowojski 
Subject: Re: [DISCUSS] increase "state.backend.fs.memory-threshold" from 1K to 
100K

Okay, thank you for all the feedback.

So we should definitely work on getting rid of the Union State, or at least
change the way it is implemented (avoid duplicate serializer snapshots).

Can you estimate from which size of the cluster on the JM heap usage
becomes critical (if we increased the threshold to 100k, or maybe 50k) ?

On Sat, May 16, 2020 at 8:10 AM Congxian Qiu  wrote:

> Hi,
>
> Overall, I agree with increasing this value. but the default value set to
> 100K maybe something too large from my side.
>
> I want to share some more information from my side.
>
> The small files problem is indeed a problem many users may encounter in
> production env. The states(Keyed state and Operator state) can become small
> files in DFS, but increase the value of `state.backend.fs.memory-threshold`
> may encounter the JM OOM problem as Yun said previously.
> We've tried increase this value in our production env, but some connectors
> which UnionState prevent us to do this, the memory consumed by these jobs
> can be very large (in our case, thousands of parallelism, set
> `state.backend.fs.memory-threshold` to 64K, will consume 10G+ memory for
> JM), so in the end, we use the solution proposed in FLINK-11937[1] for both
> keyed state and operator state.
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-11937
> Best,
> Congxian
>
>
> Yun Tang  于2020年5月15日周五 下午9:09写道：
>
> > Please correct me if I am wrong, "put the increased value into the
> default
> > configuration" means
> > we will

[jira] [Created] (FLINK-17761) FutureCompletingBlockingQueue should have a capacity limit.

2020-05-16 Thread Jiangjie Qin (Jira)

Jiangjie Qin created FLINK-17761:


 Summary: FutureCompletingBlockingQueue should have a capacity 
limit.
 Key: FLINK-17761
 URL: https://issues.apache.org/jira/browse/FLINK-17761
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Common
Reporter: Jiangjie Qin


Currently FutureCompletingBlockingQueue does not have a capacity. It should 
have one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17760) Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

2020-05-16 Thread Zhu Zhu (Jira)

Zhu Zhu created FLINK-17760:
---

 Summary: Rework tests to not rely on legacy scheduling logics in 
ExecutionGraph anymore
 Key: FLINK-17760
 URL: https://issues.apache.org/jira/browse/FLINK-17760
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination, Tests
Reporter: Zhu Zhu


The legacy scheduling logics in ExecutionGraph were used by the legacy 
scheduler. They are not in production use anymore.
In order to remove legacy scheduling logics, it is needed to rework the tests 
which relied on them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17759) Remove RestartIndividualStrategy

2020-05-16 Thread Zhu Zhu (Jira)

Zhu Zhu created FLINK-17759:
---

 Summary: Remove RestartIndividualStrategy
 Key: FLINK-17759
 URL: https://issues.apache.org/jira/browse/FLINK-17759
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhu Zhu


It was used by the legacy scheduler and is not used anymore.
Removing it can ease the work to further remove the legacy scheduling logics in 
ExecutionGraph.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17758) Remove AdaptedRestartPipelinedRegionStrategyNG

2020-05-16 Thread Zhu Zhu (Jira)

Zhu Zhu created FLINK-17758:
---

 Summary: Remove AdaptedRestartPipelinedRegionStrategyNG
 Key: FLINK-17758
 URL: https://issues.apache.org/jira/browse/FLINK-17758
 Project: Flink
  Issue Type: Sub-task
Reporter: Zhu Zhu


It was used by the legacy scheduler and is not used anymore.
Removing it can ease the work to further remove the legacy scheduling logics in 
ExecutionGraph.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Benchao Li

+1 (non-binding)

Piotr Nowojski  于2020年5月16日周六 下午8:10写道：

> +1
>
> Piotrek
>
> > On 16 May 2020, at 13:21, Yu Li  wrote:
> >
> > +1 (not sure whether my vote is binding, I guess yes since this is a
> > development-oriented vote?)
> >
> > Minor:
> > bq. This means that a version x.y.u is API and binary compatible to x.y.v
> > with u <= v wrt all @PublicEvolving classes.
> > I guess you mean "with u *>* v" to keep backward compatibility instead of
> > forward?
> >
> > Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Sat, 16 May 2020 at 18:40, Jeff Zhang  wrote:
> >
> >> Definitely +1
> >>
> >> Dian Fu  于2020年5月16日周六 下午5:48写道：
> >>
> >>> +1 (non-binding)
> >>>
> >>> Regards,
> >>> Dian
> >>>
>  在 2020年5月16日，下午2:33，Congxian Qiu  写道：
> 
>  +1 (non-binding)
>  Best,
>  Congxian
> 
> 
>  Yangze Guo  于2020年5月16日周六 上午12:51写道：
> 
> > +1
> >
> > Best,
> > Yangze Guo
> >
> > On Sat, May 16, 2020 at 12:26 AM Yuan Mei 
> >>> wrote:
> >>
> >> +1
> >>
> >> On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi 
> wrote:
> >>
> >>> +1
> >>>
> >>> – Ufuk
> >>>
> >>>
> >>> On Fri, May 15, 2020 at 4:54 PM Zhijiang <
> >> wangzhijiang...@aliyun.com
> >>> .invalid>
> >>> wrote:
> >>>
>  Sounds good, +1.
> 
>  Best,
>  Zhijiang
> 
> 
>  --
>  From:Thomas Weise 
>  Send Time:2020年5月15日(星期五) 21:33
>  To:dev 
>  Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API
> >> and
>  binary compatible across bug fix releases (x.y.u and x.y.v)
> 
>  +1
> 
> 
>  On Fri, May 15, 2020 at 6:15 AM Till Rohrmann <
> >> trohrm...@apache.org>
>  wrote:
> 
> > Dear community,
> >
> > with reference to the dev ML thread about guaranteeing API and
> > binary
> > compatibility for @PublicEvolving classes across bug fix releases
> > [1] I
> > would like to start a vote about it.
> >
> > The proposal is that the Flink community starts to guarantee
> > that @PublicEvolving classes will be API and binary compatible
> > across
> >>> bug
> > fix releases of the same minor version. This means that a version
> > x.y.u
>  is
> > API and binary compatible to x.y.v with u <= v wrt all
> > @PublicEvolving
> > classes.
> >
> > The voting options are the following:
> >
> > * +1, Provide the above described guarantees
> > * -1, Do not provide the above described guarantees (please
> >> provide
> > specific comments)
> >
> > The vote will be open for at least 72 hours. It is adopted by
> > majority
> > approval with at least 3 PMC affirmative votes.
> >
> > [1]
> >
> >
> 
> >>>
> >
> >>>
> >>
> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
> >
> > Cheers,
> > Till
> >
> 
> 
> >>>
> >
> >>>
> >>>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >>
>
>

-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn

[jira] [Created] (FLINK-17757) Implement format factory for Avro serialization and deseriazation schema of RowData type

2020-05-16 Thread Danny Chen (Jira)

Danny Chen created FLINK-17757:
--

 Summary: Implement format factory for Avro serialization and 
deseriazation schema of RowData type
 Key: FLINK-17757
 URL: https://issues.apache.org/jira/browse/FLINK-17757
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.11.0
Reporter: Danny Chen
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Google Season of Docs

2020-05-16 Thread Marta Paes Moreira

Hi, Amr!

You can certainly apply for Google Season of Docs with the Flink project.
All you have to do is have a look at our project ideas [1], think about
what you'd be interested in working on and submit your application once the
application phase opens (June 9th) [2].

Let me know if there is anything else I can help you with in the process!

Marta

[1] https://flink.apache.org/news/2020/05/04/season-of-docs.html
[2]
https://developers.google.com/season-of-docs/docs/tech-writer-application-hints

On Sat, May 16, 2020 at 8:32 AM Amr Maghraby  wrote:

> Dear Apace Flink,
> My name is Amr Maghraby, I am a new graduate from AAST college got the
> first rank on my class with CGPA 3.92 and joined the international
> competition in the US called ROV got the second worldwide and last summer I
> have involved in Google Summer of code 2019 and did good work also, I
> participated in problem-solving competitions ACM ACPC and Hash Code. I was
> asking if I could apply for GSOD?
> Waiting for your reply.
> Thanks,
> Amr Maghraby
>

Re: regarding Google Season of Docs

2020-05-16 Thread Marta Paes Moreira

Hi, Yuvraj!

Thanks for your interest in contributing to Flink during Google Season of
Docs!

You can start by reading FLIP-60 [1] and thinking about what areas of the
documentation you'd like to focus on. There is a lot of work to be done, so
there is some flexibility for you to choose topics that are in line with
your knowledge or interests. If after that you're still unsure, I can also
suggest some areas of the Table API / SQL docs that could be a good fit for
newcomers to the project.

Then, you can submit your application once the application phase opens
(June 9th) [2].

Let me know if this is what you were looking for!

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
[2]
https://developers.google.com/season-of-docs/docs/tech-writer-application-hints

On Sat, May 16, 2020 at 8:32 AM Yuvraj Manral 
wrote:

> Respected sir/mam,
>
> I came around the projects proposed by Apache Flink for Season of Docs
> 2020.
> I am a newbie to organisation but really liked the ideas of projects and
> would love to start contributing and prepare my proposal for Season of
> Docs.
>
> Please guide me through. Where should I start and then proceed ?
> Thanking you in anticipation
>
> Yuvraj Manral 
> RSVP
>

[jira] [Created] (FLINK-17756) Drop table/view shouldn't take affect on each other

2020-05-16 Thread Kurt Young (Jira)

Kurt Young created FLINK-17756:
--

 Summary: Drop table/view shouldn't take affect on each other
 Key: FLINK-17756
 URL: https://issues.apache.org/jira/browse/FLINK-17756
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Reporter: Kurt Young
Assignee: Kurt Young
 Fix For: 1.11.0


Currently "DROP VIEW" can successfully drop a table, and "DROP TABLE" can 
successfully a view. We should disable this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Piotr Nowojski

+1

Piotrek

> On 16 May 2020, at 13:21, Yu Li  wrote:
> 
> +1 (not sure whether my vote is binding, I guess yes since this is a
> development-oriented vote?)
> 
> Minor:
> bq. This means that a version x.y.u is API and binary compatible to x.y.v
> with u <= v wrt all @PublicEvolving classes.
> I guess you mean "with u *>* v" to keep backward compatibility instead of
> forward?
> 
> Thanks.
> 
> Best Regards,
> Yu
> 
> 
> On Sat, 16 May 2020 at 18:40, Jeff Zhang  wrote:
> 
>> Definitely +1
>> 
>> Dian Fu  于2020年5月16日周六 下午5:48写道：
>> 
>>> +1 (non-binding)
>>> 
>>> Regards,
>>> Dian
>>> 
 在 2020年5月16日，下午2:33，Congxian Qiu  写道：
 
 +1 (non-binding)
 Best,
 Congxian
 
 
 Yangze Guo  于2020年5月16日周六 上午12:51写道：
 
> +1
> 
> Best,
> Yangze Guo
> 
> On Sat, May 16, 2020 at 12:26 AM Yuan Mei 
>>> wrote:
>> 
>> +1
>> 
>> On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi  wrote:
>> 
>>> +1
>>> 
>>> – Ufuk
>>> 
>>> 
>>> On Fri, May 15, 2020 at 4:54 PM Zhijiang <
>> wangzhijiang...@aliyun.com
>>> .invalid>
>>> wrote:
>>> 
 Sounds good, +1.
 
 Best,
 Zhijiang
 
 
 --
 From:Thomas Weise 
 Send Time:2020年5月15日(星期五) 21:33
 To:dev 
 Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API
>> and
 binary compatible across bug fix releases (x.y.u and x.y.v)
 
 +1
 
 
 On Fri, May 15, 2020 at 6:15 AM Till Rohrmann <
>> trohrm...@apache.org>
 wrote:
 
> Dear community,
> 
> with reference to the dev ML thread about guaranteeing API and
> binary
> compatibility for @PublicEvolving classes across bug fix releases
> [1] I
> would like to start a vote about it.
> 
> The proposal is that the Flink community starts to guarantee
> that @PublicEvolving classes will be API and binary compatible
> across
>>> bug
> fix releases of the same minor version. This means that a version
> x.y.u
 is
> API and binary compatible to x.y.v with u <= v wrt all
> @PublicEvolving
> classes.
> 
> The voting options are the following:
> 
> * +1, Provide the above described guarantees
> * -1, Do not provide the above described guarantees (please
>> provide
> specific comments)
> 
> The vote will be open for at least 72 hours. It is adopted by
> majority
> approval with at least 3 PMC affirmative votes.
> 
> [1]
> 
> 
 
>>> 
> 
>>> 
>> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
> 
> Cheers,
> Till
> 
 
 
>>> 
> 
>>> 
>>> 
>> 
>> --
>> Best Regards
>> 
>> Jeff Zhang
>>

Re: Interested In Google Season of Docs!

2020-05-16 Thread Marta Paes Moreira

Hi, Roopal.

Thanks for reaching out, we're glad to see that you're interested in giving
the Flink docs a try!

To participate in Google Season of Docs (GSoD), you'll have to submit an
application once the application period opens (June 9th). Because FLIP-60
is very broad, the first step would be to think about what parts of it
you'd like to focus on — or, what you'd like to have as a project proposal
[1].

You can find more information about the whole process of participating in
GSoD in [2].

Let me know if I can help you with anything else or if you need support
with choosing your focus areas.

Marta

[1]
https://developers.google.com/season-of-docs/docs/tech-writer-application-hints#project_proposal
[2]
https://developers.google.com/season-of-docs/docs/tech-writer-application-hints

On Sat, May 16, 2020 at 8:32 AM Roopal Jain  wrote:

> Hello Flink Dev Community!
>
> I am interested in participating in Google Season of Docs for Apache Flink.
> I went through the FLIP-60 detailed proposal and thought this is something
> I could do well. I am currently working as a software engineer and have a
> B.E in Computer Engineering from one of India's reputed engineering
> colleges. I have prior open-source contribution with mentoring for Google
> Summer of Code and Google Code-In.
> I have prior work experience on Apache Spark and a good grasp on SQL, Java,
> and Python.
> Please guide me more on how to get started?
>
> Thanks & Regards,
> Roopal Jain
>

Re: [PROPOSAL] Google Season of Docs 2020.

2020-05-16 Thread Marta Paes Moreira

Thanks, Robert!

I made the wrong assumption that most people would be familiar with how
mailing lists work and should have been clearer in the announcement.

On Sat, May 16, 2020 at 8:35 AM Robert Metzger  wrote:

> FYI: I'm a moderator of the dev@ list, and we've received about 5 emails
> from applicants that were not subscribed to the list.
> Initially, I rejected their messages, asking them to subscribe and send the
> email again. This has not happened from any of them.
> That's why I accepted new applicant messages now. However, this means that
> they won't receive email responses if we just reply to the list.
>
> tl:dr: Please use "Reply to all" and make sure the applicant's email
> address is included when responding to any of those applications. Thanks :)
>
> On Tue, May 12, 2020 at 11:28 AM Till Rohrmann 
> wrote:
>
> > This is great newst :-) Thanks Marta for driving this effort!
> >
> > On Mon, May 11, 2020 at 4:22 PM Sivaprasanna 
> > wrote:
> >
> > > Awesome. Great job.
> > >
> > > On Mon, 11 May 2020 at 7:22 PM, Seth Wiesman 
> > wrote:
> > >
> > > > Thank you for putting this together Marta!
> > > >
> > > > On Mon, May 11, 2020 at 8:35 AM Fabian Hueske 
> > wrote:
> > > >
> > > > > Thanks Marta and congratulations!
> > > > >
> > > > > Am Mo., 11. Mai 2020 um 14:55 Uhr schrieb Robert Metzger <
> > > > > rmetz...@apache.org>:
> > > > >
> > > > > > Awesome :)
> > > > > > Thanks a lot for driving this Marta!
> > > > > >
> > > > > > Nice to see Flink (by virtue of having Apache as part of the
> name)
> > so
> > > > > high
> > > > > > on the list, with other good open source projects :)
> > > > > >
> > > > > >
> > > > > > On Mon, May 11, 2020 at 2:18 PM Marta Paes Moreira <
> > > > ma...@ververica.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I'm happy to announce that we were *accepted* into this year's
> > > Google
> > > > > > > Season of Docs!
> > > > > > >
> > > > > > > The list of projects was published today [1]. The next step is
> > for
> > > > > > > technical writers to reach out (May 11th-June 8th) and apply
> > (June
> > > > > > 9th-July
> > > > > > > 9th).
> > > > > > >
> > > > > > > Thanks again to everyone involved! I'm really looking forward
> to
> > > > > kicking
> > > > > > > off the project in September.
> > > > > > >
> > > > > > > [1]
> > > https://developers.google.com/season-of-docs/docs/participants/
> > > > > > >
> > > > > > > Marta
> > > > > > >
> > > > > > > On Thu, Apr 30, 2020 at 5:14 PM Marta Paes Moreira <
> > > > > ma...@ververica.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > The application to Season of Docs 2020 is close to being
> > > finalized.
> > > > > > I've
> > > > > > > > created a PR with the application announcement for the Flink
> > blog
> > > > [1]
> > > > > > (as
> > > > > > > > required by Google OSS).
> > > > > > > >
> > > > > > > > Thanks a lot to everyone who pitched in — and special thanks
> to
> > > > > > Aljoscha
> > > > > > > > and Seth for volunteering as mentors!
> > > > > > > >
> > > > > > > > I'll send an update to this thread once the results are out
> > (May
> > > > > 11th).
> > > > > > > >
> > > > > > > > [1] https://github.com/apache/flink-web/pull/332
> > > > > > > >
> > > > > > > > On Mon, Apr 27, 2020 at 9:28 PM Seth Wiesman <
> > > sjwies...@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Hi Marta,
> > > > > > > >>
> > > > > > > >> I think this is a great idea, I'd be happy to help mentor a
> > > table
> > > > > > > >> documentation project.
> > > > > > > >>
> > > > > > > >> Seth
> > > > > > > >>
> > > > > > > >> On Thu, Apr 23, 2020 at 8:38 AM Marta Paes Moreira <
> > > > > > ma...@ververica.com
> > > > > > > >
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Thanks for the feedback!
> > > > > > > >> >
> > > > > > > >> > So far, the projects on the table are:
> > > > > > > >> >
> > > > > > > >> >1. Improving the Table API/SQL documentation.
> > > > > > > >> >2. Improving the documentation about Deployments.
> > > > > > > >> >3. Restructuring and standardizing the documentation
> > about
> > > > > > > >> Connectors.
> > > > > > > >> >4. Finishing the Chinese translation.
> > > > > > > >> >
> > > > > > > >> > I think 2. would require a lot of technical knowledge
> about
> > > > Flink,
> > > > > > > which
> > > > > > > >> > might not be a good fit for GSoD (as discussed last year).
> > > > > > > >> >
> > > > > > > >> > As for mentors, we have:
> > > > > > > >> >
> > > > > > > >> >- Aljoscha (Table API/SQL)
> > > > > > > >> >- Till (Deployments)
> > > > > > > >> >- Stephan also said he'd be happy to participate as a
> > > mentor
> > > > if
> > > > > > > >> needed.
> > > > > > > >> >
> > > > > > > >> > For the translation project, I'm pulling in the people
> > > involved
> > > > in
> > > > > > > last
> > > > > > > >> > year's thread (Jark and Jincheng), as we would need two
> > > > > > > chinese-speaking
> > > > > > > >> > mentors.
> > > > > > > >> >
> > > > > > > >> >

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Yu Li

+1 (not sure whether my vote is binding, I guess yes since this is a
development-oriented vote?)

Minor:
bq. This means that a version x.y.u is API and binary compatible to x.y.v
with u <= v wrt all @PublicEvolving classes.
I guess you mean "with u *>* v" to keep backward compatibility instead of
forward?

Thanks.

Best Regards,
Yu


On Sat, 16 May 2020 at 18:40, Jeff Zhang  wrote:

> Definitely +1
>
> Dian Fu  于2020年5月16日周六 下午5:48写道：
>
> > +1 (non-binding)
> >
> > Regards,
> > Dian
> >
> > > 在 2020年5月16日，下午2:33，Congxian Qiu  写道：
> > >
> > > +1 (non-binding)
> > > Best,
> > > Congxian
> > >
> > >
> > > Yangze Guo  于2020年5月16日周六 上午12:51写道：
> > >
> > >> +1
> > >>
> > >> Best,
> > >> Yangze Guo
> > >>
> > >> On Sat, May 16, 2020 at 12:26 AM Yuan Mei 
> > wrote:
> > >>>
> > >>> +1
> > >>>
> > >>> On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi  wrote:
> > >>>
> >  +1
> > 
> >  – Ufuk
> > 
> > 
> >  On Fri, May 15, 2020 at 4:54 PM Zhijiang <
> wangzhijiang...@aliyun.com
> >  .invalid>
> >  wrote:
> > 
> > > Sounds good, +1.
> > >
> > > Best,
> > > Zhijiang
> > >
> > >
> > > --
> > > From:Thomas Weise 
> > > Send Time:2020年5月15日(星期五) 21:33
> > > To:dev 
> > > Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API
> and
> > > binary compatible across bug fix releases (x.y.u and x.y.v)
> > >
> > > +1
> > >
> > >
> > > On Fri, May 15, 2020 at 6:15 AM Till Rohrmann <
> trohrm...@apache.org>
> > > wrote:
> > >
> > >> Dear community,
> > >>
> > >> with reference to the dev ML thread about guaranteeing API and
> > >> binary
> > >> compatibility for @PublicEvolving classes across bug fix releases
> > >> [1] I
> > >> would like to start a vote about it.
> > >>
> > >> The proposal is that the Flink community starts to guarantee
> > >> that @PublicEvolving classes will be API and binary compatible
> > >> across
> >  bug
> > >> fix releases of the same minor version. This means that a version
> > >> x.y.u
> > > is
> > >> API and binary compatible to x.y.v with u <= v wrt all
> > >> @PublicEvolving
> > >> classes.
> > >>
> > >> The voting options are the following:
> > >>
> > >> * +1, Provide the above described guarantees
> > >> * -1, Do not provide the above described guarantees (please
> provide
> > >> specific comments)
> > >>
> > >> The vote will be open for at least 72 hours. It is adopted by
> > >> majority
> > >> approval with at least 3 PMC affirmative votes.
> > >>
> > >> [1]
> > >>
> > >>
> > >
> > 
> > >>
> >
> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >
> > >
> > 
> > >>
> >
> >
>
> --
> Best Regards
>
> Jeff Zhang
>

Running individual test files taking too long

2020-05-16 Thread Manish G

Hi,

I am trying to run individual unit test files, and every time it spends
lots of time(almost 4-5 minutes) on some build process, and then I get
error, like this:

Error running 'FlinkKafkaConsumerBaseTest': Command line is too long

I recently switched from IntelliJ Ulimate to CE. Can that be the cause?

With regards
Manish

Re: [DISCUSS] increase "state.backend.fs.memory-threshold" from 1K to 100K

2020-05-16 Thread Stephan Ewen

Okay, thank you for all the feedback.

So we should definitely work on getting rid of the Union State, or at least
change the way it is implemented (avoid duplicate serializer snapshots).

Can you estimate from which size of the cluster on the JM heap usage
becomes critical (if we increased the threshold to 100k, or maybe 50k) ?


On Sat, May 16, 2020 at 8:10 AM Congxian Qiu  wrote:

> Hi,
>
> Overall, I agree with increasing this value. but the default value set to
> 100K maybe something too large from my side.
>
> I want to share some more information from my side.
>
> The small files problem is indeed a problem many users may encounter in
> production env. The states(Keyed state and Operator state) can become small
> files in DFS, but increase the value of `state.backend.fs.memory-threshold`
> may encounter the JM OOM problem as Yun said previously.
> We've tried increase this value in our production env, but some connectors
> which UnionState prevent us to do this, the memory consumed by these jobs
> can be very large (in our case, thousands of parallelism, set
> `state.backend.fs.memory-threshold` to 64K, will consume 10G+ memory for
> JM), so in the end, we use the solution proposed in FLINK-11937[1] for both
> keyed state and operator state.
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-11937
> Best,
> Congxian
>
>
> Yun Tang  于2020年5月15日周五 下午9:09写道：
>
> > Please correct me if I am wrong, "put the increased value into the
> default
> > configuration" means
> > we will update that in default flink-conf.yaml but still leave the
> default
> > value of `state.backend.fs.memory-threshold`as previously?
> > It seems I did not get the point why existing setups with existing
> configs
> > will not be affected.
> >
> > The concern I raised is because one of our large-scale job with 1024
> > parallelism source of union state meet the JM OOM problem when we
> increase
> > this value.
> > I think if we introduce memory control when serializing TDD
> asynchronously
> > [1], we could be much more confident to increase this configuration as
> the
> > memory footprint
> > expands at that time by a lot of serialized TDDs.
> >
> >
> > [1]
> >
> https://github.com/apache/flink/blob/32bd0944d0519093c0a4d5d809c6636eb3a7fc31/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L752
> >
> > Best
> > Yun Tang
> >
> > 
> > From: Stephan Ewen 
> > Sent: Friday, May 15, 2020 16:53
> > To: dev 
> > Cc: Till Rohrmann ; Piotr Nowojski <
> > pi...@ververica.com>
> > Subject: Re: [DISCUSS] increase "state.backend.fs.memory-threshold" from
> > 1K to 100K
> >
> > I see, thanks for all the input.
> >
> > I agree with Yun Tang that the use of UnionState is problematic and can
> > cause issues in conjunction with this.
> > However, most of the large-scale users I know that also struggle with
> > UnionState have also increased this threshold, because with this low
> > threshold, they get an excess number of small files and overwhelm their
> > HDFS / S3 / etc.
> >
> > An intermediate solution could be to put the increased value into the
> > default configuration. That way, existing setups with existing configs
> will
> > not be affected, but new users / installations will have a simper time.
> >
> > Best,
> > Stephan
> >
> >
> > On Thu, May 14, 2020 at 9:20 PM Yun Tang  wrote:
> >
> > > Tend to be not in favor of this proposal as union state is somewhat
> > abused
> > > in several popular source connectors (e.g. kafka), and increasing this
> > > value could lead to JM OOM when sending tdd from JM to TMs with large
> > > parallelism.
> > >
> > > After we collect union state and initialize the map list [1], we
> already
> > > have union state ready to assign. At this time, the memory footprint
> has
> > > not increase too much as the union state which shared across tasks have
> > the
> > > same reference of ByteStreamStateHandle. However, when we send tdd with
> > the
> > > taskRestore to TMs, akka will serialize those ByteStreamStateHandle
> > within
> > > tdd to increases the memory footprint. If the source have 1024
> > > parallelisms, and any one of the sub-task would then have 1024*100KB
> size
> > > state handles. The sum of total memory footprint cannot be ignored.
> > >
> > > If we plan to increase the default value of
> > > state.backend.fs.memory-threshold, we should first resolve the above
> > case.
> > > In other words, this proposal could be a trade-off, which benefit
> perhaps
> > > 99% users, but might bring harmful effects to 1% user with large-scale
> > > flink jobs.
> > >
> > >
> > > [1]
> > >
> >
> https://github.com/apache/flink/blob/c1ea6fcfd05c72a68739bda8bd16a2d1c15522c0/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/RoundRobinOperatorStateRepartitioner.java#L64-L87
> > >
> > > Best
> > > Yun Tang
> > >
> > >
> > > 
> > > From: Yu Li 
> > > Sent: Thursday, May 14, 2020 23:51
> > > To: Till Rohrmann

[jira] [Created] (FLINK-17755) Expiring states should be given as side output

2020-05-16 Thread Roey Shem Tov (Jira)

Roey Shem Tov created FLINK-17755:
-

 Summary: Expiring states should be given as side output
 Key: FLINK-17755
 URL: https://issues.apache.org/jira/browse/FLINK-17755
 Project: Flink
  Issue Type: New Feature
  Components: API / State Processor
Reporter: Roey Shem Tov
 Fix For: 2.0.0, 1.12.0


When we set a StateTTLConfig to StateDescriptor, then when the records is 
expiring it is deleted from the StateBackend.

I want suggest a new feature, that we can get the expiring results as side 
output, to process them and not just delete them.

For example, if we have a ListState that have a TTL enabled, we can get the 
expiring records in the list in side-output.

What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Jeff Zhang

Definitely +1

Dian Fu  于2020年5月16日周六 下午5:48写道：

> +1 (non-binding)
>
> Regards,
> Dian
>
> > 在 2020年5月16日，下午2:33，Congxian Qiu  写道：
> >
> > +1 (non-binding)
> > Best,
> > Congxian
> >
> >
> > Yangze Guo  于2020年5月16日周六 上午12:51写道：
> >
> >> +1
> >>
> >> Best,
> >> Yangze Guo
> >>
> >> On Sat, May 16, 2020 at 12:26 AM Yuan Mei 
> wrote:
> >>>
> >>> +1
> >>>
> >>> On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi  wrote:
> >>>
>  +1
> 
>  – Ufuk
> 
> 
>  On Fri, May 15, 2020 at 4:54 PM Zhijiang   .invalid>
>  wrote:
> 
> > Sounds good, +1.
> >
> > Best,
> > Zhijiang
> >
> >
> > --
> > From:Thomas Weise 
> > Send Time:2020年5月15日(星期五) 21:33
> > To:dev 
> > Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API and
> > binary compatible across bug fix releases (x.y.u and x.y.v)
> >
> > +1
> >
> >
> > On Fri, May 15, 2020 at 6:15 AM Till Rohrmann 
> > wrote:
> >
> >> Dear community,
> >>
> >> with reference to the dev ML thread about guaranteeing API and
> >> binary
> >> compatibility for @PublicEvolving classes across bug fix releases
> >> [1] I
> >> would like to start a vote about it.
> >>
> >> The proposal is that the Flink community starts to guarantee
> >> that @PublicEvolving classes will be API and binary compatible
> >> across
>  bug
> >> fix releases of the same minor version. This means that a version
> >> x.y.u
> > is
> >> API and binary compatible to x.y.v with u <= v wrt all
> >> @PublicEvolving
> >> classes.
> >>
> >> The voting options are the following:
> >>
> >> * +1, Provide the above described guarantees
> >> * -1, Do not provide the above described guarantees (please provide
> >> specific comments)
> >>
> >> The vote will be open for at least 72 hours. It is adopted by
> >> majority
> >> approval with at least 3 PMC affirmative votes.
> >>
> >> [1]
> >>
> >>
> >
> 
> >>
> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
> >>
> >> Cheers,
> >> Till
> >>
> >
> >
> 
> >>
>
>

-- 
Best Regards

Jeff Zhang

[jira] [Created] (FLINK-17754) Walkthrough Table Java nightly end-to-end test failed to compile

2020-05-16 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-17754:
--

 Summary: Walkthrough Table Java nightly end-to-end test failed to 
compile
 Key: FLINK-17754
 URL: https://issues.apache.org/jira/browse/FLINK-17754
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API, Table SQL / Planner, Tests
Reporter: Piotr Nowojski
 Fix For: 1.11.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=1490=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5

{noformat}
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-41325742846/flink-walkthrough-table-java/src/main/java/org/apache/flink/walkthrough/SpendReport.java:[35,21]
 cannot find symbol
  symbol:   method 
registerTableSource(java.lang.String,org.apache.flink.walkthrough.common.table.BoundedTransactionTableSource)
  location: variable tEnv of type 
org.apache.flink.table.api.java.BatchTableEnvironment
[ERROR] 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-41325742846/flink-walkthrough-table-java/src/main/java/org/apache/flink/walkthrough/SpendReport.java:[36,21]
 cannot find symbol
  symbol:   method 
registerTableSink(java.lang.String,org.apache.flink.walkthrough.common.table.SpendReportTableSink)
  location: variable tEnv of type 
org.apache.flink.table.api.java.BatchTableEnvironment

(...)

[FAIL] 'Walkthrough Table Java nightly end-to-end test' failed after 0 minutes 
and 5 seconds! Test exited with exit code 1
{noformat}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Exact feature freeze date

2020-05-16 Thread Dian Fu

+1 for Monday morning in Europe.

Regards,
Dian

> 在 2020年5月16日，下午1:51，Congxian Qiu  写道：
> 
> +1 for Monday morning in Europe.
> Best,
> Congxian
> 
> 
> Danny Chan  于2020年5月16日周六 上午10:40写道：
> 
>> +1 for Monday morning in Europe.
>> 
>> Best,
>> Danny Chan
>> 在 2020年5月15日 +0800 PM10:51，Zhijiang > .invalid>，写道：
>>> +1 for Monday morning in Europe.
>>> 
>>> Best,
>>> Zhijiang
>>> 
>>> 
>>> --
>>> From:Yun Tang 
>>> Send Time:2020年5月15日(星期五) 21:58
>>> To:dev 
>>> Subject:Re: [DISCUSS] Exact feature freeze date
>>> 
>>> +1 for Monday morning in Europe.
>>> 
>>> Best
>>> Yun Tang
>>> 
>>> From: Jingsong Li 
>>> Sent: Friday, May 15, 2020 21:17
>>> To: dev 
>>> Subject: Re: [DISCUSS] Exact feature freeze date
>>> 
>>> +1 for Monday morning.
>>> 
>>> Best,
>>> Jingsong Lee
>>> 
>>> On Fri, May 15, 2020 at 8:45 PM Till Rohrmann 
>> wrote:
>>> 
 +1 from my side extend the feature freeze until Monday morning.
 
 Cheers,
 Till
 
 On Fri, May 15, 2020 at 2:04 PM Robert Metzger 
 wrote:
 
> I'm okay, but I would suggest to agree on a time of day. What about
 Monday
> morning in Europe?
> 
> On Fri, May 15, 2020 at 1:43 PM Piotr Nowojski 
> wrote:
> 
>> Hi,
>> 
>> Couple of contributors asked for extending cutting the release
>> branch
>> until Monday, what do you think about such extension?
>> 
>> (+1 from my side)
>> 
>> Piotrek
>> 
>>> On 25 Apr 2020, at 21:24, Yu Li  wrote:
>>> 
>>> +1 for extending the feature freeze to May 15th.
>>> 
>>> Best Regards,
>>> Yu
>>> 
>>> 
>>> On Fri, 24 Apr 2020 at 14:43, Yuan Mei 
 wrote:
>>> 
 +1
 
 On Thu, Apr 23, 2020 at 4:10 PM Stephan Ewen >> 
> wrote:
 
> Hi all!
> 
> I want to bring up a discussion about when we want to do the
 feature
 freeze
> for 1.11.
> 
> When kicking off the release cycle, we tentatively set the
>> date to
> end
>> of
> April, which would be in one week.
> 
> I can say from the features I am involved with (FLIP-27,
>> FLIP-115,
> reviewing some state backend improvements, etc.) that it
>> would be
>> helpful
> to have two additional weeks.
> 
> When looking at various other feature threads, my feeling is
>> that
> there
 are
> more contributors and committers that could use a few more
>> days.
> The last two months were quite exceptional in and we did
>> lose a bit
> of
> development speed here and there.
> 
> How do you think about making *May 15th* the feature freeze?
> 
> Best,
> Stephan
> 
 
>> 
>> 
> 
 
>>> 
>>> 
>>> --
>>> Best, Jingsong Lee
>>> 
>>

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Dian Fu

+1 (non-binding)

Regards,
Dian

> 在 2020年5月16日，下午2:33，Congxian Qiu  写道：
> 
> +1 (non-binding)
> Best,
> Congxian
> 
> 
> Yangze Guo  于2020年5月16日周六 上午12:51写道：
> 
>> +1
>> 
>> Best,
>> Yangze Guo
>> 
>> On Sat, May 16, 2020 at 12:26 AM Yuan Mei  wrote:
>>> 
>>> +1
>>> 
>>> On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi  wrote:
>>> 
 +1
 
 – Ufuk
 
 
 On Fri, May 15, 2020 at 4:54 PM Zhijiang >>> .invalid>
 wrote:
 
> Sounds good, +1.
> 
> Best,
> Zhijiang
> 
> 
> --
> From:Thomas Weise 
> Send Time:2020年5月15日(星期五) 21:33
> To:dev 
> Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API and
> binary compatible across bug fix releases (x.y.u and x.y.v)
> 
> +1
> 
> 
> On Fri, May 15, 2020 at 6:15 AM Till Rohrmann 
> wrote:
> 
>> Dear community,
>> 
>> with reference to the dev ML thread about guaranteeing API and
>> binary
>> compatibility for @PublicEvolving classes across bug fix releases
>> [1] I
>> would like to start a vote about it.
>> 
>> The proposal is that the Flink community starts to guarantee
>> that @PublicEvolving classes will be API and binary compatible
>> across
 bug
>> fix releases of the same minor version. This means that a version
>> x.y.u
> is
>> API and binary compatible to x.y.v with u <= v wrt all
>> @PublicEvolving
>> classes.
>> 
>> The voting options are the following:
>> 
>> * +1, Provide the above described guarantees
>> * -1, Do not provide the above described guarantees (please provide
>> specific comments)
>> 
>> The vote will be open for at least 72 hours. It is adopted by
>> majority
>> approval with at least 3 PMC affirmative votes.
>> 
>> [1]
>> 
>> 
> 
 
>> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
>> 
>> Cheers,
>> Till
>> 
> 
> 
 
>>

[jira] [Created] (FLINK-17753) watermark defined in ddl does not work in Table api

2020-05-16 Thread godfrey he (Jira)

godfrey he created FLINK-17753:
--

 Summary: watermark defined in ddl does not work in Table api
 Key: FLINK-17753
 URL: https://issues.apache.org/jira/browse/FLINK-17753
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.11.0


the following code will get {{org.apache.flink.table.api.ValidationException: A 
group window expects a time attribute for grouping in a stream environment.}}

{code:java}
@Test
  def testRowTimeTableSourceGroupWindow(): Unit = {
val ddl =
  s"""
 |CREATE TABLE rowTimeT (
 |  id int,
 |  rowtime timestamp(3),
 |  val bigint,
 |  name varchar(32),
 |  watermark for rowtime as rowtime
 |) WITH (
 |  'connector' = 'projectable-values',
 |  'bounded' = 'false'
 |)
   """.stripMargin
util.tableEnv.executeSql(ddl)

val t = util.tableEnv.from("rowTimeT")
  .where($"val" > 100)
  .window(Tumble over 10.minutes on 'rowtime as 'w)
  .groupBy('name, 'w)
  .select('name, 'w.end, 'val.avg)
util.verifyPlan(t)
  }
{code}

The reason is planner does not convert {{watermarkSpecs}} in {{TableSchema}} to 
correct type when calling {{tableEnv.from}}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17752) Align the timestamp format with Flink SQL types in JSON format

2020-05-16 Thread Jark Wu (Jira)

Jark Wu created FLINK-17752:
---

 Summary: Align the timestamp format with Flink SQL types in JSON 
format
 Key: FLINK-17752
 URL: https://issues.apache.org/jira/browse/FLINK-17752
 Project: Flink
  Issue Type: Sub-task
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Reporter: Jark Wu
 Fix For: 1.11.0


Currently, we are using RFC3339_TIMESTAMP_FORMAT (which will add timezone at 
the end of string) to as the timestamp format in JSON. However, the string 
representation fo {{TIMESTAMP (WITHOUT TIME ZONE)}} shoudn't adding 'Z' at the 
end. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17751) proctime defined in ddl can't work with over window in Table api

2020-05-16 Thread godfrey he (Jira)

godfrey he created FLINK-17751:
--

 Summary: proctime defined in ddl can't work with over window in 
Table api
 Key: FLINK-17751
 URL: https://issues.apache.org/jira/browse/FLINK-17751
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.11.0


the following test will get {{org.apache.flink.table.api.ValidationException: 
Ordering must be defined on a time attribute.}}
{code:scala}
  @Test
  def testProcTimeTableSourceOverWindow(): Unit = {
val ddl =
  s"""
 |CREATE TABLE procTimeT (
 |  id int,
 |  val bigint,
 |  name varchar(32),
 |  proctime as PROCTIME()
 |) WITH (
 |  'connector' = 'projectable-values',
 |  'bounded' = 'false'
 |)
   """.stripMargin
util.tableEnv.executeSql(ddl)

val t = util.tableEnv.from("procTimeT")
  .window(Over partitionBy 'id orderBy 'proctime preceding 2.hours as 'w)
  .select('id, 'name, 'val.sum over 'w as 'valSum)
  .filter('valSum > 100)
util.verifyPlan(t)
  }
{code}

The reason is: the type of proctime is {{TIMESTAMP(3) NOT null}}, while 
{{LegacyTypeInfoDataTypeConverter}} does not handle the mapping between 
{{Types.LOCAL_DATE_TIME}} and {{DataTypes.TIMESTAMP(3)}} with not null. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-17750) YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint failed on azure

2020-05-16 Thread Roman Khachatryan (Jira)

Roman Khachatryan created FLINK-17750:
-

 Summary: 
YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint failed on azure
 Key: FLINK-17750
 URL: https://issues.apache.org/jira/browse/FLINK-17750
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.11.0
Reporter: Roman Khachatryan


[https://dev.azure.com/khachatryanroman/810e80cc-0656-4d3c-9d8c-186764456a01/_apis/build/builds/6/logs/156]

 
{code:java}
2020-05-15T23:42:29.5307581Z [ERROR] 
testKillYarnSessionClusterEntrypoint(org.apache.flink.yarn.YARNHighAvailabilityITCase)
 Time elapsed: 21.68 s <<< ERROR! 2020-05-15T23:42:29.5308406Z 
java.util.concurrent.ExecutionException: 2020-05-15T23:42:29.5308864Z 
org.apache.flink.runtime.rest.util.RestClientException: [Internal server 
error., ] 
2020-05-15T23:42:29.5345940Z at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
2020-05-15T23:42:29.5346289Z at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) 
2020-05-15T23:42:29.5346680Z at 
org.apache.flink.yarn.YARNHighAvailabilityITCase.lambda$waitUntilJobIsRunning$4(YARNHighAvailabilityITCase.java:312)
 2020-05-15T23:42:29.5347126Z at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:123)
 2020-05-15T23:42:29.5347520Z at 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:119)
 2020-05-15T23:42:29.5347950Z at 
org.apache.flink.yarn.YARNHighAvailabilityITCase.waitUntilJobIsRunning(YARNHighAvailabilityITCase.java:310)
 *2020-05-15T23:42:29.5348430Z at 
org.apache.flink.yarn.YARNHighAvailabilityITCase.lambda$testKillYarnSessionClusterEntrypoint$0(YARNHighAvailabilityITCase.j*ava:168)
 2020-05-15T23:42:29.5348835Z at 
org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:258) 
*2020-05-15T23:42:29.5349420Z at 
org.apache.flink.yarn.YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint(YARNHighAvailabilityITCase.java:156)
 2020-05-15T23:42:29.5349984Z at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
2020-05-15T23:42:29.5350385Z at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
2020-05-15T23:42:29.5350772Z at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 2020-05-15T23:42:29.5351125Z at 
java.lang.reflect.Method.invoke(Method.java:498) 2020-05-15T23:42:29.5351459Z 
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 2020-05-15T23:42:29.5351868Z at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 2020-05-15T23:42:29.5352274Z at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 2020-05-15T23:42:29.5352661Z at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 2020-05-15T23:42:29.5353219Z at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
2020-05-15T23:42:29.5353559Z at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
2020-05-15T23:42:29.5353886Z at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) 
2020-05-15T23:42:29.5354163Z at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) 
2020-05-15T23:42:29.5354454Z at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 
2020-05-15T23:42:29.5354776Z at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 2020-05-15T23:42:29.5355144Z at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 2020-05-15T23:42:29.5355459Z at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 
2020-05-15T23:42:29.5355764Z at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 
2020-05-15T23:42:29.5356062Z at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) 
2020-05-15T23:42:29.5356382Z at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) 
2020-05-15T23:42:29.5356681Z at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) / 
2020-05-15T23:42:29.5357068Z at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
2020-05-15T23:42:29.5357432Z at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
2020-05-15T23:42:29.5357757Z at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) 
2020-05-15T23:42:29.5358088Z at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) 
2020-05-15T23:42:29.5358402Z at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) 
2020-05-15T23:42:29.5358705Z at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) 
2020-05-15T23:42:29.5358976Z at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) 
2020-05-15T23:42:29.5359307Z at

Re: [PROPOSAL] Google Season of Docs 2020.

2020-05-16 Thread Robert Metzger

FYI: I'm a moderator of the dev@ list, and we've received about 5 emails
from applicants that were not subscribed to the list.
Initially, I rejected their messages, asking them to subscribe and send the
email again. This has not happened from any of them.
That's why I accepted new applicant messages now. However, this means that
they won't receive email responses if we just reply to the list.

tl:dr: Please use "Reply to all" and make sure the applicant's email
address is included when responding to any of those applications. Thanks :)

On Tue, May 12, 2020 at 11:28 AM Till Rohrmann  wrote:

> This is great newst :-) Thanks Marta for driving this effort!
>
> On Mon, May 11, 2020 at 4:22 PM Sivaprasanna 
> wrote:
>
> > Awesome. Great job.
> >
> > On Mon, 11 May 2020 at 7:22 PM, Seth Wiesman 
> wrote:
> >
> > > Thank you for putting this together Marta!
> > >
> > > On Mon, May 11, 2020 at 8:35 AM Fabian Hueske 
> wrote:
> > >
> > > > Thanks Marta and congratulations!
> > > >
> > > > Am Mo., 11. Mai 2020 um 14:55 Uhr schrieb Robert Metzger <
> > > > rmetz...@apache.org>:
> > > >
> > > > > Awesome :)
> > > > > Thanks a lot for driving this Marta!
> > > > >
> > > > > Nice to see Flink (by virtue of having Apache as part of the name)
> so
> > > > high
> > > > > on the list, with other good open source projects :)
> > > > >
> > > > >
> > > > > On Mon, May 11, 2020 at 2:18 PM Marta Paes Moreira <
> > > ma...@ververica.com>
> > > > > wrote:
> > > > >
> > > > > > I'm happy to announce that we were *accepted* into this year's
> > Google
> > > > > > Season of Docs!
> > > > > >
> > > > > > The list of projects was published today [1]. The next step is
> for
> > > > > > technical writers to reach out (May 11th-June 8th) and apply
> (June
> > > > > 9th-July
> > > > > > 9th).
> > > > > >
> > > > > > Thanks again to everyone involved! I'm really looking forward to
> > > > kicking
> > > > > > off the project in September.
> > > > > >
> > > > > > [1]
> > https://developers.google.com/season-of-docs/docs/participants/
> > > > > >
> > > > > > Marta
> > > > > >
> > > > > > On Thu, Apr 30, 2020 at 5:14 PM Marta Paes Moreira <
> > > > ma...@ververica.com>
> > > > > > wrote:
> > > > > >
> > > > > > > The application to Season of Docs 2020 is close to being
> > finalized.
> > > > > I've
> > > > > > > created a PR with the application announcement for the Flink
> blog
> > > [1]
> > > > > (as
> > > > > > > required by Google OSS).
> > > > > > >
> > > > > > > Thanks a lot to everyone who pitched in — and special thanks to
> > > > > Aljoscha
> > > > > > > and Seth for volunteering as mentors!
> > > > > > >
> > > > > > > I'll send an update to this thread once the results are out
> (May
> > > > 11th).
> > > > > > >
> > > > > > > [1] https://github.com/apache/flink-web/pull/332
> > > > > > >
> > > > > > > On Mon, Apr 27, 2020 at 9:28 PM Seth Wiesman <
> > sjwies...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > >> Hi Marta,
> > > > > > >>
> > > > > > >> I think this is a great idea, I'd be happy to help mentor a
> > table
> > > > > > >> documentation project.
> > > > > > >>
> > > > > > >> Seth
> > > > > > >>
> > > > > > >> On Thu, Apr 23, 2020 at 8:38 AM Marta Paes Moreira <
> > > > > ma...@ververica.com
> > > > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Thanks for the feedback!
> > > > > > >> >
> > > > > > >> > So far, the projects on the table are:
> > > > > > >> >
> > > > > > >> >1. Improving the Table API/SQL documentation.
> > > > > > >> >2. Improving the documentation about Deployments.
> > > > > > >> >3. Restructuring and standardizing the documentation
> about
> > > > > > >> Connectors.
> > > > > > >> >4. Finishing the Chinese translation.
> > > > > > >> >
> > > > > > >> > I think 2. would require a lot of technical knowledge about
> > > Flink,
> > > > > > which
> > > > > > >> > might not be a good fit for GSoD (as discussed last year).
> > > > > > >> >
> > > > > > >> > As for mentors, we have:
> > > > > > >> >
> > > > > > >> >- Aljoscha (Table API/SQL)
> > > > > > >> >- Till (Deployments)
> > > > > > >> >- Stephan also said he'd be happy to participate as a
> > mentor
> > > if
> > > > > > >> needed.
> > > > > > >> >
> > > > > > >> > For the translation project, I'm pulling in the people
> > involved
> > > in
> > > > > > last
> > > > > > >> > year's thread (Jark and Jincheng), as we would need two
> > > > > > chinese-speaking
> > > > > > >> > mentors.
> > > > > > >> >
> > > > > > >> > I'll follow up with a draft proposal early next week, once
> we
> > > > reach
> > > > > a
> > > > > > >> > consensus and have enough mentors (2 per project). Thanks
> > again!
> > > > > > >> >
> > > > > > >> > Marta
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Fri, Apr 17, 2020 at 2:53 PM Till Rohrmann <
> > > > trohrm...@apache.org
> > > > > >
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Thanks for driving this effort Marta.
> > > > > > >> > >
> > > > > > >> > > I'd be

Re: [VOTE] Guarantee that @PublicEvolving classes are API and binary compatible across bug fix releases (x.y.u and x.y.v)

2020-05-16 Thread Congxian Qiu

+1 (non-binding)
Best,
Congxian


Yangze Guo  于2020年5月16日周六 上午12:51写道：

> +1
>
> Best,
> Yangze Guo
>
> On Sat, May 16, 2020 at 12:26 AM Yuan Mei  wrote:
> >
> > +1
> >
> > On Sat, May 16, 2020 at 12:21 AM Ufuk Celebi  wrote:
> >
> > > +1
> > >
> > > – Ufuk
> > >
> > >
> > > On Fri, May 15, 2020 at 4:54 PM Zhijiang  > > .invalid>
> > > wrote:
> > >
> > > > Sounds good, +1.
> > > >
> > > > Best,
> > > > Zhijiang
> > > >
> > > >
> > > > --
> > > > From:Thomas Weise 
> > > > Send Time:2020年5月15日(星期五) 21:33
> > > > To:dev 
> > > > Subject:Re: [VOTE] Guarantee that @PublicEvolving classes are API and
> > > > binary compatible across bug fix releases (x.y.u and x.y.v)
> > > >
> > > > +1
> > > >
> > > >
> > > > On Fri, May 15, 2020 at 6:15 AM Till Rohrmann 
> > > > wrote:
> > > >
> > > > > Dear community,
> > > > >
> > > > > with reference to the dev ML thread about guaranteeing API and
> binary
> > > > > compatibility for @PublicEvolving classes across bug fix releases
> [1] I
> > > > > would like to start a vote about it.
> > > > >
> > > > > The proposal is that the Flink community starts to guarantee
> > > > > that @PublicEvolving classes will be API and binary compatible
> across
> > > bug
> > > > > fix releases of the same minor version. This means that a version
> x.y.u
> > > > is
> > > > > API and binary compatible to x.y.v with u <= v wrt all
> @PublicEvolving
> > > > > classes.
> > > > >
> > > > > The voting options are the following:
> > > > >
> > > > > * +1, Provide the above described guarantees
> > > > > * -1, Do not provide the above described guarantees (please provide
> > > > > specific comments)
> > > > >
> > > > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > > > approval with at least 3 PMC affirmative votes.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> https://lists.apache.org/thread.html/rb0d0f887b291a490ed3773352c90ddf5f11e3d882dc501e3b8cf0ed0%40%3Cdev.flink.apache.org%3E
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > >
> > > >
> > >
>

Re: [DISCUSS] Stability guarantees for @PublicEvolving classes

2020-05-16 Thread Congxian Qiu

Sorry for the late jump in.

+1 to keep the compatibility of @PublicEvolving between minor
releases(x.y.a -> x.y.b), as a user I always think this as a bug-fix
release, break the compatibility between minor releases may give users a
surprise.

As the previous emails said, how and when will a @PublicEvolving
become @Public, and I'm not sure if we can have a technical solution to
keep such a rule. (In my opinion, check such things -- change
@PublicEvolving to @Public -- manually may not so easy)

Best
Congxian


Till Rohrmann  于2020年5月15日周五 下午9:18写道：

> The vote thread can be found here
>
> https://lists.apache.org/thread.html/rc58099fb0e31d0eac951a7bbf7f8bda8b7b65c9ed0c04622f5333745%40%3Cdev.flink.apache.org%3E
> .
>
> Cheers,
> Till
>
> On Fri, May 15, 2020 at 3:03 PM Till Rohrmann 
> wrote:
>
> > I completely agree that there are many other aspect of our guarantees and
> > processes around the @Public and @PublicEvolving classes which need to be
> > discussed and properly defined. For the sake of keeping this discussion
> > thread narrowly scoped, I would suggest to start a separate discussion
> > about the following points (not exhaustive):
> >
> > - What should be annotated with @Public and @PublicEvolving?
> > - Process for transforming @PublicEvolving into @Public; How to ensure
> > that @PublicEvolving will eventually be promoted to @Public?
> > - Process of retiring a @Public/@PublicEvolving API
> >
> > I will start a vote thread about the change I proposed here which is to
> > ensure API and binary compatibility for @PublicEvolving classes between
> > bugfix releases (x.y.z and x.y.u).
> >
> > Cheers,
> > Till
> >
> > On Fri, May 15, 2020 at 6:33 AM Zhu Zhu  wrote:
> >
> >> +1 for "API + binary compatibility for @PublicEvolving classes for all
> bug
> >> fix
> >> releases in a minor release (x.y.z is compatible to x.y.u)"
> >>
> >> This @PublicEnvolving would then be a hard limit to changes.
> >> So it's important to rethink the policy towards using it, as Stephan
> >> proposed.
> >>
> >> I think any Flink interfaces that are visible to users should be
> >> explicitly
> >> marked as @Public or @PublicEnvolving.
> >> Any other interfaces should not be marked as @Public/@PublicEnvolving.
> >> This would be essential for us to check whether we are breaking any user
> >> faced interfaces unexpectedly.
> >> The only exception would be the case that we had to expose a
> method/class
> >> due to implementation limitations, it should be explicitly marked it
> >> as @Internal.
> >>
> >> Thanks,
> >> Zhu Zhu
> >>
> >> Yun Tang  于2020年5月15日周五 上午11:41写道：
> >>
> >> > +1 for this idea, and I also like Xintong's suggestion to make it
> >> > explicitly when the @PublicEvolving API could upgrade to @Public API.
> >> > If we have the rule to upgrade API stable level but not define the
> clear
> >> > timeline, I'm afraid not everyone have the enthusiasm to upgrade this.
> >> >
> >> > The minor suggestion is that I think two major release (which is x.y.0
> >> as
> >> > Chesnay clarified) might be a bit quick. From the release history [1],
> >> > Flink bump major version every 3 ~ 6 months and two major release gap
> >> > could only be at least half a year.
> >> > I think half a year might be a bit too frequent for users to collect
> >> > enough feedbacks, and upgrading API stable level every 3 major
> versions
> >> > should be better.
> >> >
> >> > [1] https://flink.apache.org/downloads.html#flink
> >> >
> >> > Best
> >> > Yun Tang
> >> >
> >> >
> >> > 
> >> > From: Xintong Song 
> >> > Sent: Friday, May 15, 2020 11:04
> >> > To: dev 
> >> > Subject: Re: [DISCUSS] Stability guarantees for @PublicEvolving
> classes
> >> >
> >> > ### Documentation on API compatibility policies
> >> >
> >> > Do we have any formal documentation about the API compatibility
> >> policies?
> >> > The only things I found are:
> >> >
> >> >- In the release announcement (take 1.10.0 as an example) [1]:
> >> >"This version is API-compatible with previous 1.x releases for APIs
> >> >annotated with the @Public annotation."
> >> >- JavaDoc for Public [2] and PublicEvolving [3].
> >> >
> >> > I think we might have a formal documentation, clearly state our
> policies
> >> > for API compatibility.
> >> >
> >> >- What does the annotations mean
> >> >- In what circumstance would the APIs remain compatible / become
> >> >incompatible
> >> >- How do APIs retire (e.g., first deprecated then removed?)
> >> >
> >> > Maybe there is already such kind of documentation that I overlooked?
> If
> >> so,
> >> > we probably want to make it more explicit and easy-to-find.
> >> >
> >> > ### @Public vs. @PublicEvolving for new things
> >> >
> >> > I share Stephan's concern that, with @PublicEvolving used for every
> new
> >> > feature and rarely upgraded to @Public, we are practically making no
> >> > compatibility guarantee between minor versions (x.y.* / x.z.*). On the
> >> > other hand, I think in many

Interested In Google Season of Docs!

2020-05-16 Thread Roopal Jain

Hello Flink Dev Community!

I am interested in participating in Google Season of Docs for Apache Flink.
I went through the FLIP-60 detailed proposal and thought this is something
I could do well. I am currently working as a software engineer and have a
B.E in Computer Engineering from one of India's reputed engineering
colleges. I have prior open-source contribution with mentoring for Google
Summer of Code and Google Code-In.
I have prior work experience on Apache Spark and a good grasp on SQL, Java,
and Python.
Please guide me more on how to get started?

Thanks & Regards,
Roopal Jain

Google Season of Docs

2020-05-16 Thread Amr Maghraby

Dear Apace Flink,
My name is Amr Maghraby, I am a new graduate from AAST college got the
first rank on my class with CGPA 3.92 and joined the international
competition in the US called ROV got the second worldwide and last summer I
have involved in Google Summer of code 2019 and did good work also, I
participated in problem-solving competitions ACM ACPC and Hash Code. I was
asking if I could apply for GSOD?
Waiting for your reply.
Thanks,
Amr Maghraby

regarding Google Season of Docs

2020-05-16 Thread Yuvraj Manral

Respected sir/mam,

I came around the projects proposed by Apache Flink for Season of Docs 2020.
I am a newbie to organisation but really liked the ideas of projects and
would love to start contributing and prepare my proposal for Season of Docs.

Please guide me through. Where should I start and then proceed ?
Thanking you in anticipation

Yuvraj Manral 
RSVP

Re: [DISCUSS] increase "state.backend.fs.memory-threshold" from 1K to 100K

2020-05-16 Thread Congxian Qiu

Hi,

Overall, I agree with increasing this value. but the default value set to
100K maybe something too large from my side.

I want to share some more information from my side.

The small files problem is indeed a problem many users may encounter in
production env. The states(Keyed state and Operator state) can become small
files in DFS, but increase the value of `state.backend.fs.memory-threshold`
may encounter the JM OOM problem as Yun said previously.
We've tried increase this value in our production env, but some connectors
which UnionState prevent us to do this, the memory consumed by these jobs
can be very large (in our case, thousands of parallelism, set
`state.backend.fs.memory-threshold` to 64K, will consume 10G+ memory for
JM), so in the end, we use the solution proposed in FLINK-11937[1] for both
keyed state and operator state.


[1] https://issues.apache.org/jira/browse/FLINK-11937
Best,
Congxian


Yun Tang  于2020年5月15日周五 下午9:09写道：

> Please correct me if I am wrong, "put the increased value into the default
> configuration" means
> we will update that in default flink-conf.yaml but still leave the default
> value of `state.backend.fs.memory-threshold`as previously?
> It seems I did not get the point why existing setups with existing configs
> will not be affected.
>
> The concern I raised is because one of our large-scale job with 1024
> parallelism source of union state meet the JM OOM problem when we increase
> this value.
> I think if we introduce memory control when serializing TDD asynchronously
> [1], we could be much more confident to increase this configuration as the
> memory footprint
> expands at that time by a lot of serialized TDDs.
>
>
> [1]
> https://github.com/apache/flink/blob/32bd0944d0519093c0a4d5d809c6636eb3a7fc31/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L752
>
> Best
> Yun Tang
>
> 
> From: Stephan Ewen 
> Sent: Friday, May 15, 2020 16:53
> To: dev 
> Cc: Till Rohrmann ; Piotr Nowojski <
> pi...@ververica.com>
> Subject: Re: [DISCUSS] increase "state.backend.fs.memory-threshold" from
> 1K to 100K
>
> I see, thanks for all the input.
>
> I agree with Yun Tang that the use of UnionState is problematic and can
> cause issues in conjunction with this.
> However, most of the large-scale users I know that also struggle with
> UnionState have also increased this threshold, because with this low
> threshold, they get an excess number of small files and overwhelm their
> HDFS / S3 / etc.
>
> An intermediate solution could be to put the increased value into the
> default configuration. That way, existing setups with existing configs will
> not be affected, but new users / installations will have a simper time.
>
> Best,
> Stephan
>
>
> On Thu, May 14, 2020 at 9:20 PM Yun Tang  wrote:
>
> > Tend to be not in favor of this proposal as union state is somewhat
> abused
> > in several popular source connectors (e.g. kafka), and increasing this
> > value could lead to JM OOM when sending tdd from JM to TMs with large
> > parallelism.
> >
> > After we collect union state and initialize the map list [1], we already
> > have union state ready to assign. At this time, the memory footprint has
> > not increase too much as the union state which shared across tasks have
> the
> > same reference of ByteStreamStateHandle. However, when we send tdd with
> the
> > taskRestore to TMs, akka will serialize those ByteStreamStateHandle
> within
> > tdd to increases the memory footprint. If the source have 1024
> > parallelisms, and any one of the sub-task would then have 1024*100KB size
> > state handles. The sum of total memory footprint cannot be ignored.
> >
> > If we plan to increase the default value of
> > state.backend.fs.memory-threshold, we should first resolve the above
> case.
> > In other words, this proposal could be a trade-off, which benefit perhaps
> > 99% users, but might bring harmful effects to 1% user with large-scale
> > flink jobs.
> >
> >
> > [1]
> >
> https://github.com/apache/flink/blob/c1ea6fcfd05c72a68739bda8bd16a2d1c15522c0/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/RoundRobinOperatorStateRepartitioner.java#L64-L87
> >
> > Best
> > Yun Tang
> >
> >
> > 
> > From: Yu Li 
> > Sent: Thursday, May 14, 2020 23:51
> > To: Till Rohrmann 
> > Cc: dev ; Piotr Nowojski 
> > Subject: Re: [DISCUSS] increase "state.backend.fs.memory-threshold" from
> > 1K to 100K
> >
> > TL;DR: I have some reservations but tend to be +1 for the proposal,
> > meanwhile suggest we have a more thorough solution in the long run.
> >
> > Please correct me if I'm wrong, but it seems the root cause of the issue
> is
> > too many small files generated.
> >
> > I have some concerns for the case of session cluster [1], as well as
> > possible issues for users at large scale, otherwise I think increasing
> > `state.backend.fs.memory-threshold` to 100K is a good choice, based on
> the
> >

42 matches

Mail list logo