[jira] [Created] (FLINK-21855) Document Metrics Limitations

2021-03-18 Thread Konstantin Knauf (Jira)
Konstantin Knauf created FLINK-21855:


 Summary: Document Metrics Limitations 
 Key: FLINK-21855
 URL: https://issues.apache.org/jira/browse/FLINK-21855
 Project: Flink
  Issue Type: Sub-task
Reporter: Konstantin Knauf






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21856) Fix the bug of using Python UDF from sub-query as input param of Python UDTF

2021-03-18 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-21856:


 Summary: Fix the bug of using Python UDF from sub-query as input 
param of Python UDTF
 Key: FLINK-21856
 URL: https://issues.apache.org/jira/browse/FLINK-21856
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.12.2, 1.11.3, 1.13.0
Reporter: Huang Xingbo


This example comes from the user. splitStr is a Python UDTF. train_and_predict 
is a Python UDF.
{code:python}
t_env.sql_query("""
 select A.hotime ,
 A.before_ta ,
 A.before_rssi ,
 A.after_ta ,
 A.after_rssil ,
 A.nb_tath ,
 A.nb_rssith ,
 nbr_rssi ,
 nbr_ta
 from (SELECT
 hotime ,
 before_ta ,
 before_rssi ,
 after_ta ,
 after_rssil ,
 nb_tath ,
 nb_rssith ,
 train_and_predict(nb_tath, nb_rssith) predict
 FROM
 source) as  A,LATERAL TABLE(splitStr(predict)) as T(nbr_rssi, nbr_ta)
 """)

{code}

The root cause is that `train_and_predict` is a RexCorrelVariable which we 
don't have relevant logic to deal with.

 A workaround is to use the Table API.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21857) StackOverflow for large parallelism jobs when processing EndOfChannelStateEvent

2021-03-18 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-21857:
---

 Summary: StackOverflow for large parallelism jobs when processing 
EndOfChannelStateEvent
 Key: FLINK-21857
 URL: https://issues.apache.org/jira/browse/FLINK-21857
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Reporter: Yingjie Cao
 Fix For: 1.13.0


CheckpointedInputGate#handleEvent calls pollNext recursively when processing 
EndOfChannelStateEvent, for large parallelism job of large amount of input 
channels, the stack can become really deep thus causing StackOverflow. The 
following is the stack:
{code:java}
12:11
java.lang.StackOverflowError
  at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.waitAndGetNextData(SingleInputGate.java:650)
  at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:625)
  at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.pollNext(SingleInputGate.java:611)
  at 
org.apache.flink.runtime.taskmanager.InputGateWithMetrics.pollNext(InputGateWithMetrics.java:109)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:149)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:202)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:158)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:202)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:158)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:202)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:158)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:202)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:158)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:202)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:158)
  at 
org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:202){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Flink job cannot find recover path after using entropy injection for s3 file systems

2021-03-18 Thread Till Rohrmann
Hi Rainie,

if I remember correctly (unfortunately I don't have a S3 deployment at hand
to try it out), then in v1.9 you should find the data files for the
checkpoint under s3a://{bucket
name}/dev/checkpoints/_entropy_/{job_id}/chk-2230. A checkpoint consists of
these data files and a metadata file which links the individual data files
from the different operators together to a checkpoint. The metadata file
should be stored under s3a://{bucket
name}/dev/checkpoints/{job_id}/chk-2230 so that it is easily discoverable.
If the data files are also contained in s3a://{bucket
name}/dev/checkpoints/{job_id}/chk-2230, then there is some problem or the
system did not properly use the entropy functionality.

My suspicion is that with FLINK-5763 (this has been introduced with Flink
1.11) we moved the metadata file also under the entropy folder to make the
checkpoints/savepoints self-contained and relocatable.

Cheers,
Till

On Wed, Mar 17, 2021 at 10:14 PM Rainie Li 
wrote:

> Thanks for checking, Till.
>
> I have a follow up question for #2, do you know why the same job cannot
> show up at the entropy checkpoint in Version 1.9.
> For example:
> *When it's running in v1.11, checkpoint path is: *
> s3a://{bucket name}/dev/checkpoints/_entropy_/{job_id}/chk-1537
> *When it's running in v1.9, checkpoint path is: *
> s3a://{bucket name}/dev/checkpoints/{job_id}/chk-2230
>
> Not sure which caused this inconsistency issue.
> Thanks
> Best regards
> Rainie
>
> On Wed, Mar 17, 2021 at 6:38 AM Till Rohrmann 
> wrote:
>
> > Hi Rainie,
> >
> > 1. I think what you need to do is to look for the {job_id} in all the
> > possible sub folders of the dev/checkpoints/ folder or you extract the
> > entropy from the logs.
> >
> > 2. According to [1] entropy should only be used for the data files and
> not
> > for the metadata files. The idea was to keep the metadata path entropy
> free
> > in order to make it more easily discoverable. I can imagine that this
> > changed with FLINK-5763 [2] which was added in Flink 1.11. This
> effectively
> > means that in order to make checkpoints/savepoints self contained we
> needed
> > to add the entropy also to the metadata file paths. Moreover, this also
> > means that the entropy injection works for 1.9 and 1.11. I think it was
> > introduced with Flink 1.6.2, 1.7.0 [3].
> >
> > [1]
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/filesystems/s3.html#entropy-injection-for-s3-file-systems
> > [2] https://issues.apache.org/jira/browse/FLINK-5763
> > [3] https://issues.apache.org/jira/browse/FLINK-9061
> >
> > Cheers,
> > Till
> >
> > On Tue, Mar 16, 2021 at 7:03 PM Rainie Li  .invalid>
> > wrote:
> >
> > > Hi Flink Developers.
> > >
> > > We enabled entropy injection for s3, here is our setting on Yarn
> Cluster.
> > > s3.entropy.key: _entropy_
> > > s3.entropy.length: 1
> > > state.checkpoints.dir: 's3a://{bucket name}/dev/checkpoints/_entropy_'
> > >
> > > I have two questions:
> > > 1. After enabling entropy, job's checkpoint path changed to:
> > > *s3://{bucket name}/dev/checkpoints/_entropy_/{job_id}chk-607*
> > > SInce we don't know which key is mapped to _entropy_
> > > It cannot be used to relaunch flink jobs by running
> > > *flink run -s **s3://{bucket
> > > name}/dev/checkpoints/_entropy_/{job_id}chk-607*
> > > If you also enabled entropy injection for s3, any suggestion how to
> > recover
> > > failed jobs using entropy checkpoints?
> > >
> > > 2.We added entropy settings on the Yarn cluster.
> > > But we can only see flink jobs in version 1.11 shows the entropy
> > checkpoint
> > > path.
> > > For flink jobs version 1.9, they are still using checkpoint paths
> without
> > > entropy like:
> > > *s3://{bucket name}/dev/checkpoints/{job_id}/chk-607*
> > > Is this path equal to s3://*{bucket name}*
> > > */dev/checkpoints/_entropy_/{job_id}**chk-607?*
> > > Does entropy work for v1.9? If so, why does v1.9 job show checkpoint
> > paths
> > > *without* entropy?
> > >
> > > Appreciated any suggestions.
> > > Thanks
> > > Best regards
> > > Rainie
> > >
> >
>


[jira] [Created] (FLINK-21858) TaskMetricGroup taskName is too long, especially in sql tasks.

2021-03-18 Thread Andrew.D.lin (Jira)
Andrew.D.lin created FLINK-21858:


 Summary: TaskMetricGroup taskName is too long, especially in sql 
tasks.
 Key: FLINK-21858
 URL: https://issues.apache.org/jira/browse/FLINK-21858
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Affects Versions: 1.12.2, 1.12.1, 1.12.0
Reporter: Andrew.D.lin


Now operatorName is limited to 80 by 
org.apache.flink.runtime.metrics.groups.TaskMetricGroup#METRICS_OPERATOR_NAME_MAX_LENGTH.

So propose to limit the maximum length of metric name by configuration.

 

Here is an example:

 

"taskName":"GlobalGroupAggregate(groupBy=[dt, src, src1, src2, src3, ct1, ct2], 
select=[dt, src, src1, src2, src3, ct1, ct2, SUM_RETRACT((sum$0, count$1)) AS 
sx_pv, SUM_RETRACT((sum$2, count$3)) AS sx_uv, MAX_RETRACT(max$4) AS updt_time, 
MAX_RETRACT(max$5) AS time_id]) -> Calc(select=[((MD5((dt CONCAT _UTF-16LE'|' 
CONCAT src CONCAT _UTF-16LE'|' CONCAT src1 CONCAT _UTF-16LE'|' CONCAT src2 
CONCAT _UTF-16LE'|' CONCAT src3 CONCAT _UTF-16LE'|' CONCAT ct1 CONCAT 
_UTF-16LE'|' CONCAT ct2 CONCAT _UTF-16LE'|' CONCAT time_id)) SUBSTR 1 SUBSTR 2) 
CONCAT _UTF-16LE'_' CONCAT (dt CONCAT _UTF-16LE'|' CONCAT src CONCAT 
_UTF-16LE'|' CONCAT src1 CONCAT _UTF-16LE'|' CONCAT src2 CONCAT _UTF-16LE'|' 
CONCAT src3 CONCAT _UTF-16LE'|' CONCAT ct1 CONCAT _UTF-16LE'|' CONCAT ct2 
CONCAT _UTF-16LE'|' CONCAT time_id)) AS rowkey, sx_pv, sx_uv, updt_time]) -> 
LocalGroupAggregate(groupBy=[rowkey], select=[rowkey, MAX_RETRACT(sx_pv) AS 
max$0, MAX_RETRACT(sx_uv) AS max$1, MAX_RETRACT(updt_time) AS max$2, 
COUNT_RETRACT(*) AS count1$3])"


"operatorName":"GlobalGroupAggregate(groupBy=[dt, src, src1, src2, src3, ct1, 
ct2], selec=[dt, s"
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21859) Batch job fails due to "Could not mark slot 61a637e3977c58a0e6b73533c419297d active"

2021-03-18 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-21859:
---

 Summary: Batch job fails due to "Could not mark slot 
61a637e3977c58a0e6b73533c419297d active"
 Key: FLINK-21859
 URL: https://issues.apache.org/jira/browse/FLINK-21859
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.13.0
Reporter: Yingjie Cao


Here is the error stack:
{code:java}
2021-03-18 19:05:31org.apache.flink.runtime.JobException: Recovery is 
suppressed by NoRestartBackoffTimeStrategyat 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:130)
at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:81)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:221)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:212)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:203)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:701)
at 
org.apache.flink.runtime.scheduler.UpdateSchedulerNgOnInternalFailuresListener.notifyTaskFailure(UpdateSchedulerNgOnInternalFailuresListener.java:51)
at 
org.apache.flink.runtime.executiongraph.DefaultExecutionGraph.notifySchedulerNgAboutInternalTaskFailure(DefaultExecutionGraph.java:1449)
at 
org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1105)
at 
org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1045)
at 
org.apache.flink.runtime.executiongraph.Execution.fail(Execution.java:754)
at 
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot.signalPayloadRelease(SingleLogicalSlot.java:195)
at 
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot.release(SingleLogicalSlot.java:182)
at 
org.apache.flink.runtime.scheduler.SharedSlot.lambda$release$4(SharedSlot.java:271)
at 
java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656)
at 
java.util.concurrent.CompletableFuture.uniAcceptStage(CompletableFuture.java:669)
at 
java.util.concurrent.CompletableFuture.thenAccept(CompletableFuture.java:1997)  
  at org.apache.flink.runtime.scheduler.SharedSlot.release(SharedSlot.java:271) 
   at 
org.apache.flink.runtime.jobmaster.slotpool.AllocatedSlot.releasePayload(AllocatedSlot.java:152)
at 
org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool.releasePayload(DefaultDeclarativeSlotPool.java:385)
at 
org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool.lambda$releaseSlot$1(DefaultDeclarativeSlotPool.java:376)
at java.util.Optional.ifPresent(Optional.java:159)at 
org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool.releaseSlot(DefaultDeclarativeSlotPool.java:374)
at 
org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolService.failAllocation(DeclarativeSlotPoolService.java:198)
at 
org.apache.flink.runtime.jobmaster.JobMaster.internalFailAllocation(JobMaster.java:650)
at 
org.apache.flink.runtime.jobmaster.JobMaster.failSlot(JobMaster.java:636)at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:301)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)at 
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)at 
scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)at 
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)at 
akka.actor.Actor$class.aroundReceive(Actor.scala:517)at 
akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)at 
akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)at 
akka.actor.ActorCell.invoke(ActorCell.scala:561)at 
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)at 
akka.dispatch.Mailbox.run(Mailbox.scala:225)at 
akka.dispatch.Mai

[jira] [Created] (FLINK-21860) Support StreamExecExpand json serialization/deserialization

2021-03-18 Thread godfrey he (Jira)
godfrey he created FLINK-21860:
--

 Summary: Support StreamExecExpand json 
serialization/deserialization
 Key: FLINK-21860
 URL: https://issues.apache.org/jira/browse/FLINK-21860
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21861) Support StreamExecIncrementalGroupAggregate json serialization/deserialization

2021-03-18 Thread godfrey he (Jira)
godfrey he created FLINK-21861:
--

 Summary: Support StreamExecIncrementalGroupAggregate json 
serialization/deserialization
 Key: FLINK-21861
 URL: https://issues.apache.org/jira/browse/FLINK-21861
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21862) Add a Java SDK showcase tutorial to StateFun playground

2021-03-18 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-21862:
---

 Summary: Add a Java SDK showcase tutorial to StateFun playground
 Key: FLINK-21862
 URL: https://issues.apache.org/jira/browse/FLINK-21862
 Project: Flink
  Issue Type: Task
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.0.0


This goal of the showcase project is intended for new StateFun users that would 
like to start implementing their StateFun application functions using Java (or 
any other JVM language).

The tutorial should be streamlined and split into a few parts which we 
recommend to go through a specific order.

Each part can demonstrate with some code snippets plus Javadocs and comments to 
guide new users through the SDK fundamentals.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21863) SortPartition should allow chaining of KeySelectors to support different sorting orders for the fields

2021-03-18 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-21863:


 Summary: SortPartition should allow chaining of KeySelectors to 
support different sorting orders for the fields
 Key: FLINK-21863
 URL: https://issues.apache.org/jira/browse/FLINK-21863
 Project: Flink
  Issue Type: Improvement
Reporter: Etienne Chauchot


If we need to sort data in a DataSet using a KeySelector (for example to 
extract avro fields or when sort using field index or name is not available) we 
cannot have different sort orders for the fields. We can sort by a list of 
fields by making the keySelector return a tuple but they will all be sorted in 
the same order. 
 To allow _sort by field 1 ASC, field 2 DESC_ kind of semantics with 
KeySelectors we need to be able to chain the KeySelectors like this
{code:java}
DataSet.sortPartition(field1KeySelector , Order.ASCENDING)
.sortPartition(field2KeySelector, Order.DESCENDING)
{code}
which is currently not possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-18 Thread Xingbo Huang
Hi Till,

The package size of tensorflow[1] is also very big(about 300MB+). However,
it does not try to solve the problem, but expands the space limit in PyPI
frequently whenever the project space is full. We could also choose this
option. According to our current release frequency, we probably need to
apply for 15GB expansion every year. There are not too many similar cases,
so there is also no standard solution to refer to. But the behavior of
splitting a project into multiple packages is quite common. For example,
apache airflow will prepare a corresponding release package for each
provider[2].

So I think there are currently two solutions in my mind which could work.

1. Just keep the current solution and expand the space limit in PyPI
whenever the space is full.

2. Split into two packages to reduce the wheel package size.

[1] https://pypi.org/project/tensorflow/#files
[2] https://pypi.org/search/?q=apache-airflow-*&o=

Best,
Xingbo

Till Rohrmann  于2021年3月17日周三 下午9:22写道:

> How do other projects solve this problem?
>
> Cheers,
> Till
>
> On Wed, Mar 17, 2021 at 3:45 AM Xingbo Huang  wrote:
>
> > Hi Chesnay,
> >
> > Yes, in most cases, we can indeed download the required jars in
> `setup.py`,
> > which is also the solution I originally thought of reducing the size of
> > wheel packages. However, I'm afraid that it will not work in scenarios
> when
> > accessing the external network is not possible which is very common in
> the
> > production cluster.
> >
> > Best,
> > Xingbo
> >
> > Chesnay Schepler  于2021年3月16日周二 下午8:32写道:
> >
> > > This proposed apache-flink-libraries package would just contain the
> > > binary, right? And effectively be unusable to the python audience on
> > > it's own.
> > >
> > > Essentially we are just abusing Pypi for shipping a java binary. Is
> > > there no way for us to download the jars when the python package is
> > > being installed? (e.g., in setup.py)
> > >
> > > On 3/16/2021 1:23 PM, Dian Fu wrote:
> > > > Yes, the size of .whl file in PyFlink will also be about 3MB if we
> > split
> > > the package. Currently the package is big because we bundled the jar
> > files
> > > in it.
> > > >
> > > >> 2021年3月16日 下午8:13,Chesnay Schepler  写道:
> > > >>
> > > >> key difference being that the beam .whl files are 3mb large, aka 60x
> > > smaller.
> > > >>
> > > >> On 3/16/2021 1:06 PM, Dian Fu wrote:
> > > >>> Hi Chesnay,
> > > >>>
> > > >>> We will publish binary packages separately for:
> > > >>> 1) Python 3.5 / 3.6 / 3.7 / 3.8 (since 1.12) separately
> > > >>> 2) Linux / Mac separately
> > > >>>
> > > >>> Besides, there is also a source package which is used when none of
> > the
> > > above binary packages is usable, e.g. for Window users.
> > > >>>
> > > >>> PS: publishing multiple binary packages is very common in Python
> > > world, e.g. Beam published 22 packages in 2.28, Pandas published 16
> > > packages in 1.2.3 [2]. We could also publishing more packages if we
> > > splitting the packages as the cost of adding another package will be
> very
> > > small.
> > > >>>
> > > >>> Regards,
> > > >>> Dian
> > > >>>
> > > >>> [1] https://pypi.org/project/apache-beam/#files <
> > > https://pypi.org/project/apache-beam/#files> <
> > > https://pypi.org/project/apache-beam/#files <
> > > https://pypi.org/project/apache-beam/#files>>
> > > >>> [2] https://pypi.org/project/pandas/#files
> > > >>>
> > > >>>
> > > >>> Hi Xintong,
> > > >>>
> > > >>> Yes, you are right that there is 9 packages in 1.12 as we added
> > Python
> > > 3.8 support in 1.12.
> > > >>>
> > > >>> Regards,
> > > >>> Dian
> > > >>>
> > >  2021年3月16日 下午7:45,Xintong Song  写道:
> > > 
> > >  And it's not only uploaded to PyPI, but the ASF mirrors as well.
> > > 
> > > 
> > https://dist.apache.org/repos/dist/release/flink/flink-1.12.2/python/
> > > 
> > >  Thank you~
> > > 
> > >  Xintong Song
> > > 
> > > 
> > > 
> > >  On Tue, Mar 16, 2021 at 7:41 PM Xintong Song <
> tonysong...@gmail.com
> > >
> > > wrote:
> > > 
> > > > Actually, I think it's 9 packages, not 7.
> > > >
> > > > Check here for the 1.12.2 packages.
> > > > https://pypi.org/project/apache-flink/#files
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Tue, Mar 16, 2021 at 7:08 PM Chesnay Schepler <
> > ches...@apache.org
> > > >
> > > > wrote:
> > > >
> > > >> Am I reading this correctly that we publish 7 different
> artifacts
> > > just
> > > >> for python?
> > > >> What does the release matrix look like?
> > > >>
> > > >> On 3/16/2021 3:45 AM, Dian Fu wrote:
> > > >>> Hi Xingbo,
> > > >>>
> > > >>>
> > > >>> Thanks a lot for bringing up this discussion. Actually the size
> > > limit
> > > >> already becomes an issue during releasing 1.11.3 and 1.12.1. It
> > > blocks us
> > > >> to publish PyFlink packages to PyPI during the release as there
> is
> > > no
> > >

[VOTE] Move Flink ML pipeline API and library code to a separate repository named flink-ml

2021-03-18 Thread Dong Lin
Hi all,

I would like to start the vote for moving Flink ML pipeline API and library
code to a separate repository named flink-ml.

The main motivation is to allow us to remove SQL planner in Flink 1.14
while still allowing ML pipeline API/library development in the coming year,

The vote will last for at least 72 hours, following the consensus voting
process.

Discussion thread:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Move-Flink-ML-pipeline-API-and-library-code-to-a-separate-repository-named-flink-ml-tc49420.html


Thanks!
Dong


Re: [DISCUSS] Move Flink ML pipeline API and library code to a separate repository named flink-ml

2021-03-18 Thread Dong Lin
Thank you Becket and Till for your comments!

Since the discussion has been open for about 1 week and there is no concern
with this proposal, I have started the voting thread. Please help vote when
you get time.

Cheers,
Dong

On Mon, Mar 15, 2021 at 6:00 PM Till Rohrmann  wrote:

> +1 for moving Flink ML to a separate repository. Thanks for driving this
> discussion and effort Dong!
>
> Cheers,
> Till
>
> On Fri, Mar 12, 2021 at 1:19 PM Becket Qin  wrote:
>
> > Thanks for raising the discussion, Dong. +1 on moving the Flink ML to a
> > separate repository.
> >
> > Machine learning is a big area which deserves a separate project so the
> > development can be decoupled from Flink core. In the meantime, it gives
> us
> > the flexibility of evolving Flink without breaking the existing ML users.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Fri, Mar 12, 2021 at 6:16 PM Dong Lin  wrote:
> >
> > > Hi everyone,
> > >
> > > I am opening this thread to discuss the idea of moving Flink ML
> pipeline
> > > API and library code to a separate repository in Flink (similar to what
> > we
> > > did for flink-statefun ).
> > >
> > > The Flink ML pipeline API was proposed by FLIP-39: Flink ML pipeline
> and
> > ML
> > > libs
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > > >.
> > > It allows MLlib developers and users to develop ML pipelines on top of
> > > Flink.
> > >
> > > According to the discussion in this
> > > <
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Deprecation-and-removal-of-the-legacy-SQL-planner-td48988.html
> > > >
> > > thread, we plan to remove SQL planner in Flink 1.14. However,
> > > there exist ML libraries which currently use Flink's DataSet API
> together
> > > with Table API. Those libraries will either stop working or suffer
> > > considerable performance regression if they bump up dependency to Flink
> > > 1.14. As a result, if we keep ML pipeline API in Flink, then those ML
> > > libraries can not use the latest ML pipeline API/lib in Flink until
> Flink
> > > compenstates the missing functionality with new DataStream APIs, which
> is
> > > supposed to happen about 1 year from now in e.g. Flink 1.15.
> > >
> > > In order to allow us to remove SQL planner in Flink 1.14 while still
> > > allowing ML pipeline API/lib development in the coming year, we propose
> > to
> > > move Flink ML pipeline API and library code to a separate repository.
> > More
> > > specifically, the new repo will have the following setup:
> > > - The repo will be created at https://github.com/apache/flink-ml. This
> > > repo
> > > will depend on the core Flink repo.
> > > - The flink-ml documentation will be linked from the existing main
> Flink
> > > docs similar to
> > > https://ci.apache.org/projects/flink/flink-statefun-docs-master.
> > > - The new repo will be under namespace org.apache.flink.
> > > - We can revisit whether we should put it back to the core Flink repo
> > after
> > > the above issue is resolved and if there is good reason to make the
> > change.
> > >
> > > Here is the proposed plan if we agree to make this change:
> > > - We will create the flink-ml repo and move Flink ML pipeline related
> > code
> > > to this repo before Flink 1.13 code release (3/31/2021)
> > > - Then we update flink-ml repo to depend on Flink 1.13 after Flink 1.13
> > is
> > > released.
> > > - Then we update core Flink with new DataStream API (e.g. DataStream
> > > iteration) such that core Flink can support the same (or better) ML lib
> > > performance as it does now with the SQL planner. This is supposed to
> > happen
> > > in about 1 year.
> > > - Then we update flink-ml repo to depend on the latest Flink version
> once
> > > Flink has the new DataStream API.
> > >
> > > Besides the main motivation described above, this change also shares
> > > similar pros/cons of creating a separate repo for flink-statefun
> > >  (see this
> > > <
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stateful-Functions-in-which-form-to-contribute-same-or-different-repository-td34034.html
> > > >
> > > and this
> > > <
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PROPOSAL-Contribute-Stateful-Functions-to-Apache-Flink-td33913.html
> > > >
> > > for priory discussion).
> > >
> > > Pros:
> > > - A separate repos allows faster development for an early stage project
> > > like flink ML pipeline (both API and libs).
> > > - Flink repo is already super large and it is good not to bloat its
> size
> > > (and the number of tests)
> > > - Less tests to run when we make code changes in each repo.
> > >
> > > Cons:
> > > - The code change in the core Flink might potentially break the test or
> > > cause performance regression in flink-ml since they are in different
> > repo.
> > > So more effort is need

[jira] [Created] (FLINK-21864) Support StreamExecTemporalJoin json serialization/deserialization

2021-03-18 Thread Terry Wang (Jira)
Terry Wang created FLINK-21864:
--

 Summary: Support StreamExecTemporalJoin json 
serialization/deserialization
 Key: FLINK-21864
 URL: https://issues.apache.org/jira/browse/FLINK-21864
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Terry Wang
 Fix For: 1.13.0


Support StreamExecTemporalJoin json serialization/deserialization



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21865) Add a Docker Compose greeter example to StateFun playgrounds

2021-03-18 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-21865:
---

 Summary: Add a Docker Compose greeter example to StateFun 
playgrounds
 Key: FLINK-21865
 URL: https://issues.apache.org/jira/browse/FLINK-21865
 Project: Flink
  Issue Type: Task
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.0.0


This example is intended as a follow-up after completion of the Java SDK 
Showcase Tutorial (FLINK-21862).

If users are already familiar with the Java SDK fundamentals and would like to 
get a better understanding of how a realistic StateFun application looks like, 
then this would be the example they start with. Otherwise, we would recommend 
users to take a look at the Showcase tutorial first.

This example works with Docker Compose, and runs a few services that build up 
an end-to-end StateFun application:
- Functions service that runs functions and expose them through an HTTP 
endpoint.
- StateFun runtime processes (a manager plus workers) that will handle ingress, 
egress, and inter-function messages as well as function state storage in a 
consistent and fault-tolerant manner.
- Apache Kafka broker for the application ingress and egress.

To motivate this example, we'll implement a simple user greeter application, 
which has two functions - a {{UserFn}} that expects {{UserLogin}} JSON events 
from an ingress and keeps in state storage information about users, and a 
{{GreetingsFn}} that accepts user information to generate personalized greeting 
messages that are sent to users via an egress.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)