Re: [DISCUSS] How about adding OLAP to Flink Roadmap?

2023-08-09 Thread Jing Ge
Hi Shammon, Hi Xiangyu,

Thanks for bringing this to our attention. I can see this is a great
proposal born from real business scenarios. +1 for it.

People have been keen to use one platform to cover all their data
production and consumption requirements. Flink did a great job for the
production, i.e. streaming and batch processing with all excellent
ecosystems. This is the big advantage for Flink to go one step further and
cover the consumption part. It will turn Flink into a unified compute
platform like what the Ray project(the platform behind ChatGPT, if someone
is not aware of it)[1] is doing and secure Flink to be one of the most
interesting open source platforms for the next decade.

Frankly speaking, it will be a big change. As far as I am concerned, the
following should be considered(just thought about it at the first glance,
there must be more).

Architecture upgrade - since we will have three capabilities(I wanted to
use "engines", but it might be too early to use the big word), i.e.
streaming, batch, OLAP,  it might make sense to upgrade the architecture
while we are building the OLAP in Flink. The unified foundation or
abstraction for distributed computation should be designed and implemented
underneath those capabilities. In the future, new capabilities can leverage
the foundation and could be developed at a very fast pace.

MPP architecture - Flink session cluster is not the MMP architecture.
Commonly speaking, SNA(shared nothing architecture) is the key that could
implement MPP. Flink has everything to offer SNA. That is the reason why we
can consider building OLAP into or on top of the Flink. And speaking of
MPP, there will be a lot of things to do, e.g. the Retrieval
Architecture[2], multiple level task split, dynamic retry or even split,
etc. I will not expand all those topics at this early stage.

OLAP queries syntax - at least some common syntax and statements need to be
implemented, e.g. cube, grouping set, over partition by, you mention it.

Last but not least, there will be a big effort to upgrade the runtime
features to support OLAP wrt the performance and latency.

Best regards,
Jing


[1] https://www.ray.io/
[2] https://www.tutorialsbook.com/teradata/teradata-architecture

On Thu, Aug 10, 2023 at 11:39 AM Dan Zou  wrote:

> Thanks for bringing up this discussion, Shammon. I would like to share
> some of my observations and experiences.
>
> Flink has almost become the de facto standard for streaming computing, and
> Flink batch have been successfully applied in some companies. If Flink can
> support OLAP scenarios well, a unified engine to support streaming, batch,
> and OLAP will become a reality, which is very exciting.
>
> Based on the status quo, Flink can be used as a primary OLAP engine,
> although there is still a lot of room for optimization. This means that we
> do not need to carry out large-scale renovation at the beginning, but only
> gradually and continuously enhance it without affecting streaming.
>
> Flink OLAP can largely reuse the capabilities of Flink Batch SQL and
> optimizations in OLAP can also benefit Flink Batch. If we simplify job
> startup overhead and increase cross-job resource reuse (Plan reuse,
> Generated class reuse, Connection reuse, etc.) on this basis, Flink will
> become a good OLAP engine.
>
> So, I am big +1 for adding OLAP to Flink Roadmap, and I am willing to
> contribute to it.
>
>
> > 2023年8月9日 15:35,xiangyu feng  写道:
> >
> > Thank you Shammon for initiating this discussion. As one of the Flink
> OLAP
> > developers in ByteDance, I would also like to share a real case of our
> > users.
> >
> > About two years ago we found our first OLAP user internally by
> integrating
> > Flink OLAP with ByteHTAP. Users are willing to use Flink as an OLAP
> engine
> > mainly hoping to use Flink's rich cross-datasource join capability. In
> the
> > beginning, we only support simple query patterns with qps less than 2 and
> > joins less than 5. With the business growing and our system capabilities
> > evolving, users have moved more scenarios to Flink OLAP, and the query
> > pattern is getting more and more complicated. Until early this year, the
> > user's query pattern has changed to peak QPS greater than 20, join table
> > number greater than 30 and scan data volume exceeding 1 billion rows.
> Even
> > with the evolution of our engine over the past two years, computing at
> this
> > scale is still very challenging. It is difficult to satisfy the
> computation
> > scale, system stability and query latency at the same time.
> >
> > Through talking to our user, we easily build some intermediate views by
> > using Flink's streaming and batch engine. In a similar way to
> materialized
> > view, we have optimized user's query pattern to single query join less
> than
> > 10, scan data volume in tens of millions and QPS remains unchanged. In
> this
> > way, our OLAP service has not only perfectly met the business
> requirements
> > and also we have made this 

[jira] [Created] (FLINK-32824) Port Calcite's fix for the sql like operator

2023-08-09 Thread lincoln lee (Jira)
lincoln lee created FLINK-32824:
---

 Summary: Port Calcite's fix for the sql like operator
 Key: FLINK-32824
 URL: https://issues.apache.org/jira/browse/FLINK-32824
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.17.1, 1.18.0
Reporter: lincoln lee
 Fix For: 1.19.0


we should port the bugfix of sql like operator 
https://issues.apache.org/jira/browse/CALCITE-1898
{code}
The LIKE operator must match '.' (period) literally, not treat it as a 
wild-card. Currently it treats it the same as '_'.
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] How about adding OLAP to Flink Roadmap?

2023-08-09 Thread Dan Zou
Thanks for bringing up this discussion, Shammon. I would like to share some of 
my observations and experiences.

Flink has almost become the de facto standard for streaming computing, and 
Flink batch have been successfully applied in some companies. If Flink can 
support OLAP scenarios well, a unified engine to support streaming, batch, and 
OLAP will become a reality, which is very exciting.

Based on the status quo, Flink can be used as a primary OLAP engine, although 
there is still a lot of room for optimization. This means that we do not need 
to carry out large-scale renovation at the beginning, but only gradually and 
continuously enhance it without affecting streaming.

Flink OLAP can largely reuse the capabilities of Flink Batch SQL and 
optimizations in OLAP can also benefit Flink Batch. If we simplify job startup 
overhead and increase cross-job resource reuse (Plan reuse, Generated class 
reuse, Connection reuse, etc.) on this basis, Flink will become a good OLAP 
engine.

So, I am big +1 for adding OLAP to Flink Roadmap, and I am willing to 
contribute to it.


> 2023年8月9日 15:35,xiangyu feng  写道:
> 
> Thank you Shammon for initiating this discussion. As one of the Flink OLAP
> developers in ByteDance, I would also like to share a real case of our
> users.
> 
> About two years ago we found our first OLAP user internally by integrating
> Flink OLAP with ByteHTAP. Users are willing to use Flink as an OLAP engine
> mainly hoping to use Flink's rich cross-datasource join capability. In the
> beginning, we only support simple query patterns with qps less than 2 and
> joins less than 5. With the business growing and our system capabilities
> evolving, users have moved more scenarios to Flink OLAP, and the query
> pattern is getting more and more complicated. Until early this year, the
> user's query pattern has changed to peak QPS greater than 20, join table
> number greater than 30 and scan data volume exceeding 1 billion rows. Even
> with the evolution of our engine over the past two years, computing at this
> scale is still very challenging. It is difficult to satisfy the computation
> scale, system stability and query latency at the same time.
> 
> Through talking to our user, we easily build some intermediate views by
> using Flink's streaming and batch engine. In a similar way to materialized
> view, we have optimized user's query pattern to single query join less than
> 10, scan data volume in tens of millions and QPS remains unchanged. In this
> way, our OLAP service has not only perfectly met the business requirements
> and also we have made this migration process very smooth, thanks to Flink's
> powerful streaming and batch computing ecosystem. Finally, we are highly
> recognized by our users.
> 
> *There are two points I want to make with this case:*
> 
> 1, Although there are many OLAP engines out there, Flink may not always
> provide the best performance. But thanks to Flink's strong ecosystem, we
> are confident that we can build an OLAP engine that provides a great user
> experience. This is very important for many small and medium sized
> companies;
> 
> 2, From another perspective, I personally believe that building OLAP will
> 'bring Flink closer to our end-users' and present a wider variety of
> computational challenges to Flink. As the case mentioned above, this is a
> very common case in data analytics, where the flink is used to precompute
> the data and feed it to OLAP services. These precalculations are often
> designed to compensate for the capabilities of other OLAP engines, such as
> some engines may not have strong join capabilities, some may not have
> complete SQL support and some may have weak plan optimization support. In
> general, Flink will not face these users directly, and thus cannot build a
> comprehensive end-to-end solution to solve this problem once and for all. *By
> building Flink OLAP, we can finally fill the last missing block in the
> puzzle!*
> 
> Of course, as we built Flink OLAP internally, we encountered many
> challenging issues, which is why we are putting this discussion out there
> and hoping to involve more contributors. Meanwhile, We also hope to
> contribute our optimizations back to the community through FLINK-25318.
> 
> So for me, it is a big +1 to add OLAP to Flink Roadmap. ^-^
> 
> Best,
> Xiangyu
> 
> Xintong Song  于2023年8月8日周二 18:01写道:
> 
>> Thanks for bringing this up, Shammon.
>> 
>> In general, I'd be +1 to improve Flink's ability to serve as an OLAP
>> engine.
>> 
>> I see a great value in Flink becoming a unified Large-scale Data Processing
>> / Analysis tool. Through my interactions with users (Alibaba internal
>> users, external users on Alibaba Cloud, developers from other companies via
>> conferences / meetups), it's commonly complained how complicated and costly
>> it is to build a data processing platform out of a bunch of different
>> tools. That usually means higher learning / developing / operation and
>> maintenance 

Re: [DISCUSS] FLIP-323: Support Attached Execution on Flink Application Completion for Batch Jobs

2023-08-09 Thread liu ron
Hi, Allison

Thanks for driving this proposal, it looks cool for batch jobs under
application mode. But after reading your FLIP document and [1], I have a
question. Why do you want to rename the execution.attached configuration to
client.attached.after.submission and at the same time deprecate
execution.attached? Based on your design, I understand the role of these
two options are the same. Introducing a new option would increase the cost
of understanding and use for the user, so why not follow the idea discussed
in FLINK-25495 and make Application mode support attached.execution.

[1] https://issues.apache.org/jira/browse/FLINK-25495

Best,
Ron

Venkatakrishnan Sowrirajan  于2023年8月9日周三 02:07写道:

> This is definitely a useful feature especially for the flink batch
> execution workloads using flow orchestrators like Airflow, Azkaban, Oozie
> etc. Thanks for reviving this issue and starting a FLIP.
>
> Regards
> Venkata krishnan
>
>
> On Mon, Aug 7, 2023 at 4:09 PM Allison Chang  >
> wrote:
>
> > Hi all,
> >
> > I am opening this thread to discuss this proposal to support attached
> > execution on Flink Application Completion for Batch Jobs. The link to the
> > FLIP proposal is here:
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-323*3A*Support*Attached*Execution*on*Flink*Application*Completion*for*Batch*Jobs__;JSsrKysrKysrKys!!IKRxdwAv5BmarQ!friFO6bJub5FKSLhPIzA6kv-7uffv-zXlv9ZLMKqj_xMcmZl62HhsgvwDXSCS5hfSeyHZgoAVSFg3fk7ChaAFNKi$
> >
> > This FLIP proposes adding back attached execution for Application Mode.
> In
> > the past attached execution was supported for the per-job mode, which
> will
> > be deprecated and we want to include this feature back into Application
> > mode.
> >
> > Please reply to this email thread and share your thoughts/opinions.
> >
> > Thank you!
> >
> > Allison Chang
> >
>


[jira] [Created] (FLINK-32823) Update the adaptive scheduler doc about batch job limitation

2023-08-09 Thread Rui Fan (Jira)
Rui Fan created FLINK-32823:
---

 Summary: Update the adaptive scheduler doc about batch job 
limitation
 Key: FLINK-32823
 URL: https://issues.apache.org/jira/browse/FLINK-32823
 Project: Flink
  Issue Type: Technical Debt
  Components: Documentation, Runtime / Coordination
Affects Versions: 1.18.0
Reporter: Rui Fan
Assignee: Rui Fan
 Attachments: screenshot-1.png

FLINK-30846 [1] updated the fall back strategy from default Scheduler to 
AdaptiveBatch Scheduler when batch job enable `jobmanager.scheduler: adaptive`.

However, the doc isn't updated.



[1] https://github.com/apache/flink/pull/21814/files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32822) Add connector option to control whether to enable auto-commit of offsets when checkpoints is enabled

2023-08-09 Thread Zhanghao Chen (Jira)
Zhanghao Chen created FLINK-32822:
-

 Summary: Add connector option to control whether to enable 
auto-commit of offsets when checkpoints is enabled
 Key: FLINK-32822
 URL: https://issues.apache.org/jira/browse/FLINK-32822
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Reporter: Zhanghao Chen


When checkpointing is enabled, Flink Kafka connector commits the current 
consuming offset when checkpoints are *completed* although ** Kafka source does 
*NOT* rely on committed offsets for fault tolerance. When the checkpoint 
interval is long, the lag curve will behave in a zig-zag way: the lag will keep 
increasing, and suddenly drops on a complete checkpoint. It have led to some 
confusion for users as in 
[https://stackoverflow.com/questions/76419633/flink-kafka-source-commit-offset-to-error-offset-suddenly-increase-or-decrease]
 and may also affect external monitoring for setting up alarms (you'll have to 
set up with a high threshold due to the non-realtime commit of offsets) and 
autoscaling (the algorithm would need to pay extra effort to distinguish 
whether the backlog is actually growing or just because the checkpoint is not 
completed yet).

Therefore, I think it is worthwhile to add an option to enable auto-commit of 
offsets when checkpoints is enabled. For DataStream API, it will be adding a 
configuration method. For Table API, it will be adding a new connector option 
which wires to the DataStream API configuration underneath.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32821) Streaming examples failed to execute due to error in packaging

2023-08-09 Thread Zhanghao Chen (Jira)
Zhanghao Chen created FLINK-32821:
-

 Summary: Streaming examples failed to execute due to error in 
packaging
 Key: FLINK-32821
 URL: https://issues.apache.org/jira/browse/FLINK-32821
 Project: Flink
  Issue Type: Improvement
  Components: Examples
Affects Versions: 1.18.0
Reporter: Zhanghao Chen


5 out of the 7 streaming examples failed to run:
 * Iteration & SessionWindowing & SocketWindowWordCount & WindowJoin failed to 
run due to java.lang.NoClassDefFoundError: 
org/apache/flink/streaming/examples/utils/ParameterTool
 * TopSpeedWindowing failed to run due to: Caused by: 
java.lang.ClassNotFoundException: 
org.apache.flink.connector.datagen.source.GeneratorFunction

The NoClassDefFoundError with ParameterTool is introduced by [FLINK-32558] 
Properly deprecate DataSet API - ASF JIRA (apache.org), and we'd better resolve 
[FLINK-32820] ParameterTool is mistakenly marked as deprecated - ASF JIRA 
(apache.org) first before we come to a fix for this problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32820) ParameterTool is mistakenly marked as deprecated

2023-08-09 Thread Zhanghao Chen (Jira)
Zhanghao Chen created FLINK-32820:
-

 Summary: ParameterTool is mistakenly marked as deprecated
 Key: FLINK-32820
 URL: https://issues.apache.org/jira/browse/FLINK-32820
 Project: Flink
  Issue Type: Improvement
  Components: API / DataSet, API / DataStream
Affects Versions: 1.18.0
Reporter: Zhanghao Chen


ParameterTool and AbstractParameterTool in pacakge flink-java is mistakenly 
marked as deprecated in [FLINK-32558] Properly deprecate DataSet API - ASF JIRA 
(apache.org). They are widely used for handling application parameters and is 
also listed in the Flink user doc: 
[https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/application_parameters/.|https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/application_parameters/]
 Also, they are not directly related to Dataset API.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32819) flink can not parse the param `#` correctly in k8s application mode

2023-08-09 Thread Jun Zhang (Jira)
Jun Zhang created FLINK-32819:
-

 Summary: flink can not parse the param `#` correctly in k8s 
application mode
 Key: FLINK-32819
 URL: https://issues.apache.org/jira/browse/FLINK-32819
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Configuration
Affects Versions: 1.17.1
Reporter: Jun Zhang
 Fix For: 1.18.0


when I submit a flink job in k8s application mode, and has a param contains `#` 
,for example mysql password , the flink can not parse the param correctly.  the 
content after the `#` will lost.
{code:java}

/mnt/flink/flink-1.17.0/bin/flink run-application \
-Dexecution.target=kubernetes-application \
-Dkubernetes.container.image=x \
local:///opt/flink/usrlib/my.jar  \
--mysql-conf hostname=localhost \
--mysql-conf username=root \
--mysql-conf password=%&^GGJI#$jh665$fi^% \
--mysql-conf port=3306 

{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-mongodb v1.0.2, release candidate 1

2023-08-09 Thread Leonard Xu
Thanks Danny and Jiabao for driving the work!


+1 (binding)

- built from source code succeeded
- verified signatures
- verified hashsums 
- checked release notes
- reviewed the web PR 

Best,
Leonard



> On Aug 6, 2023, at 6:58 PM, Ahmed Hamdy  wrote:
> 
> Thanks Danny
> + 1 (non-binding)
> 
> * Signature and checksums are matching.
> * Source Code builds locally.
> * Web PR looks good.
> 
> Best Regards
> Ahmed Hamdy
> 
> 
> On Fri, 4 Aug 2023 at 17:44, Jiabao Sun 
> wrote:
> 
>> Thanks Danny,
>> 
>> +1 (non-binding)
>> 
>> - Build and compile the source code locally
>> - Tag exists in Github
>> - Checked release notes
>> - Source archive signature/checksum looks good
>> - Binary (from Maven) signature/checksum looks good
>> - Checked the contents contains jar and pom files in apache repo
>> 
>> Non blocking findings:
>> - NOTICE files year is 2022 and needs to be updated to 2023
>> 
>> Best,
>> Jiabao
>> 
>> 
>>> 2023年8月4日 下午10:10,Danny Cranmer  写道:
>>> 
>>> Hi everyone,
>>> Please review and vote on the release candidate 1 for the version v1.0.2
>> of
>>> flink-connector-mongodb, as follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>> 
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release to be deployed to dist.apache.org
>> [2],
>>> which are signed with the key with fingerprint
>>> 0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag v1.0.2-rc1 [5],
>>> * website pull request listing the new release [6].
>>> * successful CI run on this tag [7].
>>> 
>>> The vote will be open for at least 72 hours. It is adopted by majority
>>> approval, with at least 3 PMC affirmative votes.
>>> 
>>> Thanks,
>>> Danny
>>> 
>>> [1]
>>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353146
>>> [2]
>>> 
>> https://dist.apache.org/repos/dist/dev/flink/flink-connector-mongodb-1.0.2-rc1
>>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>>> [4]
>> https://repository.apache.org/content/repositories/orgapacheflink-1647/
>>> [5]
>>> 
>> https://github.com/apache/flink-connector-mongodb/releases/tag/v1.0.2-rc1
>>> [6] https://github.com/apache/flink-web/pull/669
>>> [7]
>>> 
>> https://github.com/apache/flink-connector-mongodb/actions/runs/5762820757
>> 
>> 



[jira] [Created] (FLINK-32817) Supports running jar file names with Spaces

2023-08-09 Thread Xianxun Ye (Jira)
Xianxun Ye created FLINK-32817:
--

 Summary: Supports running jar file names with Spaces
 Key: FLINK-32817
 URL: https://issues.apache.org/jira/browse/FLINK-32817
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / YARN
Affects Versions: 1.14.0
Reporter: Xianxun Ye


When submitting a flink jar to a yarn cluster, if the jar filename has spaces 
in it, the task will not be able to successfully parse the file path in 
`YarnLocalResourceDescriptor`, and the following exception will occur in 
JobManager.

The Flink jar file name is: StreamSQLExample 2.jar

 
{code:java}
2023-08-09 18:54:31,787 WARN  
org.apache.flink.runtime.extension.resourcemanager.NeActiveResourceManager [] - 
Failed requesting worker with resource spec WorkerResourceSpec {cpuCores=1.0, 
taskHeapSize=220.160mb (230854450 bytes), taskOffHeapSize=0 bytes, 
networkMemSize=158.720mb (166429984 bytes), managedMemSize=952.320mb (998579934 
bytes), numSlots=1}, current pending count: 0
java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: 
Error to parse YarnLocalResourceDescriptor from 
YarnLocalResourceDescriptor{key=StreamSQLExample 2.jar, 
path=hdfs://***/.flink/application_1586413220781_33151/StreamSQLExample 2.jar, 
size=7937, modificationTime=1691578403748, visibility=APPLICATION, type=FILE}
    at 
org.apache.flink.util.concurrent.FutureUtils.lambda$supplyAsync$21(FutureUtils.java:1052)
 ~[flink-dist_2.12-1.14.0.jar:1.14.0]
    at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
 ~[?:1.8.0_152]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_152]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_152]
    at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_152]
Caused by: org.apache.flink.util.FlinkException: Error to parse 
YarnLocalResourceDescriptor from 
YarnLocalResourceDescriptor{key=StreamSQLExample 2.jar, 
path=hdfs://sloth-jd-pub/user/sloth/.flink/application_1586413220781_33151/StreamSQLExample
 2.jar, size=7937, modificationTime=1691578403748, visibility=APPLICATION, 
type=FILE}
    at 
org.apache.flink.yarn.YarnLocalResourceDescriptor.fromString(YarnLocalResourceDescriptor.java:112)
 ~[flink-dist_2.12-1.14.0.jar:1.14.0]
    at 
org.apache.flink.yarn.Utils.decodeYarnLocalResourceDescriptorListFromString(Utils.java:600)
 ~[flink-dist_2.12-1.14.0.jar:1.14.0]
    at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:491) 
~[flink-dist_2.12-1.14.0.jar:1.14.0]
    at 
org.apache.flink.yarn.YarnResourceManagerDriver.createTaskExecutorLaunchContext(YarnResourceManagerDriver.java:452)
 ~[flink-dist_2.12-1.14.0.jar:1.14.0]
    at 
org.apache.flink.yarn.YarnResourceManagerDriver.lambda$startTaskExecutorInContainerAsync$1(YarnResourceManagerDriver.java:383)
 ~[flink-dist_2.12-1.14.0.jar:1.14.0]
    at 
org.apache.flink.util.concurrent.FutureUtils.lambda$supplyAsync$21(FutureUtils.java:1050)
 ~[flink-dist_2.12-1.14.0.jar:1.14.0]
    ... 4 more{code}
>From what I understand, the HDFS cluster allows for file names with spaces, as 
>well as S3.

 

I think we could replace the `LOCAL_RESOURCE_DESC_FORMAT` for 
{code:java}
// code placeholder
private static final Pattern LOCAL_RESOURCE_DESC_FORMAT =
Pattern.compile(
"YarnLocalResourceDescriptor\\{"
+ "key=([\\S\\x20]+), path=([\\S\\x20]+), 
size=([\\d]+), modificationTime=([\\d]+), visibility=(\\S+), type=(\\S+)}"); 
{code}
add '\x20' to only match the spaces



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32818) Support region failover for adaptive scheduler

2023-08-09 Thread Rui Fan (Jira)
Rui Fan created FLINK-32818:
---

 Summary: Support region failover for adaptive scheduler
 Key: FLINK-32818
 URL: https://issues.apache.org/jira/browse/FLINK-32818
 Project: Flink
  Issue Type: Improvement
Reporter: Rui Fan
Assignee: Rui Fan


The region failover strategy is useful for fast failover, and reduce the impact 
for business side. However, the adaptive scheduler doesn't support it so far.


https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/task_failure_recovery/#restart-pipelined-region-failover-strategy



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-08-09 Thread Maximilian Michels
+1 (binding)

-Max

On Tue, Aug 8, 2023 at 10:56 AM Etienne Chauchot  wrote:
>
> Hi all,
>
> As part of Flink bylaws, binding votes for FLIP changes are active
> committer votes.
>
> Up to now, we have only 2 binding votes. Can one of the committers/PMC
> members vote on this FLIP ?
>
> Thanks
>
> Etienne
>
>
> Le 08/08/2023 à 10:19, Etienne Chauchot a écrit :
> >
> > Hi Joseph,
> >
> > Thanks for the detailled review !
> >
> > Best
> >
> > Etienne
> >
> > Le 14/07/2023 à 11:57, Prabhu Joseph a écrit :
> >> *+1 (non-binding)*
> >>
> >> Thanks for working on this. We have seen good improvement during the cool
> >> down period with this feature.
> >> Below are details on the test results from one of our clusters:
> >>
> >> On a scale-out operation, 8 new nodes were added one by one with a gap of
> >> ~30 seconds. There were 8 restarts within 4 minutes with the default
> >> behaviour,
> >> whereas only one with this feature (cooldown period of 4 minutes).
> >>
> >> The number of records processed by the job with this feature during the
> >> restart window is higher (2909764), whereas it is only 1323960 with the
> >> default
> >> behaviour due to multiple restarts, where it spends most of the time
> >> recovering, and also whatever work progressed by the tasks after the last
> >> successful completed checkpoint is lost.
> >>
> >> Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown Period
> >> Remarks
> >> NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric indicates
> >> the difference the cool down period brings in. When the job is doing
> >> multiple restarts, the task spends most of the time recovering, and the
> >> progress the task made will be lost during the restart.
> >>
> >> 2. There is only one restart with Cool Down Period which happened when the
> >> 8th node got added back.
> >>
> >> Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
> >> NumRestarts 8 1
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot
> >> wrote:
> >>
> >>> Hi all,
> >>>
> >>> I'm going on vacation tonight for 3 weeks.
> >>>
> >>> Even if the vote is not finished, as the implementation is rather quick
> >>> and the design discussion had settled, I preferred I implementing
> >>> FLIP-322 [1] to allow people to take a look while I'm off.
> >>>
> >>> [1]https://github.com/apache/flink/pull/22985
> >>>
> >>> Best
> >>>
> >>> Etienne
> >>>
> >>> Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :
>  Hi all,
> 
>  Would you mind casting your vote to this second vote thread (opened
>  after new discussions) so that the subject can move forward ?
> 
>  @David, @Chesnay, @Robert you took part to the discussions, can you
>  please sent your vote ?
> 
>  Thank you very much
> 
>  Best
> 
>  Etienne
> 
>  Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :
> > Hi all,
> >
> > Thanks for your feedback about the FLIP-322: Cooldown period for
> > adaptive scheduler [1].
> >
> > This FLIP was discussed in [2].
> >
> > I'd like to start a vote for it. The vote will be open for at least 72
> > hours (until July 9th 15:00 GMT) unless there is an objection or
> > insufficient votes.
> >
> > [1]
> >
> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
> > [2]https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6
> >
> > Best,
> >
> > Etienne


Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei

2023-08-09 Thread Benchao Li
Congrats, YanFei!

Jing Ge  于2023年8月8日周二 17:41写道:

> Congrats, YanFei!
>
> Best regards,
> Jing
>
> On Tue, Aug 8, 2023 at 3:04 PM Yangze Guo  wrote:
>
> > Congrats, Yanfei!
> >
> > Best,
> > Yangze Guo
> >
> > On Tue, Aug 8, 2023 at 9:20 AM yuxia 
> wrote:
> > >
> > > Congratulations, Yanfei!
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > - 原始邮件 -
> > > 发件人: "ron9 liu" 
> > > 收件人: "dev" 
> > > 发送时间: 星期一, 2023年 8 月 07日 下午 11:44:23
> > > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei
> > >
> > > Congratulations Yanfei!
> > >
> > > Best,
> > > Ron
> > >
> > > Zakelly Lan  于2023年8月7日周一 23:15写道:
> > >
> > > > Congratulations, Yanfei!
> > > >
> > > > Best regards,
> > > > Zakelly
> > > >
> > > > On Mon, Aug 7, 2023 at 9:04 PM Lincoln Lee 
> > wrote:
> > > > >
> > > > > Congratulations, Yanfei!
> > > > >
> > > > > Best,
> > > > > Lincoln Lee
> > > > >
> > > > >
> > > > > Weihua Hu  于2023年8月7日周一 20:43写道:
> > > > >
> > > > > > Congratulations Yanfei!
> > > > > >
> > > > > > Best,
> > > > > > Weihua
> > > > > >
> > > > > >
> > > > > > On Mon, Aug 7, 2023 at 8:08 PM Feifan Wang 
> > wrote:
> > > > > >
> > > > > > > Congratulations Yanfei! :)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > ——
> > > > > > > Name: Feifan Wang
> > > > > > > Email: zoltar9...@163.com
> > > > > > >
> > > > > > >
> > > > > > >  Replied Message 
> > > > > > > | From | Matt Wang |
> > > > > > > | Date | 08/7/2023 19:40 |
> > > > > > > | To | dev@flink.apache.org |
> > > > > > > | Subject | Re: [ANNOUNCE] New Apache Flink Committer - Yanfei
> > Lei |
> > > > > > > Congratulations Yanfei!
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best,
> > > > > > > Matt Wang
> > > > > > >
> > > > > > >
> > > > > > >  Replied Message 
> > > > > > > | From | Mang Zhang |
> > > > > > > | Date | 08/7/2023 18:56 |
> > > > > > > | To |  |
> > > > > > > | Subject | Re:Re: [ANNOUNCE] New Apache Flink Committer -
> Yanfei
> > > > Lei |
> > > > > > > Congratulations--
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Mang Zhang
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 在 2023-08-07 18:17:58,"Yuxin Tan"  写道:
> > > > > > > Congrats, Yanfei!
> > > > > > >
> > > > > > > Best,
> > > > > > > Yuxin
> > > > > > >
> > > > > > >
> > > > > > > weijie guo  于2023年8月7日周一 17:59写道:
> > > > > > >
> > > > > > > Congrats, Yanfei!
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Weijie
> > > > > > >
> > > > > > >
> > > > > > > Biao Geng  于2023年8月7日周一 17:03写道:
> > > > > > >
> > > > > > > Congrats, Yanfei!
> > > > > > > Best,
> > > > > > > Biao Geng
> > > > > > >
> > > > > > > 发送自 Outlook for iOS
> > > > > > > 
> > > > > > > 发件人: Qingsheng Ren 
> > > > > > > 发送时间: Monday, August 7, 2023 4:23:52 PM
> > > > > > > 收件人: dev@flink.apache.org 
> > > > > > > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei
> > > > > > >
> > > > > > > Congratulations and welcome, Yanfei!
> > > > > > >
> > > > > > > Best,
> > > > > > > Qingsheng
> > > > > > >
> > > > > > > On Mon, Aug 7, 2023 at 4:19 PM Matthias Pohl <
> > matthias.p...@aiven.io
> > > > > > > .invalid>
> > > > > > > wrote:
> > > > > > >
> > > > > > > Congratulations, Yanfei! :)
> > > > > > >
> > > > > > > On Mon, Aug 7, 2023 at 10:00 AM Junrui Lee <
> jrlee@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > Congratulations Yanfei!
> > > > > > >
> > > > > > > Best,
> > > > > > > Junrui
> > > > > > >
> > > > > > > Yun Tang  于2023年8月7日周一 15:19写道:
> > > > > > >
> > > > > > > Congratulations, Yanfei!
> > > > > > >
> > > > > > > Best
> > > > > > > Yun Tang
> > > > > > > 
> > > > > > > From: Danny Cranmer 
> > > > > > > Sent: Monday, August 7, 2023 15:10
> > > > > > > To: dev 
> > > > > > > Subject: Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei
> > > > > > >
> > > > > > > Congrats Yanfei! Welcome to the team.
> > > > > > >
> > > > > > > Danny
> > > > > > >
> > > > > > > On Mon, 7 Aug 2023, 08:03 Rui Fan, <1996fan...@gmail.com>
> wrote:
> > > > > > >
> > > > > > > Congratulations Yanfei!
> > > > > > >
> > > > > > > Best,
> > > > > > > Rui
> > > > > > >
> > > > > > > On Mon, Aug 7, 2023 at 2:56 PM Yuan Mei <
> yuanmei.w...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > On behalf of the PMC, I'm happy to announce Yanfei Lei as a new
> > > > > > > Flink
> > > > > > > Committer.
> > > > > > >
> > > > > > > Yanfei has been active in the Flink community for almost two
> > > > > > > years
> > > > > > > and
> > > > > > > has
> > > > > > > played an important role in developing and maintaining State
> > > > > > > and
> > > > > > > Checkpoint
> > > > > > > related features/components, including RocksDB Rescaling
> > > > > > > Performance
> > > > > > > Improvement and Generic Incremental Checkpoints.
> > > > > > >
> > > > > > > Yanfei also helps improve community infrastructure in 

Re: [ANNOUNCE] New Apache Flink Committer - Hangxiang Yu

2023-08-09 Thread Benchao Li
Congrats, Hangxiang!

Jing Ge  于2023年8月8日周二 17:44写道:

> Congrats, Hangxiang!
>
> Best regards,
> Jing
>
> On Tue, Aug 8, 2023 at 3:04 PM Yangze Guo  wrote:
>
> > Congrats, Hangxiang!
> >
> > Best,
> > Yangze Guo
> >
> > On Tue, Aug 8, 2023 at 11:28 AM yh z  wrote:
> > >
> > > Congratulations, Hangxiang !
> > >
> > >
> > > Best,
> > > Yunhong Zheng (Swuferhong)
> > >
> > > yuxia  于2023年8月8日周二 09:20写道:
> > >
> > > > Congratulations, Hangxiang !
> > > >
> > > > Best regards,
> > > > Yuxia
> > > >
> > > > - 原始邮件 -
> > > > 发件人: "Wencong Liu" 
> > > > 收件人: "dev" 
> > > > 发送时间: 星期一, 2023年 8 月 07日 下午 11:55:24
> > > > 主题: Re:[ANNOUNCE] New Apache Flink Committer - Hangxiang Yu
> > > >
> > > > Congratulations, Hangxiang !
> > > >
> > > >
> > > > Best,
> > > > Wencong
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2023-08-07 14:57:49, "Yuan Mei"  wrote:
> > > > >On behalf of the PMC, I'm happy to announce Hangxiang Yu as a new
> > Flink
> > > > >Committer.
> > > > >
> > > > >Hangxiang has been active in the Flink community for more than 1.5
> > years
> > > > >and has played an important role in developing and maintaining State
> > and
> > > > >Checkpoint related features/components, including Generic
> Incremental
> > > > >Checkpoints (take great efforts to make the feature prod-ready).
> > Hangxiang
> > > > >is also the main driver of the FLIP-263: Resolving schema
> > compatibility.
> > > > >
> > > > >Hangxiang is passionate about the Flink community. Besides the
> > technical
> > > > >contribution above, he is also actively promoting Flink: talks about
> > > > Generic
> > > > >Incremental Checkpoints in Flink Forward and Meet-up. Hangxiang also
> > spent
> > > > >a good amount of time supporting users, participating in
> Jira/mailing
> > list
> > > > >discussions, and reviewing code.
> > > > >
> > > > >Please join me in congratulating Hangxiang for becoming a Flink
> > Committer!
> > > > >
> > > > >Thanks,
> > > > >Yuan Mei (on behalf of the Flink PMC)
> > > >
> >
>


-- 

Best,
Benchao Li


Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl

2023-08-09 Thread Benchao Li
Congratulations, Matthias!

Maximilian Michels  于2023年8月8日周二 17:54写道:

> Congrats, well done, and welcome to the PMC Matthias!
>
> -Max
>
> On Tue, Aug 8, 2023 at 8:36 AM yh z  wrote:
> >
> > Congratulations, Matthias!
> >
> > Best,
> > Yunhong Zheng (Swuferhong)
> >
> > Ryan Skraba  于2023年8月7日周一 21:39写道:
> >
> > > Congratulations Matthias -- very well-deserved, the community is lucky
> to
> > > have you <3
> > >
> > > All my best, Ryan
> > >
> > > On Mon, Aug 7, 2023 at 3:04 PM Lincoln Lee 
> wrote:
> > >
> > > > Congratulations!
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > >
> > > > Feifan Wang  于2023年8月7日周一 20:13写道:
> > > >
> > > > > Congrats Matthias!
> > > > >
> > > > >
> > > > >
> > > > > ——
> > > > > Name: Feifan Wang
> > > > > Email: zoltar9...@163.com
> > > > >
> > > > >
> > > > >  Replied Message 
> > > > > | From | Matthias Pohl |
> > > > > | Date | 08/7/2023 16:16 |
> > > > > | To |  |
> > > > > | Subject | Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias
> Pohl
> > > |
> > > > > Thanks everyone. :)
> > > > >
> > > > > On Mon, Aug 7, 2023 at 3:18 AM Andriy Redko 
> wrote:
> > > > >
> > > > > Congrats Matthias, well deserved!!
> > > > >
> > > > > DC> Congrats Matthias!
> > > > >
> > > > > DC> Very well deserved, thankyou for your continuous, consistent
> > > > > contributions.
> > > > > DC> Welcome.
> > > > >
> > > > > DC> Thanks,
> > > > > DC> Danny
> > > > >
> > > > > DC> On Fri, Aug 4, 2023 at 9:30 AM Feng Jin  >
> > > > wrote:
> > > > >
> > > > > Congratulations, Matthias!
> > > > >
> > > > > Best regards
> > > > >
> > > > > Feng
> > > > >
> > > > > On Fri, Aug 4, 2023 at 4:29 PM weijie guo <
> guoweijieres...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > Congratulations, Matthias!
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Weijie
> > > > >
> > > > >
> > > > > Wencong Liu  于2023年8月4日周五 15:50写道:
> > > > >
> > > > > Congratulations, Matthias!
> > > > >
> > > > > Best,
> > > > > Wencong Liu
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > At 2023-08-04 11:18:00, "Xintong Song" 
> > > > > wrote:
> > > > > Hi everyone,
> > > > >
> > > > > On behalf of the PMC, I'm very happy to announce that Matthias Pohl
> > > > > has
> > > > > joined the Flink PMC!
> > > > >
> > > > > Matthias has been consistently contributing to the project since
> > > > > Sep
> > > > > 2020,
> > > > > and became a committer in Dec 2021. He mainly works in Flink's
> > > > > distributed
> > > > > coordination and high availability areas. He has worked on many
> > > > > FLIPs
> > > > > including FLIP195/270/285. He helped a lot with the release
> > > > > management,
> > > > > being one of the Flink 1.17 release managers and also very active
> > > > > in
> > > > > Flink
> > > > > 1.18 / 2.0 efforts. He also contributed a lot to improving the
> > > > > build
> > > > > stability.
> > > > >
> > > > > Please join me in congratulating Matthias!
> > > > >
> > > > > Best,
> > > > >
> > > > > Xintong (on behalf of the Apache Flink PMC)
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
>


-- 

Best,
Benchao Li


[jira] [Created] (FLINK-32816) Remove the local recovery limitation for adaptive scheduler limitation

2023-08-09 Thread Rui Fan (Jira)
Rui Fan created FLINK-32816:
---

 Summary: Remove the local recovery limitation for adaptive 
scheduler limitation
 Key: FLINK-32816
 URL: https://issues.apache.org/jira/browse/FLINK-32816
 Project: Flink
  Issue Type: Improvement
Reporter: Rui Fan
Assignee: Rui Fan
 Attachments: image-2023-08-09-17-42-55-213.png

FLINK-21450 has supported local recovery for adaptive scheduler, so this 
limitation can be removed.

 !image-2023-08-09-17-42-55-213.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] How about adding OLAP to Flink Roadmap?

2023-08-09 Thread xiangyu feng
Thank you Shammon for initiating this discussion. As one of the Flink OLAP
developers in ByteDance, I would also like to share a real case of our
users.

About two years ago we found our first OLAP user internally by integrating
Flink OLAP with ByteHTAP. Users are willing to use Flink as an OLAP engine
mainly hoping to use Flink's rich cross-datasource join capability. In the
beginning, we only support simple query patterns with qps less than 2 and
joins less than 5. With the business growing and our system capabilities
evolving, users have moved more scenarios to Flink OLAP, and the query
pattern is getting more and more complicated. Until early this year, the
user's query pattern has changed to peak QPS greater than 20, join table
number greater than 30 and scan data volume exceeding 1 billion rows. Even
with the evolution of our engine over the past two years, computing at this
scale is still very challenging. It is difficult to satisfy the computation
scale, system stability and query latency at the same time.

Through talking to our user, we easily build some intermediate views by
using Flink's streaming and batch engine. In a similar way to materialized
view, we have optimized user's query pattern to single query join less than
10, scan data volume in tens of millions and QPS remains unchanged. In this
way, our OLAP service has not only perfectly met the business requirements
and also we have made this migration process very smooth, thanks to Flink's
powerful streaming and batch computing ecosystem. Finally, we are highly
recognized by our users.

*There are two points I want to make with this case:*

1, Although there are many OLAP engines out there, Flink may not always
provide the best performance. But thanks to Flink's strong ecosystem, we
are confident that we can build an OLAP engine that provides a great user
experience. This is very important for many small and medium sized
companies;

2, From another perspective, I personally believe that building OLAP will
'bring Flink closer to our end-users' and present a wider variety of
computational challenges to Flink. As the case mentioned above, this is a
very common case in data analytics, where the flink is used to precompute
the data and feed it to OLAP services. These precalculations are often
designed to compensate for the capabilities of other OLAP engines, such as
some engines may not have strong join capabilities, some may not have
complete SQL support and some may have weak plan optimization support. In
general, Flink will not face these users directly, and thus cannot build a
comprehensive end-to-end solution to solve this problem once and for all. *By
building Flink OLAP, we can finally fill the last missing block in the
puzzle!*

Of course, as we built Flink OLAP internally, we encountered many
challenging issues, which is why we are putting this discussion out there
and hoping to involve more contributors. Meanwhile, We also hope to
contribute our optimizations back to the community through FLINK-25318.

So for me, it is a big +1 to add OLAP to Flink Roadmap. ^-^

Best,
Xiangyu

Xintong Song  于2023年8月8日周二 18:01写道:

> Thanks for bringing this up, Shammon.
>
> In general, I'd be +1 to improve Flink's ability to serve as an OLAP
> engine.
>
> I see a great value in Flink becoming a unified Large-scale Data Processing
> / Analysis tool. Through my interactions with users (Alibaba internal
> users, external users on Alibaba Cloud, developers from other companies via
> conferences / meetups), it's commonly complained how complicated and costly
> it is to build a data processing platform out of a bunch of different
> tools. That usually means higher learning / developing / operation and
> maintenance cost. In addition, I also see a trend that many projects and
> products are going along the same direction.
>
> I personally would not be concerned about losing focus. Unlike Apache
> Paimon (Flink Table Store) which tries to solve a completely different
> problem other than data processing, OLAP querying is just a special case of
> batch SQL data processing, where typically you have massive concurrent
> short-lived queries. As Shammon mentioned, Flink already has most of the
> essential building blocks: batch SQL processing, session mode, sql-gateway,
> etc. IMHO, the missing piece is mostly about improving the performance in
> the specific OLAP scenarios. That sounds like a reasonable extension to me.
>
> I'd consider improving the OLAP capability as nice-to-have improvements.
> That is to say, it must not come at the price of sacrificing the
> experiences in other streaming / batch scenarios, nor significantly
> complicate the system. I think one of the reasons that FLINK-25318 became
> stale was that some of the proposed solutions are too dedicated for the
> OLAP scenarios and require extra efforts to carefully re-design in order to
> not affect other scenarios. I'd be glad to see such efforts being revived.
>
> Regarding whether to include