Re: [DISCUSS] String literal behavior in Flink

2023-03-05 Thread Jark Wu
Hi Aitozi,

I would suggest trying to contribute it to the upstream project Calcite first. 

Best,
Jark

> 2023年3月6日 11:51,Aitozi  写道:
> 
> Hi Jark,
> 
> Thank you for your helpful suggestion. It appears that 'E'foo\n'' is a more
> versatile and widely accepted option. To assess its feasibility, I have
> reviewed the relevant Unicode supports and concluded that it may
> necessitate modifications to the Parser.jj file to accommodate this new
> syntax.
> 
> 
> I am unsure whether we should initially incorporate this alteration in
> Calcite or if we can directly supersede the StringLiteral behavior within
> the Flink project. Nevertheless, I believe supporting this change is
> achievable.
> 
> 
> 
> Thanks,
> Aitozi.
> 
> Jark Wu  于2023年3月6日周一 10:16写道:
> 
>> Hi Aitozi,
>> 
>> I think this is a good idea to improve the backslash escape strings.
>> However, I lean a bit more toward the Postgres approach[1],
>> which is more standard-compliant. PG allows backslash escape
>> string by writing the letter E (upper or lower case) just before the
>> opening single quote, e.g., E'foo\n'.
>> 
>> Recognizing backslash escapes in both regular and escape string constants
>> is not backward compatible in Flink, and is also deprecated in PG.
>> 
>> In addition, Flink also supports Unicode escape string constants by
>> writing the U& before the quote[1] which works in the same way with
>> backslash escape string.
>> 
>> Best,
>> Jark
>> 
>> [1]:
>> 
>> https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
>> [2]:
>> 
>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/overview/
>> 
>> On Sat, 4 Mar 2023 at 23:31, Aitozi  wrote:
>> 
>>> Hi,
>>>  I encountered a problem when using string literal in Flink. Currently,
>>> Flink will escape the string literal during codegen, so for the query
>>> below:
>>> 
>>> SELECT 'a\nb'; it will print => a\nb
>>> 
>>> then for the query
>>> 
>>> SELECT SPLIT_INDEX(col, '\n', 0);
>>> 
>>> The col can not split by the newline. If we want to split by the newline,
>>> we should use
>>> 
>>> SELECT SPLIT_INDEX(col, '
>>> ', 0)
>>> 
>>> or
>>> 
>>> SELECT SPLIT_INDEX(col, CHR(10), 0)
>>> 
>>> The above way could be more intuitive. Some other databases support these
>>> "Special Character Escape Sequences"[1].
>>> 
>>> In this way, we can directly use
>>> SELECT SPLIT_INDEX(col, '\n', 0); for the query.
>>> 
>>> I know this is not standard behavior in ANSI SQL. I'm opening this thread
>>> for some opinions from the community guys.
>>> 
>>> [1]:
>>> 
>>> 
>> https://dev.mysql.com/doc/refman/8.0/en/string-literals.html#character-escape-sequences
>>> 
>>> Thanks,
>>> Aitozi
>>> 
>> 



[DISCUSS] FLIP-301: Hybrid Shuffle supports Remote Storage

2023-03-05 Thread Yuxin Tan
Hi everyone,

I would like to start a discussion on FLIP-301: Hybrid Shuffle supports
Remote Storage[1].

In the cloud-native environment, it is difficult to determine the
appropriate
disk space for Batch shuffle, which will affect job stability.

This FLIP is to support Remote Storage for Hybrid Shuffle to improve the
Batch job stability in the cloud-native environment.

The goals of this FLIP are as follows.
1. By default, use the local memory and disk to ensure high shuffle
performance if the local storage space is sufficient.
2. When the local storage space is insufficient, use remote storage as
a supplement to avoid large-scale Batch job failure.

Looking forward to hearing from you.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-301%3A+Hybrid+Shuffle+supports+Remote+Storage

Best,
Yuxin


[jira] [Created] (FLINK-31329) Fix Parquet stats extractor

2023-03-05 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31329:


 Summary: Fix Parquet stats extractor
 Key: FLINK-31329
 URL: https://issues.apache.org/jira/browse/FLINK-31329
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Some bugs in Parquet stats extractor:
 # Decimal Supports
 # Timestamp Supports
 # Null nullCounts supports



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-298: Unifying the Implementation of SlotManager

2023-03-05 Thread weijie guo
Hi Weihua,

Thanks for your clarification, SGTM.

Best regards,

Weijie


Weihua Hu  于2023年3月6日周一 11:43写道:

> Thanks Weijie.
>
> Heterogeneous task managers will not be considered in this FLIP since
> it does not request heterogeneous resources as you said.
>
> My first thought is we can adjust the meaning of redundant configuration
> to redundant number of per resource type. These can be considered in
> detail when we decide to support heterogeneous task managers.
>
> Best,
> Weihua
>
>
> On Sat, Mar 4, 2023 at 1:13 AM weijie guo 
> wrote:
>
> > Thanks Weihua for preparing this FLIP.
> >
> > This FLIP overall looks reasonable to me after updating as suggested by
> > Matthias.
> >
> > I only have one small question about keeping some redundant task
> managers:
> > In the fine-grained resource management, theoretically, it can support
> > heterogeneous taskmanagers. When we complete the missing features for
> FGSM,
> > do we plan to take this into account?
> > Of course, if I remember correctly, FGSM will not request heterogeneous
> > resources at present, so it is also acceptable to me if there is no
> special
> > treatment now.
> >
> > +1 for this changes if we can ensure the test coverage.
> >
> > Best regards,
> >
> > Weijie
> >
> >
> > John Roesler  于2023年3月2日周四 12:53写道:
> >
> > > Thanks for the test plan, Weihua!
> > >
> > > Yes, it addresses my concerns.
> > >
> > > Thanks,
> > > John
> > >
> > > On Wed, Mar 1, 2023, at 22:38, Weihua Hu wrote:
> > > > Hi, everyone,
> > > > Thanks for your suggestions and ideas.
> > > > Thanks Xintong for sharing the detailed backgrounds of SlotManager.
> > > >
> > > > *@Matthias
> > > >
> > > > 1. Did you do a proper test coverage analysis?
> > > >
> > > >
> > > > Just as Xintong said, we already have a CI stage for fine grained
> > > resource
> > > > managers.
> > > > And I will make sure FineGrainedSlotManager as the default
> SlotManager
> > > can
> > > > pass all the tests of CI.
> > > > In addition, I will review all unit tests of
> > DeclarativeSlotManager(DSM)
> > > to
> > > > ensure that there are no gaps in the
> > > > coverage provided by the FineGrainedSlotManager.
> > > > I also added the 'Test Plan' part to the FLIP.
> > > > @Matthias @John @Shammon Does this test plan address your concerns?
> > > >
> > > > 2.  DeclarativeSlotManager and FineGrainedSlotManager feel quite big
> in
> > > >
> > > > terms of lines of code
> > > >
> > > >
> > > > IMO, the refactoring of SlotManager does not belong to this FLIP
> since
> > it
> > > > may lead to some unstable risks. For
> > > > FineGrainedSlotManager(FGSM), we already split some reasonable
> > > components.
> > > > They are:
> > > > * TaskManagerTracker: Track task managers and their resources.
> > > > * ResourceTracker: track requirements of jobs
> > > > * ResourceAllocationStrategy: Try to fulfill the resource
> requirements
> > > with
> > > > available/pending resources.
> > > > * SlotStatusSyncer: communicate with TaskManager, for
> > allocating/freeing
> > > > slot and reconciling the slot status
> > > > Maybe we can start a discussion about refactoring SlotManager in
> > another
> > > > FLIP if there are some good suggestions.
> > > > WDYT
> > > >
> > > > 3. For me personally, having a more detailed summary comparing the
> > > >> subcomponents of both SlotManager implementations with where
> > > >> their functionality matches and where they differ might help
> > understand
> > > the
> > > >> consequences of the changes proposed in FLIP-298
> > > >
> > > > Good suggestion, I have updated the comparison in this FLIP. Looking
> > > > forward to any suggestions/thoughts
> > > > if they are not described clearly.
> > > >
> > > > *@John
> > > >
> > > > 4. In addition to changing the default, would it make sense to log a
> > > >> deprecation warning on initialization
> > > >
> > > > if the DeclarativeSlotManager is used?
> > > >>
> > > > SGTM, We should add Deprecated annotations to DSM for devs. And log a
> > > > deprecation warning for users.
> > > >
> > > > *@Shammon
> > > >
> > > > 1. For their functional differences, can you give some detailed tests
> > to
> > > >> verify that the new FineGrainedSlotManager has these capabilities?
> > This
> > > can
> > > >> effectively verify the new functions
> > > >>
> > > > As just maintained, there is already a CI stage of FGSM, and I will
> do
> > > more
> > > > review of unit tests for DSM.
> > > >
> > > >  2. I'm worried that many functions are not independent and it is
> > > difficult
> > > >> to migrate step-by-step. You can list the relationship between them
> in
> > > >> detail.
> > > >
> > > >  As Xintong saied the DSM is a subset of FGSM by design. But as time
> > goes
> > > > on, FGSM has some lacking
> > > > functions as I listed in this FLIP. And I have added the comparison
> > > between
> > > > DSM and FGSM in this FLIP.
> > > >
> > > >
> > > > Thanks again for all your thoughts. Any feedback is appreciated!
> > > >
> > > > Best,
> > > > Weihua
> > > 

Re: [DISCUSS] String literal behavior in Flink

2023-03-05 Thread Aitozi
Hi Jark,

Thank you for your helpful suggestion. It appears that 'E'foo\n'' is a more
versatile and widely accepted option. To assess its feasibility, I have
reviewed the relevant Unicode supports and concluded that it may
necessitate modifications to the Parser.jj file to accommodate this new
syntax.


I am unsure whether we should initially incorporate this alteration in
Calcite or if we can directly supersede the StringLiteral behavior within
the Flink project. Nevertheless, I believe supporting this change is
achievable.



Thanks,
Aitozi.

Jark Wu  于2023年3月6日周一 10:16写道:

> Hi Aitozi,
>
> I think this is a good idea to improve the backslash escape strings.
> However, I lean a bit more toward the Postgres approach[1],
> which is more standard-compliant. PG allows backslash escape
> string by writing the letter E (upper or lower case) just before the
> opening single quote, e.g., E'foo\n'.
>
> Recognizing backslash escapes in both regular and escape string constants
> is not backward compatible in Flink, and is also deprecated in PG.
>
> In addition, Flink also supports Unicode escape string constants by
> writing the U& before the quote[1] which works in the same way with
> backslash escape string.
>
> Best,
> Jark
>
> [1]:
>
> https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
> [2]:
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/overview/
>
> On Sat, 4 Mar 2023 at 23:31, Aitozi  wrote:
>
> > Hi,
> >   I encountered a problem when using string literal in Flink. Currently,
> > Flink will escape the string literal during codegen, so for the query
> > below:
> >
> > SELECT 'a\nb'; it will print => a\nb
> >
> > then for the query
> >
> > SELECT SPLIT_INDEX(col, '\n', 0);
> >
> > The col can not split by the newline. If we want to split by the newline,
> > we should use
> >
> > SELECT SPLIT_INDEX(col, '
> > ', 0)
> >
> > or
> >
> > SELECT SPLIT_INDEX(col, CHR(10), 0)
> >
> > The above way could be more intuitive. Some other databases support these
> > "Special Character Escape Sequences"[1].
> >
> > In this way, we can directly use
> > SELECT SPLIT_INDEX(col, '\n', 0); for the query.
> >
> > I know this is not standard behavior in ANSI SQL. I'm opening this thread
> > for some opinions from the community guys.
> >
> > [1]:
> >
> >
> https://dev.mysql.com/doc/refman/8.0/en/string-literals.html#character-escape-sequences
> >
> > Thanks,
> > Aitozi
> >
>


Re: [DISCUSS] Why does the Doc say "Flink requires at least Java 11 to build"?

2023-03-05 Thread Dong Lin
Given that the latest Flink version still supports Java 8, and that Flink
has not officially announced the drop of Java 8 support, I think it is
incorrect to say that "Flink requires at least Java 11 to build".

Because dropping Java 8 support is a backward incompatible change with huge
impact, it is probably necessary for Flink to give explicit notice of when
Java 8 support will be dropped before actually making this change. Maybe we
can follow Kafka website doc  for
example, which says "Java 8, Java 11, and Java 17 are supported. Note that
Java 8 support has been deprecated since Apache Kafka 3.0 and will be
removed in Apache Kafka 4.0".

The Flink doc has been updated to say "Flink requires Java 8 (deprecated)
or Java 11 to build" in Flink 1.17.0 branch and the master branch. The
reasons are documented in FLINK-30501
.



On Wed, Mar 1, 2023 at 9:22 PM Zhongpu Chen  wrote:

> I reported the same question at StackOverflow [1]. As I tested in my
> computer, Java 8 is enough to build Flink from source.
>
> [1] https://stackoverflow.com/questions/75601233/
>
> --
> Zhongpu Chen
>


[jira] [Created] (FLINK-31328) Greedy option on the looping pattern at the end not working

2023-03-05 Thread Juntao Hu (Jira)
Juntao Hu created FLINK-31328:
-

 Summary: Greedy option on the looping pattern at the end not 
working
 Key: FLINK-31328
 URL: https://issues.apache.org/jira/browse/FLINK-31328
 Project: Flink
  Issue Type: Bug
  Components: Library / CEP
Affects Versions: 1.16.1, 1.15.3, 1.17.0
Reporter: Juntao Hu


If use greedy option on a looping pattern which is at the end of the whole 
pattern, the matching result is not "greedy".

Example1

pattern: A.oneOrMore().consecutive().greedy() (SKIP_TO_NEXT)

sequence: a1, a2, a3

result: [a1] [a2] [a3]

Example2

pattern: B.next(A).oneOrMore().consecutive().greedy() (SKIP_TO_NEXT)

sequence: b1, a1, a2, a3

result: [b1 a1]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-298: Unifying the Implementation of SlotManager

2023-03-05 Thread Weihua Hu
Thanks Weijie.

Heterogeneous task managers will not be considered in this FLIP since
it does not request heterogeneous resources as you said.

My first thought is we can adjust the meaning of redundant configuration
to redundant number of per resource type. These can be considered in
detail when we decide to support heterogeneous task managers.

Best,
Weihua


On Sat, Mar 4, 2023 at 1:13 AM weijie guo  wrote:

> Thanks Weihua for preparing this FLIP.
>
> This FLIP overall looks reasonable to me after updating as suggested by
> Matthias.
>
> I only have one small question about keeping some redundant task managers:
> In the fine-grained resource management, theoretically, it can support
> heterogeneous taskmanagers. When we complete the missing features for FGSM,
> do we plan to take this into account?
> Of course, if I remember correctly, FGSM will not request heterogeneous
> resources at present, so it is also acceptable to me if there is no special
> treatment now.
>
> +1 for this changes if we can ensure the test coverage.
>
> Best regards,
>
> Weijie
>
>
> John Roesler  于2023年3月2日周四 12:53写道:
>
> > Thanks for the test plan, Weihua!
> >
> > Yes, it addresses my concerns.
> >
> > Thanks,
> > John
> >
> > On Wed, Mar 1, 2023, at 22:38, Weihua Hu wrote:
> > > Hi, everyone,
> > > Thanks for your suggestions and ideas.
> > > Thanks Xintong for sharing the detailed backgrounds of SlotManager.
> > >
> > > *@Matthias
> > >
> > > 1. Did you do a proper test coverage analysis?
> > >
> > >
> > > Just as Xintong said, we already have a CI stage for fine grained
> > resource
> > > managers.
> > > And I will make sure FineGrainedSlotManager as the default SlotManager
> > can
> > > pass all the tests of CI.
> > > In addition, I will review all unit tests of
> DeclarativeSlotManager(DSM)
> > to
> > > ensure that there are no gaps in the
> > > coverage provided by the FineGrainedSlotManager.
> > > I also added the 'Test Plan' part to the FLIP.
> > > @Matthias @John @Shammon Does this test plan address your concerns?
> > >
> > > 2.  DeclarativeSlotManager and FineGrainedSlotManager feel quite big in
> > >
> > > terms of lines of code
> > >
> > >
> > > IMO, the refactoring of SlotManager does not belong to this FLIP since
> it
> > > may lead to some unstable risks. For
> > > FineGrainedSlotManager(FGSM), we already split some reasonable
> > components.
> > > They are:
> > > * TaskManagerTracker: Track task managers and their resources.
> > > * ResourceTracker: track requirements of jobs
> > > * ResourceAllocationStrategy: Try to fulfill the resource requirements
> > with
> > > available/pending resources.
> > > * SlotStatusSyncer: communicate with TaskManager, for
> allocating/freeing
> > > slot and reconciling the slot status
> > > Maybe we can start a discussion about refactoring SlotManager in
> another
> > > FLIP if there are some good suggestions.
> > > WDYT
> > >
> > > 3. For me personally, having a more detailed summary comparing the
> > >> subcomponents of both SlotManager implementations with where
> > >> their functionality matches and where they differ might help
> understand
> > the
> > >> consequences of the changes proposed in FLIP-298
> > >
> > > Good suggestion, I have updated the comparison in this FLIP. Looking
> > > forward to any suggestions/thoughts
> > > if they are not described clearly.
> > >
> > > *@John
> > >
> > > 4. In addition to changing the default, would it make sense to log a
> > >> deprecation warning on initialization
> > >
> > > if the DeclarativeSlotManager is used?
> > >>
> > > SGTM, We should add Deprecated annotations to DSM for devs. And log a
> > > deprecation warning for users.
> > >
> > > *@Shammon
> > >
> > > 1. For their functional differences, can you give some detailed tests
> to
> > >> verify that the new FineGrainedSlotManager has these capabilities?
> This
> > can
> > >> effectively verify the new functions
> > >>
> > > As just maintained, there is already a CI stage of FGSM, and I will do
> > more
> > > review of unit tests for DSM.
> > >
> > >  2. I'm worried that many functions are not independent and it is
> > difficult
> > >> to migrate step-by-step. You can list the relationship between them in
> > >> detail.
> > >
> > >  As Xintong saied the DSM is a subset of FGSM by design. But as time
> goes
> > > on, FGSM has some lacking
> > > functions as I listed in this FLIP. And I have added the comparison
> > between
> > > DSM and FGSM in this FLIP.
> > >
> > >
> > > Thanks again for all your thoughts. Any feedback is appreciated!
> > >
> > > Best,
> > > Weihua
> > >
> > >
> > > On Wed, Mar 1, 2023 at 2:17 PM Xintong Song 
> > wrote:
> > >
> > >> Thanks Weihua for preparing this FLIP. +1 for the proposal.
> > >>
> > >>
> > >> As one of the contributors of the fine-grained slot manager, I'd like
> to
> > >> share some backgrounds here.
> > >>
> > >> - There used to be a defaut slot manager implementation, which is
> > >> non-declarative and has been removed now. The 

[jira] [Created] (FLINK-31327) Added conversion method for GenericRowData

2023-03-05 Thread hunter (Jira)
hunter created FLINK-31327:
--

 Summary: Added conversion method for GenericRowData
 Key: FLINK-31327
 URL: https://issues.apache.org/jira/browse/FLINK-31327
 Project: Flink
  Issue Type: New Feature
Reporter: hunter


I think when using GenericRowData, it is more difficult to convert to Java pojo 
type, so I think it is necessary to add a prize in GenericRowData to convert 
the GenericRowData type to Java pojo type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release 1.15.4, release candidate #1

2023-03-05 Thread Dian Fu
Hi Danny,

It impacts all jobs using new sinks in Python DataStream API and has
nothing to do with the tumbling window.

I just want to ensure that it will finally be included in a patch release
of 1.15.x. As it mentioned `Given that Flink 1.17 is on the horizon, it
will be good to release critical bug fixes before 1.15 goes out of support`
in the discussion thread of 1.15.4, I have thought that this will be the
last release of 1.15.x. If there will be 1.15.5, I'm fine to include it in
1.15.5. Thanks for your clarify.

Regards,
Dian






On Sat, Mar 4, 2023 at 12:07 AM Danny Cranmer 
wrote:

> Hello Dian,
>
> As per the new version support strategy [1], we will perform a Flink 1.15.5
> patch if there are any resolved critical/blocker issues at the point in
> which 1.17.0 is released. The issue referenced is a Major. I am not
> inclined to cancel the release candidate at this point unless I get a -1
> veto. Is this issue impacting all Python Datastream apps with a Tumbling
> window or a subset? I am wondering why it is only a Major?
>
> Thanks,
> Danny
>
> [1] https://lists.apache.org/thread/9w99mgx3nw5tc0v26wcvlyqxrcrkpzdz
>
> On Fri, Mar 3, 2023 at 2:09 AM Dian Fu  wrote:
>
> > Hi Danny,
> >
> > I'm sorry that I'm coming to this thread a little late. It seems that
> this
> > will be the last bugfix release of Flink 1.15? If so, I'd like to also
> > include https://issues.apache.org/jira/browse/FLINK-31272 into this
> > release
> > which fixes a serious issue of PyFlink.
> >
> > Regards,
> > Dian
> >
> >
> >
> > On Thu, Mar 2, 2023 at 5:51 PM Yu Li  wrote:
> >
> > > +1 (binding)
> > >
> > >
> > > - Checked the diff between 1.15.3 and 1.15.4-rc1: *OK* (
> > >
> >
> https://github.com/apache/flink/compare/release-1.15.3...release-1.15.4-rc1
> > > )
> > >
> > >   - AWS SDKv2 version has been bumped to 2.19.14 through FLINK-30633
> and
> > > all NOTICE files updated correctly
> > >
> > > - Checked release notes: *OK*
> > >
> > > - Checked sums and signatures: *OK*
> > >
> > > - Maven clean install from source: *OK* (8u181)
> > >
> > > - Checked the jars in the staging repo: *OK*
> > >
> > > - Checked the website updates: *OK*
> > >
> > >
> > > Thanks for driving this release, Danny!
> > >
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Wed, 1 Mar 2023 at 02:01, Ahmed Hamdy  wrote:
> > >
> > > > Thanks Danny,
> > > >
> > > > +1 (non-binding)
> > > >
> > > > - Verified hashes and signatures
> > > > - Built Source archive using maven
> > > > - Web PR looks good.
> > > > - Started WordCount Example
> > > >
> > > > On Tue, 28 Feb 2023 at 16:37, Jing Ge 
> > > wrote:
> > > >
> > > > > Thanks Danny,
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > >  - GPG signatures looks good
> > > > >  - checked dist and maven repo
> > > > >  - maven clean install from source
> > > > >  - checked version consistency in pom files
> > > > >  - went through the web release notes and found one task is still
> > open:
> > > > > FLINK-31133 [1]
> > > > >  - download artifacts
> > > > >  - started/stopped local cluster and ran WordCount job in streaming
> > and
> > > > > batch
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-31133
> > > > >
> > > > > On Tue, Feb 28, 2023 at 3:12 PM Matthias Pohl
> > > > >  wrote:
> > > > >
> > > > > > Thanks Danny.
> > > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > * Downloaded artifacts
> > > > > > * Built Flink from sources
> > > > > > * Verified SHA512 checksums GPG signatures
> > > > > > * Compared checkout with provided sources
> > > > > > * Verified pom file versions
> > > > > > * Went over NOTICE file/pom files changes without finding
> anything
> > > > > > suspicious
> > > > > > * Deployed standalone session cluster and ran WordCount example
> in
> > > > batch
> > > > > > and streaming: Nothing suspicious in log files found
> > > > > >
> > > > > > On Tue, Feb 28, 2023 at 9:50 AM Teoh, Hong
> > > >  > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks Danny for driving this
> > > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > * Hashes and Signatures look good
> > > > > > > * All required files on dist.apache.org
> > > > > > > * Source archive builds using maven
> > > > > > > * Started packaged example WordCountSQLExample job
> > > > > > > * Web PR looks good.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Hong
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On 24 Feb 2023, at 05:36, Weihua Hu 
> > > > wrote:
> > > > > > > >
> > > > > > > > CAUTION: This email originated from outside of the
> > organization.
> > > Do
> > > > > not
> > > > > > > click links or open attachments unless you can confirm the
> sender
> > > and
> > > > > > know
> > > > > > > the content is safe.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Thanks Danny.
> > > > > > > >
> > > > > > > > +1(non-binding)
> > > > > > > >
> > > > > > > > Tested the following:
> > > > > > 

Re:Re: help: [FLINK-31321] @flinkbot run azure does not work?

2023-03-05 Thread felixzh


http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.11_amd64.deb
The above url exists. 
http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
The above url doex not exists. 
ubuntu5.10 -> ubuntu5.11 ?







在 2023-03-06 09:45:10,"yuxia"  写道:
>I also encouter the same problem. I have no idea why it happens, but hope it 
>can ne fixed assp.
>
>Best regards,
>Yuxia
>
>- 原始邮件 -
>发件人: "felixzh" 
>收件人: "dev" 
>发送时间: 星期一, 2023年 3 月 06日 上午 8:46:49
>主题: help: [FLINK-31321] @flinkbot run azure does not work?
>
>2023-03-05T05:38:27.1951220Z libapr1 is already the newest version 
>(1.6.5-1ubuntu1).
>2023-03-05T05:38:27.1951745Z libapr1 set to manually installed.
>2023-03-05T05:38:27.1952264Z 0 upgraded, 0 newly installed, 0 to remove and 13 
>not upgraded.
>2023-03-05T05:38:27.1984256Z --2023-03-05 05:38:27--  
>http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
>2023-03-05T05:38:27.2104330Z Resolving security.ubuntu.com 
>(security.ubuntu.com)... 91.189.91.39, 91.189.91.38, 185.125.190.39, ...
>2023-03-05T05:38:27.2904245Z Connecting to security.ubuntu.com 
>(security.ubuntu.com)|91.189.91.39|:80... connected.
>2023-03-05T05:38:27.3707348Z HTTP request sent, awaiting response... 404 Not 
>Found
>2023-03-05T05:38:27.3708310Z 2023-03-05 05:38:27 ERROR 404: Not Found.
>2023-03-05T05:38:27.3708467Z 
>2023-03-05T05:38:27.4023505Z 
>2023-03-05T05:38:27.4024204Z WARNING: apt does not have a stable CLI 
>interface. Use with caution in scripts.
>2023-03-05T05:38:27.4024423Z 
>2023-03-05T05:38:27.4566409Z Reading package lists...
>2023-03-05T05:38:27.4595509Z E: Unsupported file 
>./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on commandline
>2023-03-05T05:38:27.4659700Z ##[error]Bash exited with code '100'.
>2023-03-05T05:38:27.4677676Z ##[section]Finishing: Prepare E2E run


Re: Large schemas lead to long DataStream-to-table transformation names

2023-03-05 Thread Jark Wu
Hi Xingcan,

I think `physicalDataType.toString()` is indeed verbose in this case.
Normal table scan generates descriptions using field names instead of the
full schema.
Will that help in your case?

Best,
Jark

On Sat, 4 Mar 2023 at 06:57, Xingcan Cui  wrote:

> Hi all,
>
> We are dealing with some streams with large (nested) schemas. When using `t
> ableEnv.createTemporaryView()` to register a DataStream to a table, the
> transformation always gets a large name. It's not a big problem, but quite
> annoying since the UI and logs are hard to read.
>
> Internally, `ExternalDynamicSource` (and `ExternalDynamicSink`) invokes
> `physicalDataType.toString()` to generate an operator name (which will also
> be used as the transformation name). I'm thinking to introduce a new table
> config to either truncate the name or use a limited level of logicalType to
> generate the name (works for nested schemas).
>
> What do you think?
>
> Best,
> Xingcan
>


Re: [DISCUSS] String literal behavior in Flink

2023-03-05 Thread Jark Wu
Hi Aitozi,

I think this is a good idea to improve the backslash escape strings.
However, I lean a bit more toward the Postgres approach[1],
which is more standard-compliant. PG allows backslash escape
string by writing the letter E (upper or lower case) just before the
opening single quote, e.g., E'foo\n'.

Recognizing backslash escapes in both regular and escape string constants
is not backward compatible in Flink, and is also deprecated in PG.

In addition, Flink also supports Unicode escape string constants by
writing the U& before the quote[1] which works in the same way with
backslash escape string.

Best,
Jark

[1]:
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
[2]:
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/overview/

On Sat, 4 Mar 2023 at 23:31, Aitozi  wrote:

> Hi,
>   I encountered a problem when using string literal in Flink. Currently,
> Flink will escape the string literal during codegen, so for the query
> below:
>
> SELECT 'a\nb'; it will print => a\nb
>
> then for the query
>
> SELECT SPLIT_INDEX(col, '\n', 0);
>
> The col can not split by the newline. If we want to split by the newline,
> we should use
>
> SELECT SPLIT_INDEX(col, '
> ', 0)
>
> or
>
> SELECT SPLIT_INDEX(col, CHR(10), 0)
>
> The above way could be more intuitive. Some other databases support these
> "Special Character Escape Sequences"[1].
>
> In this way, we can directly use
> SELECT SPLIT_INDEX(col, '\n', 0); for the query.
>
> I know this is not standard behavior in ANSI SQL. I'm opening this thread
> for some opinions from the community guys.
>
> [1]:
>
> https://dev.mysql.com/doc/refman/8.0/en/string-literals.html#character-escape-sequences
>
> Thanks,
> Aitozi
>


Re: help: [FLINK-31321] @flinkbot run azure does not work?

2023-03-05 Thread yuxia
I also encouter the same problem. I have no idea why it happens, but hope it 
can ne fixed assp.

Best regards,
Yuxia

- 原始邮件 -
发件人: "felixzh" 
收件人: "dev" 
发送时间: 星期一, 2023年 3 月 06日 上午 8:46:49
主题: help: [FLINK-31321] @flinkbot run azure does not work?

2023-03-05T05:38:27.1951220Z libapr1 is already the newest version 
(1.6.5-1ubuntu1).
2023-03-05T05:38:27.1951745Z libapr1 set to manually installed.
2023-03-05T05:38:27.1952264Z 0 upgraded, 0 newly installed, 0 to remove and 13 
not upgraded.
2023-03-05T05:38:27.1984256Z --2023-03-05 05:38:27--  
http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
2023-03-05T05:38:27.2104330Z Resolving security.ubuntu.com 
(security.ubuntu.com)... 91.189.91.39, 91.189.91.38, 185.125.190.39, ...
2023-03-05T05:38:27.2904245Z Connecting to security.ubuntu.com 
(security.ubuntu.com)|91.189.91.39|:80... connected.
2023-03-05T05:38:27.3707348Z HTTP request sent, awaiting response... 404 Not 
Found
2023-03-05T05:38:27.3708310Z 2023-03-05 05:38:27 ERROR 404: Not Found.
2023-03-05T05:38:27.3708467Z 
2023-03-05T05:38:27.4023505Z 
2023-03-05T05:38:27.4024204Z WARNING: apt does not have a stable CLI interface. 
Use with caution in scripts.
2023-03-05T05:38:27.4024423Z 
2023-03-05T05:38:27.4566409Z Reading package lists...
2023-03-05T05:38:27.4595509Z E: Unsupported file 
./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on commandline
2023-03-05T05:38:27.4659700Z ##[error]Bash exited with code '100'.
2023-03-05T05:38:27.4677676Z ##[section]Finishing: Prepare E2E run


help: [FLINK-31321] @flinkbot run azure does not work?

2023-03-05 Thread felixzh
2023-03-05T05:38:27.1951220Z libapr1 is already the newest version 
(1.6.5-1ubuntu1).
2023-03-05T05:38:27.1951745Z libapr1 set to manually installed.
2023-03-05T05:38:27.1952264Z 0 upgraded, 0 newly installed, 0 to remove and 13 
not upgraded.
2023-03-05T05:38:27.1984256Z --2023-03-05 05:38:27--  
http://security.ubuntu.com/ubuntu/pool/main/o/openssl1.0/libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb
2023-03-05T05:38:27.2104330Z Resolving security.ubuntu.com 
(security.ubuntu.com)... 91.189.91.39, 91.189.91.38, 185.125.190.39, ...
2023-03-05T05:38:27.2904245Z Connecting to security.ubuntu.com 
(security.ubuntu.com)|91.189.91.39|:80... connected.
2023-03-05T05:38:27.3707348Z HTTP request sent, awaiting response... 404 Not 
Found
2023-03-05T05:38:27.3708310Z 2023-03-05 05:38:27 ERROR 404: Not Found.
2023-03-05T05:38:27.3708467Z 
2023-03-05T05:38:27.4023505Z 
2023-03-05T05:38:27.4024204Z WARNING: apt does not have a stable CLI interface. 
Use with caution in scripts.
2023-03-05T05:38:27.4024423Z 
2023-03-05T05:38:27.4566409Z Reading package lists...
2023-03-05T05:38:27.4595509Z E: Unsupported file 
./libssl1.0.0_1.0.2n-1ubuntu5.10_amd64.deb given on commandline
2023-03-05T05:38:27.4659700Z ##[error]Bash exited with code '100'.
2023-03-05T05:38:27.4677676Z ##[section]Finishing: Prepare E2E run

[jira] [Created] (FLINK-31326) Disabled source scaling breaks downstream scaling if source busyTimeMsPerSecond is 0

2023-03-05 Thread Mate Czagany (Jira)
Mate Czagany created FLINK-31326:


 Summary: Disabled source scaling breaks downstream scaling if 
source busyTimeMsPerSecond is 0
 Key: FLINK-31326
 URL: https://issues.apache.org/jira/browse/FLINK-31326
 Project: Flink
  Issue Type: Bug
  Components: Autoscaler, Kubernetes Operator
Affects Versions: kubernetes-operator-1.5.0
Reporter: Mate Czagany


In case of 'scaling.sources.enabled'='false' the 'TARGET_DATA_RATE' of the 
source vertex will be calculated as '(1000 / busyTimeMsPerSecond) * 
numRecordsOutPerSecond' which currently on the main branch results in an 
infinite value if 'busyTimeMsPerSecond' is 0. This will also affect downstream 
operators.

I'm not that familiar with the autoscaler code, but it's in my opinion it's 
quite unexpected to have such behavioral changes by setting 
'scaling.sources.enabled' to false.

 

With PR #543 for FLINK-30575 
(https://github.com/apache/flink-kubernetes-operator/pull/543) scaling will 
happen even with 'busyTimeMsPerSecond' being 0, but it will result in 
unreasonably high parallelism numbers for downstream operators because 
'TARGET_DATA_RATE' will be very high where 0 'busyTimeMsPerSecond' will be 
replaced with 1e-10.


Metrics from the operator logs (source=e5a72f353fc1e6bbf3bd96a41384998c, 
sink=51312116a3e504bccb3874fc80d5055e)

'scaling.sources.enabled'='true':
{code:java}
 jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.PARALLELISM.Current: 1.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.MAX_PARALLELISM.Current: 1.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_PROCESSING_RATE.Current: NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_PROCESSING_RATE.Average: NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.CATCH_UP_DATA_RATE.Current: 0.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.SCALE_UP_RATE_THRESHOLD.Current: 
5.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.SCALE_DOWN_RATE_THRESHOLD.Current: 
10.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.OUTPUT_RATIO.Current: 2.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.OUTPUT_RATIO.Average: 2.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_OUTPUT_RATE.Current: Infinity
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_OUTPUT_RATE.Average: NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TARGET_DATA_RATE.Current: 
3.8667
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TARGET_DATA_RATE.Average: 
3.8833

jobVertexID.51312116a3e504bccb3874fc80d5055e.PARALLELISM.Current: 4.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.MAX_PARALLELISM.Current: 12.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.TRUE_PROCESSING_RATE.Current: 
4.827299209321681
jobVertexID.51312116a3e504bccb3874fc80d5055e.TRUE_PROCESSING_RATE.Average: 
4.848351269098938
jobVertexID.51312116a3e504bccb3874fc80d5055e.CATCH_UP_DATA_RATE.Current: 0.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.SCALE_UP_RATE_THRESHOLD.Current: 
10.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.SCALE_DOWN_RATE_THRESHOLD.Current: 
21.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.TARGET_DATA_RATE.Current: 
7.733
jobVertexID.51312116a3e504bccb3874fc80d5055e.TARGET_DATA_RATE.Average: 
7.767{code}

'scaling.sources.enabled'='false':
{code:java}
 jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.PARALLELISM.Current: 1.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.MAX_PARALLELISM.Current: 1.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_PROCESSING_RATE.Current: NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_PROCESSING_RATE.Average: NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.CATCH_UP_DATA_RATE.Current: 0.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.SCALE_UP_RATE_THRESHOLD.Current: 
NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.SCALE_DOWN_RATE_THRESHOLD.Current: 
NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.OUTPUT_RATIO.Current: 2.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.OUTPUT_RATIO.Average: 2.0
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_OUTPUT_RATE.Current: Infinity
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TRUE_OUTPUT_RATE.Average: NaN
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TARGET_DATA_RATE.Current: Infinity
jobVertexID.e5a72f353fc1e6bbf3bd96a41384998c.TARGET_DATA_RATE.Average: NaN

jobVertexID.51312116a3e504bccb3874fc80d5055e.PARALLELISM.Current: 4.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.MAX_PARALLELISM.Current: 12.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.TRUE_PROCESSING_RATE.Current: 5.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.TRUE_PROCESSING_RATE.Average: 
4.9805556
jobVertexID.51312116a3e504bccb3874fc80d5055e.CATCH_UP_DATA_RATE.Current: 0.0
jobVertexID.51312116a3e504bccb3874fc80d5055e.SCALE_UP_RATE_THRESHOLD.Current: 
NaN
jobVertexID.51312116a3e504bccb3874fc80d5055e.SCALE_DOWN_RATE_THRESHOLD.Current: 
NaN
jobVertexID.51312116a3e504bccb3874fc80d5055e.TARGET_DATA_RATE.Current: Infinity

[jira] [Created] (FLINK-31325) Improve performance of Swing

2023-03-05 Thread Yindi Wang (Jira)
Yindi Wang created FLINK-31325:
--

 Summary:  Improve performance of Swing
 Key: FLINK-31325
 URL: https://issues.apache.org/jira/browse/FLINK-31325
 Project: Flink
  Issue Type: Improvement
  Components: Library / Machine Learning
Affects Versions: ml-2.2.0
Reporter: Yindi Wang


Optimize _ComputingSimilarItems_ operator of

_org.apache.flink.ml.recommendation.swing.Swing._ Optimized code can lead to 
less cpu time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)