[jira] [Created] (FLINK-30015) Benchmarks are failing

2022-11-13 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-30015:
--

 Summary: Benchmarks are failing
 Key: FLINK-30015
 URL: https://issues.apache.org/jira/browse/FLINK-30015
 Project: Flink
  Issue Type: Bug
  Components: Benchmarks
Reporter: Martijn Visser


{code:java}
Build interrupted 1411 of flink-master-benchmarks-regression-check (Open): 
org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
{code}

Build 1405 until 1411 have all failed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30014) Fix the NPE from aggregate util

2022-11-13 Thread Jiang Xin (Jira)
Jiang Xin created FLINK-30014:
-

 Summary: Fix the NPE from aggregate util
 Key: FLINK-30014
 URL: https://issues.apache.org/jira/browse/FLINK-30014
 Project: Flink
  Issue Type: Bug
  Components: Library / Machine Learning
Reporter: Jiang Xin
 Fix For: ml-2.2.0


The following exception is thrown in Flink ML CI step.

 

```
[INFO] Running org.apache.flink.ml.feature.CountVectorizerTest 
[435|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:436]Error:
  Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.419 s <<< 
FAILURE! - in org.apache.flink.ml.feature.CountVectorizerTest 
[436|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:437]Error:
  testFitAndPredict Time elapsed: 0.66 s <<< ERROR! 
[437|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:438]java.lang.RuntimeException:
 Failed to fetch next result 
[438|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:439]
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
 
[439|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:440]
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
 
[440|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:441]
 at org.apache.commons.collections.IteratorUtils.toList(IteratorUtils.java:848) 
[441|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:442]
 at org.apache.commons.collections.IteratorUtils.toList(IteratorUtils.java:825) 
[442|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:443]
 at 
org.apache.flink.ml.feature.CountVectorizerTest.verifyPredictionResult(CountVectorizerTest.java:120)
 
[443|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:444]
 at 
org.apache.flink.ml.feature.CountVectorizerTest.testFitAndPredict(CountVectorizerTest.java:208)
 
[444|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:445]
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
[445|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:446]
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
[446|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:447]
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
[447|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:448]
 at java.lang.reflect.Method.invoke(Method.java:498) 
[448|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:449]
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 
[449|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:450]
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 
[450|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:451]
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 
[451|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:452]
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 
[452|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:453]
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
[453|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:454]
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
[454|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:455]
 at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) 
[455|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:456]
 at org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) 
[456|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:457]
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) 
[457|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:458]
 at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
[458|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:459]
 at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
 
[459|https://github.com/apache/flink-ml/actions/runs/3459311341/jobs/5774576369#step:4:460]
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 

Re: [VOTE] Release 1.15.3, release candidate #1

2022-11-13 Thread Fabian Paul
Hi everyone,

I am still looking for volunteers to validate the release. I'll extend
the voting period by another 48hours, please try to give it some time.

Best,
Fabian


On Thu, Nov 10, 2022 at 5:18 PM Fabian Paul  wrote:
>
> Hi everyone, Please review and vote on the release candidate #1 for the 
> version 1.15.3, as follows: [ ] +1, Approve the release [ ] -1, Do not 
> approve the release (please provide specific comments) The complete staging 
> area is available for your review, which includes: - JIRA release notes [1], 
> - the official Apache source release and binary convenience releases to be 
> deployed to dist.apache.org [2], which are signed with the key with 
> fingerprint 90755B0A184BD9FFD22B6BE19D4F76C84EC11E37 [3], - all artifacts to 
> be deployed to the Maven Central Repository [4], - source code tag 
> "release-1.15.3-rc1" [5], - website pull request listing the new release and 
> adding announcement blog post [6]. The vote will be open for at least 72 
> hours. It is adopted by majority approval, with at least 3 PMC affirmative 
> votes.
>
> Best, Fabian
> [1] 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352210
>  [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.15.3-rc1
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS [4] 
> https://repository.apache.org/content/repositories/orgapacheflink-1548 [5] 
> https://github.com/apache/flink/tree/release-1.15.3-rc1 [6] 
> https://github.com/apache/flink-web/pull/581


[jira] [Created] (FLINK-30013) Add data type compatibility check in SchemaChange.updateColumnType

2022-11-13 Thread Shammon (Jira)
Shammon created FLINK-30013:
---

 Summary: Add data type compatibility check in 
SchemaChange.updateColumnType
 Key: FLINK-30013
 URL: https://issues.apache.org/jira/browse/FLINK-30013
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: table-store-0.3.0
Reporter: Shammon


Add LogicalTypeCasts.supportsImplicitCast to check operation in 
SchemaChange.updateColumnType to avoid data type conversion failures when 
reading data



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30012) A typo in official Table Store document.

2022-11-13 Thread Hang HOU (Jira)
Hang HOU created FLINK-30012:


 Summary: A typo in official Table Store document.
 Key: FLINK-30012
 URL: https://issues.apache.org/jira/browse/FLINK-30012
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Affects Versions: 1.16.0
 Environment: Flink 1.16.0
Reporter: Hang HOU


Found a typo in Rescale Bucket document which is "exiting".
[Rescale 
Bucket|https://nightlies.apache.org/flink/flink-table-store-docs-release-0.2/docs/development/rescale-bucket/#rescale-bucket]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30011) HiveCatalogGenericMetadataTest azure CI failed due to catalog does not exist

2022-11-13 Thread Leonard Xu (Jira)
Leonard Xu created FLINK-30011:
--

 Summary: HiveCatalogGenericMetadataTest azure CI failed due to 
catalog does not exist
 Key: FLINK-30011
 URL: https://issues.apache.org/jira/browse/FLINK-30011
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.16.1
Reporter: Leonard Xu



{noformat}

Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testGetPartitionStats:1212 » Catalog 
F...
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testGetPartition_PartitionNotExist:1160 
» Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testGetPartition_PartitionSpecInvalid_invalidPartitionSpec:1124
 » Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testGetPartition_PartitionSpecInvalid_sizeNotEqual:1139
 » Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testGetPartition_TableNotPartitioned:1110
 » Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testGetTableStats_TableNotExistException:1201
 » Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testGetTable_TableNotExistException:323 
» Catalog
Nov 13 01:55:18 [ERROR]   HiveCatalogHiveMetadataTest.testHiveStatistics:251 » 
Catalog Failed to create ...
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testListFunctions:749 » Catalog 
Failed...
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testListPartitionPartialSpec:1188 » 
Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testListTables:498 » Catalog Failed 
to...
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testListView:620 » Catalog Failed to 
c...
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testPartitionExists:1174 » Catalog 
Fai...
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testRenameTable_TableAlreadyExistException:483
 » Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testRenameTable_TableNotExistException:465
 » Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testRenameTable_TableNotExistException_ignored:477
 » Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testRenameTable_nonPartitionedTable:451 
» Catalog
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testRenameView:637 » Catalog Failed 
to...
Nov 13 01:55:18 [ERROR]   
HiveCatalogHiveMetadataTest>CatalogTest.testTableExists:510 » Catalog Failed 
t...
Nov 13 01:55:18 [ERROR]   HiveCatalogHiveMetadataTest.testViewCompatibility:115 
» Catalog Failed to crea...
Nov 13 01:55:18 [INFO] 
Nov 13 01:55:18 [ERROR] Tests run: 361, Failures: 0, Errors: 132, Skipped: 0
{noformat}


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43104=logs=245e1f2e-ba5b-5570-d689-25ae21e5302f=d04c9862-880c-52f5-574b-a7a79fef8e0f



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30010) flink-quickstart-test failed due to could not resolve dependencies

2022-11-13 Thread Leonard Xu (Jira)
Leonard Xu created FLINK-30010:
--

 Summary: flink-quickstart-test failed due to could not resolve 
dependencies 
 Key: FLINK-30010
 URL: https://issues.apache.org/jira/browse/FLINK-30010
 Project: Flink
  Issue Type: Bug
  Components: Examples, Tests
Affects Versions: 1.17.0
Reporter: Leonard Xu



{noformat}
Nov 13 02:10:37 [ERROR] Failed to execute goal on project 
flink-quickstart-test: Could not resolve dependencies for project 
org.apache.flink:flink-quickstart-test:jar:1.17-SNAPSHOT: Could not find 
artifact org.apache.flink:flink-quickstart-scala:jar:1.17-SNAPSHOT in 
apache.snapshots (https://repository.apache.org/snapshots) -> [Help 1]
Nov 13 02:10:37 [ERROR] 
Nov 13 02:10:37 [ERROR] To see the full stack trace of the errors, re-run Maven 
with the -e switch.
Nov 13 02:10:37 [ERROR] Re-run Maven using the -X switch to enable full debug 
logging.
Nov 13 02:10:37 [ERROR] 
Nov 13 02:10:37 [ERROR] For more information about the errors and possible 
solutions, please read the following articles:
Nov 13 02:10:37 [ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
Nov 13 02:10:37 [ERROR] 
Nov 13 02:10:37 [ERROR] After correcting the problems, you can resume the build 
with the command
Nov 13 02:10:37 [ERROR]   mvn  -rf :flink-quickstart-test
Nov 13 02:10:38 Process exited with EXIT CODE: 1.
Nov 13 02:10:38 Trying to KILL watchdog (293).
/__w/1/s/tools/ci/watchdog.sh: line 100:   293 Terminated  watchdog
Nov 13 02:10:38 
==
Nov 13 02:10:38 Compilation failure detected, skipping test execution.
Nov 13 02:10:38 
==
{noformat}



https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43102=logs=298e20ef-7951-5965-0e79-ea664ddc435e=d4c90338-c843-57b0-3232-10ae74f00347=18363



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: SQL Gateway and SQL Client

2022-11-13 Thread yu zelin
Hi Jim,

It would be nice if you can take a look on 
https://github.com/apache/flink/pull/21133 and give me some feedback.

Best,
Yu Zelin

> 2022年11月12日 00:44,Jim Hughes  写道:
> 
> Hi Shengkai,
> 
> I think there is an additional case where a proxy is between the client and
> gateway.  In that case, being able to pass headers would allow for
> additional options / features.
> 
> I see several PRs from Yu Zelin.  Is there a first one to review?
> 
> Cheers,
> 
> Jim
> 
> On Thu, Nov 10, 2022 at 9:42 PM Shengkai Fang  wrote:
> 
>> Hi, Jim.
>> 
>>> how to pass additional headers when sending REST requests
>> 
>> Could you share what headers do you want to send when using SQL Client?  I
>> think there are two cases we need to consider. Please correct me if I am
>> wrong.
>> 
>> # Case 1
>> 
>> If users wants to connect to the SQL Gateway with its password, I think the
>> users should type
>> ```
>> ./sql-client.sh --user xxx --password xxx
>> ```
>> in the terminal and the OpenSessionRequest should be enough.
>> 
>> # Case 2
>> 
>> If users  wants to modify the execution config, users should type
>> ```
>> Flink SQL> SET  `` = ``;
>> ```
>> in the terminal. The Client can send ExecuteStatementRequest to the
>> Gateway.
>> 
>>> As you have FLIPs or PRs, feel free to let me, Jamie, and Alexey know.
>> 
>> It would be nice you can join us to finish the feature. I think the
>> modification about the SQL Gateway side is ready to review.
>> 
>> Best,
>> Shengkai
>> 
>> 
>> Jim Hughes  于2022年11月11日周五 05:19写道:
>> 
>>> Hi Yu Zelin,
>>> 
>>> I have read through your draft and it looks good.  I am new to Flink, so
>> I
>>> haven't learned about everything which needs to be done yet.
>>> 
>>> One of the considerations that I'm interested in understanding is how to
>>> pass additional headers when sending REST requests.  From looking at the
>>> code, it looks like a custom `OutboundChannelHandlerFactory` could be
>>> created to read additional configuration and set headers.  Does that make
>>> sense?
>>> 
>>> Thank you very much for sharing the proof of concept code and the
>>> document.  As you have FLIPs or PRs, feel free to let me, Jamie, and
>> Alexey
>>> know.  We'll be happy to review them.
>>> 
>>> Cheers,
>>> 
>>> Jim
>>> 
>>> On Wed, Nov 9, 2022 at 11:43 PM yu zelin  wrote:
>>> 
 Hi, all
 Sorry for late response. As Shengkai mentioned, Currently I’m working
>>> with
 him on SQL Client, dedicating to implement the Remote Mode of SQL
>>> Client. I
 have written a draft of implementation plan and will write a FLIP about
>>> it
 ASAP. If you are interested in, please take a look at the draft and
>> it’s
 nice if you give me some feedback.
 The doc is at:
 
>>> 
>> https://docs.google.com/document/d/14cS4VBSamMUnlM_PZuK6QKLfriUuQU51iqET5oiYy_c/edit?usp=sharing
 
> 2022年11月7日 11:19,Shengkai Fang  写道:
> 
> Hi, all. Sorry for the late reply.
> 
>> Is the gateway mode planned to be supported for SQL Client in 1.17?
>> Do you have anything you can already share so we can start with
>> your
 work or just play around with it.
> 
> Yes. @yzl is working on it and he will list the implementation plan
 later and share the progress. I think the change is not very large and
>> I
 think it's not a big problem to finish this in the release-1.17. I will
 join to develop this in the mid of November.
> 
> Best,
> Shengkai
> 
> 
> 
> 
> Jamie Grier mailto:jgr...@apache.org>>
 于2022年11月5日周六 00:48写道:
>> Hi Shengkai,
>> 
>> We're doing more and more Flink development at Confluent these days
>>> and
 we're currently trying to bootstrap a prototype that relies on the SQL
 Client and Gateway.  We will be using the the components in some of our
 projects and would like to co-develop them with you and the rest of the
 Flink community.
>> 
>> As of right now it's a pretty big blocker for our upcoming milestone
 that the SQL Client has not yet been modified to talk to the SQL
>> Gateway
 and we want to help with this effort ASAP!  We would be even willing to
 take over the work if it's not yet started but I suspect it already is.
>> 
>> Anyway, rather than start working immediately on the SQL Client and
 adding a the new Gateway mode ourselves we wanted to start a
>> conversation
 with you and see where you're at with things and offer to help.
>> 
>> Do you have anything you can already share so we can start with your
 work or just play around with it.  Like I said, we just want to get
>>> started
 and are very able to help in this area.  We see both the SQL Client and
 Gateway being very important for us and have a good team to help
>> develop
>>> it.
>> 
>> Let me know if there is a branch you can share, etc.  It would be
>> much
 appreciated!
>> 
>> -Jamie Grier
>> 
>> 
>> On 2022/10/28 06:06:49 Shengkai Fang 

Re: SQL Gateway and SQL Client

2022-11-13 Thread Shengkai Fang
Hi Jim,

Thanks for your input. You can look here[1] on the server side. You can
modify the return type of the handleRequest to Tuple2 to
get session-specific headers. RestClient has already exposed a method to
send requests with headers for the client side.

[1]
https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/rest/handler/AbstractSqlGatewayRestHandler.java#L98

Jim Hughes  于2022年11月12日周六 00:45写道:

> Hi Shengkai,
>
> I think there is an additional case where a proxy is between the client and
> gateway.  In that case, being able to pass headers would allow for
> additional options / features.
>
> I see several PRs from Yu Zelin.  Is there a first one to review?
>
> Cheers,
>
> Jim
>
> On Thu, Nov 10, 2022 at 9:42 PM Shengkai Fang  wrote:
>
> > Hi, Jim.
> >
> > > how to pass additional headers when sending REST requests
> >
> > Could you share what headers do you want to send when using SQL Client?
> I
> > think there are two cases we need to consider. Please correct me if I am
> > wrong.
> >
> > # Case 1
> >
> > If users wants to connect to the SQL Gateway with its password, I think
> the
> > users should type
> > ```
> > ./sql-client.sh --user xxx --password xxx
> > ```
> > in the terminal and the OpenSessionRequest should be enough.
> >
> > # Case 2
> >
> > If users  wants to modify the execution config, users should type
> > ```
> > Flink SQL> SET  `` = ``;
> > ```
> > in the terminal. The Client can send ExecuteStatementRequest to the
> > Gateway.
> >
> > > As you have FLIPs or PRs, feel free to let me, Jamie, and Alexey know.
> >
> > It would be nice you can join us to finish the feature. I think the
> > modification about the SQL Gateway side is ready to review.
> >
> > Best,
> > Shengkai
> >
> >
> > Jim Hughes  于2022年11月11日周五 05:19写道:
> >
> > > Hi Yu Zelin,
> > >
> > > I have read through your draft and it looks good.  I am new to Flink,
> so
> > I
> > > haven't learned about everything which needs to be done yet.
> > >
> > > One of the considerations that I'm interested in understanding is how
> to
> > > pass additional headers when sending REST requests.  From looking at
> the
> > > code, it looks like a custom `OutboundChannelHandlerFactory` could be
> > > created to read additional configuration and set headers.  Does that
> make
> > > sense?
> > >
> > > Thank you very much for sharing the proof of concept code and the
> > > document.  As you have FLIPs or PRs, feel free to let me, Jamie, and
> > Alexey
> > > know.  We'll be happy to review them.
> > >
> > > Cheers,
> > >
> > > Jim
> > >
> > > On Wed, Nov 9, 2022 at 11:43 PM yu zelin 
> wrote:
> > >
> > > > Hi, all
> > > > Sorry for late response. As Shengkai mentioned, Currently I’m working
> > > with
> > > > him on SQL Client, dedicating to implement the Remote Mode of SQL
> > > Client. I
> > > > have written a draft of implementation plan and will write a FLIP
> about
> > > it
> > > > ASAP. If you are interested in, please take a look at the draft and
> > it’s
> > > > nice if you give me some feedback.
> > > > The doc is at:
> > > >
> > >
> >
> https://docs.google.com/document/d/14cS4VBSamMUnlM_PZuK6QKLfriUuQU51iqET5oiYy_c/edit?usp=sharing
> > > >
> > > > > 2022年11月7日 11:19,Shengkai Fang  写道:
> > > > >
> > > > > Hi, all. Sorry for the late reply.
> > > > >
> > > > > > Is the gateway mode planned to be supported for SQL Client in
> 1.17?
> > > > > > Do you have anything you can already share so we can start with
> > your
> > > > work or just play around with it.
> > > > >
> > > > > Yes. @yzl is working on it and he will list the implementation plan
> > > > later and share the progress. I think the change is not very large
> and
> > I
> > > > think it's not a big problem to finish this in the release-1.17. I
> will
> > > > join to develop this in the mid of November.
> > > > >
> > > > > Best,
> > > > > Shengkai
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Jamie Grier mailto:jgr...@apache.org>>
> > > > 于2022年11月5日周六 00:48写道:
> > > > >> Hi Shengkai,
> > > > >>
> > > > >> We're doing more and more Flink development at Confluent these
> days
> > > and
> > > > we're currently trying to bootstrap a prototype that relies on the
> SQL
> > > > Client and Gateway.  We will be using the the components in some of
> our
> > > > projects and would like to co-develop them with you and the rest of
> the
> > > > Flink community.
> > > > >>
> > > > >> As of right now it's a pretty big blocker for our upcoming
> milestone
> > > > that the SQL Client has not yet been modified to talk to the SQL
> > Gateway
> > > > and we want to help with this effort ASAP!  We would be even willing
> to
> > > > take over the work if it's not yet started but I suspect it already
> is.
> > > > >>
> > > > >> Anyway, rather than start working immediately on the SQL Client
> and
> > > > adding a the new Gateway mode ourselves we wanted to start a
> > conversation
> > > > with you and see where you're at with 

[jira] [Created] (FLINK-30009) OperatorCoordinator.start()'s JavaDoc mismatches its behavior

2022-11-13 Thread Yunfeng Zhou (Jira)
Yunfeng Zhou created FLINK-30009:


 Summary: OperatorCoordinator.start()'s JavaDoc mismatches its 
behavior
 Key: FLINK-30009
 URL: https://issues.apache.org/jira/browse/FLINK-30009
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.16.0
Reporter: Yunfeng Zhou


The following description lies in the JavaDoc of 
{{OperatorCoordinator.start()}}.

{{This method is called once at the beginning, before any other methods.}}

This description is incorrect because the method {{resetToCheckpoint()}} can 
happen before {{start()}} is invoked. For example, 
{{RecreateOnResetOperatorCoordinator.DeferrableCoordinator.resetAndStart()}} 
uses these methods in this way. Thus the JavaDoc of {{OperatorCoordinator}}'s 
methods should be modified to match this behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Issue tracking workflow

2022-11-13 Thread Leonard Xu


> The mailing list has been created and I've opened a PR  to update the docs
> https://github.com/apache/flink-web/pull/583

Thanks @Martijn for the nice work.
I am willing to review this document PR, because the PR also provides Chinese 
part, which is great, I should be able to offer some tips.

Best,
Leonard



> 
> Op zo 13 nov. 2022 om 09:40 schreef Martijn Visser > 
> 
>> Agreed. I've requested a new private mailing list [1]
>> 
>> [1] https://issues.apache.org/jira/browse/INFRA-23898
>> 
>> On Sat, Nov 12, 2022 at 12:09 PM Márton Balassi 
>> wrote:
>> 
>>> Hi Martjin,
>>> 
>>> Given the situation let us set up the Jira signup mailing list following
>>> the Calcite model. This seems the most sensible to me as of now.
>>> 
>>> On Fri, Nov 11, 2022 at 7:26 PM Martijn Visser 
>>> wrote:
>>> 
 Hi everyone,
 
 Unfortunately ASF Infra has already implemented the change and new Jira
 users can't sign up.
 
 I think there is consensus that we shouldn't move from Jira now. My
 proposal would be to setup a separate mailing list to which users can
>>> send
 their request for an account, which gets sent to the PMC so they can
>>> create
 accounts for them. I don't see any other short term solution.
 
 If agreed, let's open up a vote thread on this.
 
 Thanks, Martijn
 
 
 Op do 3 nov. 2022 om 04:51 schreef Xintong Song 
 
> Thanks all for the valuable feedback, opinions and suggestions.
> 
> # Option 1.
> I know this is the first choice for pretty much everyone. Many people
 from
> the Flink community (including myself) have shared their opinion with
> Infra. However, based on the feedback so far, TBH I don't think things
> would turn out the way we want. I don't see what else we can do. Does
> anyone have more suggestions on this option? Or we probably have to
> scratch it out of the list.
> 
> # Option 4.
> Seems there are also quite some concerns on using solely GH issues:
 limited
> features (thus the significant changes to the current issue/release
> management processes), migration cost, source of truth, etc. I think
>>> I'm
> also convinced that this may not be a good choice.
> 
> # Option 2 & 3.
> Between the two options, I'm leaning towards option 2.
> - IMO, making it as easy as possible for users to report issues should
 be a
> top priority. Having to wait for a human response for creating an
>>> account
> does not meet that requirement. That makes a strong objection to
>>> option 3
> from my side.
> - Using GH issues for consumer-facing issues and reflecting the valid
 ones
> back to Jira (either manually by committers or by bot) sounds good to
>>> me.
> The status (open/closed) and labels should make tracking the issues
 easier
> compared to in mailing lists / slack, in terms of whether an issue has
 been
> taken care of / reflected to Jira / closed as invalid. That does not
>>> mean
> we should not reflect things from mailing lists / slack to Jira.
>>> Ideally,
> we leverage every possible channel for collecting user issues /
>>> feedback,
> while guiding / suggesting users to choose GH issues over the others.
> - For new contributors, they still need to request an account from a
>>> PMC
> member. They can even make that request on GH issues, if they do not
>>> mind
> posting the email address publicly.
> - I would not be worried very much about the privacy issue, if the
>>> Jira
> account creation is restricted to contributors. Contributors are
>>> exposing
> their email addresses publicly anyway, in dev@ mailing list and
>>> commit
> history. I'm also not strongly against creating a dedicated mailing
>>> list
> though.
> 
> Best,
> 
> Xintong
> 
> 
> 
> On Wed, Nov 2, 2022 at 9:16 PM Chesnay Schepler 
> wrote:
> 
>> Calcite just requested a separate mailing list for users to request
>>> a
>> JIRA account.
>> 
>> 
>> I think I'd try going a similar route. While I prefer the openness
>>> of
>> github issues, they are really limited, and while some things can be
>> replicated with labels (like fix versions / components), things like
>> release notes can't.
>> We'd also lose a central place for collecting issues, since we'd
>>> have
 to
>> (?) scope issues per repo.
>> 
>> I wouldn't want to import everything into GH issues (it's just a
>>> flawed
>> approach in the long-term imo), but on the other hand I don't know
>>> if
>> the auto linker even works if it has to link to either jira or a GH
> issue.
>> 
>> Given that we need to change workflows in any case, I think I'd
>>> prefer
>> sticking to JIRA.
>> For reported bugs I'd wager that in most cases we can file the
>>> tickets
>> ourselves and communicate with users on slack/MLs to 

Re: [DISCUSS] FLIP-268: Rack Awareness for Kafka Sources

2022-11-13 Thread Hang Ruan
Hi Jeremy,

Thanks for the proposal.
I think we should add some descriptions about how we plan to pass this
configuration in Flink SQL.
Maybe we need a new interface which is passed to the Kafka source like the
serializer/deserializer.

Best,
Hang


Re: [DISCUSS] Issue tracking workflow

2022-11-13 Thread Martijn Visser
The mailing list has been created and I've opened a PR  to update the docs
https://github.com/apache/flink-web/pull/583

Op zo 13 nov. 2022 om 09:40 schreef Martijn Visser 

> Agreed. I've requested a new private mailing list [1]
>
> [1] https://issues.apache.org/jira/browse/INFRA-23898
>
> On Sat, Nov 12, 2022 at 12:09 PM Márton Balassi 
> wrote:
>
>> Hi Martjin,
>>
>> Given the situation let us set up the Jira signup mailing list following
>> the Calcite model. This seems the most sensible to me as of now.
>>
>> On Fri, Nov 11, 2022 at 7:26 PM Martijn Visser 
>> wrote:
>>
>> > Hi everyone,
>> >
>> > Unfortunately ASF Infra has already implemented the change and new Jira
>> > users can't sign up.
>> >
>> > I think there is consensus that we shouldn't move from Jira now. My
>> > proposal would be to setup a separate mailing list to which users can
>> send
>> > their request for an account, which gets sent to the PMC so they can
>> create
>> > accounts for them. I don't see any other short term solution.
>> >
>> > If agreed, let's open up a vote thread on this.
>> >
>> > Thanks, Martijn
>> >
>> >
>> > Op do 3 nov. 2022 om 04:51 schreef Xintong Song 
>> >
>> > > Thanks all for the valuable feedback, opinions and suggestions.
>> > >
>> > > # Option 1.
>> > > I know this is the first choice for pretty much everyone. Many people
>> > from
>> > > the Flink community (including myself) have shared their opinion with
>> > > Infra. However, based on the feedback so far, TBH I don't think things
>> > > would turn out the way we want. I don't see what else we can do. Does
>> > > anyone have more suggestions on this option? Or we probably have to
>> > > scratch it out of the list.
>> > >
>> > > # Option 4.
>> > > Seems there are also quite some concerns on using solely GH issues:
>> > limited
>> > > features (thus the significant changes to the current issue/release
>> > > management processes), migration cost, source of truth, etc. I think
>> I'm
>> > > also convinced that this may not be a good choice.
>> > >
>> > > # Option 2 & 3.
>> > > Between the two options, I'm leaning towards option 2.
>> > > - IMO, making it as easy as possible for users to report issues should
>> > be a
>> > > top priority. Having to wait for a human response for creating an
>> account
>> > > does not meet that requirement. That makes a strong objection to
>> option 3
>> > > from my side.
>> > > - Using GH issues for consumer-facing issues and reflecting the valid
>> > ones
>> > > back to Jira (either manually by committers or by bot) sounds good to
>> me.
>> > > The status (open/closed) and labels should make tracking the issues
>> > easier
>> > > compared to in mailing lists / slack, in terms of whether an issue has
>> > been
>> > > taken care of / reflected to Jira / closed as invalid. That does not
>> mean
>> > > we should not reflect things from mailing lists / slack to Jira.
>> Ideally,
>> > > we leverage every possible channel for collecting user issues /
>> feedback,
>> > > while guiding / suggesting users to choose GH issues over the others.
>> > > - For new contributors, they still need to request an account from a
>> PMC
>> > > member. They can even make that request on GH issues, if they do not
>> mind
>> > > posting the email address publicly.
>> > > - I would not be worried very much about the privacy issue, if the
>> Jira
>> > > account creation is restricted to contributors. Contributors are
>> exposing
>> > > their email addresses publicly anyway, in dev@ mailing list and
>> commit
>> > > history. I'm also not strongly against creating a dedicated mailing
>> list
>> > > though.
>> > >
>> > > Best,
>> > >
>> > > Xintong
>> > >
>> > >
>> > >
>> > > On Wed, Nov 2, 2022 at 9:16 PM Chesnay Schepler 
>> > > wrote:
>> > >
>> > > > Calcite just requested a separate mailing list for users to request
>> a
>> > > > JIRA account.
>> > > >
>> > > >
>> > > > I think I'd try going a similar route. While I prefer the openness
>> of
>> > > > github issues, they are really limited, and while some things can be
>> > > > replicated with labels (like fix versions / components), things like
>> > > > release notes can't.
>> > > > We'd also lose a central place for collecting issues, since we'd
>> have
>> > to
>> > > > (?) scope issues per repo.
>> > > >
>> > > > I wouldn't want to import everything into GH issues (it's just a
>> flawed
>> > > > approach in the long-term imo), but on the other hand I don't know
>> if
>> > > > the auto linker even works if it has to link to either jira or a GH
>> > > issue.
>> > > >
>> > > > Given that we need to change workflows in any case, I think I'd
>> prefer
>> > > > sticking to JIRA.
>> > > > For reported bugs I'd wager that in most cases we can file the
>> tickets
>> > > > ourselves and communicate with users on slack/MLs to gather all the
>> > > > information. I'd argue that if we'd had been more pro-active with
>> > filing
>> > > > tickets for user issues (instead of relying on them to do it) we
>> > > > 

[jira] [Created] (FLINK-30008) Add Flink 1.16.0 Support

2022-11-13 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-30008:
-

 Summary: Add Flink 1.16.0 Support
 Key: FLINK-30008
 URL: https://issues.apache.org/jira/browse/FLINK-30008
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / AWS
Reporter: Danny Cranmer
 Fix For: aws-connector-2.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Allow sharing (RocksDB) memory between slots

2022-11-13 Thread Xintong Song
I like the idea of sharing RocksDB memory across slots. However, I'm quite
concerned by the current proposed approach.

The proposed changes break several good properties that we designed for
managed memory.
1. It's isolated across slots
2. It should never be wasted (unless there's nothing in the job that needs
managed memory)
In addition, it further complicates configuration / computation logics of
managed memory.

As an alternative, I'd suggest introducing a variant of
RocksDBStateBackend, that shares memory across slots and does not use
managed memory. This basically means the shared memory is not considered as
part of managed memory. For users of this new feature, they would need to
configure how much memory the variant state backend should use, and
probably also a larger framework-off-heap / jvm-overhead memory. The latter
might require a bit extra user effort compared to the current approach, but
would avoid significant complexity in the managed memory configuration and
calculation logics which affects broader users.


Best,

Xintong



On Sat, Nov 12, 2022 at 1:21 AM Roman Khachatryan  wrote:

> Hi John, Yun,
>
> Thank you for your feedback
>
> @John
>
> > It seems like operators would either choose isolation for the cluster’s
> jobs
> > or they would want to share the memory between jobs.
> > I’m not sure I see the motivation to reserve only part of the memory for
> sharing
> > and allowing jobs to choose whether they will share or be isolated.
>
> I see two related questions here:
>
> 1) Whether to allow mixed workloads within the same cluster.
> I agree that most likely all the jobs will have the same "sharing"
> requirement.
> So we can drop "state.backend.memory.share-scope" from the proposal.
>
> 2) Whether to allow different memory consumers to use shared or exclusive
> memory.
> Currently, only RocksDB is proposed to use shared memory. For python, it's
> non-trivial because it is job-specific.
> So we have to partition managed memory into shared/exclusive and therefore
> can NOT replace "taskmanager.memory.managed.shared-fraction" with some
> boolean flag.
>
> I think your question was about (1), just wanted to clarify why the
> shared-fraction is needed.
>
> @Yun
>
> > I am just curious whether this could really bring benefits to our users
> with such complex configuration logic.
> I agree, and configuration complexity seems a common concern.
> I hope that removing "state.backend.memory.share-scope" (as proposed above)
> reduces the complexity.
> Please share any ideas of how to simplify it further.
>
> > Could you share some real experimental results?
> I did an experiment to verify that the approach is feasible,
> i.e. multilple jobs can share the same memory/block cache.
> But I guess that's not what you mean here? Do you have any experiments in
> mind?
>
> > BTW, as talked before, I am not sure whether different lifecycles of
> RocksDB state-backends
> > would affect the memory usage of block cache & write buffer manager in
> RocksDB.
> > Currently, all instances would start and destroy nearly simultaneously,
> > this would change after we introduce this feature with jobs running at
> different scheduler times.
> IIUC, the concern is that closing a RocksDB instance might close the
> BlockCache.
> I checked that manually and it seems to work as expected.
> And I think that would contradict the sharing concept, as described in the
> documentation [1].
>
> [1]
> https://github.com/facebook/rocksdb/wiki/Block-Cache
>
> Regards,
> Roman
>
>
> On Wed, Nov 9, 2022 at 3:50 AM Yanfei Lei  wrote:
>
> > Hi Roman,
> > Thanks for the proposal, this allows State Backend to make better use of
> > memory.
> >
> > After reading the ticket, I'm curious about some points:
> >
> > 1. Is shared-memory only for the state backend? If both
> > "taskmanager.memory.managed.shared-fraction: >0" and
> > "state.backend.rocksdb.memory.managed: false" are set at the same time,
> > will the shared-memory be wasted?
> > 2. It's said that "Jobs 4 and 5 will use the same 750Mb of unmanaged
> memory
> > and will compete with each other" in the example, how is the memory size
> of
> > unmanaged part calculated?
> > 3. For fine-grained-resource-management, the control
> > of cpuCores, taskHeapMemory can still work, right?  And I am a little
> > worried that too many memory-about configuration options are complicated
> > for users to understand.
> >
> > Regards,
> > Yanfei
> >
> > Roman Khachatryan  于2022年11月8日周二 23:22写道:
> >
> > > Hi everyone,
> > >
> > > I'd like to discuss sharing RocksDB memory across slots as proposed in
> > > FLINK-29928 [1].
> > >
> > > Since 1.10 / FLINK-7289 [2], it is possible to:
> > > - share these objects among RocksDB instances of the same slot
> > > - bound the total memory usage by all RocksDB instances of a TM
> > >
> > > However, the memory is divided between the slots equally (unless using
> > > fine-grained resource control). This is sub-optimal if some slots
> contain
> > > more memory 

Re: [DISCUSS] Issue tracking workflow

2022-11-13 Thread Martijn Visser
Agreed. I've requested a new private mailing list [1]

[1] https://issues.apache.org/jira/browse/INFRA-23898

On Sat, Nov 12, 2022 at 12:09 PM Márton Balassi 
wrote:

> Hi Martjin,
>
> Given the situation let us set up the Jira signup mailing list following
> the Calcite model. This seems the most sensible to me as of now.
>
> On Fri, Nov 11, 2022 at 7:26 PM Martijn Visser 
> wrote:
>
> > Hi everyone,
> >
> > Unfortunately ASF Infra has already implemented the change and new Jira
> > users can't sign up.
> >
> > I think there is consensus that we shouldn't move from Jira now. My
> > proposal would be to setup a separate mailing list to which users can
> send
> > their request for an account, which gets sent to the PMC so they can
> create
> > accounts for them. I don't see any other short term solution.
> >
> > If agreed, let's open up a vote thread on this.
> >
> > Thanks, Martijn
> >
> >
> > Op do 3 nov. 2022 om 04:51 schreef Xintong Song 
> >
> > > Thanks all for the valuable feedback, opinions and suggestions.
> > >
> > > # Option 1.
> > > I know this is the first choice for pretty much everyone. Many people
> > from
> > > the Flink community (including myself) have shared their opinion with
> > > Infra. However, based on the feedback so far, TBH I don't think things
> > > would turn out the way we want. I don't see what else we can do. Does
> > > anyone have more suggestions on this option? Or we probably have to
> > > scratch it out of the list.
> > >
> > > # Option 4.
> > > Seems there are also quite some concerns on using solely GH issues:
> > limited
> > > features (thus the significant changes to the current issue/release
> > > management processes), migration cost, source of truth, etc. I think
> I'm
> > > also convinced that this may not be a good choice.
> > >
> > > # Option 2 & 3.
> > > Between the two options, I'm leaning towards option 2.
> > > - IMO, making it as easy as possible for users to report issues should
> > be a
> > > top priority. Having to wait for a human response for creating an
> account
> > > does not meet that requirement. That makes a strong objection to
> option 3
> > > from my side.
> > > - Using GH issues for consumer-facing issues and reflecting the valid
> > ones
> > > back to Jira (either manually by committers or by bot) sounds good to
> me.
> > > The status (open/closed) and labels should make tracking the issues
> > easier
> > > compared to in mailing lists / slack, in terms of whether an issue has
> > been
> > > taken care of / reflected to Jira / closed as invalid. That does not
> mean
> > > we should not reflect things from mailing lists / slack to Jira.
> Ideally,
> > > we leverage every possible channel for collecting user issues /
> feedback,
> > > while guiding / suggesting users to choose GH issues over the others.
> > > - For new contributors, they still need to request an account from a
> PMC
> > > member. They can even make that request on GH issues, if they do not
> mind
> > > posting the email address publicly.
> > > - I would not be worried very much about the privacy issue, if the Jira
> > > account creation is restricted to contributors. Contributors are
> exposing
> > > their email addresses publicly anyway, in dev@ mailing list and commit
> > > history. I'm also not strongly against creating a dedicated mailing
> list
> > > though.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Wed, Nov 2, 2022 at 9:16 PM Chesnay Schepler 
> > > wrote:
> > >
> > > > Calcite just requested a separate mailing list for users to request a
> > > > JIRA account.
> > > >
> > > >
> > > > I think I'd try going a similar route. While I prefer the openness of
> > > > github issues, they are really limited, and while some things can be
> > > > replicated with labels (like fix versions / components), things like
> > > > release notes can't.
> > > > We'd also lose a central place for collecting issues, since we'd have
> > to
> > > > (?) scope issues per repo.
> > > >
> > > > I wouldn't want to import everything into GH issues (it's just a
> flawed
> > > > approach in the long-term imo), but on the other hand I don't know if
> > > > the auto linker even works if it has to link to either jira or a GH
> > > issue.
> > > >
> > > > Given that we need to change workflows in any case, I think I'd
> prefer
> > > > sticking to JIRA.
> > > > For reported bugs I'd wager that in most cases we can file the
> tickets
> > > > ourselves and communicate with users on slack/MLs to gather all the
> > > > information. I'd argue that if we'd had been more pro-active with
> > filing
> > > > tickets for user issues (instead of relying on them to do it) we
> > > > would've addressed several issues way sooner.
> > > >
> > > > Additionally, since either option would be a sort of experiment, then
> > > > JIRA is a safer option. We have to change less and there aren't any
> > > > long-term ramifications (like having to re-import GH tickets into
> > JIRA).
> > > >
> > > > On 28/10/2022 

[jira] [Created] (FLINK-30007) Document how users can request a Jira account / file a bug

2022-11-13 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-30007:
--

 Summary: Document how users can request a Jira account / file a 
bug 
 Key: FLINK-30007
 URL: https://issues.apache.org/jira/browse/FLINK-30007
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Project Website
Reporter: Martijn Visser
Assignee: Martijn Visser


Follow-up of https://lists.apache.org/thread/y8vx7qr32xny31qq00f1jzpnz4kw8hpg



--
This message was sent by Atlassian Jira
(v8.20.10#820010)