date:20210818

[jira] [Created] (FLINK-23868) JobExecutionResult printed event if suppressSysout is on

2021-08-18 Thread Paul Lin (Jira)

Paul Lin created FLINK-23868:


 Summary: JobExecutionResult printed event if suppressSysout is on
 Key: FLINK-23868
 URL: https://issues.apache.org/jira/browse/FLINK-23868
 Project: Flink
  Issue Type: Bug
  Components: Client / Job Submission
Affects Versions: 1.13.2, 1.14.0
Reporter: Paul Lin


Environments prints job execution results to stdout by default and provided a 
flag `suppressSysout` to disable the behavior. This flag is useful when 
submitting jobs through REST API or by programmatic approaches. However, 
JobExecutionResult is still printed when this flag is on, which looks like a 
bug to me.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Configuring flink-python in IntelliJ

2021-08-18 Thread Ingo Bürk

Hi Dian,

thank you for the explanation. I guess this is, at least as far as we know,
as good as it gets for an IntelliJ setup for all of Flink. I'll put up a
pull request to update the IDE setup guide, but also point out that for
specific work on PyFlink, a separate setup with PyCharm is recommended.


Best
Ingo

On Wed, Aug 18, 2021 at 4:20 PM Dian Fu  wrote:

> Hi Ingo,
>
> There are both Java/Python source codes in PyFlink and I use two IDEs at
> the same time: IntelliJ IDEA & PyCharm.
>
> Regarding ONLY using IntelliJ IDEA for both Python & Java development, I
> have tried it and it works just as you said. I have done the following:
> - Install Python Plugin
> - Mark the module flink-python as a Python module: right click
> "flink-python" -> "Open Module Settings", change the Module SDK to Python
> - Click file setup.py under flink-python, then it will promote to install a
> few Python libraries, just install them
> - Run the Python tests under sub-directories of flink-python/pyflink
>
> Regarding “a) Step (3) will be undone on every Maven import since Maven
> sets the SDK for the module”, this has not occurred in my environment for
> now. Are there any differences for you regarding the above steps?
>
> PS: For me, I will still use both IDEs for PyFlink development as it
> contains both Java & Python source code under flink-python directory and it
> doesn't support configuring both Python & Java SDK for it. However, if
> one works
> with flink-python occasionally, using only IntelliJ IDEA is a good approach
> and should be enough.
>
> Regards,
> Dian
>
> On Wed, Aug 18, 2021 at 7:33 PM Ingo Bürk  wrote:
>
> > Hi Dian,
> >
> > thanks for responding! No, I haven't tried, but I'm aware of it. I don't
> > work exclusively on PyFlink, but rather just occasionally. So instead of
> > having a whole separate IDE and project for one module I'm trying to get
> > the whole Flink project to work as one in IntelliJ. Do PyFlink developers
> > generally work solely on PyFlink and nothing else / do you switch IDEs
> for
> > everything?
> >
> >
> > Best
> > Ingo
> >
> > On Wed, Aug 18, 2021 at 1:20 PM Dian Fu  wrote:
> >
> > > Hi Ingo,
> > >
> > > Thanks a lot for starting up this discussion. I have not tried IntelliJ
> > and
> > > I’m using PyCharm for PyFlink development. Have you tried this
> guideline?
> > > [1]
> > > It’s written in 1.9 and I think it should still work.
> > >
> > > Regards,
> > > Dian
> > >
> > > [1]
> > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/flinkdev/ide_setup/#pycharm
> > >
> > > On Wed, Aug 18, 2021 at 5:38 PM Till Rohrmann 
> > > wrote:
> > >
> > > > Thanks for starting this discussion Ingo. I guess that Dian can
> > probably
> > > > answer this question. I am big +1 for updating our documentation for
> > how
> > > to
> > > > set up the IDE since this problem will probably be encountered a
> couple
> > > of
> > > > times.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Wed, Aug 18, 2021 at 11:03 AM Ingo Bürk 
> wrote:
> > > >
> > > > > Hello @dev,
> > > > >
> > > > > like probably most of you, I am using IntelliJ to work on Flink.
> > > Lately I
> > > > > needed to also work with flink-python, and thus was wondering about
> > how
> > > > to
> > > > > properly set it up to work in IntelliJ. So far, I have done the
> > > > following:
> > > > >
> > > > > 1. Install the Python plugin
> > > > > 2. Set up a custom Python SDK using a virtualenv
> > > > > 3. Configure the flink-python module to use this Python SDK (rather
> > > than
> > > > a
> > > > > Java SDK)
> > > > > 4. Install as many of the Python dependencies as possible when
> > IntelliJ
> > > > > prompted me to do so
> > > > >
> > > > > This got me into a mostly working state, for example I can run
> tests
> > > from
> > > > > the IDE, at least the ones I tried so far. However, there are two
> > > > concerns:
> > > > >
> > > > > a) Step (3) will be undone on every Maven import since Maven sets
> the
> > > SDK
> > > > > for the module
> > > > > b) Step (4) installed most, but a couple of Python dependencies
> could
> > > not
> > > > > be installed, though so far that didn't cause noticeable problems
> > > > >
> > > > > I'm wondering if there are additional / different steps to do,
> > > > specifically
> > > > > for (a), and maybe how the PyFlink developers are configuring this
> in
> > > > their
> > > > > IDE. The IDE setup guide didn't seem to contain information about
> > that
> > > > > (only about separately setting up this module in PyCharm). Ideally,
> > we
> > > > > could even update the IDE setup guide.
> > > > >
> > > > >
> > > > > Best
> > > > > Ingo
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Merge FLINK-23757 after feature freeze

2021-08-18 Thread Xintong Song

Sounds good to me. Thank you both!

Thank you~

Xintong Song



On Thu, Aug 19, 2021 at 12:34 PM Ingo Bürk  wrote:

> Thanks everyone, and especially Dian,
>
> the PR was a draft because originally the task was for all JSON methods.
> I've now split it to only refer to those which are merged for 1.14 already,
> and converted the PR to a normal one. Dian kindly offered to review and
> merge it.
>
>
> Best
> Ingo
>
> On Thu, Aug 19, 2021, 04:24 Dian Fu  wrote:
>
>> Hi Xintong,
>>
>> I can help review the PR.
>>
>> Regards,
>> Dian
>>
>> > 2021年8月19日 上午9:48，Xintong Song  写道：
>> >
>> > Thanks all for the discussion.
>> >
>> > Quick question for @Ingo:
>> > When do you think the PR will be ready (given that it's still a draft
>> now),
>> > and who would review it?
>> >
>> > Thank you~
>> >
>> > Xintong Song
>> >
>> >
>> >
>> > On Wed, Aug 18, 2021 at 10:27 PM Dian Fu  wrote:
>> >
>> >> The risk should be very limited and it should not affect other parts
>> of the
>> >> functionality. So I'm also in favour of merging it.
>> >>
>> >> Regards,
>> >> Dian
>> >>
>> >> On Wed, Aug 18, 2021 at 8:07 PM Till Rohrmann 
>> >> wrote:
>> >>
>> >>> @Dian Fu  could you assess how involved this
>> >>> change is? If the change is not very involved and the risk is limited,
>> >> then
>> >>> I'd be in favour of merging it because feature parity of APIs is quite
>> >>> important for our users.
>> >>>
>> >>> Cheers,
>> >>> Till
>> >>>
>> >>> On Wed, Aug 18, 2021 at 1:46 PM Ingo Bürk  wrote:
>> >>>
>>  Hello dev,
>> 
>>  I was wondering whether we could also consider merging
>> FLINK-23757[1][2]
>>  after the freeze. This is about exposing two built-in functions
>> which we
>>  added to Table API & SQL prior to the freeze also for PyFlink.
>> Meaning
>>  that
>>  the feature itself isn't new, we only expose it on the Python API,
>> and
>> >> as
>>  such it's also entirely isolated from the rest of PyFlink and Flink
>>  itself.
>>  As such I'm not sure this is considered a new feature, but I'd rather
>> >> ask.
>>  The main motivation for this would be to retain parity on the APIs.
>>  Thanks!
>> 
>>  [1] https://issues.apache.org/jira/browse/FLINK-23757
>>  [2] https://github.com/apache/flink/pull/16874
>> 
>> 
>>  Best
>>  Ingo
>> 
>> >>>
>> >>
>>
>>

[jira] [Created] (FLINK-23867) FlinkKafkaInternalProducerITCase.testCommitTransactionAfterClosed fails with IllegalStateException

2021-08-18 Thread Xintong Song (Jira)

Xintong Song created FLINK-23867:


 Summary: 
FlinkKafkaInternalProducerITCase.testCommitTransactionAfterClosed fails with 
IllegalStateException
 Key: FLINK-23867
 URL: https://issues.apache.org/jira/browse/FLINK-23867
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.13.2
Reporter: Xintong Song
 Fix For: 1.13.3


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22465&view=logs&j=72d4811f-9f0d-5fd0-014a-0bc26b72b642&t=c1d93a6a-ba91-515d-3196-2ee8019fbda7&l=6862

{code}
Aug 18 23:20:14 [ERROR] Tests run: 10, Failures: 0, Errors: 1, Skipped: 0, Time 
elapsed: 51.905 s <<< FAILURE! - in 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase
Aug 18 23:20:14 [ERROR] 
testCommitTransactionAfterClosed(org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase)
  Time elapsed: 7.848 s  <<< ERROR!
Aug 18 23:20:14 java.lang.Exception: Unexpected exception, 
expected but was
Aug 18 23:20:14 at 
org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:28)
Aug 18 23:20:14 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
Aug 18 23:20:14 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
Aug 18 23:20:14 at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
Aug 18 23:20:14 at java.lang.Thread.run(Thread.java:748)
Aug 18 23:20:14 Caused by: java.lang.AssertionError: The message should have 
been successfully sent expected null, but 
was:
Aug 18 23:20:14 at org.junit.Assert.fail(Assert.java:88)
Aug 18 23:20:14 at org.junit.Assert.failNotNull(Assert.java:755)
Aug 18 23:20:14 at org.junit.Assert.assertNull(Assert.java:737)
Aug 18 23:20:14 at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.getClosedProducer(FlinkKafkaInternalProducerITCase.java:228)
Aug 18 23:20:14 at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.testCommitTransactionAfterClosed(FlinkKafkaInternalProducerITCase.java:177)
Aug 18 23:20:14 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Aug 18 23:20:14 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Aug 18 23:20:14 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Aug 18 23:20:14 at java.lang.reflect.Method.invoke(Method.java:498)
Aug 18 23:20:14 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
Aug 18 23:20:14 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Aug 18 23:20:14 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
Aug 18 23:20:14 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Aug 18 23:20:14 at 
org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:19)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Merge FLINK-23757 after feature freeze

2021-08-18 Thread Ingo Bürk

Thanks everyone, and especially Dian,

the PR was a draft because originally the task was for all JSON methods.
I've now split it to only refer to those which are merged for 1.14 already,
and converted the PR to a normal one. Dian kindly offered to review and
merge it.


Best
Ingo

On Thu, Aug 19, 2021, 04:24 Dian Fu  wrote:

> Hi Xintong,
>
> I can help review the PR.
>
> Regards,
> Dian
>
> > 2021年8月19日 上午9:48，Xintong Song  写道：
> >
> > Thanks all for the discussion.
> >
> > Quick question for @Ingo:
> > When do you think the PR will be ready (given that it's still a draft
> now),
> > and who would review it?
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Aug 18, 2021 at 10:27 PM Dian Fu  wrote:
> >
> >> The risk should be very limited and it should not affect other parts of
> the
> >> functionality. So I'm also in favour of merging it.
> >>
> >> Regards,
> >> Dian
> >>
> >> On Wed, Aug 18, 2021 at 8:07 PM Till Rohrmann 
> >> wrote:
> >>
> >>> @Dian Fu  could you assess how involved this
> >>> change is? If the change is not very involved and the risk is limited,
> >> then
> >>> I'd be in favour of merging it because feature parity of APIs is quite
> >>> important for our users.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Wed, Aug 18, 2021 at 1:46 PM Ingo Bürk  wrote:
> >>>
>  Hello dev,
> 
>  I was wondering whether we could also consider merging
> FLINK-23757[1][2]
>  after the freeze. This is about exposing two built-in functions which
> we
>  added to Table API & SQL prior to the freeze also for PyFlink. Meaning
>  that
>  the feature itself isn't new, we only expose it on the Python API, and
> >> as
>  such it's also entirely isolated from the rest of PyFlink and Flink
>  itself.
>  As such I'm not sure this is considered a new feature, but I'd rather
> >> ask.
>  The main motivation for this would be to retain parity on the APIs.
>  Thanks!
> 
>  [1] https://issues.apache.org/jira/browse/FLINK-23757
>  [2] https://github.com/apache/flink/pull/16874
> 
> 
>  Best
>  Ingo
> 
> >>>
> >>
>
>

[jira] [Created] (FLINK-23866) KafkaTransactionLogITCase.testGetTransactionsToAbort fails with IllegalStateException

2021-08-18 Thread Xintong Song (Jira)

Xintong Song created FLINK-23866:


 Summary: KafkaTransactionLogITCase.testGetTransactionsToAbort 
fails with IllegalStateException
 Key: FLINK-23866
 URL: https://issues.apache.org/jira/browse/FLINK-23866
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.0
Reporter: Xintong Song
 Fix For: 1.14.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22463&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=15a22db7-8faa-5b34-3920-d33c9f0ca23c&l=7098

{code}
Aug 18 23:14:24 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time 
elapsed: 67.32 s <<< FAILURE! - in 
org.apache.flink.connector.kafka.sink.KafkaTransactionLogITCase
Aug 18 23:14:24 [ERROR] testGetTransactionsToAbort  Time elapsed: 22.35 s  <<< 
ERROR!
Aug 18 23:14:24 java.lang.IllegalStateException: You can only check the 
position for partitions assigned to this consumer.
Aug 18 23:14:24 at 
org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1717)
Aug 18 23:14:24 at 
org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1684)
Aug 18 23:14:24 at 
org.apache.flink.connector.kafka.sink.KafkaTransactionLog.hasReadAllRecords(KafkaTransactionLog.java:144)
Aug 18 23:14:24 at 
org.apache.flink.connector.kafka.sink.KafkaTransactionLog.getTransactionsToAbort(KafkaTransactionLog.java:133)
Aug 18 23:14:24 at 
org.apache.flink.connector.kafka.sink.KafkaTransactionLogITCase.testGetTransactionsToAbort(KafkaTransactionLogITCase.java:110)
Aug 18 23:14:24 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Aug 18 23:14:24 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Aug 18 23:14:24 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Aug 18 23:14:24 at java.lang.reflect.Method.invoke(Method.java:498)
Aug 18 23:14:24 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
Aug 18 23:14:24 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Aug 18 23:14:24 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
Aug 18 23:14:24 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Aug 18 23:14:24 at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
Aug 18 23:14:24 at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
Aug 18 23:14:24 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
Aug 18 23:14:24 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
Aug 18 23:14:24 at 
org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30)
Aug 18 23:14:24 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
Aug 18 23:14:24 at 
org.junit.runners.ParentRunner.run(ParentRunner.java:413)
Aug 18 23:14:24 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
Aug 18 23:14:24 at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
Aug 18 23:14:24 at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
Aug 18 23:14:24 at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
Aug 18 23:14:24 at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
Aug 18 23:14:24 at 
java.util.Iterator.forEachRemaining(Iterator.java:116)
Aug 18 23:14:24 at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
Aug 18 23:14:24 at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
Aug 18 23:14:24 at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
Aug 18 23:14:24 at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
Aug 18 23:14:24

[jira] [Created] (FLINK-23865) Class cast error caused by nested Pojo in legacy outputConversion

2021-08-18 Thread zoucao (Jira)

zoucao created FLINK-23865:
--

 Summary: Class cast error caused by nested Pojo in legacy 
outputConversion
 Key: FLINK-23865
 URL: https://issues.apache.org/jira/browse/FLINK-23865
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.2
Reporter: zoucao


code:
{code:java}
Table table = tbEnv.fromValues(DataTypes.ROW(
DataTypes.FIELD("innerPojo", DataTypes.ROW(DataTypes.FIELD("c", 
STRING(,
DataTypes.FIELD("b", STRING()),
DataTypes.FIELD("a", INT())),
Row.of(Row.of("str-c"), "str-b", 1));
DataStream pojoDataStream = tbEnv.toAppendStream(table, Pojo.class);
-
public static class Pojo{
public InnerPojo innerPojo;
public String b;
public int a;

public Pojo() {
}
}

public static class InnerPojo {
public String c;

public InnerPojo() {
}
}{code}
error:
{code:java}
java.lang.ClassCastException: org.apache.flink.table.types.logical.IntType 
cannot be cast to 
org.apache.flink.table.types.logical.RowTypejava.lang.ClassCastException: 
org.apache.flink.table.types.logical.IntType cannot be cast to 
org.apache.flink.table.types.logical.RowType at 
org.apache.flink.table.planner.sinks.TableSinkUtils$$anonfun$1.apply(TableSinkUtils.scala:163)
 at 
org.apache.flink.table.planner.sinks.TableSinkUtils$$anonfun$1.apply(TableSinkUtils.scala:155)
{code}
The fields of PojoTypeInfo is in the alphabet order, such that in 
`expandPojoTypeToSchema`, 'pojoType' and 'queryLogicalType' should have own 
index，but now we use the pojo field index to get 'queryLogicalType', this will 
casue the field type mismatch. It should be fixed like :
{code:java}
val queryIndex = queryLogicalType.getFieldIndex(name)
val nestedLogicalType = 
queryLogicalType.getFields()(queryIndex).getType.asInstanceOf[RowType]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Merge FLINK-23757 after feature freeze

2021-08-18 Thread Dian Fu

Hi Xintong,

I can help review the PR.

Regards,
Dian

> 2021年8月19日 上午9:48，Xintong Song  写道：
> 
> Thanks all for the discussion.
> 
> Quick question for @Ingo:
> When do you think the PR will be ready (given that it's still a draft now),
> and who would review it?
> 
> Thank you~
> 
> Xintong Song
> 
> 
> 
> On Wed, Aug 18, 2021 at 10:27 PM Dian Fu  wrote:
> 
>> The risk should be very limited and it should not affect other parts of the
>> functionality. So I'm also in favour of merging it.
>> 
>> Regards,
>> Dian
>> 
>> On Wed, Aug 18, 2021 at 8:07 PM Till Rohrmann 
>> wrote:
>> 
>>> @Dian Fu  could you assess how involved this
>>> change is? If the change is not very involved and the risk is limited,
>> then
>>> I'd be in favour of merging it because feature parity of APIs is quite
>>> important for our users.
>>> 
>>> Cheers,
>>> Till
>>> 
>>> On Wed, Aug 18, 2021 at 1:46 PM Ingo Bürk  wrote:
>>> 
 Hello dev,
 
 I was wondering whether we could also consider merging FLINK-23757[1][2]
 after the freeze. This is about exposing two built-in functions which we
 added to Table API & SQL prior to the freeze also for PyFlink. Meaning
 that
 the feature itself isn't new, we only expose it on the Python API, and
>> as
 such it's also entirely isolated from the rest of PyFlink and Flink
 itself.
 As such I'm not sure this is considered a new feature, but I'd rather
>> ask.
 The main motivation for this would be to retain parity on the APIs.
 Thanks!
 
 [1] https://issues.apache.org/jira/browse/FLINK-23757
 [2] https://github.com/apache/flink/pull/16874
 
 
 Best
 Ingo
 
>>> 
>>

Re: [DISCUSS] Merge Kafka-related PRs after feature freeze

2021-08-18 Thread Xintong Song

Thanks for starting this discussion, Arvid.

I'd be fine with merging them, as long as there's no other objections and
the PRs are ready for merging on Friday. If this takes more time than that,
I'd rather consider it as a distraction for the release stabilization and
should be moved to the next release.

Thank you~

Xintong Song



On Wed, Aug 18, 2021 at 8:59 PM Arvid Heise  wrote:

> Dear devs,
>
> we would like to merge these PRs after features freeze:
> FLINK-23838: Add FLIP-33 metrics to new KafkaSink [1]
> FLINK-23801: Add FLIP-33 metrics to KafkaSource [2]
> FLINK-23640: Create a KafkaRecordSerializationSchemas builder [3]
>
> All of the 3 PRs are smaller quality of life improvements that are purely
> implemented in flink-connector-kafka, so the risk in merging them is
> minimal in terms of production stability. They also reuse existing test
> infrastructure, so I expect little impact on the test stability.
>
> We are still polishing the PRs and would be ready to merge them on Friday
> when the objection period would be over.
>
> Happy to hear your thoughts,
>
> Arvid
>
> [1] https://github.com/apache/flink/pull/16875
> [2] https://github.com/apache/flink/pull/16838
> [3] https://github.com/apache/flink/pull/16783
>

Re: [DISCUSS] Merge FLINK-23757 after feature freeze

2021-08-18 Thread Xintong Song

Thanks all for the discussion.

Quick question for @Ingo:
When do you think the PR will be ready (given that it's still a draft now),
and who would review it?

Thank you~

Xintong Song



On Wed, Aug 18, 2021 at 10:27 PM Dian Fu  wrote:

> The risk should be very limited and it should not affect other parts of the
> functionality. So I'm also in favour of merging it.
>
> Regards,
> Dian
>
> On Wed, Aug 18, 2021 at 8:07 PM Till Rohrmann 
> wrote:
>
> > @Dian Fu  could you assess how involved this
> > change is? If the change is not very involved and the risk is limited,
> then
> > I'd be in favour of merging it because feature parity of APIs is quite
> > important for our users.
> >
> > Cheers,
> > Till
> >
> > On Wed, Aug 18, 2021 at 1:46 PM Ingo Bürk  wrote:
> >
> >> Hello dev,
> >>
> >> I was wondering whether we could also consider merging FLINK-23757[1][2]
> >> after the freeze. This is about exposing two built-in functions which we
> >> added to Table API & SQL prior to the freeze also for PyFlink. Meaning
> >> that
> >> the feature itself isn't new, we only expose it on the Python API, and
> as
> >> such it's also entirely isolated from the rest of PyFlink and Flink
> >> itself.
> >> As such I'm not sure this is considered a new feature, but I'd rather
> ask.
> >> The main motivation for this would be to retain parity on the APIs.
> >> Thanks!
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-23757
> >> [2] https://github.com/apache/flink/pull/16874
> >>
> >>
> >> Best
> >> Ingo
> >>
> >
>

[jira] [Created] (FLINK-23864) The document for Pulsar Source

2021-08-18 Thread Yufan Sheng (Jira)

Yufan Sheng created FLINK-23864:
---

 Summary: The document for Pulsar Source
 Key: FLINK-23864
 URL: https://issues.apache.org/jira/browse/FLINK-23864
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.14.0
Reporter: Yufan Sheng
 Fix For: 1.14.0


The new Pulsar source has been merged into flink master branch. We need a 
detailed documentation for how to use it on flink.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Merge FLINK-23757 after feature freeze

2021-08-18 Thread Dian Fu

The risk should be very limited and it should not affect other parts of the
functionality. So I'm also in favour of merging it.

Regards,
Dian

On Wed, Aug 18, 2021 at 8:07 PM Till Rohrmann  wrote:

> @Dian Fu  could you assess how involved this
> change is? If the change is not very involved and the risk is limited, then
> I'd be in favour of merging it because feature parity of APIs is quite
> important for our users.
>
> Cheers,
> Till
>
> On Wed, Aug 18, 2021 at 1:46 PM Ingo Bürk  wrote:
>
>> Hello dev,
>>
>> I was wondering whether we could also consider merging FLINK-23757[1][2]
>> after the freeze. This is about exposing two built-in functions which we
>> added to Table API & SQL prior to the freeze also for PyFlink. Meaning
>> that
>> the feature itself isn't new, we only expose it on the Python API, and as
>> such it's also entirely isolated from the rest of PyFlink and Flink
>> itself.
>> As such I'm not sure this is considered a new feature, but I'd rather ask.
>> The main motivation for this would be to retain parity on the APIs.
>> Thanks!
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-23757
>> [2] https://github.com/apache/flink/pull/16874
>>
>>
>> Best
>> Ingo
>>
>

Re: Common type required when creating TableSchema

2021-08-18 Thread Ingo Bürk

Hi Dominik,

can you maybe share how you are trying to create the Schema and what you
are doing with it afterwards?


Best
Ingo

On Wed, Aug 18, 2021 at 4:08 PM Dominik Wosiński  wrote:

> Hey,
> I've stumbled across a weird behavior and was wondering whether this is
> intentional for some reason or the result of a weird bug. So, basically
> currently if we want to create *org.apache.flink.table.api.Schema *taht has
> one of the types defined as *RAW (*AVRO enum in my case) it's probably not
> possible ATM. I've created arrays with names and types, but it's impossible
> to create the *Schema* due to the following error:
>
> *Could not find a common type for arguments: [BIGINT, BIGINT,
> RAW('org.test.TestEnum', '...'), INT, STRING, BIGINT]*
>
> Is that intentional ? If so, is there a way to create a *Schema* that has
> *RAW* types ?
>
>
> Thanks,
> Cheers,
> Dom.
>

Re: Configuring flink-python in IntelliJ

2021-08-18 Thread Dian Fu

Hi Ingo,

There are both Java/Python source codes in PyFlink and I use two IDEs at
the same time: IntelliJ IDEA & PyCharm.

Regarding ONLY using IntelliJ IDEA for both Python & Java development, I
have tried it and it works just as you said. I have done the following:
- Install Python Plugin
- Mark the module flink-python as a Python module: right click
"flink-python" -> "Open Module Settings", change the Module SDK to Python
- Click file setup.py under flink-python, then it will promote to install a
few Python libraries, just install them
- Run the Python tests under sub-directories of flink-python/pyflink

Regarding “a) Step (3) will be undone on every Maven import since Maven
sets the SDK for the module”, this has not occurred in my environment for
now. Are there any differences for you regarding the above steps?

PS: For me, I will still use both IDEs for PyFlink development as it
contains both Java & Python source code under flink-python directory and it
doesn't support configuring both Python & Java SDK for it. However, if
one works
with flink-python occasionally, using only IntelliJ IDEA is a good approach
and should be enough.

Regards,
Dian

On Wed, Aug 18, 2021 at 7:33 PM Ingo Bürk  wrote:

> Hi Dian,
>
> thanks for responding! No, I haven't tried, but I'm aware of it. I don't
> work exclusively on PyFlink, but rather just occasionally. So instead of
> having a whole separate IDE and project for one module I'm trying to get
> the whole Flink project to work as one in IntelliJ. Do PyFlink developers
> generally work solely on PyFlink and nothing else / do you switch IDEs for
> everything?
>
>
> Best
> Ingo
>
> On Wed, Aug 18, 2021 at 1:20 PM Dian Fu  wrote:
>
> > Hi Ingo,
> >
> > Thanks a lot for starting up this discussion. I have not tried IntelliJ
> and
> > I’m using PyCharm for PyFlink development. Have you tried this guideline?
> > [1]
> > It’s written in 1.9 and I think it should still work.
> >
> > Regards,
> > Dian
> >
> > [1]
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/flinkdev/ide_setup/#pycharm
> >
> > On Wed, Aug 18, 2021 at 5:38 PM Till Rohrmann 
> > wrote:
> >
> > > Thanks for starting this discussion Ingo. I guess that Dian can
> probably
> > > answer this question. I am big +1 for updating our documentation for
> how
> > to
> > > set up the IDE since this problem will probably be encountered a couple
> > of
> > > times.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Aug 18, 2021 at 11:03 AM Ingo Bürk  wrote:
> > >
> > > > Hello @dev,
> > > >
> > > > like probably most of you, I am using IntelliJ to work on Flink.
> > Lately I
> > > > needed to also work with flink-python, and thus was wondering about
> how
> > > to
> > > > properly set it up to work in IntelliJ. So far, I have done the
> > > following:
> > > >
> > > > 1. Install the Python plugin
> > > > 2. Set up a custom Python SDK using a virtualenv
> > > > 3. Configure the flink-python module to use this Python SDK (rather
> > than
> > > a
> > > > Java SDK)
> > > > 4. Install as many of the Python dependencies as possible when
> IntelliJ
> > > > prompted me to do so
> > > >
> > > > This got me into a mostly working state, for example I can run tests
> > from
> > > > the IDE, at least the ones I tried so far. However, there are two
> > > concerns:
> > > >
> > > > a) Step (3) will be undone on every Maven import since Maven sets the
> > SDK
> > > > for the module
> > > > b) Step (4) installed most, but a couple of Python dependencies could
> > not
> > > > be installed, though so far that didn't cause noticeable problems
> > > >
> > > > I'm wondering if there are additional / different steps to do,
> > > specifically
> > > > for (a), and maybe how the PyFlink developers are configuring this in
> > > their
> > > > IDE. The IDE setup guide didn't seem to contain information about
> that
> > > > (only about separately setting up this module in PyCharm). Ideally,
> we
> > > > could even update the IDE setup guide.
> > > >
> > > >
> > > > Best
> > > > Ingo
> > > >
> > >
> >
>

Common type required when creating TableSchema

2021-08-18 Thread Dominik Wosiński

Hey,
I've stumbled across a weird behavior and was wondering whether this is
intentional for some reason or the result of a weird bug. So, basically
currently if we want to create *org.apache.flink.table.api.Schema *taht has
one of the types defined as *RAW (*AVRO enum in my case) it's probably not
possible ATM. I've created arrays with names and types, but it's impossible
to create the *Schema* due to the following error:

*Could not find a common type for arguments: [BIGINT, BIGINT,
RAW('org.test.TestEnum', '...'), INT, STRING, BIGINT]*

Is that intentional ? If so, is there a way to create a *Schema* that has
*RAW* types ?


Thanks,
Cheers,
Dom.

Re: Incompatible RAW types in Table API

2021-08-18 Thread Dominik Wosiński

FYI.
I've managed to fix that by switching to using `toDataStream`. It seems to
be working fine now.
I have created the issue about the UDF though, since it seems to be
different issue. Not sure if an issue should be created for
`toAppendStream` if this is meant to be deprecated.

pon., 9 sie 2021 o 19:33 Timo Walther  napisał(a):

> Sorry, I meant "will be deprecated in Flink 1.14"
>
> On 09.08.21 19:32, Timo Walther wrote:
> > Hi Dominik,
> >
> > `toAppendStream` is soft deprecated in Flink 1.13 and will be deprecated
> > in Flink 1.13. It uses the old type system and might not match perfectly
> > with the other reworked type system in new functions and sources.
> >
> > For SQL, a lot of Avro classes need to be treated as RAW types. But we
> > might address this issue soon and further improve Avro support.
> >
> > I would suggest to continue this discussion in a JIRA issue. Can you
> > also share the code for `NewEvent` and te Avro schema or generated Avro
> > class for `Event` for to have a fully reproducible example?
> >
> > What can help is to explicitly define the types:
> >
> > E.g. you can also use `DataTypes.of(TypeInformation)` both in
> > `ScalarFunction.getTypeInference` and
> > `StreamTableEnvironment.toDataStream()`.
> >
> > I hope this helps.
> >
> > Timo
> >
> > On 09.08.21 16:27, Dominik Wosiński wrote:
> >> It should be `id` instead of `licence` in the error, I've copy-pasted it
> >> incorrectly :<
> >>
> >> I've also tried additional thing, i.e. creating the ScalarFunction that
> >> does mapping of one avro generated enum to additional avro generated
> >> enum:
> >>
> >> @FunctionHint(
> >>input = Array(
> >>  new DataTypeHint(value = "RAW", bridgedTo = classOf[OneEnum])
> >>),
> >>output = new DataTypeHint(value = "RAW", bridgedTo =
> >> classOf[OtherEnum])
> >> )
> >> class EnumMappingFunction extends ScalarFunction {
> >>
> >>def eval(event: OneEnum): OtherEnum = {OtherEnum.DEFAULT_VALUE}
> >> }
> >>
> >> This results in the following error:
> >>
> >>
> >>
> >> *Invalid argument type at position 0. Data type RAW('org.test.OneEnum',
> >> '...') expected but RAW('org.test.OneEnum', '...') passed.*
> >>
> >>
> >> pon., 9 sie 2021 o 15:13 Dominik Wosiński 
> napisał(a):
> >>
> >>> Hey all,
> >>>
> >>> I think I've hit some weird issue in Flink TypeInformation generation.
> I
> >>> have the following code:
> >>>
> >>> val stream: DataStream[Event] = ...
> >>> tableEnv.createTemporaryView("TableName",stream)
> >>> val table = tableEnv
> >>> .sqlQuery("SELECT id, timestamp, eventType from TableName")
> >>> tableEnvironment.toAppendStream[NewEvent](table)
> >>>
> >>> In this particual example *Event* is an avro generated class and
> >>> *NewEvent
> >>> *is just POJO. This is just a toy example so please ignore the fact
> that
> >>> this operation doesn't make much sense.
> >>>
> >>> When I try to run the code I am getting the following error:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *org.apache.flink.table.api.ValidationException: Column types of query
> >>> result and sink for unregistered table do not match.Cause: Incompatible
> >>> types for sink column 'licence' at position 0.Query schema: [id:
> >>> RAW('org.apache.avro.util.Utf8', '...'), timestamp: BIGINT NOT NULL,
> >>> kind:
> >>> RAW('org.test.EventType', '...')]*
> >>>
> >>> *Sink schema:  id: RAW('org.apache.avro.util.Utf8', '?'), timestamp:
> >>> BIGINT, kind: RAW('org.test.EventType', '?')]*
> >>>
> >>> So, it seems that the type is recognized correctly but for some reason
> >>> there is still mismatch according to Flink, maybe because of
> >>> different type
> >>> serializer used ?
> >>>
> >>> Thanks in advance for any help,
> >>> Best Regards,
> >>> Dom.
> >>>
> >>>
> >>>
> >>>
> >>
> >
>
>

[jira] [Created] (FLINK-23863) AVRO Raw types do not match for UDFs

2021-08-18 Thread Jira

Dominik Wosiński created FLINK-23863:


 Summary: AVRO Raw types do not match for UDFs
 Key: FLINK-23863
 URL: https://issues.apache.org/jira/browse/FLINK-23863
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, Table SQL / API
Affects Versions: 1.13.2
Reporter: Dominik Wosiński


It seems that for some reason when using UDF, we can't properly use RAW types 
at least for AVRO generated classes. The following code:
```

{color:#bbb529}@FunctionHint{color}(
 input = Array(
 {color:#cc7832}new {color}DataTypeHint(value = 
{color:#6a8759}"RAW"{color}{color:#cc7832}, {color}bridgedTo = 
classOf[TestEnum])
 ){color:#cc7832},
{color} output = {color:#cc7832}new {color}DataTypeHint(value = 
{color:#6a8759}"RAW"{color}{color:#cc7832}, {color}bridgedTo = 
classOf[SecondEnum])
)
{color:#cc7832}class {color}EnumMappingFunction {color:#cc7832}extends 
{color}ScalarFunction {

 {color:#cc7832}def 
{color}{color:#ffc66d}eval{color}({color:#72737a}event{color}: TestEnum): 
SecondEnum = {SecondEnum.{color:#9876aa}ANOTHER_TEST_VALUE{color}}
 }
```

Will result in this rather unhelpful error:
```
Caused by: org.apache.flink.table.api.ValidationException: Invalid argument 
type at position 0. Data type RAW('org.test.TestEnum', '...') expected but 
RAW('org.test.TestEnum', '...') passed.
```
Perhaps it is due to different type serializer that is generated by Flink SQL 
and the `DataTypeHint` ? 

I've created the following minimal reproducible example to make it easier to 
identify the error  https://github.com/Wosin/FlinkRepro.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23862) Race condition while cancelling task during initialization

2021-08-18 Thread Roman Khachatryan (Jira)

Roman Khachatryan created FLINK-23862:
-

 Summary: Race condition while cancelling task during initialization
 Key: FLINK-23862
 URL: https://issues.apache.org/jira/browse/FLINK-23862
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.14.0
Reporter: Roman Khachatryan
 Fix For: 1.14.0


While debugging the recent failures in FLINK-22889, I see that sometimes the 
operator chain is not closed if the task is cancelled while it's being 
initialized.

 

The reason is that on restore(), cleanUpInvoke() is only closed if there was an 
exception, including CancelTaskException.

The latter is only thrown if StreamTask.canceled is set, i.e. TaskCanceler has 
called StreamTask.cancel().

 

So if StreamTask is cancelled in between restore and normal invoke then it may 
not close the operator chain and not do other cleanup.

 

One solution is to make StreamTask.cleanup visible to and called from Task.

 

cc: [~akalashnikov], [~pnowojski]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[DISCUSS] Merge Kafka-related PRs after feature freeze

2021-08-18 Thread Arvid Heise

Dear devs,

we would like to merge these PRs after features freeze:
FLINK-23838: Add FLIP-33 metrics to new KafkaSink [1]
FLINK-23801: Add FLIP-33 metrics to KafkaSource [2]
FLINK-23640: Create a KafkaRecordSerializationSchemas builder [3]

All of the 3 PRs are smaller quality of life improvements that are purely
implemented in flink-connector-kafka, so the risk in merging them is
minimal in terms of production stability. They also reuse existing test
infrastructure, so I expect little impact on the test stability.

We are still polishing the PRs and would be ready to merge them on Friday
when the objection period would be over.

Happy to hear your thoughts,

Arvid

[1] https://github.com/apache/flink/pull/16875
[2] https://github.com/apache/flink/pull/16838
[3] https://github.com/apache/flink/pull/16783

[jira] [Created] (FLINK-23861) flink sql client support dynamic params

2021-08-18 Thread zhangbinzaifendou (Jira)

zhangbinzaifendou created FLINK-23861:
-

 Summary: flink sql client support dynamic params
 Key: FLINK-23861
 URL: https://issues.apache.org/jira/browse/FLINK-23861
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Client
Affects Versions: 1.13.2
Reporter: zhangbinzaifendou


1 Every time the set command is executed, the method call process is very long 
and a new createTableEnvironment object is created

2 As a result of the previous discussion in 
[FLINK-22770|https://issues.apache.org/jira/browse/FLINK-22770], I don’t think 
that adding quotation marks to key and value is not a good habit for users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23860) Conversion to relational algebra failed to preserve datatypes

2021-08-18 Thread lixu (Jira)

lixu created FLINK-23860:


 Summary: Conversion to relational algebra failed to preserve 
datatypes
 Key: FLINK-23860
 URL: https://issues.apache.org/jira/browse/FLINK-23860
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.2, 1.13.1
Reporter: lixu
 Fix For: 1.14.0, 1.13.3


{code:java}
//代码占位符
StreamExecutionEnvironment streamExecutionEnvironment = 
StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnvironment = 
StreamTableEnvironment.create(streamExecutionEnvironment);

tableEnvironment.executeSql("CREATE TABLE datagen (\n" +
" f_sequence INT,\n" +
" f_random INT,\n" +
" f_random_str STRING,\n" +
" ts AS localtimestamp,\n" +
" WATERMARK FOR ts AS ts\n" +
") WITH (\n" +
" 'connector' = 'datagen',\n" +
" 'rows-per-second'='5',\n" +
" 'fields.f_sequence.kind'='sequence',\n" +
" 'fields.f_sequence.start'='1',\n" +
" 'fields.f_sequence.end'='1000',\n" +
" 'fields.f_random.min'='1',\n" +
" 'fields.f_random.max'='1000',\n" +
" 'fields.f_random_str.length'='10'\n" +
")");

Table table = tableEnvironment.sqlQuery("select row(f_sequence, f_random) as c 
from datagen");

Table table1 = tableEnvironment.sqlQuery("select * from " + table);

table1.execute().print();
{code}
{code:java}
// exception
Exception in thread "main" java.lang.AssertionError: Conversion to relational 
algebra failed to preserve datatypes:Exception in thread "main" 
java.lang.AssertionError: Conversion to relational algebra failed to preserve 
datatypes:validated type:RecordType(RecordType:peek_no_expand(INTEGER EXPR$0, 
INTEGER EXPR$1) NOT NULL c) NOT NULLconverted 
type:RecordType(RecordType(INTEGER EXPR$0, INTEGER EXPR$1) NOT NULL c) NOT 
NULLrel:LogicalProject(c=[ROW($0, $1)])  LogicalWatermarkAssigner(rowtime=[ts], 
watermark=[$3])    LogicalProject(f_sequence=[$0], f_random=[$1], 
f_random_str=[$2], ts=[LOCALTIMESTAMP])      
LogicalTableScan(table=[[default_catalog, default_database, datagen]]) at 
org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:467)
 at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:582)
 at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:169)
 at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:161)
 at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.toQueryOperation(SqlToOperationConverter.java:989)
 at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlQuery(SqlToOperationConverter.java:958)
 at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:283)
 at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101) 
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:704)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23859) [typo][flink-core][flink-connectors]fix typo for code

2021-08-18 Thread wuguihu (Jira)

wuguihu created FLINK-23859:
---

 Summary: [typo][flink-core][flink-connectors]fix typo for code  
 Key: FLINK-23859
 URL: https://issues.apache.org/jira/browse/FLINK-23859
 Project: Flink
  Issue Type: Bug
Reporter: wuguihu


There are some typo issues in these modules.

 
{code:java}
# Use the Codespell tool to check typo issue.
pip install codespell

codespell -h
{code}
 

1、 codespell flink-java/src
{code:java}
flink-java/src/main/java/org/apache/flink/api/java/operators/PartitionOperator.java:125:
 partioning ==> partitioning

flink-java/src/main/java/org/apache/flink/api/java/operators/PartitionOperator.java:128:
 neccessary ==> necessary
{code}
 
 2、 codespell flink-clients/
{code:java}
flink-clients/src/test/java/org/apache/flink/client/program/DefaultPackagedProgramRetrieverTest.java:545:
 acessible ==> accessible
{code}
 

3、codespell flink-connectors/ -S '*.xml' -S '*.iml' -S '*.txt'

 
{code:java}
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderOptions.java:25:
 tht ==> that

flink-connectors/flink-connector-jdbc/src/test/java/org/apache/flink/connector/jdbc/catalog/PostgresCatalogTestBase.java:192:
 doens't ==> doesn't

flink-connectors/flink-connector-jdbc/src/main/java/org/apache/flink/connector/jdbc/dialect/JdbcDialect.java:96:
 PostgresSQL ==> postgresql

flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/batch/connectors/cassandra/CustomCassandraAnnotatedPojo.java:38:
 instanciation ==> instantiation

flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserCalcitePlanner.java:822:
 partion ==> partition

flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:943:
 funtion ==> function

flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveASTParseDriver.java:55:
 funtion ==> function

flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveASTParseDriver.java:51:
 characteres ==> characters

flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/util/KinesisConfigUtil.java:436:
 paremeters ==> parameters

flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserBaseSemanticAnalyzer.java:2369:
 Unkown ==> Unknown

flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java:75:
 reprsents ==> represents

flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/functions/hive/HiveFunctionWrapper.java:28:
 functino ==> function


flink-connectors/flink-connector-hbase-2.2/src/main/java/org/apache/flink/connector/hbase2/source/HBaseRowDataAsyncLookupFunction.java:62:
 implemenation


flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/enumerator/cursor/StartCursor.java:70:
 ture ==> true


flink-connectors/flink-connector-pulsar/src/test/resources/containers/txnStandalone.conf:907:
 partions ==> partitions

flink-connectors/flink-connector-pulsar/src/test/resources/containers/txnStandalone.conf:468:
 implementatation ==> implementation


flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/enumerate/BlockSplittingRecursiveEnumerator.java:141:
 bloc ==> block

flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/reader/SimpleStreamFormat.java:37:
 te ==> the


flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/deserializer/KafkaRecordDeserializationSchema.java:70:
 determin ==> determine

flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/subscriber/TopicListSubscriber.java:36:
 hav ==> have

flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserQBSubQuery.java:555:
 correlatd ==> correlated


flink-connectors/flink-connector-kafka/src/test/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBaseTest.java:263:
 intial ==> initial
flink-connectors/flink-connector-kafka/src/test/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBaseTest.java:302:
 intial ==> initial
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java:249:
 wth ==> with
flink-connectors/flink-connector-files/src/test/java/org/apache/flink/connector/file/sink/writer/FileWriterBucketStateSerializerMigrationTest.java:232:
 comitted ==> committed

flink-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connec

[jira] [Created] (FLINK-23858) Clarify StreamRecord#timestamp.

2021-08-18 Thread Arvid Heise (Jira)

Arvid Heise created FLINK-23858:
---

 Summary: Clarify StreamRecord#timestamp.
 Key: FLINK-23858
 URL: https://issues.apache.org/jira/browse/FLINK-23858
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Network
Reporter: Arvid Heise


The new Source apparently changed the way we specify records without 
timestamps. Previously, we used separate methods to create and maintain 
timestamp-less records.
Now, we are shiftings towards using a special value 
(TimeStampAssigner#NO_TIMESTAMP).

We first of all need to document that somewhere; at the very least in the 
JavaDoc of StreamRecord.
We should also revise the consequences: 
- Do we want to encode it in the {{StreamElementSerializer}}? Currently, we use 
a flag to indicate no-timestamp on the old path but in the new path we now use 
9 bytes to encode NO_TIMESTAMP.
- We should check if all code-paths deal with `hasTimestamp() == true && 
getTimestamp() == TimeStampAssigner#NO_TIMESTAMP`, in particular with sinks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23857) insert overwirite table select * from t where 1 != 1, Unable to clear table data

2021-08-18 Thread lixu (Jira)

lixu created FLINK-23857:


 Summary: insert overwirite table select * from t where 1 != 1, 
Unable to clear table data
 Key: FLINK-23857
 URL: https://issues.apache.org/jira/browse/FLINK-23857
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.1
Reporter: lixu
 Fix For: 1.14.0, 1.13.3


insert overwirite table select * from t where 1 != 1，Unable to clear table 
data，Unlike hive。



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Merge FLINK-23757 after feature freeze

2021-08-18 Thread Till Rohrmann

@Dian Fu  could you assess how involved this change is?
If the change is not very involved and the risk is limited, then I'd be in
favour of merging it because feature parity of APIs is quite important for
our users.

Cheers,
Till

On Wed, Aug 18, 2021 at 1:46 PM Ingo Bürk  wrote:

> Hello dev,
>
> I was wondering whether we could also consider merging FLINK-23757[1][2]
> after the freeze. This is about exposing two built-in functions which we
> added to Table API & SQL prior to the freeze also for PyFlink. Meaning that
> the feature itself isn't new, we only expose it on the Python API, and as
> such it's also entirely isolated from the rest of PyFlink and Flink itself.
> As such I'm not sure this is considered a new feature, but I'd rather ask.
> The main motivation for this would be to retain parity on the APIs. Thanks!
>
> [1] https://issues.apache.org/jira/browse/FLINK-23757
> [2] https://github.com/apache/flink/pull/16874
>
>
> Best
> Ingo
>

[DISCUSS] Merge FLINK-23757 after feature freeze

2021-08-18 Thread Ingo Bürk

Hello dev,

I was wondering whether we could also consider merging FLINK-23757[1][2]
after the freeze. This is about exposing two built-in functions which we
added to Table API & SQL prior to the freeze also for PyFlink. Meaning that
the feature itself isn't new, we only expose it on the Python API, and as
such it's also entirely isolated from the rest of PyFlink and Flink itself.
As such I'm not sure this is considered a new feature, but I'd rather ask.
The main motivation for this would be to retain parity on the APIs. Thanks!

[1] https://issues.apache.org/jira/browse/FLINK-23757
[2] https://github.com/apache/flink/pull/16874


Best
Ingo

[jira] [Created] (FLINK-23856) Support all JSON methods in PyFlink

2021-08-18 Thread Jira

Ingo Bürk created FLINK-23856:
-

 Summary: Support all JSON methods in PyFlink
 Key: FLINK-23856
 URL: https://issues.apache.org/jira/browse/FLINK-23856
 Project: Flink
  Issue Type: Sub-task
Reporter: Ingo Bürk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Configuring flink-python in IntelliJ

2021-08-18 Thread Ingo Bürk

Hi Dian,

thanks for responding! No, I haven't tried, but I'm aware of it. I don't
work exclusively on PyFlink, but rather just occasionally. So instead of
having a whole separate IDE and project for one module I'm trying to get
the whole Flink project to work as one in IntelliJ. Do PyFlink developers
generally work solely on PyFlink and nothing else / do you switch IDEs for
everything?


Best
Ingo

On Wed, Aug 18, 2021 at 1:20 PM Dian Fu  wrote:

> Hi Ingo,
>
> Thanks a lot for starting up this discussion. I have not tried IntelliJ and
> I’m using PyCharm for PyFlink development. Have you tried this guideline?
> [1]
> It’s written in 1.9 and I think it should still work.
>
> Regards,
> Dian
>
> [1]
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/flinkdev/ide_setup/#pycharm
>
> On Wed, Aug 18, 2021 at 5:38 PM Till Rohrmann 
> wrote:
>
> > Thanks for starting this discussion Ingo. I guess that Dian can probably
> > answer this question. I am big +1 for updating our documentation for how
> to
> > set up the IDE since this problem will probably be encountered a couple
> of
> > times.
> >
> > Cheers,
> > Till
> >
> > On Wed, Aug 18, 2021 at 11:03 AM Ingo Bürk  wrote:
> >
> > > Hello @dev,
> > >
> > > like probably most of you, I am using IntelliJ to work on Flink.
> Lately I
> > > needed to also work with flink-python, and thus was wondering about how
> > to
> > > properly set it up to work in IntelliJ. So far, I have done the
> > following:
> > >
> > > 1. Install the Python plugin
> > > 2. Set up a custom Python SDK using a virtualenv
> > > 3. Configure the flink-python module to use this Python SDK (rather
> than
> > a
> > > Java SDK)
> > > 4. Install as many of the Python dependencies as possible when IntelliJ
> > > prompted me to do so
> > >
> > > This got me into a mostly working state, for example I can run tests
> from
> > > the IDE, at least the ones I tried so far. However, there are two
> > concerns:
> > >
> > > a) Step (3) will be undone on every Maven import since Maven sets the
> SDK
> > > for the module
> > > b) Step (4) installed most, but a couple of Python dependencies could
> not
> > > be installed, though so far that didn't cause noticeable problems
> > >
> > > I'm wondering if there are additional / different steps to do,
> > specifically
> > > for (a), and maybe how the PyFlink developers are configuring this in
> > their
> > > IDE. The IDE setup guide didn't seem to contain information about that
> > > (only about separately setting up this module in PyCharm). Ideally, we
> > > could even update the IDE setup guide.
> > >
> > >
> > > Best
> > > Ingo
> > >
> >
>

Re: Configuring flink-python in IntelliJ

2021-08-18 Thread Dian Fu

Hi Ingo,

Thanks a lot for starting up this discussion. I have not tried IntelliJ and
I’m using PyCharm for PyFlink development. Have you tried this guideline?
[1]
It’s written in 1.9 and I think it should still work.

Regards,
Dian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/flinkdev/ide_setup/#pycharm

On Wed, Aug 18, 2021 at 5:38 PM Till Rohrmann  wrote:

> Thanks for starting this discussion Ingo. I guess that Dian can probably
> answer this question. I am big +1 for updating our documentation for how to
> set up the IDE since this problem will probably be encountered a couple of
> times.
>
> Cheers,
> Till
>
> On Wed, Aug 18, 2021 at 11:03 AM Ingo Bürk  wrote:
>
> > Hello @dev,
> >
> > like probably most of you, I am using IntelliJ to work on Flink. Lately I
> > needed to also work with flink-python, and thus was wondering about how
> to
> > properly set it up to work in IntelliJ. So far, I have done the
> following:
> >
> > 1. Install the Python plugin
> > 2. Set up a custom Python SDK using a virtualenv
> > 3. Configure the flink-python module to use this Python SDK (rather than
> a
> > Java SDK)
> > 4. Install as many of the Python dependencies as possible when IntelliJ
> > prompted me to do so
> >
> > This got me into a mostly working state, for example I can run tests from
> > the IDE, at least the ones I tried so far. However, there are two
> concerns:
> >
> > a) Step (3) will be undone on every Maven import since Maven sets the SDK
> > for the module
> > b) Step (4) installed most, but a couple of Python dependencies could not
> > be installed, though so far that didn't cause noticeable problems
> >
> > I'm wondering if there are additional / different steps to do,
> specifically
> > for (a), and maybe how the PyFlink developers are configuring this in
> their
> > IDE. The IDE setup guide didn't seem to contain information about that
> > (only about separately setting up this module in PyCharm). Ideally, we
> > could even update the IDE setup guide.
> >
> >
> > Best
> > Ingo
> >
>

[jira] [Created] (FLINK-23855) Table API & SQL Configuration Not displayed on flink dashboard

2021-08-18 Thread simenliuxing (Jira)

simenliuxing created FLINK-23855:


 Summary: Table API & SQL Configuration Not displayed on flink 
dashboard
 Key: FLINK-23855
 URL: https://issues.apache.org/jira/browse/FLINK-23855
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.13.2
Reporter: simenliuxing
 Fix For: 1.14.0
 Attachments: image-2021-08-18-18-48-10-985.png

branch:1.13.2
planner:blink

hi
When I run a flinksql task in standalone mode, I set some parameters starting 
with table., but I can't find them on the dashboard, although I know these 
parameters are effective.Can these parameters be displayed somewhere



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Configuring flink-python in IntelliJ

2021-08-18 Thread Till Rohrmann

Thanks for starting this discussion Ingo. I guess that Dian can probably
answer this question. I am big +1 for updating our documentation for how to
set up the IDE since this problem will probably be encountered a couple of
times.

Cheers,
Till

On Wed, Aug 18, 2021 at 11:03 AM Ingo Bürk  wrote:

> Hello @dev,
>
> like probably most of you, I am using IntelliJ to work on Flink. Lately I
> needed to also work with flink-python, and thus was wondering about how to
> properly set it up to work in IntelliJ. So far, I have done the following:
>
> 1. Install the Python plugin
> 2. Set up a custom Python SDK using a virtualenv
> 3. Configure the flink-python module to use this Python SDK (rather than a
> Java SDK)
> 4. Install as many of the Python dependencies as possible when IntelliJ
> prompted me to do so
>
> This got me into a mostly working state, for example I can run tests from
> the IDE, at least the ones I tried so far. However, there are two concerns:
>
> a) Step (3) will be undone on every Maven import since Maven sets the SDK
> for the module
> b) Step (4) installed most, but a couple of Python dependencies could not
> be installed, though so far that didn't cause noticeable problems
>
> I'm wondering if there are additional / different steps to do, specifically
> for (a), and maybe how the PyFlink developers are configuring this in their
> IDE. The IDE setup guide didn't seem to contain information about that
> (only about separately setting up this module in PyCharm). Ideally, we
> could even update the IDE setup guide.
>
>
> Best
> Ingo
>

回复：[DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-18 Thread 洪帆(既起)

Hi, Mingliang,

Thank you for providing a real-world case of heterogeneous topology in the
training and inference phase, and Becket has given two options to you to choose.

Personally, I think Becket's two options are over-simplified in description,
and may be somehow misleading.
Here, I would like add some of my thoughts:
Proposal Option-2 does NOT have to implement two DAGs in ALL cases.
In most cases, the best practice in Proposal Option-2 is to put the common part
(inference part) into a Pipeline. In the training phase, the data is
preprocessed by AlgoOps or another pipeline, and then fed to Pipeline.fit().
The output PipelineModel can be directly used in the inference phase. The code
will be much clearer and cleaner than the complicated manipulation of
estimatorInputs and transformerInput in the Graph API.
Proposal Option-1 can NOT ALWAYS encapsulate the heterogeneous topology with
the Graph/GraphBuilder API. In [1], we already list some cases where Graph API
failed to encapsulate the complicated topology, and we also presented concrete
scenarios we encountered. And such incapability could bring extra effort when
incremental developing your ML task.
EVEN IF Mingliang's cases happened to be in the rare positions where Becket's
two options applied, In [1], the actual differences between two options are
shown in code snippets. I personally do not think implementing two DAGs brings
much overhead. You may check those code snippets if you would like.

As far as I can see, most inference/predict pipelines are used for online
serving (as in offline inference, there is no need to export models). In the
situation of online serving, the corresponding pipeline can only accept 1
dataset and produce 1 dataset. It means the item-1 above applies: Proposal
Option-2 does the same thing in a clear and clean way.

So, Mingliang, if it does not bother you much, you may give more information
about your scenarios, and may think with supplementary information above.

[1]
https://docs.google.com/document/d/1L3aI9LjkcUPoM52liEY6uFktMnFMNFQ6kXAjnz_11do

Sincerely,
Fan Hong

--
发件人：青雉（祁明良）
发送时间：2021年8月10日(星期二) 11:36
收件人：dev@flink.apache.org
主 题：Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

Vote for option 2.
It is similar to what we are doing with Tensorflow.
1. Define the graph in training phase
2. Export model with different input/output spec for online inference

Thanks,
Mingliang

On Aug 10, 2021, at 9:39 AM, Becket Qin
mailto:becket@gmail.com>> wrote:

estimatorInputs

本?件及其附件含有小??公司的保密信息，?限于?送?以上收件人或群?。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、?制、或散?）本?件中的信息。如果??收了本?件，??立即??或?件通知?件人并?除本?件！
This communication may contain privileged or other confidential information of
Red. If you have received it in error, please advise the sender by reply e-mail
and immediately delete the message and any attachments without copying or
disclosing the contents. Thank you.

[jira] [Created] (FLINK-23854) KafkaSink error when restart from the checkpoint with a lower parallelism

2021-08-18 Thread Hang Ruan (Jira)

Hang Ruan created FLINK-23854:
-

 Summary: KafkaSink error when restart from the checkpoint with a 
lower parallelism
 Key: FLINK-23854
 URL: https://issues.apache.org/jira/browse/FLINK-23854
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.0
Reporter: Hang Ruan


The KafkaSink throws the exception when restarted with a lower parallelism. The 
exception is like this.
{code:java}
// code placeholder
java.lang.IllegalStateException: Internal error: It is expected that state from 
previous executions is distributed to the same subtask id.at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)at 
org.apache.flink.connector.kafka.sink.KafkaWriter.recoverAndInitializeState(KafkaWriter.java:178)
at 
org.apache.flink.connector.kafka.sink.KafkaWriter.(KafkaWriter.java:130)  
  at 
org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:99) 
   at 
org.apache.flink.streaming.runtime.operators.sink.SinkOperator.initializeState(SinkOperator.java:134)
at 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:118)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286)
at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:109)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:690)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:666)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:785)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:638)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:572)at 
java.lang.Thread.run(Thread.java:748)Suppressed: 
java.lang.NullPointerExceptionat 
org.apache.flink.streaming.runtime.operators.sink.SinkOperator.close(SinkOperator.java:195)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:141)
at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:127)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:1028)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:1014)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:927)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:797)
... 4 more
{code}
I start the kafka cluster(kafka_2.13-2.8.0) and the flink cluster in my own 
mac. I change the parallelism from 4 to 2 and restart the job from some 
completed checkpoint. Then the error occurs. 

And the cli command and the code are as follows.
{code:java}
// cli command
./bin/flink run -d -c com.test.KafkaExactlyOnceScaleDownTest -s 
/Users/test/checkpointDir/ExactlyOnceTest1/67105fcc1724e147fc6208af0dd90618/chk-1
 /Users/test/project/self/target/test.jar
{code}
{code:java}
public class KafkaExactlyOnceScaleDownTest { 
public static void main(String[] args) throws Exception { 
final String kafkaSourceTopic = "flinkSourceTest"; 
final String kafkaSinkTopic = "flinkSinkExactlyTest1"; 
final String groupId = "ExactlyOnceTest1"; 
final String brokers = "localhost:9092"; 
final String ckDir = "file:///Users/hangruan/checkpointDir/" + groupId; 
final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment(); 
env.enableCheckpointing(6); 
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); 
env.getCheckpointConfig().enableExternalizedCheckpoints( 
CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); 
env.getCheckpointConfig().setCheckpointStorage(ckDir); env.setParallelism(4); 
KafkaSource source = KafkaSource.builder() 
.setBootstrapServers(brokers) .setTopics(kafkaSourceTopic) .setGroupId(groupId) 
.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest") 
.setValueOnlyDeserializer(new SimpleStringSchema()) .build(); 
DataStream flintstones = env.fromSource(source, 
WatermarkStrategy.noWatermarks(), "Kafka Source"); DataStream adults = 
flintstones.filter(s -> s != null && s.length() > 2); Properties props = new 
Properties(); props.setProperty("transaction.timeout.ms", "90"); 
adults.sinkTo(KafkaSink.builder() .setBootstrapServers(brokers) 
.setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE) 
.setTransactional

Configuring flink-python in IntelliJ

2021-08-18 Thread Ingo Bürk

Hello @dev,

like probably most of you, I am using IntelliJ to work on Flink. Lately I
needed to also work with flink-python, and thus was wondering about how to
properly set it up to work in IntelliJ. So far, I have done the following:

1. Install the Python plugin
2. Set up a custom Python SDK using a virtualenv
3. Configure the flink-python module to use this Python SDK (rather than a
Java SDK)
4. Install as many of the Python dependencies as possible when IntelliJ
prompted me to do so

This got me into a mostly working state, for example I can run tests from
the IDE, at least the ones I tried so far. However, there are two concerns:

a) Step (3) will be undone on every Maven import since Maven sets the SDK
for the module
b) Step (4) installed most, but a couple of Python dependencies could not
be installed, though so far that didn't cause noticeable problems

I'm wondering if there are additional / different steps to do, specifically
for (a), and maybe how the PyFlink developers are configuring this in their
IDE. The IDE setup guide didn't seem to contain information about that
(only about separately setting up this module in PyCharm). Ideally, we
could even update the IDE setup guide.


Best
Ingo

Flink 1.14 Bi-weekly 2021-08-17 Feature Freeze done, now stabilise!

2021-08-18 Thread Johannes Moser

Dear Flink Community,

Thanks for putting all that effort into the 1.14 release. We are now post 
feature freeze and it looks like most of the features made it. This will be a 
decent release.

Here's a summary of today's bi-weekly, which has been a bit longer.

*Exception for feature freeze*
There are two efforts around FLIP-147 and the Kafka source which have some 
pending PRs. The people driving that forward will (or already did) reach out to 
the mailing list and ask for the exception to merge it.

*1.14 release wiki page*
Please keep the page [1] up to date. To mark well documented and tested 
features we are introducing the new state "done done". Please mark a feature as 
"done done", when it is well documented and tested. Please add issues or PRs 
for documentation to the corresponding column, as well as tickets that can be 
picked up by other teams for cross-team testing.

*Cleaning up issues*
When there was a parent issue targeting the 1.14 release and there are 
remaining, open subtasks please move them to a new ticket and close the old one 
for the sake of transparency and clarity.

*Documentation*
During the stabilisation phase documentation issues will be handled as blockers 
as they also have an impact on test-ability.

*Cross-team testing*
As mentioned above, those responsible for a feature should create a ticket for 
cross-team testing containing instructions to test the newly developed feature. 
This should be linked on the wiki page. Others should pick some of those and 
test features they did not contribute to.

*Build stability*
The last weeks the build stability has been awful, which somehow comes natural 
as a lot of features have been just merged. Now, this is blocking us from 
moving towards a first release candidate. We encourage every contributor to 
look into those issues and pick some to improve the general CI experience. The 
release managers will try to find assignees for the blockers as soon as they 
appear and for the critical issues during the syncs. Please ensure to report 
issues with the right severity and fix version.

You can use the Kanban board to get an overview and pick up issues. There are 
filters ready to use for 1.14 related blockers, critical issues, instabilities 
and cross-team testing issues. So once again: make sure the fixVersion is set 
to "1.14.0" to let issues appear on this board. [2]

*Syncs*
From now on the syncs will happen weekly (Tuesdays, same time, same place). If 
the load for the release managers is too high we will move to two meetings per 
week. In the syncs we will go through the blockers and assign unassigned 
critical issues.

*Release branch and RC0*
As we want to collect user feedback as early as possible, we'd like to push for 
branching of the release and creating the first release candidate as soon as 
possible. This requires work on stability issues. (see above). We will decide 
in the sync next week about the timing.

Thanks for making this happen.

Best Xintong, Dawid & Joe

[1] https://cwiki.apache.org/confluence/display/FLINK/1.14+Release
[2] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=468

[jira] [Created] (FLINK-23853) Update StateFun's Flink dependency to 1.13.2

2021-08-18 Thread Tzu-Li (Gordon) Tai (Jira)

Tzu-Li (Gordon) Tai created FLINK-23853:
---

 Summary: Update StateFun's Flink dependency to 1.13.2
 Key: FLINK-23853
 URL: https://issues.apache.org/jira/browse/FLINK-23853
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: statefun-3.1.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23852) In sql-client mode ，could not change the properties.

2021-08-18 Thread Shuaishuai Guo (Jira)

Shuaishuai Guo created FLINK-23852:
--

 Summary: In sql-client  mode ，could not change the properties.
 Key: FLINK-23852
 URL: https://issues.apache.org/jira/browse/FLINK-23852
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Table SQL / Client
Affects Versions: 1.13.0
Reporter: Shuaishuai Guo
 Attachments: 微信图片_20210818152914.png

When we set properties as follows, it does not work.

{color:#FF}{{ SET 'sql-client.execution.result-mode' = 'tableau';}}{color}

{{The reson is: single quotation mark is  needless.}}

{{}}

{{It should be use like this :}}

{{{color:#FF}  {color}}}+{color:#FF}{{}}{{SET 
sql-client.execution.result-mode = 'tableau';}}{color}+

{{Note: all of the  properties  have this problem.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

37 matches

Mail list logo