[jira] [Created] (FLINK-21332) Optimize releasing result partitions in RegionPartitionReleaseStrategy

2021-02-08 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-21332:


 Summary: Optimize releasing result partitions in 
RegionPartitionReleaseStrategy
 Key: FLINK-21332
 URL: https://issues.apache.org/jira/browse/FLINK-21332
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhilong Hong
 Fix For: 1.13.0


RegionPartitionReleaseStrategy is responsible for releasing result partitions 
when all the downstream tasks finish.

The current implementation is:
{code:java}
for each consumed SchedulingResultPartition of current finished 
SchedulingPipelinedRegion:
  for each consumer SchedulingPipelinedRegion of the SchedulingResultPartition:
if all the regions are finished:
  release the partitions
{code}
The time complexity of releasing a result partition is O(N^2). However, 
considering that during the entire stage, all the result partitions need to be 
released, the time complexity is actually O(N^3).

After the optimization of DefaultSchedulingTopology, the consumed result 
partitions are grouped. Since the result partitions in one group are 
isomorphic, we can just cache the finished status of result partition groups 
and the corresponding pipeline regions.

The optimized implementation is:
{code:java}
for each ConsumedPartitionGroup of current finished SchedulingPipelinedRegion:
  if all consumer SchedulingPipelinedRegion of the ConsumedPartitionGroup are 
finished:
set the ConsumePartitionGroup to be fully consumed
for result partition in the ConsumePartitionGroup:
  if all the ConsumePartitionGroups it belongs to are fully consumed:
release the result partition
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Shengkai Fang
Hi, all.

I think it may cause user confused. The main problem is  we have no means
to detect the conflict configuration, e.g. users set the option true and
use `TableResult#await` together.

Best,
Shengkai.


[jira] [Created] (FLINK-21331) Optimize calculating tasks to restart in RestartPipelinedRegionFailoverStrategy

2021-02-08 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-21331:


 Summary: Optimize calculating tasks to restart in 
RestartPipelinedRegionFailoverStrategy
 Key: FLINK-21331
 URL: https://issues.apache.org/jira/browse/FLINK-21331
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhilong Hong
 Fix For: 1.13.0


RestartPipelinedRegionFailoverStrategy is used to calculate the tasks to 
restart when a task failure occurs. It contains two parts: firstly calculate 
the regions to restart; then add all the tasks in these regions to the 
restarting queue.

The bottleneck is mainly in the first part. This part traverses all the 
upstream and downstream regions of the failed region to determine whether they 
should be restarted or not.

For the current failed region, if its consumed result partition is not 
available, the owner, i.e., the upstream region should restart. Also, since the 
failed region needs to restart, its result partition won't be available, all 
the downstream regions need to restart, too.

1. Calculating the upstream regions that should restart

The current implementation is:
{code:java}
for each SchedulingExecutionVertex in current visited SchedulingPipelinedRegion:
  for each consumed SchedulingResultPartition of the SchedulingExecutionVertex:
if the result partition is not available:
  add the producer region to the restart queue
{code}
Based on FLINK-21328, the consumed result partition of a vertex is already 
grouped. Here we can use a HashSet to record the visited result partition 
group. For vertices connected with all-to-all edges, they will only need to 
traverse the group once. This decreases the time complexity from O(N^2) to O(N).

2. Calculating the downstream regions that should restart

The current implementation is:
{code:java}
for each SchedulingExecutionVertex in current visited SchedulingPipelinedRegion:
  for each produced SchedulingResultPartition of the SchedulingExecutionVertex:
for each consumer SchedulingExecutionVertex of the produced 
SchedulingResultPartition:
  if the region containing the consumer SchedulingExecutionVertex is not 
visited:
add the region to the restart queue
{code}
Since the count of the produced result partitions of a vertex equals the count 
of output JobEdges, the time complexity of this procedure is actually O(N^2). 
As the consumer vertices of a result partition are already grouped, we can use 
a HashSet to record the visited ConsumerVertexGroup. The time complexity 
decreases from O(N^2) to O(N).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21330) Optimization the initialization of PipelinedRegionSchedulingStrategy

2021-02-08 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-21330:


 Summary: Optimization the initialization of 
PipelinedRegionSchedulingStrategy
 Key: FLINK-21330
 URL: https://issues.apache.org/jira/browse/FLINK-21330
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhilong Hong
 Fix For: 1.13.0


{{PipelinedRegionSchedulingStrategy}} is used for task scheduling. Its 
initialization is located at {{PipelinedRegionSchedulingStrategy#init}}. The 
initialization can be divided into two parts:
 # Calculating consumed result partitions of SchedulingPipelinedRegions
 # Calculating the consumer pipelined region of SchedulingResultPartition

Based on FLINK-21328, the {{consumedResults}} of 
{{DefaultSchedulingPipelinedRegion}} can be replaced with 
{{ConsumedPartitionGroup}}.

Then we can optimize the procedures we mentioned above. After the optimization, 
the time complexity decreases from O(N^2) to O(N).

The related usage of {{getConsumedResults}} should be replaced, too.

The detailed design doc is located at: 
[https://docs.google.com/document/d/1OjGAyJ9Z6KsxcMtBHr6vbbrwP9xye7CdCtrLvf8dFYw/edit#heading=h.a1mz4yjpry6m]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21329) "Local recovery and sticky scheduling end-to-end test" does not finish within 600 seconds

2021-02-08 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-21329:
--

 Summary: "Local recovery and sticky scheduling end-to-end test" 
does not finish within 600 seconds
 Key: FLINK-21329
 URL: https://issues.apache.org/jira/browse/FLINK-21329
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.13.0
Reporter: Robert Metzger


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13118=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529

{code}
Feb 08 22:25:46 
==
Feb 08 22:25:46 Running 'Local recovery and sticky scheduling end-to-end test'
Feb 08 22:25:46 
==
Feb 08 22:25:46 TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-46881214821
Feb 08 22:25:47 Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.13-SNAPSHOT-bin/flink-1.13-SNAPSHOT
Feb 08 22:25:47 Running local recovery test with configuration:
Feb 08 22:25:47 parallelism: 4
Feb 08 22:25:47 max attempts: 10
Feb 08 22:25:47 backend: rocks
Feb 08 22:25:47 incremental checkpoints: false
Feb 08 22:25:47 kill JVM: false
Feb 08 22:25:47 Starting zookeeper daemon on host fv-az127-394.
Feb 08 22:25:47 Starting HA cluster with 1 masters.
Feb 08 22:25:48 Starting standalonesession daemon on host fv-az127-394.
Feb 08 22:25:49 Starting taskexecutor daemon on host fv-az127-394.
Feb 08 22:25:49 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:50 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:51 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:53 Waiting for Dispatcher REST endpoint to come up...
Feb 08 22:25:54 Dispatcher REST endpoint is up.
Feb 08 22:25:54 Started TM watchdog with PID 28961.
Feb 08 22:25:58 Job has been submitted with JobID 
e790e85a39040539f9386c0df7ca4812
Feb 08 22:35:47 Test (pid: 27970) did not finish after 600 seconds.
Feb 08 22:35:47 Printing Flink logs and killing it:

{code}

and

{code}

at 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalDriver.unhandledError(ZooKeeperLeaderRetrievalDriver.java:184)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:713)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$6.apply(CuratorFrameworkImpl.java:709)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100)
at 
org.apache.flink.shaded.curator4.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.logError(CuratorFrameworkImpl.java:708)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:874)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException:
 KeeperErrorCode = ConnectionLoss
at 
org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at 
org.apache.flink.shaded.curator4.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862)
... 10 more

{code}



--
This 

[jira] [Created] (FLINK-21328) Optimize the initialization of DefaultExecutionTopology

2021-02-08 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-21328:


 Summary: Optimize the initialization of DefaultExecutionTopology
 Key: FLINK-21328
 URL: https://issues.apache.org/jira/browse/FLINK-21328
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhilong Hong
 Fix For: 1.13.0


Based on FLINK-21326, the {{consumedResults}} in {{DefaultExecutionVertex}} and 
{{consumers}} in {{DefaultResultPartition}} can be replaced with 
{{ConsumedPartitionGroup}} and {{ConsumerVertexGroup}} in {{EdgeManager}}.
 # The method {{DefaultExecutionTopology#connectVerticesToConsumedPartitions}} 
could be removed.
 # All the related usages should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21327) Support window TVF in batch mode

2021-02-08 Thread Jark Wu (Jira)
Jark Wu created FLINK-21327:
---

 Summary: Support window TVF in batch mode
 Key: FLINK-21327
 URL: https://issues.apache.org/jira/browse/FLINK-21327
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Jark Wu


As a batch and streaming unified engine, we should also support to run window 
TVF in batch mode. Then users can use one query with streaming mode to produce 
data in real-time and use the same query with batch mode to backfill data for a 
specific day.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Activate bloom filter in RocksDB State Backend via Flink configuration

2021-02-08 Thread Yun Tang
Hi Jun,

Some predefined options would also activate bloom filters, e.g.  
PredefinedOptions#SPINNING_DISK_OPTIMIZED_HIGH_MEM, but I think offering 
configurable option is good idea. +1 for this.

When talking about the bloom filter default value, I slight prefer to use full 
format [1] instead of old block format. This is related with FLINK-20496 [2] 
which try to add option to enable partitioned index & filter.

[1] 
https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter#full-filters-new-format
[2] https://issues.apache.org/jira/browse/FLINK-20496

Best
Yun Tang

From: Till Rohrmann 
Sent: Monday, February 8, 2021 17:06
To: dev 
Subject: Re: Activate bloom filter in RocksDB State Backend via Flink 
configuration

Hi Jun,

Making things easier to use and configure is a good idea. Hence, +1 for
this proposal. Maybe create a JIRA ticket for it.

For the concrete default values it would be nice to hear the opinion of a
RocksDB expert.

Cheers,
Till

On Sun, Feb 7, 2021 at 7:23 PM Jun Qin  wrote:

> Hi,
>
> Activating bloom filter in the RocksDB state backend improves read
> performance. Currently activating bloom filter can only be done by
> implementing a custom ConfigurableRocksDBOptionsFactory. I think we should
> provide an option to activate bloom filter via Flink configuration.  What
> do you think? If so, what about the following configuration?
>
> state.backend.rocksdb.bloom-filter.enabled: false (default)
> state.backend.rocksdb.bloom-filter.bits-per-key: 10 (default)
> state.backend.rocksdb.bloom-filter.block-based: true (default)
>
>
> Thanks
> Jun


[jira] [Created] (FLINK-21326) Optimize building topology when initializing ExecutionGraph

2021-02-08 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-21326:


 Summary: Optimize building topology when initializing 
ExecutionGraph
 Key: FLINK-21326
 URL: https://issues.apache.org/jira/browse/FLINK-21326
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhilong Hong
 Fix For: 1.13.0


The main idea of optimizing the procedure of building topology is to put all 
the vertices that consumed the same result partitions into one group, and put 
all the result partitions that have the same consumer vertices into one 
consumer group. The corresponding data structure is {{ConsumedPartitionGroup}} 
and {{ConsumerVertexGroup}}. {{EdgeManager}} is used to store the relationship 
between the groups. The procedure of creating {{ExecutionEdge}} is replaced 
with building {{EdgeManager}}.

With these improvements, the complexity of building topology in ExecutionGraph 
decreases from O(N^2) to O(N). 

Furthermore, {{ExecutionEdge}} and all its related calls are replaced with 
{{ConsumedPartitionGroup}} and {{ConsumerVertexGroup}}.

The detailed design doc is located at: 
https://docs.google.com/document/d/1OjGAyJ9Z6KsxcMtBHr6vbbrwP9xye7CdCtrLvf8dFYw/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21325) NoResourceAvailableException while cancel then resubmit jobs

2021-02-08 Thread hayden zhou (Jira)
hayden zhou created FLINK-21325:
---

 Summary: NoResourceAvailableException while cancel then resubmit  
jobs
 Key: FLINK-21325
 URL: https://issues.apache.org/jira/browse/FLINK-21325
 Project: Flink
  Issue Type: Bug
 Environment: FLINK 1.12  with 
[flink-kubernetes_2.11-1.12-SNAPSHOT.jar] in libs directory to fix FLINK 
restart problem on k8s HA session mode.
Reporter: hayden zhou


 I have five stream jobs and want to clear all states in jobs, so I canceled 
all those jobs, then resubmitted one by one, resulting in two jobs are in 
running status,  while three jobs are in created status with errors 
".NoResourceAvailableException: Slot request bulk is not fulfillable! Could not 
allocate the required slot within slot request timeout
"

below is the error logs:

{code:java}
ava.util.concurrent.CompletionException: 
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Slot request bulk is not fulfillable! Could not allocate the required slot 
within slot request timeout
 at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
 at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
 at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
 at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
 at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
 at 
org.apache.flink.runtime.scheduler.SharedSlot.cancelLogicalSlotRequest(SharedSlot.java:195)
 at 
org.apache.flink.runtime.scheduler.SlotSharingExecutionSlotAllocator.cancelLogicalSlotRequest(SlotSharingExecutionSlotAllocator.java:147)
 at 
org.apache.flink.runtime.scheduler.SharingPhysicalSlotRequestBulk.cancel(SharingPhysicalSlotRequestBulk.java:84)
 at 
org.apache.flink.runtime.jobmaster.slotpool.PhysicalSlotRequestBulkWithTimestamp.cancel(PhysicalSlotRequestBulkWithTimestamp.java:66)
 at 
org.apache.flink.runtime.jobmaster.slotpool.PhysicalSlotRequestBulkCheckerImpl.lambda$schedulePendingRequestBulkWithTimestampCheck$0(PhysicalSlotRequestBulkCheckerImpl.java:87)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:404)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:197)
 at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154)
 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
 at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
 at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
 at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
 at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
 at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
 at akka.actor.ActorCell.invoke(ActorCell.scala:561)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
 at akka.dispatch.Mailbox.run(Mailbox.scala:225)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
 at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: 
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Slot request bulk is not fulfillable! Could not allocate the required slot 
within slot request timeout
 at 
org.apache.flink.runtime.jobmaster.slotpool.PhysicalSlotRequestBulkCheckerImpl.lambda$schedulePendingRequestBulkWithTimestampCheck$0(PhysicalSlotRequestBulkCheckerImpl.java:84)
 ... 24 more
Caused by: java.util.concurrent.TimeoutException: Timeout has occurred: 30 
ms
 ... 25 more
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Rui Li
Hi Jark,

I agree it's more consistent if table API also respects this config. But on
the other hand, it might make the `executeSql` API a little trickier to
use, because now DDL, DQL and DML all behave differently from one another:

   - DDL: always sync
   - DQL: always async
   - DML: can be sync or async according to the config

So I slightly prefer to apply this config only to the SQL Client. API users
can always easily achieve sync or async behavior in their code. And the
config option is just meant to give SQL Client users a chance to do the
same thing. But let's hear more opinions from other folks.

On Tue, Feb 9, 2021 at 10:21 AM Jark Wu  wrote:

> Hi Rui,
>
> That's a good point. From the naming of the option, I prefer to get sync
> behavior.
> It would be very straightforward that it affects all the DMLs on SQL CLI
> and
> TableEnvironment (including `executeSql`, `StatementSet`,
> `Table#executeInsert`, etc.).
> This can also make SQL CLI easy to support this configuration by passing
> through to the TableEnv.
>
> Best,
> Jark
>
> On Tue, 9 Feb 2021 at 10:07, Rui Li  wrote:
>
>> Hi,
>>
>> Glad to see we have reached consensus on option #2. +1 to it.
>>
>> Regarding the name, I'm fine with `table.dml-async`. But I wonder whether
>> this config also applies to table API. E.g. if a user
>> sets table.dml-async=false and calls TableEnvironment::executeSql to run a
>> DML, will he get sync behavior?
>>
>> On Mon, Feb 8, 2021 at 11:28 PM Jark Wu  wrote:
>>
>>> Ah, I just forgot the option name.
>>>
>>> I'm also fine with `table.dml-async`.
>>>
>>> What do you think @Rui Li  @Shengkai Fang
>>>  ?
>>>
>>> Best,
>>> Jark
>>>
>>> On Mon, 8 Feb 2021 at 23:06, Timo Walther  wrote:
>>>
 Great to hear that. Can someone update the FLIP a final time before we
 start a vote?

 We should quickly discuss how we would like to name the config option
 for the async/sync mode. I heared voices internally that are strongly
 against calling it "detach" due to historical reasons with a Flink job
 detach mode. How about `table.dml-async`?

 Thanks,
 Timo


 On 08.02.21 15:55, Jark Wu wrote:
 > Thanks Timo,
 >
 > I'm +1 for option#2 too.
 >
 > I think we have addressed all the concerns and can start a vote.
 >
 > Best,
 > Jark
 >
 > On Mon, 8 Feb 2021 at 22:19, Timo Walther  wrote:
 >
 >> Hi Jark,
 >>
 >> you are right. Nesting STATEMENT SET and ASYNC might be too verbose.
 >>
 >> So let's stick to the config option approach.
 >>
 >> However, I strongly believe that we should not use the
 batch/streaming
 >> mode for deriving semantics. This discussion is similar to time
 function
 >> discussion. We should not derive sync/async submission behavior from
 a
 >> flag that should only influence runtime operators and the incremental
 >> computation. Statements for bounded streams should have the same
 >> semantics in batch mode.
 >>
 >> I think your proposed option 2) is a good tradeoff. For the following
 >> reasons:
 >>
 >> pros:
 >> - by default, batch and streaming behave exactly the same
 >> - SQL Client CLI behavior does not change compared to 1.12 and
 remains
 >> async for batch and streaming
 >> - consistent with the async Table API behavior
 >>
 >> con:
 >> - batch files are not 100% SQL compliant by default
 >>
 >> The last item might not be an issue since we can expect that users
 have
 >> long-running jobs and prefer async execution in most cases.
 >>
 >> Regards,
 >> Timo
 >>
 >>
 >> On 08.02.21 14:15, Jark Wu wrote:
 >>> Hi Timo,
 >>>
 >>> Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;...
 END;`.
 >>> Because it makes submitting streaming jobs very verbose, every
 INSERT
 >> INTO
 >>> and STATEMENT SET must be wrapped in the ASYNC clause which is
 >>> not user-friendly and not backward-compatible.
 >>>
 >>> I agree we will have unified behavior but this is at the cost of
 hurting
 >>> our main users.
 >>> I'm worried that end users can't understand the technical decision,
 and
 >>> they would
 >>> feel streaming is harder to use.
 >>>
 >>> If we want to have an unified behavior, and let users decide what's
 the
 >>> desirable behavior, I prefer to have a config option. A Flink
 cluster can
 >>> be set to async, then
 >>> users don't need to wrap every DML in an ASYNC clause. This is the
 least
 >>> intrusive
 >>> way to the users.
 >>>
 >>>
 >>> Personally, I'm fine with following options in priority:
 >>>
 >>> 1) sync for batch DML and async for streaming DML
 >>> ==> only breaks batch behavior, but makes both happy
 >>>
 >>> 2) async for both batch and streaming DML, and can be set to sync
 via a
 >>> 

Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Jark Wu
Hi Rui,

That's a good point. From the naming of the option, I prefer to get sync
behavior.
It would be very straightforward that it affects all the DMLs on SQL CLI
and
TableEnvironment (including `executeSql`, `StatementSet`,
`Table#executeInsert`, etc.).
This can also make SQL CLI easy to support this configuration by passing
through to the TableEnv.

Best,
Jark

On Tue, 9 Feb 2021 at 10:07, Rui Li  wrote:

> Hi,
>
> Glad to see we have reached consensus on option #2. +1 to it.
>
> Regarding the name, I'm fine with `table.dml-async`. But I wonder whether
> this config also applies to table API. E.g. if a user
> sets table.dml-async=false and calls TableEnvironment::executeSql to run a
> DML, will he get sync behavior?
>
> On Mon, Feb 8, 2021 at 11:28 PM Jark Wu  wrote:
>
>> Ah, I just forgot the option name.
>>
>> I'm also fine with `table.dml-async`.
>>
>> What do you think @Rui Li  @Shengkai Fang
>>  ?
>>
>> Best,
>> Jark
>>
>> On Mon, 8 Feb 2021 at 23:06, Timo Walther  wrote:
>>
>>> Great to hear that. Can someone update the FLIP a final time before we
>>> start a vote?
>>>
>>> We should quickly discuss how we would like to name the config option
>>> for the async/sync mode. I heared voices internally that are strongly
>>> against calling it "detach" due to historical reasons with a Flink job
>>> detach mode. How about `table.dml-async`?
>>>
>>> Thanks,
>>> Timo
>>>
>>>
>>> On 08.02.21 15:55, Jark Wu wrote:
>>> > Thanks Timo,
>>> >
>>> > I'm +1 for option#2 too.
>>> >
>>> > I think we have addressed all the concerns and can start a vote.
>>> >
>>> > Best,
>>> > Jark
>>> >
>>> > On Mon, 8 Feb 2021 at 22:19, Timo Walther  wrote:
>>> >
>>> >> Hi Jark,
>>> >>
>>> >> you are right. Nesting STATEMENT SET and ASYNC might be too verbose.
>>> >>
>>> >> So let's stick to the config option approach.
>>> >>
>>> >> However, I strongly believe that we should not use the batch/streaming
>>> >> mode for deriving semantics. This discussion is similar to time
>>> function
>>> >> discussion. We should not derive sync/async submission behavior from a
>>> >> flag that should only influence runtime operators and the incremental
>>> >> computation. Statements for bounded streams should have the same
>>> >> semantics in batch mode.
>>> >>
>>> >> I think your proposed option 2) is a good tradeoff. For the following
>>> >> reasons:
>>> >>
>>> >> pros:
>>> >> - by default, batch and streaming behave exactly the same
>>> >> - SQL Client CLI behavior does not change compared to 1.12 and remains
>>> >> async for batch and streaming
>>> >> - consistent with the async Table API behavior
>>> >>
>>> >> con:
>>> >> - batch files are not 100% SQL compliant by default
>>> >>
>>> >> The last item might not be an issue since we can expect that users
>>> have
>>> >> long-running jobs and prefer async execution in most cases.
>>> >>
>>> >> Regards,
>>> >> Timo
>>> >>
>>> >>
>>> >> On 08.02.21 14:15, Jark Wu wrote:
>>> >>> Hi Timo,
>>> >>>
>>> >>> Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;... END;`.
>>> >>> Because it makes submitting streaming jobs very verbose, every INSERT
>>> >> INTO
>>> >>> and STATEMENT SET must be wrapped in the ASYNC clause which is
>>> >>> not user-friendly and not backward-compatible.
>>> >>>
>>> >>> I agree we will have unified behavior but this is at the cost of
>>> hurting
>>> >>> our main users.
>>> >>> I'm worried that end users can't understand the technical decision,
>>> and
>>> >>> they would
>>> >>> feel streaming is harder to use.
>>> >>>
>>> >>> If we want to have an unified behavior, and let users decide what's
>>> the
>>> >>> desirable behavior, I prefer to have a config option. A Flink
>>> cluster can
>>> >>> be set to async, then
>>> >>> users don't need to wrap every DML in an ASYNC clause. This is the
>>> least
>>> >>> intrusive
>>> >>> way to the users.
>>> >>>
>>> >>>
>>> >>> Personally, I'm fine with following options in priority:
>>> >>>
>>> >>> 1) sync for batch DML and async for streaming DML
>>> >>> ==> only breaks batch behavior, but makes both happy
>>> >>>
>>> >>> 2) async for both batch and streaming DML, and can be set to sync
>>> via a
>>> >>> configuration.
>>> >>> ==> compatible, and provides flexible configurable behavior
>>> >>>
>>> >>> 3) sync for both batch and streaming DML, and can be
>>> >>>   set to async via a configuration.
>>> >>> ==> +0 for this, because it breaks all the compatibility, esp. our
>>> main
>>> >>> users.
>>> >>>
>>> >>> Best,
>>> >>> Jark
>>> >>>
>>> >>> On Mon, 8 Feb 2021 at 17:34, Timo Walther 
>>> wrote:
>>> >>>
>>>  Hi Jark, Hi Rui,
>>> 
>>>  1) How should we execute statements in CLI and in file? Should
>>> there be
>>>  a difference?
>>>  So it seems we have consensus here with unified bahavior. Even
>>> though
>>>  this means we are breaking existing batch INSERT INTOs that were
>>>  asynchronous before.
>>> 
>>>  2) Should we have different behavior for batch and streaming?
>>>  I think 

Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Rui Li
Hi,

Glad to see we have reached consensus on option #2. +1 to it.

Regarding the name, I'm fine with `table.dml-async`. But I wonder whether
this config also applies to table API. E.g. if a user
sets table.dml-async=false and calls TableEnvironment::executeSql to run a
DML, will he get sync behavior?

On Mon, Feb 8, 2021 at 11:28 PM Jark Wu  wrote:

> Ah, I just forgot the option name.
>
> I'm also fine with `table.dml-async`.
>
> What do you think @Rui Li  @Shengkai Fang
>  ?
>
> Best,
> Jark
>
> On Mon, 8 Feb 2021 at 23:06, Timo Walther  wrote:
>
>> Great to hear that. Can someone update the FLIP a final time before we
>> start a vote?
>>
>> We should quickly discuss how we would like to name the config option
>> for the async/sync mode. I heared voices internally that are strongly
>> against calling it "detach" due to historical reasons with a Flink job
>> detach mode. How about `table.dml-async`?
>>
>> Thanks,
>> Timo
>>
>>
>> On 08.02.21 15:55, Jark Wu wrote:
>> > Thanks Timo,
>> >
>> > I'm +1 for option#2 too.
>> >
>> > I think we have addressed all the concerns and can start a vote.
>> >
>> > Best,
>> > Jark
>> >
>> > On Mon, 8 Feb 2021 at 22:19, Timo Walther  wrote:
>> >
>> >> Hi Jark,
>> >>
>> >> you are right. Nesting STATEMENT SET and ASYNC might be too verbose.
>> >>
>> >> So let's stick to the config option approach.
>> >>
>> >> However, I strongly believe that we should not use the batch/streaming
>> >> mode for deriving semantics. This discussion is similar to time
>> function
>> >> discussion. We should not derive sync/async submission behavior from a
>> >> flag that should only influence runtime operators and the incremental
>> >> computation. Statements for bounded streams should have the same
>> >> semantics in batch mode.
>> >>
>> >> I think your proposed option 2) is a good tradeoff. For the following
>> >> reasons:
>> >>
>> >> pros:
>> >> - by default, batch and streaming behave exactly the same
>> >> - SQL Client CLI behavior does not change compared to 1.12 and remains
>> >> async for batch and streaming
>> >> - consistent with the async Table API behavior
>> >>
>> >> con:
>> >> - batch files are not 100% SQL compliant by default
>> >>
>> >> The last item might not be an issue since we can expect that users have
>> >> long-running jobs and prefer async execution in most cases.
>> >>
>> >> Regards,
>> >> Timo
>> >>
>> >>
>> >> On 08.02.21 14:15, Jark Wu wrote:
>> >>> Hi Timo,
>> >>>
>> >>> Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;... END;`.
>> >>> Because it makes submitting streaming jobs very verbose, every INSERT
>> >> INTO
>> >>> and STATEMENT SET must be wrapped in the ASYNC clause which is
>> >>> not user-friendly and not backward-compatible.
>> >>>
>> >>> I agree we will have unified behavior but this is at the cost of
>> hurting
>> >>> our main users.
>> >>> I'm worried that end users can't understand the technical decision,
>> and
>> >>> they would
>> >>> feel streaming is harder to use.
>> >>>
>> >>> If we want to have an unified behavior, and let users decide what's
>> the
>> >>> desirable behavior, I prefer to have a config option. A Flink cluster
>> can
>> >>> be set to async, then
>> >>> users don't need to wrap every DML in an ASYNC clause. This is the
>> least
>> >>> intrusive
>> >>> way to the users.
>> >>>
>> >>>
>> >>> Personally, I'm fine with following options in priority:
>> >>>
>> >>> 1) sync for batch DML and async for streaming DML
>> >>> ==> only breaks batch behavior, but makes both happy
>> >>>
>> >>> 2) async for both batch and streaming DML, and can be set to sync via
>> a
>> >>> configuration.
>> >>> ==> compatible, and provides flexible configurable behavior
>> >>>
>> >>> 3) sync for both batch and streaming DML, and can be
>> >>>   set to async via a configuration.
>> >>> ==> +0 for this, because it breaks all the compatibility, esp. our
>> main
>> >>> users.
>> >>>
>> >>> Best,
>> >>> Jark
>> >>>
>> >>> On Mon, 8 Feb 2021 at 17:34, Timo Walther  wrote:
>> >>>
>>  Hi Jark, Hi Rui,
>> 
>>  1) How should we execute statements in CLI and in file? Should there
>> be
>>  a difference?
>>  So it seems we have consensus here with unified bahavior. Even though
>>  this means we are breaking existing batch INSERT INTOs that were
>>  asynchronous before.
>> 
>>  2) Should we have different behavior for batch and streaming?
>>  I think also batch users prefer async behavior because usually even
>>  those pipelines take some time to execute. But we need should stick
>> to
>>  standard SQL blocking semantics.
>> 
>>  What are your opinions on making async explicit in SQL via `BEGIN
>> ASYNC;
>>  ... END;`? This would allow us to really have unified semantics
>> because
>>  batch and streaming would behave the same?
>> 
>>  Regards,
>>  Timo
>> 
>> 
>>  On 07.02.21 04:46, Rui Li wrote:
>> > Hi Timo,
>> >
>> > I agree with Jark that we should 

[jira] [Created] (FLINK-21324) statefun-testutil can't assert messages function sends to itself

2021-02-08 Thread Timur (Jira)
Timur created FLINK-21324:
-

 Summary: statefun-testutil can't assert messages function sends to 
itself
 Key: FLINK-21324
 URL: https://issues.apache.org/jira/browse/FLINK-21324
 Project: Flink
  Issue Type: Bug
  Components: Stateful Functions
Affects Versions: statefun-2.2.2
Reporter: Timur


Assertions don't work for messages sent by functions to themselves. The reason 
is that TestContext doesn't add a message to responses:

 
{code:java}
@Override
public void send(Address to, Object message) {
  if (to.equals(selfAddress)) {
messages.add(new Envelope(self(), to, message));
return;
  }
  responses.computeIfAbsent(to, ignore -> new ArrayList<>()).add(message);
}
{code}
Instead of adding the message to responses the method returns right after 
message added to messages. 

 

Here is the example of the assertion that doesn't work in case a function sent 
a message to itself:
{code:java}
assertThat(harness.invoke(aliceDiseaseDiagnosedEvent()), sentNothing());
{code}
The test won't fail even though the message was really sent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Apache Flink 1.12.2

2021-02-08 Thread Matthias Pohl
Fabian and I were investigating strange behavior with stop-with-savepoint
not terminating when using the new fromSource to add a source to a job
definition. I created FLINK-21323 [1] to cover the issue. This might not be
a blocker for 1.12.2 since this bug would have been already around since
1.11, if I'm not mistaken? I wanted to bring this to your attention,
anyway. It would be good if someone more familiar with this part of the
source code could verify our findings and confirm the severity of the issue.

Best,
Matthias

[1] https://issues.apache.org/jira/browse/FLINK-21323

On Mon, Feb 8, 2021 at 7:25 PM Thomas Weise  wrote:

> +1 for the 1.12.2 release
>
> On Mon, Feb 8, 2021 at 3:20 AM Matthias Pohl 
> wrote:
>
> > Thanks Yuan for driving 1.12.2.
> > +1 for releasing 1.12.2
> >
> > One comment about FLINK-21030 [1]: I hope to fix it this week. But there
> > are still some uncertainties. The underlying problem is older than 1.12.
> > Hence, the suggestion is to not block the 1.12.2 release because of
> > FLINK-21030 [1]. I will leave it as a blocker issue, though, to underline
> > that it should be fixed for 1.13.
> >
> > Best,
> > Matthias
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-21030
> >
> > On Sun, Feb 7, 2021 at 4:10 AM Xintong Song 
> wrote:
> >
> > > Thanks Yuan,
> > >
> > > +1 for releasing 1.12.2 and Yuan as the release manager.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Sat, Feb 6, 2021 at 3:41 PM Yu Li  wrote:
> > >
> > > > +1 for releasing 1.12.2, and thanks for volunteering to be our
> release
> > > > manager Yuan.
> > > >
> > > > Besides the mentioned issues, I could see two more blockers with
> 1.12.2
> > > as
> > > > fix version [1] and need some tracking:
> > > > * FLINK-21013 
> > Blink
> > > > planner does not ingest timestamp into StreamRecord
> > > > * FLINK- 21030  graph>
> > > > Broken
> > > > job restart for job with disjoint graph
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > > [1] https://s.apache.org/flink-1.12.2-blockers
> > > >
> > > >
> > > > On Sat, 6 Feb 2021 at 08:57, Kurt Young  wrote:
> > > >
> > > > > Thanks for being our release manager Yuan.
> > > > >
> > > > > We found a out of memory issue [1] which will affect most batch
> jobs
> > > > thus I
> > > > > think
> > > > > it would be great if we can include this fix in 1.12.2.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-20663
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Sat, Feb 6, 2021 at 12:36 AM Till Rohrmann <
> trohrm...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Thanks for volunteering for being our release manager Yuan :-)
> > > > > >
> > > > > > +1 for a timely bug fix release.
> > > > > >
> > > > > > I will try to review the PR for FLINK- 20417 [1] which is a good
> > fix
> > > to
> > > > > > include in the next bug fix release. We don't have to block the
> > > release
> > > > > on
> > > > > > this fix though.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-20417
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Fri, Feb 5, 2021 at 5:12 PM Piotr Nowojski <
> > pnowoj...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Thanks Yuan for bringing up this topic.
> > > > > > >
> > > > > > > +1 for the quick 1.12.2 release.
> > > > > > >
> > > > > > > As Yuan mentioned, me and Roman can help whenever committer
> > rights
> > > > will
> > > > > > be
> > > > > > > required.
> > > > > > >
> > > > > > > > I am a Firefox user and I just fixed a long lasting bug
> > > > > > > https://github.com/apache/flink/pull/14848 , wish this would
> be
> > > > merged
> > > > > > in
> > > > > > > this release.
> > > > > > >
> > > > > > > I will push to speed up review of your PR. Let's try to merge
> it
> > > > before
> > > > > > > 1.12.2, but at the same time I wouldn't block the release on
> this
> > > > bug.
> > > > > > >
> > > > > > > Best,
> > > > > > > Piotrek
> > > > > > >
> > > > > > > pt., 5 lut 2021 o 12:12 郁蓝 
> > > napisał(a):
> > > > > > >
> > > > > > > > Hi Yuan,
> > > > > > > >
> > > > > > > >
> > > > > > > > I am a Firefox user and I just fixed a long lasting bug
> > > > > > > > https://github.com/apache/flink/pull/14848 , wish this would
> > be
> > > > > merged
> > > > > > > in
> > > > > > > > this release.
> > > > > > > >
> > > > > > > >
> > > > > > > > Best wishes
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --原始邮件--
> > > > > > > > 发件人:
> > > > > > > >   "dev"
> > > > > > > >
> >  <
> > > > > > > > yuanmei.w...@gmail.com;
> > > > > > > > 发送时间:2021年2月5日(星期五) 晚上6:36
> > > > > > > > 收件人:"dev" > > > > > > >
> > > > > > > > 主题:[DISCUSS] Releasing Apache Flink 1.12.2
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > 

[jira] [Created] (FLINK-21323) Stop-with-savepoint is not supported by SourceOperatorStreamTask

2021-02-08 Thread Matthias (Jira)
Matthias created FLINK-21323:


 Summary: Stop-with-savepoint is not supported by 
SourceOperatorStreamTask
 Key: FLINK-21323
 URL: https://issues.apache.org/jira/browse/FLINK-21323
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Common
Affects Versions: 1.12.1, 1.11.3
Reporter: Matthias
 Fix For: 1.11.4, 1.12.2, 1.13.0


When looking into FLINK-21030 analyzing the stop-with-savepoint behavior we 
implemented different test jobs covering the old {{addSource}} and new 
{{fromSource}} methods for adding sources. The stop-with-savepoint consists of 
two phase:
 # Create the savepoint
 # Stop the source to trigger finalizing the job

The test failing in the second phase using {{fromSource}} does not succeed. The 
reason for this might be that {{finishTask}} is not implemented by 
{{SourceOperatorStreamTask}} in contrast to {{SourceStreamTask}} which is used 
when calling {{addSource}} in the job definition. Hence, the job termination is 
never triggered.

We might have missed this due to some naming error of 
{{JobMasterStopWithSavepointIT}} test that is not triggered by Maven due to the 
wrong suffix used in this case. The IT is failing right now. FLINK-21031 is 
covering the fix of {{JobMasterStopWithSavepointIT}} already.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Apache Flink 1.12.2

2021-02-08 Thread Thomas Weise
+1 for the 1.12.2 release

On Mon, Feb 8, 2021 at 3:20 AM Matthias Pohl  wrote:

> Thanks Yuan for driving 1.12.2.
> +1 for releasing 1.12.2
>
> One comment about FLINK-21030 [1]: I hope to fix it this week. But there
> are still some uncertainties. The underlying problem is older than 1.12.
> Hence, the suggestion is to not block the 1.12.2 release because of
> FLINK-21030 [1]. I will leave it as a blocker issue, though, to underline
> that it should be fixed for 1.13.
>
> Best,
> Matthias
>
> [1] https://issues.apache.org/jira/browse/FLINK-21030
>
> On Sun, Feb 7, 2021 at 4:10 AM Xintong Song  wrote:
>
> > Thanks Yuan,
> >
> > +1 for releasing 1.12.2 and Yuan as the release manager.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Sat, Feb 6, 2021 at 3:41 PM Yu Li  wrote:
> >
> > > +1 for releasing 1.12.2, and thanks for volunteering to be our release
> > > manager Yuan.
> > >
> > > Besides the mentioned issues, I could see two more blockers with 1.12.2
> > as
> > > fix version [1] and need some tracking:
> > > * FLINK-21013 
> Blink
> > > planner does not ingest timestamp into StreamRecord
> > > * FLINK- 21030 
> > > Broken
> > > job restart for job with disjoint graph
> > >
> > > Best Regards,
> > > Yu
> > >
> > > [1] https://s.apache.org/flink-1.12.2-blockers
> > >
> > >
> > > On Sat, 6 Feb 2021 at 08:57, Kurt Young  wrote:
> > >
> > > > Thanks for being our release manager Yuan.
> > > >
> > > > We found a out of memory issue [1] which will affect most batch jobs
> > > thus I
> > > > think
> > > > it would be great if we can include this fix in 1.12.2.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-20663
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Sat, Feb 6, 2021 at 12:36 AM Till Rohrmann 
> > > > wrote:
> > > >
> > > > > Thanks for volunteering for being our release manager Yuan :-)
> > > > >
> > > > > +1 for a timely bug fix release.
> > > > >
> > > > > I will try to review the PR for FLINK- 20417 [1] which is a good
> fix
> > to
> > > > > include in the next bug fix release. We don't have to block the
> > release
> > > > on
> > > > > this fix though.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-20417
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Fri, Feb 5, 2021 at 5:12 PM Piotr Nowojski <
> pnowoj...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Thanks Yuan for bringing up this topic.
> > > > > >
> > > > > > +1 for the quick 1.12.2 release.
> > > > > >
> > > > > > As Yuan mentioned, me and Roman can help whenever committer
> rights
> > > will
> > > > > be
> > > > > > required.
> > > > > >
> > > > > > > I am a Firefox user and I just fixed a long lasting bug
> > > > > > https://github.com/apache/flink/pull/14848 , wish this would be
> > > merged
> > > > > in
> > > > > > this release.
> > > > > >
> > > > > > I will push to speed up review of your PR. Let's try to merge it
> > > before
> > > > > > 1.12.2, but at the same time I wouldn't block the release on this
> > > bug.
> > > > > >
> > > > > > Best,
> > > > > > Piotrek
> > > > > >
> > > > > > pt., 5 lut 2021 o 12:12 郁蓝 
> > napisał(a):
> > > > > >
> > > > > > > Hi Yuan,
> > > > > > >
> > > > > > >
> > > > > > > I am a Firefox user and I just fixed a long lasting bug
> > > > > > > https://github.com/apache/flink/pull/14848 , wish this would
> be
> > > > merged
> > > > > > in
> > > > > > > this release.
> > > > > > >
> > > > > > >
> > > > > > > Best wishes
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --原始邮件--
> > > > > > > 发件人:
> > > > > > >   "dev"
> > > > > > >
>  <
> > > > > > > yuanmei.w...@gmail.com;
> > > > > > > 发送时间:2021年2月5日(星期五) 晚上6:36
> > > > > > > 收件人:"dev" > > > > > >
> > > > > > > 主题:[DISCUSS] Releasing Apache Flink 1.12.2
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hey devs,
> > > > > > >
> > > > > > > One of the major issues that have not been resolved in Apache
> > Flink
> > > > > > 1.12.1
> > > > > > > is "unaligned
> > > > > > > checkpoint recovery may lead to corrupted data stream"[1].
> Since
> > > the
> > > > > > > problem is now fixed and
> > > > > > > it is critical to the users, I would like to kick off a
> > discussion
> > > on
> > > > > > > releasing Flink 1.12.2 that
> > > > > > > includes unaligned checkpoint fixes.
> > > > > > >
> > > > > > > I would like to volunteer myself for managing this release.
> But I
> > > > > noticed
> > > > > > > that some of the release
> > > > > > > steps may require committer authorities. Luckily, Piotr and
> Roman
> > > are
> > > > > > very
> > > > > > > kind to provide help
> > > > > > > on such steps.
> > > > > > >
> > > > > > > Apart from the unaligned checkpoint issue, please let us know
> in
> > > this
> > > > > > > thread if there are any
> > > > > > > 

Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Jark Wu
Ah, I just forgot the option name.

I'm also fine with `table.dml-async`.

What do you think @Rui Li  @Shengkai Fang
 ?

Best,
Jark

On Mon, 8 Feb 2021 at 23:06, Timo Walther  wrote:

> Great to hear that. Can someone update the FLIP a final time before we
> start a vote?
>
> We should quickly discuss how we would like to name the config option
> for the async/sync mode. I heared voices internally that are strongly
> against calling it "detach" due to historical reasons with a Flink job
> detach mode. How about `table.dml-async`?
>
> Thanks,
> Timo
>
>
> On 08.02.21 15:55, Jark Wu wrote:
> > Thanks Timo,
> >
> > I'm +1 for option#2 too.
> >
> > I think we have addressed all the concerns and can start a vote.
> >
> > Best,
> > Jark
> >
> > On Mon, 8 Feb 2021 at 22:19, Timo Walther  wrote:
> >
> >> Hi Jark,
> >>
> >> you are right. Nesting STATEMENT SET and ASYNC might be too verbose.
> >>
> >> So let's stick to the config option approach.
> >>
> >> However, I strongly believe that we should not use the batch/streaming
> >> mode for deriving semantics. This discussion is similar to time function
> >> discussion. We should not derive sync/async submission behavior from a
> >> flag that should only influence runtime operators and the incremental
> >> computation. Statements for bounded streams should have the same
> >> semantics in batch mode.
> >>
> >> I think your proposed option 2) is a good tradeoff. For the following
> >> reasons:
> >>
> >> pros:
> >> - by default, batch and streaming behave exactly the same
> >> - SQL Client CLI behavior does not change compared to 1.12 and remains
> >> async for batch and streaming
> >> - consistent with the async Table API behavior
> >>
> >> con:
> >> - batch files are not 100% SQL compliant by default
> >>
> >> The last item might not be an issue since we can expect that users have
> >> long-running jobs and prefer async execution in most cases.
> >>
> >> Regards,
> >> Timo
> >>
> >>
> >> On 08.02.21 14:15, Jark Wu wrote:
> >>> Hi Timo,
> >>>
> >>> Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;... END;`.
> >>> Because it makes submitting streaming jobs very verbose, every INSERT
> >> INTO
> >>> and STATEMENT SET must be wrapped in the ASYNC clause which is
> >>> not user-friendly and not backward-compatible.
> >>>
> >>> I agree we will have unified behavior but this is at the cost of
> hurting
> >>> our main users.
> >>> I'm worried that end users can't understand the technical decision, and
> >>> they would
> >>> feel streaming is harder to use.
> >>>
> >>> If we want to have an unified behavior, and let users decide what's the
> >>> desirable behavior, I prefer to have a config option. A Flink cluster
> can
> >>> be set to async, then
> >>> users don't need to wrap every DML in an ASYNC clause. This is the
> least
> >>> intrusive
> >>> way to the users.
> >>>
> >>>
> >>> Personally, I'm fine with following options in priority:
> >>>
> >>> 1) sync for batch DML and async for streaming DML
> >>> ==> only breaks batch behavior, but makes both happy
> >>>
> >>> 2) async for both batch and streaming DML, and can be set to sync via a
> >>> configuration.
> >>> ==> compatible, and provides flexible configurable behavior
> >>>
> >>> 3) sync for both batch and streaming DML, and can be
> >>>   set to async via a configuration.
> >>> ==> +0 for this, because it breaks all the compatibility, esp. our main
> >>> users.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>> On Mon, 8 Feb 2021 at 17:34, Timo Walther  wrote:
> >>>
>  Hi Jark, Hi Rui,
> 
>  1) How should we execute statements in CLI and in file? Should there
> be
>  a difference?
>  So it seems we have consensus here with unified bahavior. Even though
>  this means we are breaking existing batch INSERT INTOs that were
>  asynchronous before.
> 
>  2) Should we have different behavior for batch and streaming?
>  I think also batch users prefer async behavior because usually even
>  those pipelines take some time to execute. But we need should stick to
>  standard SQL blocking semantics.
> 
>  What are your opinions on making async explicit in SQL via `BEGIN
> ASYNC;
>  ... END;`? This would allow us to really have unified semantics
> because
>  batch and streaming would behave the same?
> 
>  Regards,
>  Timo
> 
> 
>  On 07.02.21 04:46, Rui Li wrote:
> > Hi Timo,
> >
> > I agree with Jark that we should provide consistent experience
> >> regarding
> > SQL CLI and files. Some systems even allow users to execute SQL files
> >> in
> > the CLI, e.g. the "SOURCE" command in MySQL. If we want to support
> that
>  in
> > the future, it's a little tricky to decide whether that should be
> >> treated
> > as CLI or file.
> >
> > I actually prefer a config option and let users decide what's the
> > desirable behavior. But if we have agreed not to use options, I'm
> also
>  fine
> > 

[jira] [Created] (FLINK-21322) Add ExecutionGraphHandler

2021-02-08 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-21322:


 Summary: Add ExecutionGraphHandler
 Key: FLINK-21322
 URL: https://issues.apache.org/jira/browse/FLINK-21322
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.13.0


Add a re-usable scheduler component containing common execution graph logic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Timo Walther
Great to hear that. Can someone update the FLIP a final time before we 
start a vote?


We should quickly discuss how we would like to name the config option 
for the async/sync mode. I heared voices internally that are strongly 
against calling it "detach" due to historical reasons with a Flink job 
detach mode. How about `table.dml-async`?


Thanks,
Timo


On 08.02.21 15:55, Jark Wu wrote:

Thanks Timo,

I'm +1 for option#2 too.

I think we have addressed all the concerns and can start a vote.

Best,
Jark

On Mon, 8 Feb 2021 at 22:19, Timo Walther  wrote:


Hi Jark,

you are right. Nesting STATEMENT SET and ASYNC might be too verbose.

So let's stick to the config option approach.

However, I strongly believe that we should not use the batch/streaming
mode for deriving semantics. This discussion is similar to time function
discussion. We should not derive sync/async submission behavior from a
flag that should only influence runtime operators and the incremental
computation. Statements for bounded streams should have the same
semantics in batch mode.

I think your proposed option 2) is a good tradeoff. For the following
reasons:

pros:
- by default, batch and streaming behave exactly the same
- SQL Client CLI behavior does not change compared to 1.12 and remains
async for batch and streaming
- consistent with the async Table API behavior

con:
- batch files are not 100% SQL compliant by default

The last item might not be an issue since we can expect that users have
long-running jobs and prefer async execution in most cases.

Regards,
Timo


On 08.02.21 14:15, Jark Wu wrote:

Hi Timo,

Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;... END;`.
Because it makes submitting streaming jobs very verbose, every INSERT

INTO

and STATEMENT SET must be wrapped in the ASYNC clause which is
not user-friendly and not backward-compatible.

I agree we will have unified behavior but this is at the cost of hurting
our main users.
I'm worried that end users can't understand the technical decision, and
they would
feel streaming is harder to use.

If we want to have an unified behavior, and let users decide what's the
desirable behavior, I prefer to have a config option. A Flink cluster can
be set to async, then
users don't need to wrap every DML in an ASYNC clause. This is the least
intrusive
way to the users.


Personally, I'm fine with following options in priority:

1) sync for batch DML and async for streaming DML
==> only breaks batch behavior, but makes both happy

2) async for both batch and streaming DML, and can be set to sync via a
configuration.
==> compatible, and provides flexible configurable behavior

3) sync for both batch and streaming DML, and can be
  set to async via a configuration.
==> +0 for this, because it breaks all the compatibility, esp. our main
users.

Best,
Jark

On Mon, 8 Feb 2021 at 17:34, Timo Walther  wrote:


Hi Jark, Hi Rui,

1) How should we execute statements in CLI and in file? Should there be
a difference?
So it seems we have consensus here with unified bahavior. Even though
this means we are breaking existing batch INSERT INTOs that were
asynchronous before.

2) Should we have different behavior for batch and streaming?
I think also batch users prefer async behavior because usually even
those pipelines take some time to execute. But we need should stick to
standard SQL blocking semantics.

What are your opinions on making async explicit in SQL via `BEGIN ASYNC;
... END;`? This would allow us to really have unified semantics because
batch and streaming would behave the same?

Regards,
Timo


On 07.02.21 04:46, Rui Li wrote:

Hi Timo,

I agree with Jark that we should provide consistent experience

regarding

SQL CLI and files. Some systems even allow users to execute SQL files

in

the CLI, e.g. the "SOURCE" command in MySQL. If we want to support that

in

the future, it's a little tricky to decide whether that should be

treated

as CLI or file.

I actually prefer a config option and let users decide what's the
desirable behavior. But if we have agreed not to use options, I'm also

fine

with Alternative #1.

On Sun, Feb 7, 2021 at 11:01 AM Jark Wu  wrote:


Hi Timo,

1) How should we execute statements in CLI and in file? Should there

be

a

difference?
I do think we should unify the behavior of CLI and SQL files. SQL

files

can

be thought of as a shortcut of
"start CLI" => "copy content of SQL files" => "past content in CLI".
Actually, we already did this in kafka_e2e.sql [1].
I think it's hard for users to understand why SQL files behave

differently

from CLI, all the other systems don't have such a difference.

If we distinguish SQL files and CLI, should there be a difference in

JDBC

driver and UI platform?
Personally, they all should have consistent behavior.

2) Should we have different behavior for batch and streaming?
I think we all agree streaming users prefer async execution, otherwise

it's

weird and difficult to use if the
submit script or CLI 

Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Jark Wu
Thanks Timo,

I'm +1 for option#2 too.

I think we have addressed all the concerns and can start a vote.

Best,
Jark

On Mon, 8 Feb 2021 at 22:19, Timo Walther  wrote:

> Hi Jark,
>
> you are right. Nesting STATEMENT SET and ASYNC might be too verbose.
>
> So let's stick to the config option approach.
>
> However, I strongly believe that we should not use the batch/streaming
> mode for deriving semantics. This discussion is similar to time function
> discussion. We should not derive sync/async submission behavior from a
> flag that should only influence runtime operators and the incremental
> computation. Statements for bounded streams should have the same
> semantics in batch mode.
>
> I think your proposed option 2) is a good tradeoff. For the following
> reasons:
>
> pros:
> - by default, batch and streaming behave exactly the same
> - SQL Client CLI behavior does not change compared to 1.12 and remains
> async for batch and streaming
> - consistent with the async Table API behavior
>
> con:
> - batch files are not 100% SQL compliant by default
>
> The last item might not be an issue since we can expect that users have
> long-running jobs and prefer async execution in most cases.
>
> Regards,
> Timo
>
>
> On 08.02.21 14:15, Jark Wu wrote:
> > Hi Timo,
> >
> > Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;... END;`.
> > Because it makes submitting streaming jobs very verbose, every INSERT
> INTO
> > and STATEMENT SET must be wrapped in the ASYNC clause which is
> > not user-friendly and not backward-compatible.
> >
> > I agree we will have unified behavior but this is at the cost of hurting
> > our main users.
> > I'm worried that end users can't understand the technical decision, and
> > they would
> > feel streaming is harder to use.
> >
> > If we want to have an unified behavior, and let users decide what's the
> > desirable behavior, I prefer to have a config option. A Flink cluster can
> > be set to async, then
> > users don't need to wrap every DML in an ASYNC clause. This is the least
> > intrusive
> > way to the users.
> >
> >
> > Personally, I'm fine with following options in priority:
> >
> > 1) sync for batch DML and async for streaming DML
> > ==> only breaks batch behavior, but makes both happy
> >
> > 2) async for both batch and streaming DML, and can be set to sync via a
> > configuration.
> > ==> compatible, and provides flexible configurable behavior
> >
> > 3) sync for both batch and streaming DML, and can be
> >  set to async via a configuration.
> > ==> +0 for this, because it breaks all the compatibility, esp. our main
> > users.
> >
> > Best,
> > Jark
> >
> > On Mon, 8 Feb 2021 at 17:34, Timo Walther  wrote:
> >
> >> Hi Jark, Hi Rui,
> >>
> >> 1) How should we execute statements in CLI and in file? Should there be
> >> a difference?
> >> So it seems we have consensus here with unified bahavior. Even though
> >> this means we are breaking existing batch INSERT INTOs that were
> >> asynchronous before.
> >>
> >> 2) Should we have different behavior for batch and streaming?
> >> I think also batch users prefer async behavior because usually even
> >> those pipelines take some time to execute. But we need should stick to
> >> standard SQL blocking semantics.
> >>
> >> What are your opinions on making async explicit in SQL via `BEGIN ASYNC;
> >> ... END;`? This would allow us to really have unified semantics because
> >> batch and streaming would behave the same?
> >>
> >> Regards,
> >> Timo
> >>
> >>
> >> On 07.02.21 04:46, Rui Li wrote:
> >>> Hi Timo,
> >>>
> >>> I agree with Jark that we should provide consistent experience
> regarding
> >>> SQL CLI and files. Some systems even allow users to execute SQL files
> in
> >>> the CLI, e.g. the "SOURCE" command in MySQL. If we want to support that
> >> in
> >>> the future, it's a little tricky to decide whether that should be
> treated
> >>> as CLI or file.
> >>>
> >>> I actually prefer a config option and let users decide what's the
> >>> desirable behavior. But if we have agreed not to use options, I'm also
> >> fine
> >>> with Alternative #1.
> >>>
> >>> On Sun, Feb 7, 2021 at 11:01 AM Jark Wu  wrote:
> >>>
>  Hi Timo,
> 
>  1) How should we execute statements in CLI and in file? Should there
> be
> >> a
>  difference?
>  I do think we should unify the behavior of CLI and SQL files. SQL
> files
> >> can
>  be thought of as a shortcut of
>  "start CLI" => "copy content of SQL files" => "past content in CLI".
>  Actually, we already did this in kafka_e2e.sql [1].
>  I think it's hard for users to understand why SQL files behave
> >> differently
>  from CLI, all the other systems don't have such a difference.
> 
>  If we distinguish SQL files and CLI, should there be a difference in
> >> JDBC
>  driver and UI platform?
>  Personally, they all should have consistent behavior.
> 
>  2) Should we have different behavior for batch and streaming?
>  I think we 

Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Timo Walther

Hi Jark,

you are right. Nesting STATEMENT SET and ASYNC might be too verbose.

So let's stick to the config option approach.

However, I strongly believe that we should not use the batch/streaming 
mode for deriving semantics. This discussion is similar to time function 
discussion. We should not derive sync/async submission behavior from a 
flag that should only influence runtime operators and the incremental 
computation. Statements for bounded streams should have the same 
semantics in batch mode.


I think your proposed option 2) is a good tradeoff. For the following 
reasons:


pros:
- by default, batch and streaming behave exactly the same
- SQL Client CLI behavior does not change compared to 1.12 and remains 
async for batch and streaming

- consistent with the async Table API behavior

con:
- batch files are not 100% SQL compliant by default

The last item might not be an issue since we can expect that users have 
long-running jobs and prefer async execution in most cases.


Regards,
Timo


On 08.02.21 14:15, Jark Wu wrote:

Hi Timo,

Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;... END;`.
Because it makes submitting streaming jobs very verbose, every INSERT INTO
and STATEMENT SET must be wrapped in the ASYNC clause which is
not user-friendly and not backward-compatible.

I agree we will have unified behavior but this is at the cost of hurting
our main users.
I'm worried that end users can't understand the technical decision, and
they would
feel streaming is harder to use.

If we want to have an unified behavior, and let users decide what's the
desirable behavior, I prefer to have a config option. A Flink cluster can
be set to async, then
users don't need to wrap every DML in an ASYNC clause. This is the least
intrusive
way to the users.


Personally, I'm fine with following options in priority:

1) sync for batch DML and async for streaming DML
==> only breaks batch behavior, but makes both happy

2) async for both batch and streaming DML, and can be set to sync via a
configuration.
==> compatible, and provides flexible configurable behavior

3) sync for both batch and streaming DML, and can be
 set to async via a configuration.
==> +0 for this, because it breaks all the compatibility, esp. our main
users.

Best,
Jark

On Mon, 8 Feb 2021 at 17:34, Timo Walther  wrote:


Hi Jark, Hi Rui,

1) How should we execute statements in CLI and in file? Should there be
a difference?
So it seems we have consensus here with unified bahavior. Even though
this means we are breaking existing batch INSERT INTOs that were
asynchronous before.

2) Should we have different behavior for batch and streaming?
I think also batch users prefer async behavior because usually even
those pipelines take some time to execute. But we need should stick to
standard SQL blocking semantics.

What are your opinions on making async explicit in SQL via `BEGIN ASYNC;
... END;`? This would allow us to really have unified semantics because
batch and streaming would behave the same?

Regards,
Timo


On 07.02.21 04:46, Rui Li wrote:

Hi Timo,

I agree with Jark that we should provide consistent experience regarding
SQL CLI and files. Some systems even allow users to execute SQL files in
the CLI, e.g. the "SOURCE" command in MySQL. If we want to support that

in

the future, it's a little tricky to decide whether that should be treated
as CLI or file.

I actually prefer a config option and let users decide what's the
desirable behavior. But if we have agreed not to use options, I'm also

fine

with Alternative #1.

On Sun, Feb 7, 2021 at 11:01 AM Jark Wu  wrote:


Hi Timo,

1) How should we execute statements in CLI and in file? Should there be

a

difference?
I do think we should unify the behavior of CLI and SQL files. SQL files

can

be thought of as a shortcut of
"start CLI" => "copy content of SQL files" => "past content in CLI".
Actually, we already did this in kafka_e2e.sql [1].
I think it's hard for users to understand why SQL files behave

differently

from CLI, all the other systems don't have such a difference.

If we distinguish SQL files and CLI, should there be a difference in

JDBC

driver and UI platform?
Personally, they all should have consistent behavior.

2) Should we have different behavior for batch and streaming?
I think we all agree streaming users prefer async execution, otherwise

it's

weird and difficult to use if the
submit script or CLI never exists. On the other hand, batch SQL users

are

used to SQL statements being
executed blockly.

Either unified async execution or unified sync execution, will hurt one
side of the streaming
batch users. In order to make both sides happy, I think we can have
different behavior for batch and streaming.
There are many essential differences between batch and stream systems, I
think it's normal to have some
different behaviors, and the behavior doesn't break the unified batch
stream semantics.


Thus, I'm +1 to Alternative 1:
We consider batch/streaming 

[jira] [Created] (FLINK-21321) Change RocksDB incremental checkpoint re-scaling to use deleteRange

2021-02-08 Thread Joey Pereira (Jira)
Joey Pereira created FLINK-21321:


 Summary: Change RocksDB incremental checkpoint re-scaling to use 
deleteRange
 Key: FLINK-21321
 URL: https://issues.apache.org/jira/browse/FLINK-21321
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Reporter: Joey Pereira


In FLINK-8790, it was suggested to use RocksDB's {{deleteRange}} API to more 
efficiently clip the databases for the desired target group.

During the PR for that ticket, 
[#5582|https://github.com/apache/flink/pull/5582], the change did not end up 
using the {{deleteRange}} method  as it was an experimental feature in RocksDB.

At this point {{deleteRange}} is in a far less experimental state now but I 
believe is still formally "experimental". It is heavily by many others like 
CockroachDB and TiKV and they have teased out several bugs in complex 
interactions over the years.

For certain re-scaling situations where restores trigger {{restoreWithScaling}} 
and the DB clipping logic, this would likely reduce an O(n) operation (N = 
state size/records) to O(1). For large state apps, this would potentially 
represent a non-trivial amount of time spent for re-scaling. In the case of my 
workplace, we have an operator with 100s of billions of records in state and 
re-scaling was taking a long time (>>30min, but it has been awhile since doing 
it).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Jark Wu
Hi Timo,

Actually, I'm not in favor of explicit syntax `BEGIN ASYNC;... END;`.
Because it makes submitting streaming jobs very verbose, every INSERT INTO
and STATEMENT SET must be wrapped in the ASYNC clause which is
not user-friendly and not backward-compatible.

I agree we will have unified behavior but this is at the cost of hurting
our main users.
I'm worried that end users can't understand the technical decision, and
they would
feel streaming is harder to use.

If we want to have an unified behavior, and let users decide what's the
desirable behavior, I prefer to have a config option. A Flink cluster can
be set to async, then
users don't need to wrap every DML in an ASYNC clause. This is the least
intrusive
way to the users.


Personally, I'm fine with following options in priority:

1) sync for batch DML and async for streaming DML
==> only breaks batch behavior, but makes both happy

2) async for both batch and streaming DML, and can be set to sync via a
configuration.
==> compatible, and provides flexible configurable behavior

3) sync for both batch and streaming DML, and can be
set to async via a configuration.
==> +0 for this, because it breaks all the compatibility, esp. our main
users.

Best,
Jark

On Mon, 8 Feb 2021 at 17:34, Timo Walther  wrote:

> Hi Jark, Hi Rui,
>
> 1) How should we execute statements in CLI and in file? Should there be
> a difference?
> So it seems we have consensus here with unified bahavior. Even though
> this means we are breaking existing batch INSERT INTOs that were
> asynchronous before.
>
> 2) Should we have different behavior for batch and streaming?
> I think also batch users prefer async behavior because usually even
> those pipelines take some time to execute. But we need should stick to
> standard SQL blocking semantics.
>
> What are your opinions on making async explicit in SQL via `BEGIN ASYNC;
> ... END;`? This would allow us to really have unified semantics because
> batch and streaming would behave the same?
>
> Regards,
> Timo
>
>
> On 07.02.21 04:46, Rui Li wrote:
> > Hi Timo,
> >
> > I agree with Jark that we should provide consistent experience regarding
> > SQL CLI and files. Some systems even allow users to execute SQL files in
> > the CLI, e.g. the "SOURCE" command in MySQL. If we want to support that
> in
> > the future, it's a little tricky to decide whether that should be treated
> > as CLI or file.
> >
> > I actually prefer a config option and let users decide what's the
> > desirable behavior. But if we have agreed not to use options, I'm also
> fine
> > with Alternative #1.
> >
> > On Sun, Feb 7, 2021 at 11:01 AM Jark Wu  wrote:
> >
> >> Hi Timo,
> >>
> >> 1) How should we execute statements in CLI and in file? Should there be
> a
> >> difference?
> >> I do think we should unify the behavior of CLI and SQL files. SQL files
> can
> >> be thought of as a shortcut of
> >> "start CLI" => "copy content of SQL files" => "past content in CLI".
> >> Actually, we already did this in kafka_e2e.sql [1].
> >> I think it's hard for users to understand why SQL files behave
> differently
> >> from CLI, all the other systems don't have such a difference.
> >>
> >> If we distinguish SQL files and CLI, should there be a difference in
> JDBC
> >> driver and UI platform?
> >> Personally, they all should have consistent behavior.
> >>
> >> 2) Should we have different behavior for batch and streaming?
> >> I think we all agree streaming users prefer async execution, otherwise
> it's
> >> weird and difficult to use if the
> >> submit script or CLI never exists. On the other hand, batch SQL users
> are
> >> used to SQL statements being
> >> executed blockly.
> >>
> >> Either unified async execution or unified sync execution, will hurt one
> >> side of the streaming
> >> batch users. In order to make both sides happy, I think we can have
> >> different behavior for batch and streaming.
> >> There are many essential differences between batch and stream systems, I
> >> think it's normal to have some
> >> different behaviors, and the behavior doesn't break the unified batch
> >> stream semantics.
> >>
> >>
> >> Thus, I'm +1 to Alternative 1:
> >> We consider batch/streaming mode and block for batch INSERT INTO and
> async
> >> for streaming INSERT INTO/STATEMENT SET.
> >> And this behavior is consistent across CLI and files.
> >>
> >> Best,
> >> Jark
> >>
> >> [1]:
> >>
> >>
> https://github.com/apache/flink/blob/master/flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/src/test/resources/kafka_e2e.sql
> >>
> >> On Fri, 5 Feb 2021 at 21:49, Timo Walther  wrote:
> >>
> >>> Hi Jark,
> >>>
> >>> thanks for the summary. I hope we can also find a good long-term
> >>> solution on the async/sync execution behavior topic.
> >>>
> >>> It should be discussed in a bigger round because it is (similar to the
> >>> time function discussion) related to batch-streaming unification where
> >>> we should stick to the SQL standard to some degree but also need to
> come
> 

Re: [DISCUSS] FLIP-164: Improve Schema Handling in Catalogs

2021-02-08 Thread Dawid Wysakowicz
Hi Timo,

From my perspective the proposed changes look good. I agree it is an
important step towards FLIP-129 and FLIP-136. Personally I feel
comfortable voting on the document.

Best,

Dawid

On 05/02/2021 16:09, Timo Walther wrote:
> Hi everyone,
>
> you might have seen that we discussed a better schema API in past as
> part of FLIP-129 and FLIP-136. We also discussed this topic during
> different releases:
>
> https://issues.apache.org/jira/browse/FLINK-17793
>
> Jark and I had an offline discussion how we can finally fix this
> shortcoming and maintain backwards compatibile for a couple of
> releases to give people time to update their code.
>
> I would like to propose the following FLIP:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs
>
>
> The FLIP updates the class hierarchy to achieve the following goals:
>
> - make it visible whether a schema is resolved or unresolved and when
> the resolution happens
> - offer a unified API for FLIP-129, FLIP-136, and catalogs
> - allow arbitrary data types and expressions in the schema for
> watermark spec or columns
> - have access to other catalogs for declaring a data type or
> expression via CatalogManager
> - a cleaned up TableSchema
> - remain backwards compatible in the persisted properties and API
>
> Looking forward to your feedback.
>
> Thanks,
> Timo



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] Releasing Apache Flink 1.12.2

2021-02-08 Thread Matthias Pohl
Thanks Yuan for driving 1.12.2.
+1 for releasing 1.12.2

One comment about FLINK-21030 [1]: I hope to fix it this week. But there
are still some uncertainties. The underlying problem is older than 1.12.
Hence, the suggestion is to not block the 1.12.2 release because of
FLINK-21030 [1]. I will leave it as a blocker issue, though, to underline
that it should be fixed for 1.13.

Best,
Matthias

[1] https://issues.apache.org/jira/browse/FLINK-21030

On Sun, Feb 7, 2021 at 4:10 AM Xintong Song  wrote:

> Thanks Yuan,
>
> +1 for releasing 1.12.2 and Yuan as the release manager.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Sat, Feb 6, 2021 at 3:41 PM Yu Li  wrote:
>
> > +1 for releasing 1.12.2, and thanks for volunteering to be our release
> > manager Yuan.
> >
> > Besides the mentioned issues, I could see two more blockers with 1.12.2
> as
> > fix version [1] and need some tracking:
> > * FLINK-21013  Blink
> > planner does not ingest timestamp into StreamRecord
> > * FLINK- 21030 
> > Broken
> > job restart for job with disjoint graph
> >
> > Best Regards,
> > Yu
> >
> > [1] https://s.apache.org/flink-1.12.2-blockers
> >
> >
> > On Sat, 6 Feb 2021 at 08:57, Kurt Young  wrote:
> >
> > > Thanks for being our release manager Yuan.
> > >
> > > We found a out of memory issue [1] which will affect most batch jobs
> > thus I
> > > think
> > > it would be great if we can include this fix in 1.12.2.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-20663
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Sat, Feb 6, 2021 at 12:36 AM Till Rohrmann 
> > > wrote:
> > >
> > > > Thanks for volunteering for being our release manager Yuan :-)
> > > >
> > > > +1 for a timely bug fix release.
> > > >
> > > > I will try to review the PR for FLINK- 20417 [1] which is a good fix
> to
> > > > include in the next bug fix release. We don't have to block the
> release
> > > on
> > > > this fix though.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-20417
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Fri, Feb 5, 2021 at 5:12 PM Piotr Nowojski 
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Thanks Yuan for bringing up this topic.
> > > > >
> > > > > +1 for the quick 1.12.2 release.
> > > > >
> > > > > As Yuan mentioned, me and Roman can help whenever committer rights
> > will
> > > > be
> > > > > required.
> > > > >
> > > > > > I am a Firefox user and I just fixed a long lasting bug
> > > > > https://github.com/apache/flink/pull/14848 , wish this would be
> > merged
> > > > in
> > > > > this release.
> > > > >
> > > > > I will push to speed up review of your PR. Let's try to merge it
> > before
> > > > > 1.12.2, but at the same time I wouldn't block the release on this
> > bug.
> > > > >
> > > > > Best,
> > > > > Piotrek
> > > > >
> > > > > pt., 5 lut 2021 o 12:12 郁蓝 
> napisał(a):
> > > > >
> > > > > > Hi Yuan,
> > > > > >
> > > > > >
> > > > > > I am a Firefox user and I just fixed a long lasting bug
> > > > > > https://github.com/apache/flink/pull/14848 , wish this would be
> > > merged
> > > > > in
> > > > > > this release.
> > > > > >
> > > > > >
> > > > > > Best wishes
> > > > > >
> > > > > >
> > > > > >
> > > > > > --原始邮件--
> > > > > > 发件人:
> > > > > >   "dev"
> > > > > > <
> > > > > > yuanmei.w...@gmail.com;
> > > > > > 发送时间:2021年2月5日(星期五) 晚上6:36
> > > > > > 收件人:"dev" > > > > >
> > > > > > 主题:[DISCUSS] Releasing Apache Flink 1.12.2
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hey devs,
> > > > > >
> > > > > > One of the major issues that have not been resolved in Apache
> Flink
> > > > > 1.12.1
> > > > > > is "unaligned
> > > > > > checkpoint recovery may lead to corrupted data stream"[1]. Since
> > the
> > > > > > problem is now fixed and
> > > > > > it is critical to the users, I would like to kick off a
> discussion
> > on
> > > > > > releasing Flink 1.12.2 that
> > > > > > includes unaligned checkpoint fixes.
> > > > > >
> > > > > > I would like to volunteer myself for managing this release. But I
> > > > noticed
> > > > > > that some of the release
> > > > > > steps may require committer authorities. Luckily, Piotr and Roman
> > are
> > > > > very
> > > > > > kind to provide help
> > > > > > on such steps.
> > > > > >
> > > > > > Apart from the unaligned checkpoint issue, please let us know in
> > this
> > > > > > thread if there are any
> > > > > > other fixes that we should try to include in this version. I'll
> try
> > > to
> > > > > > communicate with the issue
> > > > > > owners and come up with a time estimation next week.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-20654
> > > > > >
> > > > > > Best
> > > > > > Yuan
> > > > >
> > > >
> > >
> >


[jira] [Created] (FLINK-21320) Stream from FlinkKafkaShuffle.persistentKeyBy does not end if upstream ends

2021-02-08 Thread Kezhu Wang (Jira)
Kezhu Wang created FLINK-21320:
--

 Summary: Stream from FlinkKafkaShuffle.persistentKeyBy does not 
end if upstream ends
 Key: FLINK-21320
 URL: https://issues.apache.org/jira/browse/FLINK-21320
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.13.0
Reporter: Kezhu Wang


I pushed a [test 
case|https://github.com/kezhuw/flink/commit/5793083c370241717c485b5663a10f72884e028b]
 in my repository for evaluation, that test will hang after all input consumed.

I think we could do something(such as setting {{running}} to {{false}} upon 
{{Watermark.MAX_WATERMARK}}) in 
{{KafkaShuffleFetcher.partitionConsumerRecordsHandler}} to end kafka consumer.

cc [~ym]  [~sewen] [~AHeise] [~pnowojski]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Flink 1.11.4 release? And usual plan for maintenance etc

2021-02-08 Thread Chesnay Schepler
The 1.13 feature freeze is scheduled for the end of march, and the 
release can be expected to happen between 2-4 weeks after that.


On 2/8/2021 11:21 AM, Adam Roberts wrote:

Hey Till, all - thanks for the information, very useful to know.

So the JIRAs in particular are for upgrading the version of Jackson
https://issues.apache.org/jira/browse/FLINK-21020, and there's another for
Netty as well https://issues.apache.org/jira/browse/FLINK-21019 but
admittedly I'm not sure how applicable/easy the former is to do against the
1.11.x stream.

Of course, if we can pick up any bug fixes and potentially security fixes
along the way that would be great - 35 fixes sounds good to me.

*Just out of interest, if you had to guess when do you think 1.13 would be
released?*

Thanks again,


On Fri, 5 Feb 2021 at 16:48, Till Rohrmann  wrote:


Hi Adam,

Currently there are no concrete plans for the 1.11.4 release. Usually the
community creates a new release after enough fixes have been merged or if
there is enough demand for it. We have indeed accumulated 35 fixes for
1.11.4 already which could justify a new release. Which fixes do you need
in particular?

1.11.x is still actively maintained. The community's update policy is a bit
hidden here [1]. In a nutshell, the community supports the latest two major
releases with bug fixes for critical issues. Consequently, we are currently
supporting 1.11.x and 1.12.x.

Usually enhancements are only backported if they don't change Flink's
behaviour and if there is demand for it. So usually, this does not happen.

Once Flink 1.13 is released, Flink 1.11.x won't be maintained anymore by
the community. This means that problems won't be fixed in the 1.11.x
release branch.

[1] https://flink.apache.org/downloads.html#update-policy-for-old-releases

Cheers,
Till

On Fri, Feb 5, 2021 at 10:46 AM Adam Roberts 
wrote:


Hey everyone, a few short questions please - is there an expected release
date/target for 1.11.4 please? I ask as I've noticed a couple of

dependency

upgrades that I'd really like to pick up if possible, instead of moving

to

Flink 1.12 or 1.13 due to the existing Flink jobs we have.



My second question is related...is the 1.11.x line still actively
maintained?



Is it just receiving fixes and patches or enhancements too?



And once 1.13 is released, might the 1.11x release line be restricted to
just patches?



Curious how things are done, obviously if there is documentation I've

just

missed out please send that my way.



Thanks!





Re: Flink 1.11.4 release? And usual plan for maintenance etc

2021-02-08 Thread Adam Roberts
Hey Till, all - thanks for the information, very useful to know.

So the JIRAs in particular are for upgrading the version of Jackson
https://issues.apache.org/jira/browse/FLINK-21020, and there's another for
Netty as well https://issues.apache.org/jira/browse/FLINK-21019 but
admittedly I'm not sure how applicable/easy the former is to do against the
1.11.x stream.

Of course, if we can pick up any bug fixes and potentially security fixes
along the way that would be great - 35 fixes sounds good to me.

*Just out of interest, if you had to guess when do you think 1.13 would be
released?*

Thanks again,


On Fri, 5 Feb 2021 at 16:48, Till Rohrmann  wrote:

> Hi Adam,
>
> Currently there are no concrete plans for the 1.11.4 release. Usually the
> community creates a new release after enough fixes have been merged or if
> there is enough demand for it. We have indeed accumulated 35 fixes for
> 1.11.4 already which could justify a new release. Which fixes do you need
> in particular?
>
> 1.11.x is still actively maintained. The community's update policy is a bit
> hidden here [1]. In a nutshell, the community supports the latest two major
> releases with bug fixes for critical issues. Consequently, we are currently
> supporting 1.11.x and 1.12.x.
>
> Usually enhancements are only backported if they don't change Flink's
> behaviour and if there is demand for it. So usually, this does not happen.
>
> Once Flink 1.13 is released, Flink 1.11.x won't be maintained anymore by
> the community. This means that problems won't be fixed in the 1.11.x
> release branch.
>
> [1] https://flink.apache.org/downloads.html#update-policy-for-old-releases
>
> Cheers,
> Till
>
> On Fri, Feb 5, 2021 at 10:46 AM Adam Roberts 
> wrote:
>
> > Hey everyone, a few short questions please - is there an expected release
> > date/target for 1.11.4 please? I ask as I've noticed a couple of
> dependency
> > upgrades that I'd really like to pick up if possible, instead of moving
> to
> > Flink 1.12 or 1.13 due to the existing Flink jobs we have.
> >
> >
> >
> > My second question is related...is the 1.11.x line still actively
> > maintained?
> >
> >
> >
> > Is it just receiving fixes and patches or enhancements too?
> >
> >
> >
> > And once 1.13 is released, might the 1.11x release line be restricted to
> > just patches?
> >
> >
> >
> > Curious how things are done, obviously if there is documentation I've
> just
> > missed out please send that my way.
> >
> >
> >
> > Thanks!
> >
>


Re: [DISCUSS]FLIP-163: SQL Client Improvements

2021-02-08 Thread Timo Walther

Hi Jark, Hi Rui,

1) How should we execute statements in CLI and in file? Should there be 
a difference?
So it seems we have consensus here with unified bahavior. Even though 
this means we are breaking existing batch INSERT INTOs that were 
asynchronous before.


2) Should we have different behavior for batch and streaming?
I think also batch users prefer async behavior because usually even 
those pipelines take some time to execute. But we need should stick to 
standard SQL blocking semantics.


What are your opinions on making async explicit in SQL via `BEGIN ASYNC; 
... END;`? This would allow us to really have unified semantics because 
batch and streaming would behave the same?


Regards,
Timo


On 07.02.21 04:46, Rui Li wrote:

Hi Timo,

I agree with Jark that we should provide consistent experience regarding
SQL CLI and files. Some systems even allow users to execute SQL files in
the CLI, e.g. the "SOURCE" command in MySQL. If we want to support that in
the future, it's a little tricky to decide whether that should be treated
as CLI or file.

I actually prefer a config option and let users decide what's the
desirable behavior. But if we have agreed not to use options, I'm also fine
with Alternative #1.

On Sun, Feb 7, 2021 at 11:01 AM Jark Wu  wrote:


Hi Timo,

1) How should we execute statements in CLI and in file? Should there be a
difference?
I do think we should unify the behavior of CLI and SQL files. SQL files can
be thought of as a shortcut of
"start CLI" => "copy content of SQL files" => "past content in CLI".
Actually, we already did this in kafka_e2e.sql [1].
I think it's hard for users to understand why SQL files behave differently
from CLI, all the other systems don't have such a difference.

If we distinguish SQL files and CLI, should there be a difference in JDBC
driver and UI platform?
Personally, they all should have consistent behavior.

2) Should we have different behavior for batch and streaming?
I think we all agree streaming users prefer async execution, otherwise it's
weird and difficult to use if the
submit script or CLI never exists. On the other hand, batch SQL users are
used to SQL statements being
executed blockly.

Either unified async execution or unified sync execution, will hurt one
side of the streaming
batch users. In order to make both sides happy, I think we can have
different behavior for batch and streaming.
There are many essential differences between batch and stream systems, I
think it's normal to have some
different behaviors, and the behavior doesn't break the unified batch
stream semantics.


Thus, I'm +1 to Alternative 1:
We consider batch/streaming mode and block for batch INSERT INTO and async
for streaming INSERT INTO/STATEMENT SET.
And this behavior is consistent across CLI and files.

Best,
Jark

[1]:

https://github.com/apache/flink/blob/master/flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/src/test/resources/kafka_e2e.sql

On Fri, 5 Feb 2021 at 21:49, Timo Walther  wrote:


Hi Jark,

thanks for the summary. I hope we can also find a good long-term
solution on the async/sync execution behavior topic.

It should be discussed in a bigger round because it is (similar to the
time function discussion) related to batch-streaming unification where
we should stick to the SQL standard to some degree but also need to come
up with good streaming semantics.

Let me summarize the problem again to hear opinions:

- Batch SQL users are used to execute SQL files sequentially (from top
to bottom).
- Batch SQL users are used to SQL statements being executed blocking.
One after the other. Esp. when moving around data with INSERT INTO.
- Streaming users prefer async execution because unbounded stream are
more frequent than bounded streams.
- We decided to make Flink Table API is async because in a programming
language it is easy to call `.await()` on the result to make it blocking.
- INSERT INTO statements in the current SQL Client implementation are
always submitted asynchrounous.
- Other client's such as Ververica platform allow only one INSERT INTO
or a STATEMENT SET at the end of a file that will run asynchrounously.

Questions:

- How should we execute statements in CLI and in file? Should there be a
difference?
- Should we have different behavior for batch and streaming?
- Shall we solve parts with a config option or is it better to make it
explicit in the SQL job definition because it influences the semantics
of multiple INSERT INTOs?

Let me summarize my opinion at the moment:

- SQL files should always be executed blocking by default. Because they
could potentially contain a long list of INSERT INTO statements. This
would be SQL standard compliant.
- If we allow async execution, we should make this explicit in the SQL
file via `BEGIN ASYNC; ... END;`.
- In the CLI, we always execute async to maintain the old behavior. We
can also assume that people are only using the CLI to fire statements
and close the CLI afterwards.

Alternative 1:
- We 

Re: Activate bloom filter in RocksDB State Backend via Flink configuration

2021-02-08 Thread Till Rohrmann
Hi Jun,

Making things easier to use and configure is a good idea. Hence, +1 for
this proposal. Maybe create a JIRA ticket for it.

For the concrete default values it would be nice to hear the opinion of a
RocksDB expert.

Cheers,
Till

On Sun, Feb 7, 2021 at 7:23 PM Jun Qin  wrote:

> Hi,
>
> Activating bloom filter in the RocksDB state backend improves read
> performance. Currently activating bloom filter can only be done by
> implementing a custom ConfigurableRocksDBOptionsFactory. I think we should
> provide an option to activate bloom filter via Flink configuration.  What
> do you think? If so, what about the following configuration?
>
> state.backend.rocksdb.bloom-filter.enabled: false (default)
> state.backend.rocksdb.bloom-filter.bits-per-key: 10 (default)
> state.backend.rocksdb.bloom-filter.block-based: true (default)
>
>
> Thanks
> Jun


Re: Flink upset-kaka connector not working with Avro confluent

2021-02-08 Thread Till Rohrmann
Hi Shamit,

thanks for reaching out to the community. I am pulling in Timo who might
know more about this problem.

Cheers,
Till

On Sun, Feb 7, 2021 at 6:22 AM shamit jain  wrote:

> Hello Team,
>
> I am facing issue with "upsert-kafka" connector which should read the Avro
> message serialized using
> "io.confluent.kafka.serializers.KafkaAvroDeserializer". Same is working
> with "kafka" connector.
>
> Looks like we are not able to pass the schema registry url and subject
> name like the way we are passing while using "kafka" connector.
>
> Please help.
>
>
> Table definition with upsert-kafka is below (not working),
>
> CREATE TABLE proposalLine (PROPOSAL_LINE_ID
> bigint,LAST_MODIFIED_BY STRING ,PRIMARY KEY (PROPOSAL_LINE_ID) NOT ENFORCED
> ) "WITH ('connector' = 'upsert-kafka', 'properties.bootstrap.servers' =
> 'localhost:9092', 'properties.auto.offset.reset' = 'earliest', 'topic' =
> 'lndcdcadsprpslproposalline', 'key.format' = 'avro', 'value.format' =
> 'avro', 'properties.group.id'='dd', 'properties.schema.registry.url'=' href="http://localhost:8081'">http://localhost:8081',
> 'properties.key.deserializer'='org.apache.kafka.common.serialization.LongDeserializer',
> 'properties.value.deserializer'='io.confluent.kafka.serializers.KafkaAvroDeserializer')
>
> ERROR:
>  Caused by: java.io.IOException: Failed to deserialize Avro record.
> at
> org.apache.flink.formats.avro.AvroRowDataDeserializationSchema.deserialize(AvroRowDataDeserializationSchema.java:101)
> at
> org.apache.flink.formats.avro.AvroRowDataDeserializationSchema.deserialize(AvroRowDataDeserializationSchema.java:44)
> at
> org.apache.flink.api.common.serialization.DeserializationSchema.deserialize(DeserializationSchema.java:82)
> at
> org.apache.flink.streaming.connectors.kafka.table.DynamicKafkaDeserializationSchema.deserialize(DynamicKafkaDeserializationSchema.java:130)
> at
> org.apache.flink.streaming.connectors.kafka.internals.KafkaFetcher.partitionConsumerRecordsHandler(KafkaFetcher.java:179)
> at
> org.apache.flink.streaming.connectors.kafka.internals.KafkaFetcher.runFetchLoop(KafkaFetcher.java:142)
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:826)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:66)
> at
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:241)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:460)
> at org.apache.avro.io
> .ResolvingDecoder.readIndex(ResolvingDecoder.java:283)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at
> org.apache.flink.formats.avro.AvroDeserializationSchema.deserialize(AvroDeserializationSchema.java:139)
> at
> org.apache.flink.formats.avro.AvroRowDataDeserializationSchema.deserialize(AvroRowDataDeserializationSchema.java:98)
> ... 9 more
>
>
> Table definition with kafka connector is below (working),
> CREATE TABLE proposalLine (PROPOSAL_LINE_ID bigint,LAST_MODIFIED_BY String
> ) WITH ('connector' = 'kafka', 'properties.bootstrap.servers' =
> 'localhost:9092', 'properties.auto.offset.reset' = 'earliest', 'topic' =
> 'lndcdcadsprpslproposalline',
> 'format'='avro-confluent','avro-confluent.schema-registry.url' = 'http://localhost:8081'">http://localhost:8081',
> 'avro-confluent.schema-registry.subject' =
> 'lndcdcadsprpslproposalline-value')
>
> Regards,
> Shamit