[jira] [Created] (FLINK-23355) Use Flink SQL ,How to implement insert into seq to multi sinks?

2021-07-12 Thread Bo Wang (Jira)
Bo Wang created FLINK-23355:
---

 Summary: Use Flink SQL ,How to implement insert into seq to multi 
sinks?
 Key: FLINK-23355
 URL: https://issues.apache.org/jira/browse/FLINK-23355
 Project: Flink
  Issue Type: Technical Debt
  Components: Table SQL / API
Affects Versions: 1.12.0
Reporter: Bo Wang


Now, I want to use Flink SQL implement insert to multi sinks in seq, such as  I 
have one Kafka Table  source_table_one  and have two sink table , db table 
sink_db_one,  kafka table sink_kafka_two. 
I want to implement when the element from source_table_one arriavd, first  
insert into table sink_db_one select * from source_table_one when element be 
inserted,  second stage insert into table sink_kafka_two select * from 
source_table_one。 

the web system will listen the sink_kafka_two topic  and query data from 
sink_db_one table. 

 

anyone have some idea to implement this function ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Kete Young is now part of the Flink PMC

2019-07-23 Thread Bo WANG
Congratulations Kurt!


Best,

Bo WANG


On Tue, Jul 23, 2019 at 5:24 PM Robert Metzger  wrote:

> Hi all,
>
> On behalf of the Flink PMC, I'm happy to announce that Kete Young is now
> part of the Apache Flink Project Management Committee (PMC).
>
> Kete has been a committer since February 2017, working a lot on Table API /
> SQL. He's currently co-managing the 1.9 release! Thanks a lot for your work
> for Flink!
>
> Congratulations & Welcome Kurt!
>
> Best,
> Robert
>


Re: [ANNOUNCE] Zhijiang Wang has been added as a committer to the Flink project

2019-07-22 Thread Bo WANG
Congratulations Zhijiang!


Best,

Bo WANG


On Mon, Jul 22, 2019 at 10:12 PM Robert Metzger  wrote:

> Hey all,
>
> We've added another committer to the Flink project: Zhijiang Wang.
>
> Congratulations Zhijiang!
>
> Best,
> Robert
> (on behalf of the Flink PMC)
>


Re: [ANNOUNCE] Jincheng Sun is now part of the Flink PMC

2019-06-24 Thread Bo WANG
Congratulations Jincheng! Well deserved!

On Tue, Jun 25, 2019 at 10:10 AM Kurt Young  wrote:

> Congratulations Jincheng!
>
> Best,
> Kurt
>
>
> On Tue, Jun 25, 2019 at 9:56 AM LakeShen 
> wrote:
>
> > Congratulations! Jincheng Sun
> >
> > Best,
> > LakeShen
> >
> > Robert Metzger  于2019年6月24日周一 下午11:09写道:
> >
> > > Hi all,
> > >
> > > On behalf of the Flink PMC, I'm happy to announce that Jincheng Sun is
> > now
> > > part of the Apache Flink Project Management Committee (PMC).
> > >
> > > Jincheng has been a committer since July 2017. He has been very active
> on
> > > Flink's Table API / SQL component, as well as helping with releases.
> > >
> > > Congratulations & Welcome Jincheng!
> > >
> > > Best,
> > > Robert
> > >
> >
>


Re: [DISCUSS] Adaptive Parallelism of Job Vertex

2019-04-17 Thread Bo WANG
Thanks Till for the comments.
We will implement a new adaptive parallelism supported scheduler in the new
schedule framework. Based on these schedule interfaces, we could do the
work in parallel.

On Tue, Apr 16, 2019 at 11:18 PM Till Rohrmann  wrote:

> Hi Bo Wang,
>
> thanks for proposing this design document. I think it is an interesting
> idea to improve Flink's execution efficiency.
>
> At the moment, the community is actively working on making Flink's
> scheduler pluggable. Once this is possible, we could try this feature out
> by implementing a scheduler which supports adaptive parallelism without
> affecting the existing code. I think this would be a nice approach to
> further evaluate and benchmark the implications of such a strategy. What do
> you think?
>
> Cheers,
> Till
>
> On Mon, Apr 8, 2019 at 10:28 AM Bo WANG  wrote:
>
> > Hi all,
> > In distribution computing system, execution parallelism is vital for both
> > resource efficiency and execution performance. In Flink, execution
> > parallelism is a pre-specified parameter, which is usually an empirical
> > value and thus might not be optimal on the various amount of data
> processed
> > by each task.
> >
> > Furthermore, a fixed parallelism cannot scale to varying data size, which
> > is common in production cluster, since we may not frequently change the
> > cluster configuration.
> >
> > Thus, we propose adaptively determine the execution parallelism of each
> > vertex at runtime based on the actual input data size and an ideal data
> > size processed by each task. The ideal data size is a pre-specified
> > parameter according to the property of the operator.
> >
> > The design doc is ready:
> >
> >
> https://docs.google.com/document/d/1ZxnoJ3SOxUk1PL2xC1t-kepq28Pg20IL6eVKUUOWSKY/edit?usp=sharing
> > ,
> > any comments are highly appreciated.
> >
>


[DISCUSS] Adaptive Parallelism of Job Vertex

2019-04-08 Thread Bo WANG
Hi all,
In distribution computing system, execution parallelism is vital for both
resource efficiency and execution performance. In Flink, execution
parallelism is a pre-specified parameter, which is usually an empirical
value and thus might not be optimal on the various amount of data processed
by each task.

Furthermore, a fixed parallelism cannot scale to varying data size, which
is common in production cluster, since we may not frequently change the
cluster configuration.

Thus, we propose adaptively determine the execution parallelism of each
vertex at runtime based on the actual input data size and an ideal data
size processed by each task. The ideal data size is a pre-specified
parameter according to the property of the operator.

The design doc is ready:
https://docs.google.com/document/d/1ZxnoJ3SOxUk1PL2xC1t-kepq28Pg20IL6eVKUUOWSKY/edit?usp=sharing,
any comments are highly appreciated.


Re: [DISCUSS] Shall we make SpillableSubpartition repeatedly readable to support fine grained recovery

2019-01-27 Thread Bo WANG
ause the proposed extension is actually necessary for proper batch
> > > fault tolerance (independent of the DataSet or Query Processor stack).
> > >
> > > I am adding Kurt to this thread - maybe he help us find that out.
> > >
> > > On Thu, Jan 24, 2019 at 2:36 PM Piotr Nowojski 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I’m not sure how much effort we will be willing to invest in the
> > existing
> > > > batch stack. We are currently focusing on the support of bounded
> > > > DataStreams (already done in Blink and will be merged to Flink soon)
> > and
> > > > unifing batch & stream under DataStream API.
> > > >
> > > > Piotrek
> > > >
> > > > > On 23 Jan 2019, at 04:45, Bo WANG  wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > When running the batch WordCount example,  I configured the job
> > > execution
> > > > > mode
> > > > > as BATCH_FORCED, and failover-strategy as region, I manually injected
> > > > some
> > > > > errors to let the execution fail in different phases. In some cases,
> > > the
> > > > > job could
> > > > > recovery from failover and became succeed, but in some cases, the job
> > > > > retried
> > > > > several times and failed.
> > > > >
> > > > > Example:
> > > > > - If the failure occurred before task read data, e.g., failed before
> > > > > invokable.invoke() in Task.java, failover could succeed.
> > > > > - If the failure occurred after task having read data, failover did
> > not
> > > > > work.
> > > > >
> > > > > Problem diagnose:
> > > > > Running the example described before, each ExecutionVertex is defined
> > > as
> > > > > a restart region, and the ResultPartitionType between executions is
> > > > > BLOCKING.
> > > > > Thus, SpillableSubpartition and SpillableSubpartitionView are used to
> > > > > write/read
> > > > > shuffle data, and data blocks are described as BufferConsumers stored
> > > in
> > > > a
> > > > > list
> > > > > called buffers, when task requires input data from
> > > > > SpillableSubpartitionView,
> > > > > BufferConsumers are REMOVED from buffers. Thus, when failures
> > occurred
> > > > > after having read data, some BufferConsumers have already released.
> > > > > Although tasks retried, the input data is incomplete.
> > > > >
> > > > > Fix Proposal:
> > > > > - BufferConsumer should not be removed from buffers until the
> > consumed
> > > > > ExecutionVertex is terminal.
> > > > > - SpillableSubpartition should not be released until the consumed
> > > > > ExecutionVertex is terminal.
> > > > > - SpillableSubpartition could creates multi
> > SpillableSubpartitionViews,
> > > > > each of which is corresponding to a ExecutionAttempt.
> > > > >
> > > > > Best,
> > > > > Bo
> > > >
> > > >
> > >
> > >
> >


[DISCUSS] Shall we make SpillableSubpartition repeatedly readable to support fine grained recovery

2019-01-22 Thread Bo WANG
Hi all,

When running the batch WordCount example,  I configured the job execution
mode
as BATCH_FORCED, and failover-strategy as region, I manually injected some
errors to let the execution fail in different phases. In some cases, the
job could
recovery from failover and became succeed, but in some cases, the job
retried
several times and failed.

Example:
- If the failure occurred before task read data, e.g., failed before
invokable.invoke() in Task.java, failover could succeed.
- If the failure occurred after task having read data, failover did not
work.

Problem diagnose:
Running the example described before, each ExecutionVertex is defined as
a restart region, and the ResultPartitionType between executions is
BLOCKING.
Thus, SpillableSubpartition and SpillableSubpartitionView are used to
write/read
shuffle data, and data blocks are described as BufferConsumers stored in a
list
called buffers, when task requires input data from
SpillableSubpartitionView,
BufferConsumers are REMOVED from buffers. Thus, when failures occurred
after having read data, some BufferConsumers have already released.
Although tasks retried, the input data is incomplete.

Fix Proposal:
- BufferConsumer should not be removed from buffers until the consumed
ExecutionVertex is terminal.
- SpillableSubpartition should not be released until the consumed
ExecutionVertex is terminal.
- SpillableSubpartition could creates multi SpillableSubpartitionViews,
each of which is corresponding to a ExecutionAttempt.

Best,
Bo


Apply for flink contributor permission

2018-12-24 Thread Bo WANG
Hi, everyone

We would like to make contribution to JIRA(FLINK-10644). Would anyone
kindly give me and my team member the contribution permission?
Our JIRA id: Ryantaocer, eaglewatcher.

Best regards
Bo