Hello Ron, thank you for your detailed answers!

For the Visitor pattern, I thought about it the other way around, so that 
operations visit the scheduler, and not vice-versa :) In this way operations 
can get the required information in order to be executed in a tailored way.

Thank you for your effort, but, as you say:
> furthermore, I think the current does not see the benefits of the time,
simpler instead of better, similar to the design of
CatalogModificationEvent[2] and CatalogModificationListener[3], the
developer only needs instanceof judgment.

In java, most of the times, `instanceof` is considered an anti-pattern, that's 
why I was also thinking about a command pattern (every operations defines an 
`execute` method). However, I also understand this part is not crucial for the 
FLIP under discussion, and the implementation details can simply wait for the 
PRs to come.

> After discussing with Shengkai offline, there is no need for this REST API
to support multiple tables to be refreshed at the same time, so it would be
more appropriate to put the materialized table identifier in the path of
the URL, thanks for the suggestion.

Very good!

Thank you!
On Apr 29, 2024 at 05:04 +0200, Ron Liu <ron9....@gmail.com>, wrote:
> Hi, Lorenzo
>
> > I have a question there: how can the gateway update the refreshHandler in
> the Catalog before getting it from the scheduler?
>
> The refreshHandler in CatalogMateriazedTable is null before getting it from
> the scheduler, you can look at the CatalogMaterializedTable.Builder[1] for
> more details.
>
> > You have a typo here: WorkflowScheudler -> WorkflowScheduler :)
>
> Fix it now, thanks very much.
>
> > For the operations part, I still think that the FLIP would benefit from
> providing a specific pattern for operations. You could either propose a
> command pattern [1] or a visitor pattern (where the scheduler visits the
> operation to get relevant info) [2] for those operations at your choice.
>
> Thank you for your input, I find it very useful. I tried to understand your
> thinking through code and implemented the following pseudo code using the
> visitor design pattern:
> 1. first defined WorkflowOperationVisitor, providing several overloaded
> visit methods.
>
> public interface WorkflowOperationVisitor {
>
> <T extends RefreshHandler> T visit(CreateWorkflowOperation
> createWorkflowOperation);
>
> void visit(ModifyWorkflowOperation operation);
> }
>
> 2. then in the WorkflowOperation add the accept method.
>
> @PublicEvolving
> public interface WorkflowOperation {
>
> void accept(WorkflowOperationVisitor visitor);
> }
>
>
> 3. in the WorkflowScheduler call the implementation class of
> WorkflowOperationVisitor, complete the corresponding operations.
>
> I recognize this design pattern purely from a code design point of view,
> but from the point of our specific scenario:
> 1. For CreateWorkflowOperation, the visit method needs to return
> RefreshHandler, for ModifyWorkflowOperation, such as suspend and resume,
> the visit method doesn't need to return RefreshHandler. parameter,
> currently for different WorkflowOperation, WorkflowOperationVisitor#accept
> can't be unified, so I think visitor may not be applicable here.
>
> 2. In addition, I think using the visitor pattern will add complexity to
> the WorkflowScheduler implementer, which needs to implement one more
> interface WorkflowOperationVisitor, this interface is not for the engine to
> use, so I don't see any benefit from this design at the moment.
>
> 3. furthermore, I think the current does not see the benefits of the time,
> simpler instead of better, similar to the design of
> CatalogModificationEvent[2] and CatalogModificationListener[3], the
> developer only needs instanceof judgment.
>
> To summarize, I don't think there is a need to introduce command or visitor
> pattern at present.
>
> > About the REST API, I will wait for your offline discussion :)
>
> After discussing with Shengkai offline, there is no need for this REST API
> to support multiple tables to be refreshed at the same time, so it would be
> more appropriate to put the materialized table identifier in the path of
> the URL, thanks for the suggestion.
>
> [1]
> https://github.com/apache/flink/blob/e412402ca4dfc438e28fb990dc53ea7809430aee/flink-table/flink-table-common/src/main/java/org/apache/flink/table/catalog/CatalogMaterializedTable.java#L264
> [2]
> https://github.com/apache/flink/blob/b1544e4e513d2b75b350c20dbb1c17a8232c22fd/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/catalog/listener/CatalogModificationEvent.java#L28
> [3]
> https://github.com/apache/flink/blob/b1544e4e513d2b75b350c20dbb1c17a8232c22fd/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/catalog/listener/CatalogModificationListener.java#L31
>
> Best,
> Ron
>
> Ron Liu <ron9....@gmail.com> 于2024年4月28日周日 23:53写道:
>
> > Hi, Shengkai
> >
> > Thanks for your feedback and suggestion, it looks very useful for this
> > proposal, regarding your question I made the following optimization:
> >
> > > > *WorkflowScheduler*
> > > > 1. How to get the exception details if `modifyRefreshWorkflow` fails?
> > > > 2. Could you give us an example about how to configure the scheduler?
> >
> > 1. Added a new WorkflowException, WorkflowScheduler's related method
> > signature will throw WorkflowException, when creating or modifying Workflow
> > encountered an exception, so that the framework will sense and deal with it.
> >
> > 2. Added a new Configuration section, introduced a new Option, and gave an
> > example of how to define the Scheduler in flink-conf.yaml.
> >
> > > > *SQL Gateway*
> > > > 1. SqlGatewayService requires Session as the input, but the REST API
> > doesn't need any Session information.
> > > > 2. Use "-" instead of "_" in the REST URI and camel case for fields in
> > request/response
> > > > 3. Do we need scheduleTime and scheduleTimeFormat together?
> >
> > 1. If it is designed as a synchronous API, it may lead to network jitter,
> > thread resource exhaustion and other problems, which I have not considered
> > before. The asynchronous API, although increasing the cost of use for the
> > user, is friendly to the SqlGatewayService, as well as the Client thread
> > resources. In summary as discussed offline, so I also tend to think that
> > all APIs of SqlGateway should be unified, and all should be asynchronous
> > APIs, and bound to session. I have updated the REST API section in FLIP.
> >
> > 2. thanks for the reminder, it has been updated
> >
> > 3. After rethinking, I think it can indeed be simpler, there is no need to
> > pass in a custom time format, scheduleTime can be unified to the SQL
> > standard timestamp format: 'yyyy-MM-dd HH:mm:ss', it is able to satisfy the
> > time related needs of materialized table.
> >
> > Based on your feedback, I have optimized and updated the FLIP related
> > section.
> >
> > Best,
> > Ron
> >
> >
> > Shengkai Fang <fskm...@gmail.com> 于2024年4月28日周日 15:47写道:
> >
> > > > Hi, Liu.
> > > >
> > > > Thanks for your proposal. I have some questions about the FLIP:
> > > >
> > > > *WorkflowScheduler*
> > > >
> > > > 1. How to get the exception details if `modifyRefreshWorkflow` fails?
> > > > 2. Could you give us an example about how to configure the scheduler?
> > > >
> > > > *SQL Gateway*
> > > >
> > > > 1. SqlGatewayService requires Session as the input, but the REST API
> > > > doesn't need any Session information.
> > > >
> > > > From the perspective of a gateway developer, I tend to unify the API of
> > > > the
> > > > SQL gateway, binding all concepts to the session. On the one hand, this
> > > > approach allows us to reduce maintenance and understanding costs, as we
> > > > only need to maintain one set of architecture to complete basic 
> > > > concepts.
> > > > On the other hand, the benefits of an asynchronous architecture are
> > > > evident: we maintain state on the server side. If the request is a long
> > > > connection, even in the face of network layer jitter, we can still find
> > > > the
> > > > original result through session and operation handles.
> > > >
> > > > Using asynchronous APIs may increase the development cost for users, but
> > > > from a platform perspective, if a request remains in a blocking state 
> > > > for
> > > > a
> > > > long time, it also becomes a burden on the platform's JVM. This is 
> > > > because
> > > > thread switching and maintenance require certain resources.
> > > >
> > > > 2. Use "-" instead of "_" in the REST URI and camel case for fields in
> > > > request/response
> > > >
> > > > Please follow the Flink REST Design.
> > > >
> > > > 3. Do we need scheduleTime and scheduleTimeFormat together?
> > > >
> > > > I think we can use SQL timestamp format or ISO timestamp format. It is 
> > > > not
> > > > necessary to pass time in any specific format.
> > > >
> > > > https://en.wikipedia.org/wiki/ISO_8601
> > > >
> > > > Best,
> > > > Shengkai
> > > >
> >

Reply via email to