I have drafted the communication messages between the master node of core
and the worker node of core as below,


RPC Desc Request Response  Resp Desc
registDQWorkerNode When DQ Worker node start, it will regist it self to
master then the master can submit tasks to the worker node String hostName int
code 200: the master returns 200 and then it will assigned tasks to the node
other:regist failed, and the woker will retry regularly until the master
returns 200
reportDQWorkNodeStatus The worker node will report self status regularly,
including the id list of running tasks, waitting tasks, success tasks,
failed tasks, etc. List<Integer> runningIdList
List<Integer> waittingIdList
List<Integer> successIdList
List<Integer> failedIdList int code 200: the master returns 200 and then it
will assign tasks to the node
submitDQTask The master can submit tasks to the worker node int instanceId int
code 200: the worker accepts the task and then the task's status can be
queried or be killed in this node
5XX: the task was rejected by the node because the error happened in the
node server
4XX: the task was rejected by the node because the request is error, like
find no task info by the instanceId, etc.
stopDQTask The master can stop tasks in the worker node int instanceId int
code 200: the worker process the request success,
5XX: the request was rejected by the node because the error happened in the
node server
4XX: the request was rejected by the node because the request is error,
like find no task info in the node by the instanceId, etc.
querySingleDQTask The master can DQ Task status from the worker node int
instanceId int code
int status 200: the worker process the request success, then the value of
status will descripe task's status
        0: init  1:  waitting 2: recording 3: evaluating 4: alerting 5:
success 6: failed
5XX: the request was rejected by the node because the error happened in the
node server
4XX: the request was rejected by the node because the request is error,
like find no task info in the node by the instanceId, etc.
nodeHeartBeat The heart beat message will be used to confirm the woker is
alive, and then the task can be assigned to the node long timestamp
int code 200:
the master accepts it and updates the worker info success
Could you please review it?

Warden Wang <wangd95...@gmail.com> 於 2022年10月23日 週日 下午3:22寫道:

> Hi all
>
> I have modified the communication messages between core and dispatcher
> ,and add the description  about code/value in Response
> definition as below,
>
> RPC Desc Request Response
> submitSql The core module submit record sql to dispatcher   SubmitRequest
> {
>      String recordSql;
>      Enum engine;  // Spark,Hive,Presto,etc.
>      String owner;
>      Integer maxRetryCount;
> } SubmitResponse{
>      Integer code;
>      String jobId;
>      Enum errorCode;
>      Exception ex;
> }
> getJobStatus The core module get  the status of job from the dispatcher
> by jobid  JobStatusRequest {
>      String jobId;
> } JobStatusResponse{
>      Integer code;
>      Enum jobStatus;
>      Enum errorCode;
>      Exception ex;
> }
> getMetricResult The core module will get the result of recordSql from
> the dispatcher by jobid  MetricRequest {
>      String jobId;
> } MetricResponse{
>      Integer code;
>      Double metric;
>      Enum errorCode;
>      Exception ex;
> }
> validateSQL The core module submit record sql to dispatcher for
> validating the syntax of record sql  ValidateSQLRequest {
>      String recordSql;
>      Enum engine;  // Spark,Hive,Presto,etc.
> } ValidateSQLResponse{
>      Integer code;
>      Enum errorCode;
>      Exception ex;
> }
> Code 200 The dispatcher accept the request and process success
> 400 The request is rejected by the dispatcher, because it's a bad request
> 500 The dispatcher accept the request but process failed
> ErrorCode 0 recordSql syntax error
> 1 internal error, dispatcher self is crashed
> 2  external error, target engine is crashed when dispatcher call,etc
> JobStatus 0 ACCEPTED
> 1 RUNNING
> 2 SUCCESS
> 3 FAILED
>
> Eugene Law <liu...@apache.org> 於 2022年10月20日 週四 晚上10:52寫道:
>
>> Hi,
>>
>> Can you give a more detailed description about code/value in Response
>> definition?
>>
>> RecordResponse {
>>  Integer code;
>>  Long value;
>>  Enum errorCode; // if code != 200, please tell us what happened:
>>      //    1、 recordSql syntax error
>>      //    2、 internal error, dispatcher self is crashed
>>      //    3、 external error, target engine is crashed when dispatcher
>> call,etc
>>  Exception ex;   // error detail info
>> }
>>
>> Thx
>> Eugene
>>
>> On Tue, Oct 18, 2022 at 10:19 PM William Guo <gu...@apache.org> wrote:
>>
>> > Hi Warden,
>> >
>> > I think type long is not sufficient for metrics,if it is a number, type
>> > double should be more appropriate.
>> >
>> > Thanks,
>> > William
>> >
>> > On Mon, Oct 17, 2022 at 10:55 PM Warden Wang <wangd95...@gmail.com>
>> wrote:
>> >
>> > > Hi Jianhua
>> > > For metrics, resp as long type is enough.
>> > >
>> > > Thanks
>> > > Warden
>> > >
>> > > Eugene Law <liu...@apache.org> 於 2022年10月13日 週四 晚上9:16寫道:
>> > >
>> > > > +1
>> > > >
>> > > > Let's clarify this. If the Griffin engine would be integrated into
>> an
>> > > > external system, we need to consider message protocol, such as
>> protobuf
>> > > or
>> > > > thrift.
>> > > >
>> > > > On the other hand, if the Griffin engine would connect various data
>> > > > engines, we only need to outline the data transfer structure.
>> > > >
>> > > > Thx
>> > > >
>> > > > On Thu, Oct 13, 2022 at 2:31 PM William Guo <gu...@apache.org>
>> wrote:
>> > > >
>> > > > > hi,
>> > > > >
>> > > > > My opinion is,
>> > > > >
>> > > > > for external, to integrate with apache griffin,
>> > > > > we will keep it open on the protocol, since we need to consider
>> > > > integration
>> > > > > efforts.
>> > > > >
>> > > > > but for  internal components,
>> > > > > say, to connect with different query engines(hive, spark, flink),
>> we
>> > > > prefer
>> > > > > JDBC, which is very typical to connect to different engines as a
>> hub.
>> > > > >
>> > > > > What do you think?
>> > > > >
>> > > > > Thanks,
>> > > > > William
>> > > > >
>> > > > > On Thu, Oct 13, 2022 at 12:16 PM Edgar Joya <euj...@gmail.com>
>> > wrote:
>> > > > >
>> > > > > > Will you be using proto files for grpc for the new architecture?
>> > > > > > https://grpc.io/
>> > > > > >
>> > > > > > On Wed, Oct 12, 2022 at 10:08 PM William Guo <gu...@apache.org>
>> > > wrote:
>> > > > > >
>> > > > > > > Hi jianhua,
>> > > > > > >
>> > > > > > > We cannot see the architecture diagram in wiki
>> > > > > > >
>> > > >
>> https://cwiki.apache.org/confluence/display/GRIFFIN/Components+Diagram
>> > > > > > > But we can see the sequence diagram there.
>> > > > > > >
>> > > > > > > Could you rephrase it ?
>> > > > > > >
>> > > > > > > And as Eugence said, we need to make it clear which protocol
>> we
>> > are
>> > > > > using
>> > > > > > > between different components.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > William
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Wed, Oct 12, 2022 at 9:57 PM Eugene Law <liu...@apache.org
>> >
>> > > > wrote:
>> > > > > > >
>> > > > > > > > Hi,
>> > > > > > > >
>> > > > > > > > Could you point out what kind of protocol definition schema
>> you
>> > > > used
>> > > > > in
>> > > > > > > > interface description? I think we should know the accurate
>> > > fields'
>> > > > > > range,
>> > > > > > > > type in different language.
>> > > > > > > >
>> > > > > > > > Thx
>> > > > > > > >
>> > > > > > > > On Wed, Oct 12, 2022 at 7:48 PM William Guo <
>> gu...@apache.org>
>> > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Thanks jianhua.
>> > > > > > > > >
>> > > > > > > > > Could you draw the sequence diagram in wiki
>> > > > > > > > >
>> > > > > > > > >
>> > > > > >
>> > >
>> https://cwiki.apache.org/confluence/display/GRIFFIN/Components+Diagram
>> > > > > > > > >
>> > > > > > > > > We cannot see your attachments in email.
>> > > > > > > > >
>> > > > > > > > > Thanks,
>> > > > > > > > > William
>> > > > > > > > >
>> > > > > > > > > On Wed, Oct 12, 2022 at 5:58 PM jianhua guo <
>> > > guojhk...@gmail.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > The core module and dispatcher's architecture is below:
>> > > > > > > > > > [image: image.png]
>> > > > > > > > > > 1. core module will generate the DQ sql according to the
>> > > > > different
>> > > > > > > > > > engine‘s syntax features,and wrapped in the submit
>> request.
>> > > > > > > > > > 2. dispatcher provided the restful api to accept the
>> > request
>> > > > from
>> > > > > > the
>> > > > > > > > > core
>> > > > > > > > > > server.
>> > > > > > > > > > 3. after received the request info, dispatcher will
>> submit
>> > > the
>> > > > > > query
>> > > > > > > to
>> > > > > > > > > > the specified engine(presto, spark, hive, flink)
>> > > > > > > > > > 4. dispatcher need provided a method to the core module,
>> > > named
>> > > > > > > > "execute"
>> > > > > > > > > > is nice, and in the method, it will continue to get job
>> > > status
>> > > > > from
>> > > > > > > > > > dispatcher,when the job is finished, then fetch the
>> result
>> > > from
>> > > > > > > > > dispatcher,
>> > > > > > > > > > then return the response to the core module.
>> > > > > > > > > > The below is the sequence diagram.
>> > > > > > > > > > [image: image.png]
>> > > > > > > > > > [image: griffinDispatcher1.png]
>> > > > > > > > > > [image: image.png]
>> > > > > > > > > > Thanks for you review.
>> > > > > > > > > >
>> > > > > > > > > > On Wed, Oct 12, 2022 at 5:09 PM jianhua guo <
>> > > > guojhk...@gmail.com
>> > > > > >
>> > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > >> Most looks good to me. Just one confusion, the
>> > > > "RecordResponse"
>> > > > > > only
>> > > > > > > > > >> return long value result?
>> > > > > > > > > >>
>> > > > > > > > > >> On Wed, Oct 12, 2022 at 1:22 PM William Guo <
>> > > gu...@apache.org
>> > > > >
>> > > > > > > wrote:
>> > > > > > > > > >>
>> > > > > > > > > >>> hi all,
>> > > > > > > > > >>>
>> > > > > > > > > >>> The following message from Warden, we can discuss
>> > > interaction
>> > > > > > > between
>> > > > > > > > > >>> core
>> > > > > > > > > >>> and dispatcher module here.
>> > > > > > > > > >>> ====================
>> > > > > > > > > >>> I have drafted the communication messages between core
>> > and
>> > > > > > > dispatcher
>> > > > > > > > > as
>> > > > > > > > > >>> below,
>> > > > > > > > > >>>
>> > > > > > > > > >>>
>> > > > > > > > > >>> // submit sql
>> > > > > > > > > >>> execute(RecordRequest) return RecordResponse;
>> > > > > > > > > >>>
>> > > > > > > > > >>> RecordRequest {
>> > > > > > > > > >>>  String recordSql;
>> > > > > > > > > >>>  Enum engine;  // Spark,Hive,Presto,etc.
>> > > > > > > > > >>>  String owner;
>> > > > > > > > > >>>  Integer maxRetryCount;
>> > > > > > > > > >>> }
>> > > > > > > > > >>>
>> > > > > > > > > >>> RecordResponse {
>> > > > > > > > > >>>  Integer code;
>> > > > > > > > > >>>  Long value;
>> > > > > > > > > >>>  Enum errorCode; // if code != 200, please tell us
>> what
>> > > > > happened:
>> > > > > > > > > >>>      //    1、 recordSql syntax error
>> > > > > > > > > >>>      //    2、 internal error, dispatcher self is
>> crashed
>> > > > > > > > > >>>      //    3、 external error, target engine is crashed
>> > when
>> > > > > > > > dispatcher
>> > > > > > > > > >>> call,etc
>> > > > > > > > > >>>  Exception ex;   // error detail info
>> > > > > > > > > >>> }
>> > > > > > > > > >>>
>> > > > > > > > > >>> // validate sql syntax
>> > > > > > > > > >>> validateSQL(CheckRequest) return CheckResponse;
>> > > > > > > > > >>>
>> > > > > > > > > >>> ValidateSQLRequest {
>> > > > > > > > > >>>  String recordSql;
>> > > > > > > > > >>>  Enum engine;  // Spark,Hive,Presto,etc.
>> > > > > > > > > >>> }
>> > > > > > > > > >>>
>> > > > > > > > > >>> ValidateSQLResponse {
>> > > > > > > > > >>>  Integer code;
>> > > > > > > > > >>>  Enum errorCode; // if code != 200, please tell us
>> what
>> > > > > happened:
>> > > > > > > > > >>>      //    1、 recordSql syntax error
>> > > > > > > > > >>>      //    2、 internal error, dispatcher self is
>> crashed
>> > > > > > > > > >>>      //    3、 external error, target engine is crashed
>> > when
>> > > > > > > > dispatcher
>> > > > > > > > > >>> call,etc
>> > > > > > > > > >>>  Exception ex;   // error detail info
>> > > > > > > > > >>> }
>> > > > > > > > > >>>
>> > > > > > > > > >>>
>> > > > > > > > > >>>
>> > > > > > > > > >>> Could you please review it? Give us your feedback.
>> > > > > > > > > >>>
>> > > > > > > > > >>> ==================
>> > > > > > > > > >>>
>> > > > > > > > > >>
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Edgar Joya
>> > > > > > 980.259.0683
>> > > > > > euj...@gmail.com
>> > > > > > @eujc21
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to