hi all, Warden has reshaped the document in the wiki
https://cwiki.apache.org/confluence/display/GRIFFIN/Components+Diagram We can review it from there. Thanks, William On Sun, Oct 23, 2022 at 3:28 PM Warden Wang <wangd95...@gmail.com> wrote: > I have drafted the communication messages between the master node of core > and the worker node of core as below, > > > RPC Desc Request Response Resp Desc > registDQWorkerNode When DQ Worker node start, it will regist it self to > master then the master can submit tasks to the worker node String hostName > int > code 200: the master returns 200 and then it will assigned tasks to the > node > other:regist failed, and the woker will retry regularly until the master > returns 200 > reportDQWorkNodeStatus The worker node will report self status regularly, > including the id list of running tasks, waitting tasks, success tasks, > failed tasks, etc. List<Integer> runningIdList > List<Integer> waittingIdList > List<Integer> successIdList > List<Integer> failedIdList int code 200: the master returns 200 and then it > will assign tasks to the node > submitDQTask The master can submit tasks to the worker node int instanceId > int > code 200: the worker accepts the task and then the task's status can be > queried or be killed in this node > 5XX: the task was rejected by the node because the error happened in the > node server > 4XX: the task was rejected by the node because the request is error, like > find no task info by the instanceId, etc. > stopDQTask The master can stop tasks in the worker node int instanceId int > code 200: the worker process the request success, > 5XX: the request was rejected by the node because the error happened in the > node server > 4XX: the request was rejected by the node because the request is error, > like find no task info in the node by the instanceId, etc. > querySingleDQTask The master can DQ Task status from the worker node int > instanceId int code > int status 200: the worker process the request success, then the value of > status will descripe task's status > 0: init 1: waitting 2: recording 3: evaluating 4: alerting 5: > success 6: failed > 5XX: the request was rejected by the node because the error happened in the > node server > 4XX: the request was rejected by the node because the request is error, > like find no task info in the node by the instanceId, etc. > nodeHeartBeat The heart beat message will be used to confirm the woker is > alive, and then the task can be assigned to the node long timestamp > int code 200: > the master accepts it and updates the worker info success > Could you please review it? > > Warden Wang <wangd95...@gmail.com> 於 2022年10月23日 週日 下午3:22寫道: > > > Hi all > > > > I have modified the communication messages between core and dispatcher > > ,and add the description about code/value in Response > > definition as below, > > > > RPC Desc Request Response > > submitSql The core module submit record sql to dispatcher SubmitRequest > > { > > String recordSql; > > Enum engine; // Spark,Hive,Presto,etc. > > String owner; > > Integer maxRetryCount; > > } SubmitResponse{ > > Integer code; > > String jobId; > > Enum errorCode; > > Exception ex; > > } > > getJobStatus The core module get the status of job from the dispatcher > > by jobid JobStatusRequest { > > String jobId; > > } JobStatusResponse{ > > Integer code; > > Enum jobStatus; > > Enum errorCode; > > Exception ex; > > } > > getMetricResult The core module will get the result of recordSql from > > the dispatcher by jobid MetricRequest { > > String jobId; > > } MetricResponse{ > > Integer code; > > Double metric; > > Enum errorCode; > > Exception ex; > > } > > validateSQL The core module submit record sql to dispatcher for > > validating the syntax of record sql ValidateSQLRequest { > > String recordSql; > > Enum engine; // Spark,Hive,Presto,etc. > > } ValidateSQLResponse{ > > Integer code; > > Enum errorCode; > > Exception ex; > > } > > Code 200 The dispatcher accept the request and process success > > 400 The request is rejected by the dispatcher, because it's a bad request > > 500 The dispatcher accept the request but process failed > > ErrorCode 0 recordSql syntax error > > 1 internal error, dispatcher self is crashed > > 2 external error, target engine is crashed when dispatcher call,etc > > JobStatus 0 ACCEPTED > > 1 RUNNING > > 2 SUCCESS > > 3 FAILED > > > > Eugene Law <liu...@apache.org> 於 2022年10月20日 週四 晚上10:52寫道: > > > >> Hi, > >> > >> Can you give a more detailed description about code/value in Response > >> definition? > >> > >> RecordResponse { > >> Integer code; > >> Long value; > >> Enum errorCode; // if code != 200, please tell us what happened: > >> // 1、 recordSql syntax error > >> // 2、 internal error, dispatcher self is crashed > >> // 3、 external error, target engine is crashed when dispatcher > >> call,etc > >> Exception ex; // error detail info > >> } > >> > >> Thx > >> Eugene > >> > >> On Tue, Oct 18, 2022 at 10:19 PM William Guo <gu...@apache.org> wrote: > >> > >> > Hi Warden, > >> > > >> > I think type long is not sufficient for metrics,if it is a number, > type > >> > double should be more appropriate. > >> > > >> > Thanks, > >> > William > >> > > >> > On Mon, Oct 17, 2022 at 10:55 PM Warden Wang <wangd95...@gmail.com> > >> wrote: > >> > > >> > > Hi Jianhua > >> > > For metrics, resp as long type is enough. > >> > > > >> > > Thanks > >> > > Warden > >> > > > >> > > Eugene Law <liu...@apache.org> 於 2022年10月13日 週四 晚上9:16寫道: > >> > > > >> > > > +1 > >> > > > > >> > > > Let's clarify this. If the Griffin engine would be integrated into > >> an > >> > > > external system, we need to consider message protocol, such as > >> protobuf > >> > > or > >> > > > thrift. > >> > > > > >> > > > On the other hand, if the Griffin engine would connect various > data > >> > > > engines, we only need to outline the data transfer structure. > >> > > > > >> > > > Thx > >> > > > > >> > > > On Thu, Oct 13, 2022 at 2:31 PM William Guo <gu...@apache.org> > >> wrote: > >> > > > > >> > > > > hi, > >> > > > > > >> > > > > My opinion is, > >> > > > > > >> > > > > for external, to integrate with apache griffin, > >> > > > > we will keep it open on the protocol, since we need to consider > >> > > > integration > >> > > > > efforts. > >> > > > > > >> > > > > but for internal components, > >> > > > > say, to connect with different query engines(hive, spark, > flink), > >> we > >> > > > prefer > >> > > > > JDBC, which is very typical to connect to different engines as a > >> hub. > >> > > > > > >> > > > > What do you think? > >> > > > > > >> > > > > Thanks, > >> > > > > William > >> > > > > > >> > > > > On Thu, Oct 13, 2022 at 12:16 PM Edgar Joya <euj...@gmail.com> > >> > wrote: > >> > > > > > >> > > > > > Will you be using proto files for grpc for the new > architecture? > >> > > > > > https://grpc.io/ > >> > > > > > > >> > > > > > On Wed, Oct 12, 2022 at 10:08 PM William Guo < > gu...@apache.org> > >> > > wrote: > >> > > > > > > >> > > > > > > Hi jianhua, > >> > > > > > > > >> > > > > > > We cannot see the architecture diagram in wiki > >> > > > > > > > >> > > > > >> https://cwiki.apache.org/confluence/display/GRIFFIN/Components+Diagram > >> > > > > > > But we can see the sequence diagram there. > >> > > > > > > > >> > > > > > > Could you rephrase it ? > >> > > > > > > > >> > > > > > > And as Eugence said, we need to make it clear which protocol > >> we > >> > are > >> > > > > using > >> > > > > > > between different components. > >> > > > > > > > >> > > > > > > Thanks, > >> > > > > > > William > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > On Wed, Oct 12, 2022 at 9:57 PM Eugene Law < > liu...@apache.org > >> > > >> > > > wrote: > >> > > > > > > > >> > > > > > > > Hi, > >> > > > > > > > > >> > > > > > > > Could you point out what kind of protocol definition > schema > >> you > >> > > > used > >> > > > > in > >> > > > > > > > interface description? I think we should know the accurate > >> > > fields' > >> > > > > > range, > >> > > > > > > > type in different language. > >> > > > > > > > > >> > > > > > > > Thx > >> > > > > > > > > >> > > > > > > > On Wed, Oct 12, 2022 at 7:48 PM William Guo < > >> gu...@apache.org> > >> > > > > wrote: > >> > > > > > > > > >> > > > > > > > > Thanks jianhua. > >> > > > > > > > > > >> > > > > > > > > Could you draw the sequence diagram in wiki > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > >> https://cwiki.apache.org/confluence/display/GRIFFIN/Components+Diagram > >> > > > > > > > > > >> > > > > > > > > We cannot see your attachments in email. > >> > > > > > > > > > >> > > > > > > > > Thanks, > >> > > > > > > > > William > >> > > > > > > > > > >> > > > > > > > > On Wed, Oct 12, 2022 at 5:58 PM jianhua guo < > >> > > guojhk...@gmail.com > >> > > > > > >> > > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > > > The core module and dispatcher's architecture is > below: > >> > > > > > > > > > [image: image.png] > >> > > > > > > > > > 1. core module will generate the DQ sql according to > the > >> > > > > different > >> > > > > > > > > > engine‘s syntax features,and wrapped in the submit > >> request. > >> > > > > > > > > > 2. dispatcher provided the restful api to accept the > >> > request > >> > > > from > >> > > > > > the > >> > > > > > > > > core > >> > > > > > > > > > server. > >> > > > > > > > > > 3. after received the request info, dispatcher will > >> submit > >> > > the > >> > > > > > query > >> > > > > > > to > >> > > > > > > > > > the specified engine(presto, spark, hive, flink) > >> > > > > > > > > > 4. dispatcher need provided a method to the core > module, > >> > > named > >> > > > > > > > "execute" > >> > > > > > > > > > is nice, and in the method, it will continue to get > job > >> > > status > >> > > > > from > >> > > > > > > > > > dispatcher,when the job is finished, then fetch the > >> result > >> > > from > >> > > > > > > > > dispatcher, > >> > > > > > > > > > then return the response to the core module. > >> > > > > > > > > > The below is the sequence diagram. > >> > > > > > > > > > [image: image.png] > >> > > > > > > > > > [image: griffinDispatcher1.png] > >> > > > > > > > > > [image: image.png] > >> > > > > > > > > > Thanks for you review. > >> > > > > > > > > > > >> > > > > > > > > > On Wed, Oct 12, 2022 at 5:09 PM jianhua guo < > >> > > > guojhk...@gmail.com > >> > > > > > > >> > > > > > > > wrote: > >> > > > > > > > > > > >> > > > > > > > > >> Most looks good to me. Just one confusion, the > >> > > > "RecordResponse" > >> > > > > > only > >> > > > > > > > > >> return long value result? > >> > > > > > > > > >> > >> > > > > > > > > >> On Wed, Oct 12, 2022 at 1:22 PM William Guo < > >> > > gu...@apache.org > >> > > > > > >> > > > > > > wrote: > >> > > > > > > > > >> > >> > > > > > > > > >>> hi all, > >> > > > > > > > > >>> > >> > > > > > > > > >>> The following message from Warden, we can discuss > >> > > interaction > >> > > > > > > between > >> > > > > > > > > >>> core > >> > > > > > > > > >>> and dispatcher module here. > >> > > > > > > > > >>> ==================== > >> > > > > > > > > >>> I have drafted the communication messages between > core > >> > and > >> > > > > > > dispatcher > >> > > > > > > > > as > >> > > > > > > > > >>> below, > >> > > > > > > > > >>> > >> > > > > > > > > >>> > >> > > > > > > > > >>> // submit sql > >> > > > > > > > > >>> execute(RecordRequest) return RecordResponse; > >> > > > > > > > > >>> > >> > > > > > > > > >>> RecordRequest { > >> > > > > > > > > >>> String recordSql; > >> > > > > > > > > >>> Enum engine; // Spark,Hive,Presto,etc. > >> > > > > > > > > >>> String owner; > >> > > > > > > > > >>> Integer maxRetryCount; > >> > > > > > > > > >>> } > >> > > > > > > > > >>> > >> > > > > > > > > >>> RecordResponse { > >> > > > > > > > > >>> Integer code; > >> > > > > > > > > >>> Long value; > >> > > > > > > > > >>> Enum errorCode; // if code != 200, please tell us > >> what > >> > > > > happened: > >> > > > > > > > > >>> // 1、 recordSql syntax error > >> > > > > > > > > >>> // 2、 internal error, dispatcher self is > >> crashed > >> > > > > > > > > >>> // 3、 external error, target engine is > crashed > >> > when > >> > > > > > > > dispatcher > >> > > > > > > > > >>> call,etc > >> > > > > > > > > >>> Exception ex; // error detail info > >> > > > > > > > > >>> } > >> > > > > > > > > >>> > >> > > > > > > > > >>> // validate sql syntax > >> > > > > > > > > >>> validateSQL(CheckRequest) return CheckResponse; > >> > > > > > > > > >>> > >> > > > > > > > > >>> ValidateSQLRequest { > >> > > > > > > > > >>> String recordSql; > >> > > > > > > > > >>> Enum engine; // Spark,Hive,Presto,etc. > >> > > > > > > > > >>> } > >> > > > > > > > > >>> > >> > > > > > > > > >>> ValidateSQLResponse { > >> > > > > > > > > >>> Integer code; > >> > > > > > > > > >>> Enum errorCode; // if code != 200, please tell us > >> what > >> > > > > happened: > >> > > > > > > > > >>> // 1、 recordSql syntax error > >> > > > > > > > > >>> // 2、 internal error, dispatcher self is > >> crashed > >> > > > > > > > > >>> // 3、 external error, target engine is > crashed > >> > when > >> > > > > > > > dispatcher > >> > > > > > > > > >>> call,etc > >> > > > > > > > > >>> Exception ex; // error detail info > >> > > > > > > > > >>> } > >> > > > > > > > > >>> > >> > > > > > > > > >>> > >> > > > > > > > > >>> > >> > > > > > > > > >>> Could you please review it? Give us your feedback. > >> > > > > > > > > >>> > >> > > > > > > > > >>> ================== > >> > > > > > > > > >>> > >> > > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > -- > >> > > > > > Edgar Joya > >> > > > > > 980.259.0683 > >> > > > > > euj...@gmail.com > >> > > > > > @eujc21 > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > >