[jira] [Commented] (GRIFFIN-210) [Measure] need to integrate with upstream/downstream nodes when bad records are founded

Nikolay Sokolov (JIRA) Sat, 17 Nov 2018 12:43:32 -0800


    [ 
https://issues.apache.org/jira/browse/GRIFFIN-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690691#comment-16690691
 ]


Nikolay Sokolov commented on GRIFFIN-210:
-----------------------------------------

We are using Griffin together with Grafana in order to more flexible 
dashboards, alerts and triggers. My feeling is that getting same degree of 
flexibility and robustness as third-party elasticsearch-based alerters are 
providing, might take significant effort, both on backend and on UI. It might 
be easier to integrate external alerters with griffin itself, by calling 
external logic, and just storing ids/references to external thresholds.

>From other hand, there is uncovered area of integration between Griffin itself 
>and jobs producing data. For example, job might want to trigger DQ check 
>stored in service module against its own results, in order to validate some 
>assertions. That would allow manage DQ definitions on UI, decoupled from jobs 
>code. In cases like that, remedy actions would be taken on job side, based on 
>result of triggered DQ check. However, right now several things are missing 
>for that: API to get job (or jobs?) by name, measure, or some associated tag; 
>API to trigger job outside of schedule getting job instance id back; ability 
>to verify metric results of a job against thresholds (either internal or 
>external).

Same APIs mentioned could be used on receiving side: receiving job could run 
call verification API, in order to estimate state of input dataset based on 
previously performed checks, and take corresponding action.

What do you think about this approach?

> [Measure] need to integrate with upstream/downstream nodes when bad records 
> are founded
> ---------------------------------------------------------------------------------------
>
>                 Key: GRIFFIN-210
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-210
>             Project: Griffin (Incubating)
>          Issue Type: Wish
>            Reporter: William Guo
>            Assignee: William Guo
>            Priority: Major
>
> In a typical data quality project, when Apache Griffin find some data quality 
> issue, usually, it need to integrate with upstream or downstream nodes.
> So corresponding systems can have opportunities to automatically do some 
> remedy action, such as retry...  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (GRIFFIN-210) [Measure] need to integrate with upstream/downstream nodes when bad records are founded

Reply via email to