[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen

2018-10-03 Thread Cory Woytasik (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637382#comment-16637382
 ] 

Cory Woytasik commented on GRIFFIN-190:
---

I'm not sure why my last comment had all the extra brackets added in the 
env.json and dq.json section.  Sorry

 

> Blank Health and DQ Metrics Screen
> --
>
> Key: GRIFFIN-190
> URL: https://issues.apache.org/jira/browse/GRIFFIN-190
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Cory Woytasik
>Priority: Major
> Attachments: PLDataLineageLoad061818.csv
>
>
> Griffin is up and running.  We have both an accuracy measure and a profiling 
> measure that is set to run every minute via jobs.  When we click the chart 
> icon next to the job we receive a "no content" message.  When we click on the 
> Health link or DQ Metrics link they think for a second and then display a 
> blank screen.  We are thinking this might be ES related, but aren't 
> completely sure.  Need some help.  We assume it's a path or property setup 
> issue.  Here are the versions we are running:
> Hive - 3.1.0
> Elasticsearch - 5.3.1
> griffin - 0.2.0
> hadoop - 3.1.1
> livy - 0.3.0
> spark - 2.3.1
> Using postgres too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen

2018-10-03 Thread Cory Woytasik (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637381#comment-16637381
 ] 

Cory Woytasik commented on GRIFFIN-190:
---

Ok here is what i have done based on some info that I found at: 
https://griffin.incubator.apache.org/docs/profiling.html

1. In the /home/server/griffin-0.2.0-incubating/measure/target/classes/env.json 
- I set the file to:

{{{ "spark": \{ "log.level": "WARN" }, "sinks": [ \{ "type": "console" }, \{ 
"type": "hdfs", "config": { "path": "hdfs:///griffin/persist" } }, \{ "type": 
"elasticsearch", "config": { "method": "post", "api": 
"http://es:9200/griffin/accuracy; } } ] }}}

 

{{2. I created 
/home/server/griffin-0.2.0-incubating/measure/target/classes/dq.json - I set 
the file to based on my table (lineageload in hive):}}

{{{}}
{{"name": "batch_prof",}}
{{ "process.type": "batch",}}
{{ "data.sources": [}}
{{ {}}
{{ "name": "src",}}
{{ "baseline": true,}}
{{ "connectors": [}}
{{ {}}
{{ "type": "hive",}}
{{ "version": "3.1",}}
{{ "config": {}}
{{ "database": "default",}}
{{ "table.name": "lineageload"}}
{{ }}}
{{ }}}
{{ ]}}
{{ }}}
{{ ],}}
{{ "evaluate.rule": {}}
{{ "rules": [}}
{{ {}}
{{ "dsl.type": "griffin-dsl",}}
{{ "dq.type": "profiling",}}
{{ "out.dataframe.name": "prof",}}
{{ "rule": "src.asset.count() AS asset_count, src.asset.length().max() AS 
asset_length_max",}}
{{ "out": [}}
{{ {}}
{{ "type": "metric",}}
{{ "name": "prof"}}
{{ }}}
{{ ]}}
{{ }}}
{{ ]}}
{{ },}}
{{ "sinks": ["CONSOLE", "HDFS"]}}
{{}}}

3. I then ran the following command: 

./spark-submit --class org.apache.griffin.measure.Application --master yarn 
--deploy-mode client --queue default \
--driver-memory 1g --executor-memory 1g --num-executors 2 \
/home/server/griffin-0.2.0-incubating/measure/target/griffin-measure.jar \
/home/server/griffin-0.2.0-incubating/measure/target/classes/env.json 
/home/server/griffin-0.2.0-incubating/measure/target/classes/dq.json

4. I then looked at unit-tests.log in 
/home/server/spark-2.3.1-bin-hadoop2.7/bin/target and noticed the following 
message:

18/10/03 13:38:44.721 main WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
18/10/03 13:38:44.815 main INFO Application$: [Ljava.lang.String;@7c214cc0
18/10/03 13:38:44.815 main INFO Application$: 
/home/server/griffin-0.2.0-incubating/measure/target/classes/env.json
18/10/03 13:38:44.815 main INFO Application$: 
/home/server/griffin-0.2.0-incubating/measure/target/classes/dq.json
18/10/03 13:38:45.577 main INFO Application$: params validation pass
18/10/03 13:38:45.599 main ERROR Application$: process init error: null
18/10/03 13:38:45.610 Thread-1 INFO ShutdownHookManager: Shutdown hook called
18/10/03 13:38:45.610 Thread-1 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-ea6a6ad3-e533-4a4d-a33e-55b0c35a8352

 

What am I doing wrong?  Or what are we missing?  Thanks Lionel  

> Blank Health and DQ Metrics Screen
> --
>
> Key: GRIFFIN-190
> URL: https://issues.apache.org/jira/browse/GRIFFIN-190
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Cory Woytasik
>Priority: Major
> Attachments: PLDataLineageLoad061818.csv
>
>
> Griffin is up and running.  We have both an accuracy measure and a profiling 
> measure that is set to run every minute via jobs.  When we click the chart 
> icon next to the job we receive a "no content" message.  When we click on the 
> Health link or DQ Metrics link they think for a second and then display a 
> blank screen.  We are thinking this might be ES related, but aren't 
> completely sure.  Need some help.  We assume it's a path or property setup 
> issue.  Here are the versions we are running:
> Hive - 3.1.0
> Elasticsearch - 5.3.1
> griffin - 0.2.0
> hadoop - 3.1.1
> livy - 0.3.0
> spark - 2.3.1
> Using postgres too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)