[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen

2018-09-07 Thread Lionel Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607886#comment-16607886
 ] 

Lionel Liu commented on GRIFFIN-190:


Actually, in griffin-0.2.0-incubating, the parameters of "email" and "sms" are 
not supported in application, we didn't remove them from the code or in 
env.json, that leads some misunderstanding. You can just remove them from 
env.json, it will not make any difference.

We're using ElasticSearch as the default metric storage, thus users could 
leverage the alert function of ES for the notification, griffin will just focus 
on the DQ calculation.

In the latest version griffin-0.3.0-incubating, the "email" and "sms" 
parameters are removed. I think you can try that version, almost the same as 
0.2.0, with better config experience of spark job parameters for livy, and 
clearer job config structure. The latest version was released recently, the 
documents are not all updated yet, we're still working on it.

> Blank Health and DQ Metrics Screen
> --
>
> Key: GRIFFIN-190
> URL: https://issues.apache.org/jira/browse/GRIFFIN-190
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Cory Woytasik
>Priority: Major
>
> Griffin is up and running.  We have both an accuracy measure and a profiling 
> measure that is set to run every minute via jobs.  When we click the chart 
> icon next to the job we receive a "no content" message.  When we click on the 
> Health link or DQ Metrics link they think for a second and then display a 
> blank screen.  We are thinking this might be ES related, but aren't 
> completely sure.  Need some help.  We assume it's a path or property setup 
> issue.  Here are the versions we are running:
> Hive - 3.1.0
> Elasticsearch - 5.3.1
> griffin - 0.2.0
> hadoop - 3.1.1
> livy - 0.3.0
> spark - 2.3.1
> Using postgres too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen

2018-09-07 Thread Cory Woytasik (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607646#comment-16607646
 ] 

Cory Woytasik commented on GRIFFIN-190:
---

Please disregard the numeric formatting in my sparkJob.properties.  Jira seems 
to add the numbering in our comment

 

> Blank Health and DQ Metrics Screen
> --
>
> Key: GRIFFIN-190
> URL: https://issues.apache.org/jira/browse/GRIFFIN-190
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Cory Woytasik
>Priority: Major
>
> Griffin is up and running.  We have both an accuracy measure and a profiling 
> measure that is set to run every minute via jobs.  When we click the chart 
> icon next to the job we receive a "no content" message.  When we click on the 
> Health link or DQ Metrics link they think for a second and then display a 
> blank screen.  We are thinking this might be ES related, but aren't 
> completely sure.  Need some help.  We assume it's a path or property setup 
> issue.  Here are the versions we are running:
> Hive - 3.1.0
> Elasticsearch - 5.3.1
> griffin - 0.2.0
> hadoop - 3.1.1
> livy - 0.3.0
> spark - 2.3.1
> Using postgres too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen

2018-09-07 Thread Cory Woytasik (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607643#comment-16607643
 ] 

Cory Woytasik commented on GRIFFIN-190:
---

Thank you Lionel.  We made changes to our spark and livy config files to match 
your docker files and we seem to get a bit farther. 

We changed the sparkJob.properties file to include: 

"

# spark required
sparkJob.file=hdfs:///griffin/griffin-measure.jar
sparkJob.className=org.apache.griffin.measure.Application
sparkJob.args_1=hdfs:///env/env.json
sparkJob.args_3=hdfs,raw
sparkJob.jars_1 = hdfs:///livy/datanucleus-api-jdo-3.2.6.jar
sparkJob.jars_2 = hdfs:///livy/datanucleus-core-3.2.10.jar
sparkJob.jars_3 = hdfs:///livy/datanucleus-rdbms-3.2.9.jar
#sparkJob.uri = http://:8998/batches


sparkJob.name=griffin
sparkJob.queue=default

# options
sparkJob.numExecutors=2
sparkJob.executorCores=1
sparkJob.driverMemory=1g
sparkJob.executorMemory=1g

# other dependent jars
sparkJob.jars =

# hive-site.xml location, as configured in spark conf if ignored here
spark.yarn.dist.files = hdfs:///conf/hive-site.xml

# livy
livy.uri=http://localhost:8998/batches

# spark-admin
spark.uri=http://localhost:8088

 

We are now throwing the following error every time a profile job runs:

18/09/07 14:54:02.523 main ERROR Application$: Can not deserialize instance of 
org.apache.griffin.measure.config.params.env.EmailParam out of START_ARRAY token

at [Source: 
[org.apache.hadoop.hdfs.client.HdfsDataInputStream@3fae596|mailto:org.apache.hadoop.hdfs.client.HdfsDataInputStream@3fae596];
 line: 40, column: 4] (through reference chain: 
org.apache.griffin.measure.config.params.env.EnvParam["mail"])

18/09/07 14:54:02.526 Thread-1 INFO ShutdownHookManager: Shutdown hook called

18/09/07 14:54:02.526 Thread-1 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-dffc3893-9e6b-46d8-9f88-1e78ad227904

18/09/07 14:54:02.960 
[SparProcApp_com.cloudera.livy.utils.SparkProcApp@31a994b6|mailto:SparProcApp_com.cloudera.livy.utils.SparkProcApp@31a994b6]
 ERROR SparkProcApp: spark-submit exited with code 254

 

In Jira we found a reference by you to email and sms 
([https://www.mail-archive.com/dev@griffin.incubator.apache.org/msg01715.html)] 
but we aren't exactly sure what the users did to fix their env.json file?   
That's also if our problem is similar to their problem.  

We appreciate all of your help on this one. 

> Blank Health and DQ Metrics Screen
> --
>
> Key: GRIFFIN-190
> URL: https://issues.apache.org/jira/browse/GRIFFIN-190
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Cory Woytasik
>Priority: Major
>
> Griffin is up and running.  We have both an accuracy measure and a profiling 
> measure that is set to run every minute via jobs.  When we click the chart 
> icon next to the job we receive a "no content" message.  When we click on the 
> Health link or DQ Metrics link they think for a second and then display a 
> blank screen.  We are thinking this might be ES related, but aren't 
> completely sure.  Need some help.  We assume it's a path or property setup 
> issue.  Here are the versions we are running:
> Hive - 3.1.0
> Elasticsearch - 5.3.1
> griffin - 0.2.0
> hadoop - 3.1.1
> livy - 0.3.0
> spark - 2.3.1
> Using postgres too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] incubator-griffin pull request #411: Improve document quality

2018-09-07 Thread toyboxman
GitHub user toyboxman opened a pull request:

https://github.com/apache/incubator-griffin/pull/411

Improve document quality

Revise docker guide document and append more descriptions.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/toyboxman/incubator-griffin doc/docker

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-griffin/pull/411.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #411


commit bbdfcefa859a6eec2eafdee93102c36899c8454b
Author: Eugene 
Date:   2018-09-07T07:37:05Z

Improve document quality

Revise docker guide document and append more descriptions.




---