[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646042#comment-16646042 ] Nikolay Sokolov commented on GRIFFIN-190: - [~cwoytasik] looks like it've failed over here: [https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating-rc4/measure/src/main/scala/org/apache/griffin/measure/Application.scala#L99] If you are compiling it yourself, you can add extra line {{ex.printStackTrace()}} before exit call, so it would be more clear what the error is. Process init is mostly doing spark context setup there, and very likely config is not complete or spark itself is not configured properly. Looking at spark context init code, I have gut feel that you are having NPE around [this line|https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating-rc4/measure/src/main/scala/org/apache/griffin/measure/process/BatchDqProcess.scala#L57], and adding "config" key your env.json could help: {code:none} "spark": {"log.level": "WARN", "config": {}} {code} [~Lionel_3L] should we log full stack trace there, instead of just message? > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637382#comment-16637382 ] Cory Woytasik commented on GRIFFIN-190: --- I'm not sure why my last comment had all the extra brackets added in the env.json and dq.json section. Sorry > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637381#comment-16637381 ] Cory Woytasik commented on GRIFFIN-190: --- Ok here is what i have done based on some info that I found at: https://griffin.incubator.apache.org/docs/profiling.html 1. In the /home/server/griffin-0.2.0-incubating/measure/target/classes/env.json - I set the file to: {{{ "spark": \{ "log.level": "WARN" }, "sinks": [ \{ "type": "console" }, \{ "type": "hdfs", "config": { "path": "hdfs:///griffin/persist" } }, \{ "type": "elasticsearch", "config": { "method": "post", "api": "http://es:9200/griffin/accuracy; } } ] }}} {{2. I created /home/server/griffin-0.2.0-incubating/measure/target/classes/dq.json - I set the file to based on my table (lineageload in hive):}} {{{}} {{"name": "batch_prof",}} {{ "process.type": "batch",}} {{ "data.sources": [}} {{ {}} {{ "name": "src",}} {{ "baseline": true,}} {{ "connectors": [}} {{ {}} {{ "type": "hive",}} {{ "version": "3.1",}} {{ "config": {}} {{ "database": "default",}} {{ "table.name": "lineageload"}} {{ }}} {{ }}} {{ ]}} {{ }}} {{ ],}} {{ "evaluate.rule": {}} {{ "rules": [}} {{ {}} {{ "dsl.type": "griffin-dsl",}} {{ "dq.type": "profiling",}} {{ "out.dataframe.name": "prof",}} {{ "rule": "src.asset.count() AS asset_count, src.asset.length().max() AS asset_length_max",}} {{ "out": [}} {{ {}} {{ "type": "metric",}} {{ "name": "prof"}} {{ }}} {{ ]}} {{ }}} {{ ]}} {{ },}} {{ "sinks": ["CONSOLE", "HDFS"]}} {{}}} 3. I then ran the following command: ./spark-submit --class org.apache.griffin.measure.Application --master yarn --deploy-mode client --queue default \ --driver-memory 1g --executor-memory 1g --num-executors 2 \ /home/server/griffin-0.2.0-incubating/measure/target/griffin-measure.jar \ /home/server/griffin-0.2.0-incubating/measure/target/classes/env.json /home/server/griffin-0.2.0-incubating/measure/target/classes/dq.json 4. I then looked at unit-tests.log in /home/server/spark-2.3.1-bin-hadoop2.7/bin/target and noticed the following message: 18/10/03 13:38:44.721 main WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/10/03 13:38:44.815 main INFO Application$: [Ljava.lang.String;@7c214cc0 18/10/03 13:38:44.815 main INFO Application$: /home/server/griffin-0.2.0-incubating/measure/target/classes/env.json 18/10/03 13:38:44.815 main INFO Application$: /home/server/griffin-0.2.0-incubating/measure/target/classes/dq.json 18/10/03 13:38:45.577 main INFO Application$: params validation pass 18/10/03 13:38:45.599 main ERROR Application$: process init error: null 18/10/03 13:38:45.610 Thread-1 INFO ShutdownHookManager: Shutdown hook called 18/10/03 13:38:45.610 Thread-1 INFO ShutdownHookManager: Deleting directory /tmp/spark-ea6a6ad3-e533-4a4d-a33e-55b0c35a8352 What am I doing wrong? Or what are we missing? Thanks Lionel > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621545#comment-16621545 ] Lionel Liu commented on GRIFFIN-190: Hi [~cwoytasik], the problem also confuses me, I think we need more tests to figure out where it went wrong. Since you're creating jobs via UI, I suggest you generate a dq.json file like this: [https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating-rc4/griffin-doc/measure/measure-batch-sample.md#batch-profiling-sample] [,|https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating-rc4/measure/src/test/resources/_profiling-batch-griffindsl.json,] and submit the job directly to spark cluster. To address where's the problem. In this way, we can also get the application log for details. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621534#comment-16621534 ] Lionel Liu commented on GRIFFIN-190: That's helpful [~chemikadze], I'll have a test about that and document it. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620025#comment-16620025 ] Nikolay Sokolov commented on GRIFFIN-190: - (Just passing by, probably it can be relevant.) Talking about Livy, I've found that escaping works as expected only if Livy is using cluster mode when submitting jobs. In client mode, AM fails because \` is not getting expanded on container side, and griffin job can't parse broken json. Probably that should be documented somewhere. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619498#comment-16619498 ] Cory Woytasik commented on GRIFFIN-190: --- The NULL count check is failing for all columns. I just created another job to verify. I ran NULL checks against the other columns that should have NULL values too and I also verified that a select statement against HIVE returns the expected rows and that is also working. So what's the next step for us? We are still not seeing results when clicking the health link or DQ Metrics link too. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619286#comment-16619286 ] Cory Woytasik commented on GRIFFIN-190: --- Object has many rows that are NULL? When I query the hive table I also get many rows returned with a NULL object value. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619276#comment-16619276 ] Lionel Liu commented on GRIFFIN-190: OK, I think I've got you now. For null-count rule "count(source.`object`) AS `object-nullcount` WHERE source.`object` IS NULL", there's no result, I've checked your data, it seems good too. How about test null-count for the other columns? Enum count measures the items count group by a enum column. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619207#comment-16619207 ] Cory Woytasik commented on GRIFFIN-190: --- yes null-count fails every single time however distinct count and total counts seem to work for each column. I'm not sure what the enum measure is supposed to tell us but that also does not provide results. I'm only using a single file so not sure what you are trying to state in the 11:00 vs 12:00 data. Nothing changes in the .csv file. Are you referencing times the job runs? Even so the Null count does not appear the first time the job runs. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618415#comment-16618415 ] Lionel Liu commented on GRIFFIN-190: I need to double confirm about this: {color:#14892c}some of the jobs are completing successfully now with metric files, but some of the rules still fail. {color:#33}You mean all jobs of some rules like the null-count always failed? Or only some jobs of the null-count rule failed while some succeeded?{color}{color} {color:#14892c}{color:#33}If the former one, all the null-count rule jobs failed, we need to check the dq.json.{color}{color} {color:#14892c}{color:#33}If the latter one, that might be caused by the data difference among different partitions, like the data of 11:00 performs different from the data of 12:00, which may lead the failure of calculation, then we have to check the data.{color}{color} > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617726#comment-16617726 ] Cory Woytasik commented on GRIFFIN-190: --- [^PLDataLineageLoad061818.csv] ^Here is the file that we are using. We are trying to perform the Null count on the object column which as you can see contains numerous rows that are NULL.^ ^Thanks for all of your help to date.^ > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > Attachments: PLDataLineageLoad061818.csv > > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615019#comment-16615019 ] Michael Kisly commented on GRIFFIN-190: --- We removed the backslashes from L181 (I didn't comment out the whole line since that variable is used later on) and it seems some of the jobs are completing successfully now with metric files, but some of the rules still fail. The error we see is main ERROR SparkSqlEngine: collect metrics object-nullct error: Table or view not found: object-nullct; object is the field header name. The JSON being generated further up the logs looks like this: "evaluate.rule" : { "id" : 5741, "rules" : [ { "id" : 5742, "rule" : "count(source.`object`) AS `object-nullcount` WHERE source.`object` IS NULL", "name" : "object-nullct", "dsl.type" : "griffin-dsl", "dq.type" : "profiling" }, { "id" : 5743, > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613227#comment-16613227 ] Lionel Liu commented on GRIFFIN-190: Yes, the backslashes are generated by UI, the code is around here: [https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating/ui/angular/src/app/measure/create-measure/pr/pr.component.ts#L291|https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating/ui/angular/src/app/measure/create-measure/pr/pr.component.ts#L291,] But we've tested about this, it works well in our docker container. I noticed that your livy version is 0.3.0 too, it should perform the same as ours. If you want to fix this, you can have a try to fix it, you can ignore this step: [https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating/service/src/main/java/org/apache/griffin/core/job/SparkSubmitJob.java#L181|https://github.com/apache/incubator-griffin/blob/griffin-0.2.0-incubating/service/src/main/java/org/apache/griffin/core/job/SparkSubmitJob.java#L181,] * If this works, maybe our livy works in different way, or they're in different version. * If this brings error in livy's log, you can just remove all the backslashes in UI code around above link. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611501#comment-16611501 ] Lionel Liu commented on GRIFFIN-190: How did you submit the job? Directly curl via livy's API by using the escaped json string? Or submit via livy's API by using the config json file? Or submit it via griffin server? * If you're directly submitting the json string, you need to escape it like this: \"rule\": \"approx_count_distinct(source.\\`asset\\`) AS \\`asset-distcount\\`\" * If you're submitting json file, you don't need to escape the backslash like this: "rule" : "approx_count_distinct(source.`asset`) AS `asset-distcount`" * If you're submitting it via griffin server, you don't need to escape the backslash either. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611040#comment-16611040 ] Michael Kisly commented on GRIFFIN-190: --- We saw the error in the livy logs. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610559#comment-16610559 ] Cory Woytasik commented on GRIFFIN-190: --- Could we get a response on the error message we are seeing too? I would like to know if this related to our .csv file that hive is utilizing? Or if this is linked to griffin code? Thanks > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609927#comment-16609927 ] William Guo commented on GRIFFIN-190: - you can reuse the same metadata/quartz for version 0.3, we don't change metadata in this version. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609664#comment-16609664 ] Michael Kisly commented on GRIFFIN-190: --- Removing the email and sms has gotten us to a new error message. }; line: 39, column: 48] (through reference chain: org.apache.griffin.measure.config.params.user.UserParam["evaluate.rule"]->org.apache.griffin.measure.config.params.user.EvaluateRuleParam["rules"]->com.fasterxml.jackson.module.scala.deser.BuilderWrapper[0]) Prior to that we see 18/09/10 11:30:02.866 main ERROR Application$: Unrecognized character escape '`' (code 96) Looking in the evaluate.rule parameters I see "rule" : "approx_count_distinct(source.\`asset\`) AS \`asset-distcount\`", I'm guessing Jackson is having an issue reading the backslashes. On a different note if we were to try version 0.3 , are we able to use the existing hive metastore/tables and quartz tables that were setup from the prior installation? > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607886#comment-16607886 ] Lionel Liu commented on GRIFFIN-190: Actually, in griffin-0.2.0-incubating, the parameters of "email" and "sms" are not supported in application, we didn't remove them from the code or in env.json, that leads some misunderstanding. You can just remove them from env.json, it will not make any difference. We're using ElasticSearch as the default metric storage, thus users could leverage the alert function of ES for the notification, griffin will just focus on the DQ calculation. In the latest version griffin-0.3.0-incubating, the "email" and "sms" parameters are removed. I think you can try that version, almost the same as 0.2.0, with better config experience of spark job parameters for livy, and clearer job config structure. The latest version was released recently, the documents are not all updated yet, we're still working on it. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607646#comment-16607646 ] Cory Woytasik commented on GRIFFIN-190: --- Please disregard the numeric formatting in my sparkJob.properties. Jira seems to add the numbering in our comment > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607643#comment-16607643 ] Cory Woytasik commented on GRIFFIN-190: --- Thank you Lionel. We made changes to our spark and livy config files to match your docker files and we seem to get a bit farther. We changed the sparkJob.properties file to include: " # spark required sparkJob.file=hdfs:///griffin/griffin-measure.jar sparkJob.className=org.apache.griffin.measure.Application sparkJob.args_1=hdfs:///env/env.json sparkJob.args_3=hdfs,raw sparkJob.jars_1 = hdfs:///livy/datanucleus-api-jdo-3.2.6.jar sparkJob.jars_2 = hdfs:///livy/datanucleus-core-3.2.10.jar sparkJob.jars_3 = hdfs:///livy/datanucleus-rdbms-3.2.9.jar #sparkJob.uri = http://:8998/batches sparkJob.name=griffin sparkJob.queue=default # options sparkJob.numExecutors=2 sparkJob.executorCores=1 sparkJob.driverMemory=1g sparkJob.executorMemory=1g # other dependent jars sparkJob.jars = # hive-site.xml location, as configured in spark conf if ignored here spark.yarn.dist.files = hdfs:///conf/hive-site.xml # livy livy.uri=http://localhost:8998/batches # spark-admin spark.uri=http://localhost:8088 We are now throwing the following error every time a profile job runs: 18/09/07 14:54:02.523 main ERROR Application$: Can not deserialize instance of org.apache.griffin.measure.config.params.env.EmailParam out of START_ARRAY token at [Source: [org.apache.hadoop.hdfs.client.HdfsDataInputStream@3fae596|mailto:org.apache.hadoop.hdfs.client.HdfsDataInputStream@3fae596]; line: 40, column: 4] (through reference chain: org.apache.griffin.measure.config.params.env.EnvParam["mail"]) 18/09/07 14:54:02.526 Thread-1 INFO ShutdownHookManager: Shutdown hook called 18/09/07 14:54:02.526 Thread-1 INFO ShutdownHookManager: Deleting directory /tmp/spark-dffc3893-9e6b-46d8-9f88-1e78ad227904 18/09/07 14:54:02.960 [SparProcApp_com.cloudera.livy.utils.SparkProcApp@31a994b6|mailto:SparProcApp_com.cloudera.livy.utils.SparkProcApp@31a994b6] ERROR SparkProcApp: spark-submit exited with code 254 In Jira we found a reference by you to email and sms ([https://www.mail-archive.com/dev@griffin.incubator.apache.org/msg01715.html)] but we aren't exactly sure what the users did to fix their env.json file? That's also if our problem is similar to their problem. We appreciate all of your help on this one. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606643#comment-16606643 ] Lionel Liu commented on GRIFFIN-190: Hi [~mkisly], When using something sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar we get an error wrong fs expected [file:///] . Where did you get this error? In griffin service log or in livy log? Suppose it would be in livy log, in griffin service, we do NOT parse the value of "sparkJob.file" as a path, we just directly send the string value to livy as the value of "file" filed like this: "file": "hdfs://localhost:9000/griffin/griffin-measure.jar". In application.properties, "fs.defaultFS" is only used to check done file existence, it will not affect the spark job submission. I guess there might be some issue of the environment. I'm not sure how's your livy and spark configured, maybe you can refer to our docker image built up scripts: [https://github.com/bhlx3lyx7/griffin-docker/tree/master/env2/conf/spark] [https://github.com/bhlx3lyx7/griffin-docker/tree/master/env2/conf/livy] Or the error might be caused by the other parameters like: "sparkJob.jars" or "spark.yarn.dist.files", they also affect if you need enable Hive Context when submitting spark jobs. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606483#comment-16606483 ] Michael Kisly commented on GRIFFIN-190: --- Hi Lionel, The job appears to be failing somewhere between the interaction with livy/spark and hdfs. When we run a postman request to livy 8998 and use something like the following "file": "hdfs://localhost:9000/example/" it seems to resolve the path fine. When running the griffin job livy reports in an error in its logs of the following: ERROR SparkProcApp: spark-submit exited with code 1 . I've found some different behavior when editing the path to the jar files in the sparkJob.Properties but am unsure of how the address should be specified. When using something sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar we get an error wrong fs expected file:/// so we then changed the path of the jars to hdfs:///griffin/ and hdfs:///livy/ however now we just get that 1 error code. Also in the application.Properties we have the following specified fs.defaultFS = hdfs://localhost:9000 . No applications appear to get created in yarn Thanks, Mike > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605153#comment-16605153 ] Lionel Liu commented on GRIFFIN-190: Hi [~cwoytasik], you might need to check for some information. 1. Assume you're using the default env.json, there should be result persisted in hdfs if the measure job succeed, you can find the results in the path: hdfs:///griffin/persist//, there will be several directories named as the timestamp of job triggered, inside there lists the metrics. * If the "_METRICS" file seems good, it means that the job succeed in spark. * If the "_METRICS" doesn't exist, we have to find the yarn log of the spark application for the job. In that way, we need to find the application id in livy log or griffin service log, then fetch yarn log by this: yarn logs -applicationId > app.log To export the application log into app.log, then you can find the ERROR msg in that log. 2. If the results exist in hdfs, we can try to query them from ES like this: curl -XGET ':9200/griffin/accuracy/_search?pretty_path=hits.hits._source' -d '\{"query":{"match_all":{}}, "sort": [\{"tmst": {"order": "asc"}}]}' If it doesn't exist, there might be something wrong when spark application submit metrics to ES. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)