[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

2018-11-17 Thread chemikadze
Github user chemikadze commented on a diff in the pull request:

https://github.com/apache/incubator-griffin/pull/456#discussion_r234426015
  
--- Diff: 
measure/src/main/scala/org/apache/griffin/measure/datasource/connector/DataConnectorFactory.scala
 ---
@@ -84,6 +87,26 @@ object DataConnectorFactory extends Loggable {
 }
   }
 
+  private def getCustomConnector(session: SparkSession,
+ context: StreamingContext,
+ param: DataConnectorParam,
+ storage: TimestampStorage,
+ maybeClient: 
Option[StreamingCacheClient]): DataConnector = {
+val className = param.getConfig("class").asInstanceOf[String]
+val cls = Class.forName(className)
+if (classOf[BatchDataConnector].isAssignableFrom(cls)) {
+  val ctx = BatchDataConnectorContext(session, param, storage)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[BatchDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[BatchDataConnector]
+} else if (classOf[StreamingDataConnector].isAssignableFrom(cls)) {
+  val ctx = StreamingDataConnectorContext(session, context, param, 
storage, maybeClient)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[StreamingDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[StreamingDataConnector]
+} else {
+  throw new ClassCastException("")
--- End diff --

Oh, thanks for reminding! Planned to do that, but got distracted.


---


[jira] [Commented] (GRIFFIN-213) Support pluggable datasource connectors

2018-11-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690744#comment-16690744
 ] 

ASF GitHub Bot commented on GRIFFIN-213:


Github user chemikadze commented on a diff in the pull request:

https://github.com/apache/incubator-griffin/pull/456#discussion_r234426015
  
--- Diff: 
measure/src/main/scala/org/apache/griffin/measure/datasource/connector/DataConnectorFactory.scala
 ---
@@ -84,6 +87,26 @@ object DataConnectorFactory extends Loggable {
 }
   }
 
+  private def getCustomConnector(session: SparkSession,
+ context: StreamingContext,
+ param: DataConnectorParam,
+ storage: TimestampStorage,
+ maybeClient: 
Option[StreamingCacheClient]): DataConnector = {
+val className = param.getConfig("class").asInstanceOf[String]
+val cls = Class.forName(className)
+if (classOf[BatchDataConnector].isAssignableFrom(cls)) {
+  val ctx = BatchDataConnectorContext(session, param, storage)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[BatchDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[BatchDataConnector]
+} else if (classOf[StreamingDataConnector].isAssignableFrom(cls)) {
+  val ctx = StreamingDataConnectorContext(session, context, param, 
storage, maybeClient)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[StreamingDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[StreamingDataConnector]
+} else {
+  throw new ClassCastException("")
--- End diff --

Oh, thanks for reminding! Planned to do that, but got distracted.


> Support pluggable datasource connectors
> ---
>
> Key: GRIFFIN-213
> URL: https://issues.apache.org/jira/browse/GRIFFIN-213
> Project: Griffin (Incubating)
>  Issue Type: Improvement
>Reporter: Nikolay Sokolov
>Priority: Minor
>
> As of Griffin 0.3, code modification is required, in order to add new data 
> connectors.
> Proposal is to add new data connector type, CUSTOM, that would allow to 
> specify class name of data connector implementation to use. Additional jars 
> with custom connector implementations would be provided in spark 
> configuration template.
> Class name would be specified in "class" config of data connector. For 
> example:
> {code:json}
> "connectors": [
> {
>   "type": "CUSTOM",
>   "config": {
> "class": "org.example.griffin.JDBCConnector"
> // extra connector-specific parameters
>   }
> }
>   ]
> {code}
> Proposed contract for implementations is based on current convention:
>  - for batch
>  ** class should be a subclass of BatchDataConnector
>  ** if should have method with signature:
> {code:java}
> public static BatchDataConnector apply(ctx: BatchDataConnectorContext)
> {code}
>  - for streaming
>  ** class should be a subclass of StreamingDataConnector
>  ** it should have method with signature:
> {code:java}
> public static StreamingDataConnector apply(ctx: StreamingDataConnectorContext)
> {code}
> Signatures of context objects:
> {code:scala}
> case class BatchDataConnectorContext(@transient sparkSession: SparkSession,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage)
> case class StreamingDataConnectorContext(@transient sparkSession: 
> SparkSession,
>  @transient ssc: StreamingContext,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage,
>  streamingCacheClientOpt: 
> Option[StreamingCacheClient])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

2018-11-17 Thread gavlyukovskiy
Github user gavlyukovskiy commented on a diff in the pull request:

https://github.com/apache/incubator-griffin/pull/456#discussion_r234425707
  
--- Diff: 
measure/src/main/scala/org/apache/griffin/measure/datasource/connector/DataConnectorFactory.scala
 ---
@@ -84,6 +87,26 @@ object DataConnectorFactory extends Loggable {
 }
   }
 
+  private def getCustomConnector(session: SparkSession,
+ context: StreamingContext,
+ param: DataConnectorParam,
+ storage: TimestampStorage,
+ maybeClient: 
Option[StreamingCacheClient]): DataConnector = {
+val className = param.getConfig("class").asInstanceOf[String]
+val cls = Class.forName(className)
+if (classOf[BatchDataConnector].isAssignableFrom(cls)) {
+  val ctx = BatchDataConnectorContext(session, param, storage)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[BatchDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[BatchDataConnector]
+} else if (classOf[StreamingDataConnector].isAssignableFrom(cls)) {
+  val ctx = StreamingDataConnectorContext(session, context, param, 
storage, maybeClient)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[StreamingDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[StreamingDataConnector]
+} else {
+  throw new ClassCastException("")
--- End diff --

It would be nice to have here message that custom connector class must 
extend `BatchDataConnector` or `StreamingDataConnector`


---


[jira] [Commented] (GRIFFIN-213) Support pluggable datasource connectors

2018-11-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690737#comment-16690737
 ] 

ASF GitHub Bot commented on GRIFFIN-213:


Github user gavlyukovskiy commented on a diff in the pull request:

https://github.com/apache/incubator-griffin/pull/456#discussion_r234425707
  
--- Diff: 
measure/src/main/scala/org/apache/griffin/measure/datasource/connector/DataConnectorFactory.scala
 ---
@@ -84,6 +87,26 @@ object DataConnectorFactory extends Loggable {
 }
   }
 
+  private def getCustomConnector(session: SparkSession,
+ context: StreamingContext,
+ param: DataConnectorParam,
+ storage: TimestampStorage,
+ maybeClient: 
Option[StreamingCacheClient]): DataConnector = {
+val className = param.getConfig("class").asInstanceOf[String]
+val cls = Class.forName(className)
+if (classOf[BatchDataConnector].isAssignableFrom(cls)) {
+  val ctx = BatchDataConnectorContext(session, param, storage)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[BatchDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[BatchDataConnector]
+} else if (classOf[StreamingDataConnector].isAssignableFrom(cls)) {
+  val ctx = StreamingDataConnectorContext(session, context, param, 
storage, maybeClient)
+  val meth = cls.getDeclaredMethod("apply", 
classOf[StreamingDataConnectorContext])
+  meth.invoke(null, ctx).asInstanceOf[StreamingDataConnector]
+} else {
+  throw new ClassCastException("")
--- End diff --

It would be nice to have here message that custom connector class must 
extend `BatchDataConnector` or `StreamingDataConnector`


> Support pluggable datasource connectors
> ---
>
> Key: GRIFFIN-213
> URL: https://issues.apache.org/jira/browse/GRIFFIN-213
> Project: Griffin (Incubating)
>  Issue Type: Improvement
>Reporter: Nikolay Sokolov
>Priority: Minor
>
> As of Griffin 0.3, code modification is required, in order to add new data 
> connectors.
> Proposal is to add new data connector type, CUSTOM, that would allow to 
> specify class name of data connector implementation to use. Additional jars 
> with custom connector implementations would be provided in spark 
> configuration template.
> Class name would be specified in "class" config of data connector. For 
> example:
> {code:json}
> "connectors": [
> {
>   "type": "CUSTOM",
>   "config": {
> "class": "org.example.griffin.JDBCConnector"
> // extra connector-specific parameters
>   }
> }
>   ]
> {code}
> Proposed contract for implementations is based on current convention:
>  - for batch
>  ** class should be a subclass of BatchDataConnector
>  ** if should have method with signature:
> {code:java}
> public static BatchDataConnector apply(ctx: BatchDataConnectorContext)
> {code}
>  - for streaming
>  ** class should be a subclass of StreamingDataConnector
>  ** it should have method with signature:
> {code:java}
> public static StreamingDataConnector apply(ctx: StreamingDataConnectorContext)
> {code}
> Signatures of context objects:
> {code:scala}
> case class BatchDataConnectorContext(@transient sparkSession: SparkSession,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage)
> case class StreamingDataConnectorContext(@transient sparkSession: 
> SparkSession,
>  @transient ssc: StreamingContext,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage,
>  streamingCacheClientOpt: 
> Option[StreamingCacheClient])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GRIFFIN-213) Support pluggable datasource connectors

2018-11-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690723#comment-16690723
 ] 

ASF GitHub Bot commented on GRIFFIN-213:


GitHub user chemikadze opened a pull request:

https://github.com/apache/incubator-griffin/pull/456

[GRIFFIN-213] Custom connector support

Provide ability to extend batch and streaming data integrations
with custom user-provided connectors. Introduces new data connector
type, `CUSTOM`, parameterized with `class` property. Also adds support
for custom data connector enum on service side.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chemikadze/incubator-griffin GRIFFIN-213

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-griffin/pull/456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #456


commit d487347a363f172cfc9e26225d5687cc3f95ab73
Author: Nikolay Sokolov 
Date:   2018-11-17T22:37:36Z

[GRIFFIN-213] Custom connector support

Provide ability to extend batch and streaming data integrations
with custom user-provided connectors. Introduces new data connector
type, `CUSTOM`, parameterized with `class` property. Also adds support
for custom data connector enum on service side.




> Support pluggable datasource connectors
> ---
>
> Key: GRIFFIN-213
> URL: https://issues.apache.org/jira/browse/GRIFFIN-213
> Project: Griffin (Incubating)
>  Issue Type: Improvement
>Reporter: Nikolay Sokolov
>Priority: Minor
>
> As of Griffin 0.3, code modification is required, in order to add new data 
> connectors.
> Proposal is to add new data connector type, CUSTOM, that would allow to 
> specify class name of data connector implementation to use. Additional jars 
> with custom connector implementations would be provided in spark 
> configuration template.
> Class name would be specified in "class" config of data connector. For 
> example:
> {code:json}
> "connectors": [
> {
>   "type": "CUSTOM",
>   "config": {
> "class": "org.example.griffin.JDBCConnector"
> // extra connector-specific parameters
>   }
> }
>   ]
> {code}
> Proposed contract for implementations is based on current convention:
>  - for batch
>  ** class should be a subclass of BatchDataConnector
>  ** if should have method with signature:
> {code:java}
> public static BatchDataConnector apply(ctx: BatchDataConnectorContext)
> {code}
>  - for streaming
>  ** class should be a subclass of StreamingDataConnector
>  ** it should have method with signature:
> {code:java}
> public static StreamingDataConnector apply(ctx: StreamingDataConnectorContext)
> {code}
> Signatures of context objects:
> {code:scala}
> case class BatchDataConnectorContext(@transient sparkSession: SparkSession,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage)
> case class StreamingDataConnectorContext(@transient sparkSession: 
> SparkSession,
>  @transient ssc: StreamingContext,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage,
>  streamingCacheClientOpt: 
> Option[StreamingCacheClient])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

2018-11-17 Thread chemikadze
GitHub user chemikadze opened a pull request:

https://github.com/apache/incubator-griffin/pull/456

[GRIFFIN-213] Custom connector support

Provide ability to extend batch and streaming data integrations
with custom user-provided connectors. Introduces new data connector
type, `CUSTOM`, parameterized with `class` property. Also adds support
for custom data connector enum on service side.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chemikadze/incubator-griffin GRIFFIN-213

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-griffin/pull/456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #456


commit d487347a363f172cfc9e26225d5687cc3f95ab73
Author: Nikolay Sokolov 
Date:   2018-11-17T22:37:36Z

[GRIFFIN-213] Custom connector support

Provide ability to extend batch and streaming data integrations
with custom user-provided connectors. Introduces new data connector
type, `CUSTOM`, parameterized with `class` property. Also adds support
for custom data connector enum on service side.




---


[jira] [Commented] (GRIFFIN-213) Support pluggable datasource connectors

2018-11-17 Thread Nikolay Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690695#comment-16690695
 ] 

Nikolay Sokolov commented on GRIFFIN-213:
-

[~Lionel_3L]  [~guoyp] do you have any thoughts on that proposal?

> Support pluggable datasource connectors
> ---
>
> Key: GRIFFIN-213
> URL: https://issues.apache.org/jira/browse/GRIFFIN-213
> Project: Griffin (Incubating)
>  Issue Type: Improvement
>Reporter: Nikolay Sokolov
>Priority: Minor
>
> As of Griffin 0.3, code modification is required, in order to add new data 
> connectors.
> Proposal is to add new data connector type, CUSTOM, that would allow to 
> specify class name of data connector implementation to use. Additional jars 
> with custom connector implementations would be provided in spark 
> configuration template.
> Class name would be specified in "class" config of data connector. For 
> example:
> {code:json}
> "connectors": [
> {
>   "type": "CUSTOM",
>   "config": {
> "class": "org.example.griffin.JDBCConnector"
> // extra connector-specific parameters
>   }
> }
>   ]
> {code}
> Proposed contract for implementations is based on current convention:
>  - for batch
>  ** class should be a subclass of BatchDataConnector
>  ** if should have method with signature:
> {code:java}
> public static BatchDataConnector apply(ctx: BatchDataConnectorContext)
> {code}
>  - for streaming
>  ** class should be a subclass of StreamingDataConnector
>  ** it should have method with signature:
> {code:java}
> public static StreamingDataConnector apply(ctx: StreamingDataConnectorContext)
> {code}
> Signatures of context objects:
> {code:scala}
> case class BatchDataConnectorContext(@transient sparkSession: SparkSession,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage)
> case class StreamingDataConnectorContext(@transient sparkSession: 
> SparkSession,
>  @transient ssc: StreamingContext,
>  dcParam: DataConnectorParam,
>  timestampStorage: TimestampStorage,
>  streamingCacheClientOpt: 
> Option[StreamingCacheClient])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GRIFFIN-125) As a user, I want to enable the featuer that can help me to check the data consistancy for active-active elasticsearch clusters

2018-11-17 Thread Nikolay Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690693#comment-16690693
 ] 

Nikolay Sokolov commented on GRIFFIN-125:
-

Looks like GRIFFIN-213 would allow to implement missing elasticsearch connector 
without dependency on upstream change.

> As a user, I want to enable the featuer that can help me to check the data 
> consistancy for active-active elasticsearch clusters
> ---
>
> Key: GRIFFIN-125
> URL: https://issues.apache.org/jira/browse/GRIFFIN-125
> Project: Griffin (Incubating)
>  Issue Type: New Feature
>Reporter: Ruan, Yiming
>Assignee: Lionel Liu
>Priority: Minor
>
> 2 or 3 elasticsearch clusters are ran in active-active mode. I want to check 
> the indices lists and the indices accounts to make sure the data consistency 
> between the clusters



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GRIFFIN-210) [Measure] need to integrate with upstream/downstream nodes when bad records are founded

2018-11-17 Thread Nikolay Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690691#comment-16690691
 ] 

Nikolay Sokolov commented on GRIFFIN-210:
-

We are using Griffin together with Grafana in order to more flexible 
dashboards, alerts and triggers. My feeling is that getting same degree of 
flexibility and robustness as third-party elasticsearch-based alerters are 
providing, might take significant effort, both on backend and on UI. It might 
be easier to integrate external alerters with griffin itself, by calling 
external logic, and just storing ids/references to external thresholds.

>From other hand, there is uncovered area of integration between Griffin itself 
>and jobs producing data. For example, job might want to trigger DQ check 
>stored in service module against its own results, in order to validate some 
>assertions. That would allow manage DQ definitions on UI, decoupled from jobs 
>code. In cases like that, remedy actions would be taken on job side, based on 
>result of triggered DQ check. However, right now several things are missing 
>for that: API to get job (or jobs?) by name, measure, or some associated tag; 
>API to trigger job outside of schedule getting job instance id back; ability 
>to verify metric results of a job against thresholds (either internal or 
>external).

Same APIs mentioned could be used on receiving side: receiving job could run 
call verification API, in order to estimate state of input dataset based on 
previously performed checks, and take corresponding action.

What do you think about this approach?

> [Measure] need to integrate with upstream/downstream nodes when bad records 
> are founded
> ---
>
> Key: GRIFFIN-210
> URL: https://issues.apache.org/jira/browse/GRIFFIN-210
> Project: Griffin (Incubating)
>  Issue Type: Wish
>Reporter: William Guo
>Assignee: William Guo
>Priority: Major
>
> In a typical data quality project, when Apache Griffin find some data quality 
> issue, usually, it need to integrate with upstream or downstream nodes.
> So corresponding systems can have opportunities to automatically do some 
> remedy action, such as retry...  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GRIFFIN-213) Support pluggable datasource connectors

2018-11-17 Thread Nikolay Sokolov (JIRA)
Nikolay Sokolov created GRIFFIN-213:
---

 Summary: Support pluggable datasource connectors
 Key: GRIFFIN-213
 URL: https://issues.apache.org/jira/browse/GRIFFIN-213
 Project: Griffin (Incubating)
  Issue Type: Improvement
Reporter: Nikolay Sokolov


As of Griffin 0.3, code modification is required, in order to add new data 
connectors.

Proposal is to add new data connector type, CUSTOM, that would allow to specify 
class name of data connector implementation to use. Additional jars with custom 
connector implementations would be provided in spark configuration template.

Class name would be specified in "class" config of data connector. For example:
{code:json}
"connectors": [
{
  "type": "CUSTOM",
  "config": {
"class": "org.example.griffin.JDBCConnector"
// extra connector-specific parameters
  }
}
  ]
{code}

Proposed contract for implementations is based on current convention:
 - for batch
 ** class should be a subclass of BatchDataConnector
 ** if should have method with signature:
{code:java}
public static BatchDataConnector apply(ctx: BatchDataConnectorContext)
{code}

 - for streaming
 ** class should be a subclass of StreamingDataConnector
 ** it should have method with signature:
{code:java}
public static StreamingDataConnector apply(ctx: StreamingDataConnectorContext)
{code}

Signatures of context objects:
{code:scala}
case class BatchDataConnectorContext(@transient sparkSession: SparkSession,
 dcParam: DataConnectorParam,
 timestampStorage: TimestampStorage)

case class StreamingDataConnectorContext(@transient sparkSession: SparkSession,
 @transient ssc: StreamingContext,
 dcParam: DataConnectorParam,
 timestampStorage: TimestampStorage,
 streamingCacheClientOpt: 
Option[StreamingCacheClient])

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GRIFFIN-203) "Plaintext mode" for measure creation

2018-11-17 Thread Nikolay Sokolov (JIRA)


 [ 
https://issues.apache.org/jira/browse/GRIFFIN-203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Sokolov resolved GRIFFIN-203.
-
Resolution: Fixed

> "Plaintext mode" for measure creation
> -
>
> Key: GRIFFIN-203
> URL: https://issues.apache.org/jira/browse/GRIFFIN-203
> Project: Griffin (Incubating)
>  Issue Type: New Feature
>Reporter: Nikolay Sokolov
>Priority: Major
>
> Creating custom rules from API might be cumbersome -- body should be prepared 
> outside of UI, and then submitted using HTTP call. To make user's life 
> easier, it would be useful to allow measure creation by editing JSON directly 
> on UI. Viewing side of this feature would be GRIFFIN-202.
> Also, experience-wise, JSON might not be the best option for complex 
> spark-sql rules. Possible solution to that would be allowing to write YAML 
> representation instead of JSON, and then either submitting YAML body or 
> converting from YAML to JSON on UI side before submission.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GRIFFIN-208) Job status is SUCCESS even if some stages have failed

2018-11-17 Thread Nikolay Sokolov (JIRA)


 [ 
https://issues.apache.org/jira/browse/GRIFFIN-208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Sokolov resolved GRIFFIN-208.
-
Resolution: Fixed
  Assignee: Nikolay Sokolov

> Job status is SUCCESS even if some stages have failed
> -
>
> Key: GRIFFIN-208
> URL: https://issues.apache.org/jira/browse/GRIFFIN-208
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Reporter: Nikolay Sokolov
>Assignee: Nikolay Sokolov
>Priority: Major
>
> When some steps (MetricWrite or SparkSql, for example) fail, errors are just 
> logged, but not reported as part of job status. Symptoms:
> {code:none} 
> 18/10/22 17:17:58 ERROR transform.SparkSqlTransformStep: run spark sql [  
> ] error: ...
> {code}
> YarnApplicationState: FINISHED
> FinalStatus Reported by AM:   SUCCEEDED



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GRIFFIN-194) [service] Hive API improvement

2018-11-17 Thread Nikolay Sokolov (JIRA)


 [ 
https://issues.apache.org/jira/browse/GRIFFIN-194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Sokolov resolved GRIFFIN-194.
-
Resolution: Fixed

> [service] Hive API improvement
> --
>
> Key: GRIFFIN-194
> URL: https://issues.apache.org/jira/browse/GRIFFIN-194
> Project: Griffin (Incubating)
>  Issue Type: Sub-task
>Reporter: Nikolay Sokolov
>Priority: Minor
>
> Purpose is mainly to support GRIFFIN-195 with single request to get table 
> list information, while avoiding transferring all table metadata and making 
> lots of metastore requests.
> Hive API provides following relevant APIs right now:
> * listing DBs
> * getting all table names in DB
> * listing all table _objects_ in all _dbs_
> What's seems to be missing, is API call for all table names in all DBs (as 
> middle ground between n+1 API requests and 1 API request with huge payload 
> and n*m+1 metastore requests on backend).
> This api request should take no parameters, and return Map List> in response.
> Proposed API endpoint: TBD



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GRIFFIN-193) Profiling measure UX improvements

2018-11-17 Thread Nikolay Sokolov (JIRA)


 [ 
https://issues.apache.org/jira/browse/GRIFFIN-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Sokolov resolved GRIFFIN-193.
-
Resolution: Fixed
  Assignee: Nikolay Sokolov

> Profiling measure UX improvements
> -
>
> Key: GRIFFIN-193
> URL: https://issues.apache.org/jira/browse/GRIFFIN-193
> Project: Griffin (Incubating)
>  Issue Type: Improvement
>Reporter: Nikolay Sokolov
>Assignee: Nikolay Sokolov
>Priority: Major
>
> While profiling measure UI works fine on small scale, it is becoming tricky 
> to use, when number of tables and databases grows, becoming almost unusable 
> on 1000+ tables. APIs listing large amounts of tables fail frequently and 
> take minutes to complete, it's hard to find tables on UI, and some animations 
> are also starting to work slowly.
> This ticket will have subtasks for both UI and service improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (GRIFFIN-194) [service] Hive API improvement

2018-11-17 Thread Nikolay Sokolov (JIRA)


 [ 
https://issues.apache.org/jira/browse/GRIFFIN-194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Sokolov reassigned GRIFFIN-194:
---

Assignee: Nikolay Sokolov

> [service] Hive API improvement
> --
>
> Key: GRIFFIN-194
> URL: https://issues.apache.org/jira/browse/GRIFFIN-194
> Project: Griffin (Incubating)
>  Issue Type: Sub-task
>Reporter: Nikolay Sokolov
>Assignee: Nikolay Sokolov
>Priority: Minor
>
> Purpose is mainly to support GRIFFIN-195 with single request to get table 
> list information, while avoiding transferring all table metadata and making 
> lots of metastore requests.
> Hive API provides following relevant APIs right now:
> * listing DBs
> * getting all table names in DB
> * listing all table _objects_ in all _dbs_
> What's seems to be missing, is API call for all table names in all DBs (as 
> middle ground between n+1 API requests and 1 API request with huge payload 
> and n*m+1 metastore requests on backend).
> This api request should take no parameters, and return Map List> in response.
> Proposed API endpoint: TBD



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)