[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-29 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-8473:
-
Issue Type: Sub-task  (was: Improvement)
Parent: IMPALA-8598

> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Implement a plugin approach (similar to {{authorization_provider}}) for 
> consuming query event hooks, where downstream users can provide their own 
> hook implementations as runtime dependencies.
> Keep but deprecate existing lineage event file writing.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
> would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.
>  
> h3. Code Review
> [https://gerrit.cloudera.org/#/c/13352/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-23 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Description: 
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
h3. Design Proposal

Implement a plugin approach (similar to {{authorization_provider}}) for 
consuming query event hooks, where downstream users can provide their own hook 
implementations as runtime dependencies.

Keep but deprecate existing lineage event file writing.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.

 
h3. Code Review

[https://gerrit.cloudera.org/#/c/13352/]

  was:
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
h3. Design Proposal

Move lineage logging from be to fe, where we can make use of the same plugin 
approach as {{authorization_provider}} to allow a downstream user to provide 
their own lineage consumers as runtime dependencies.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.

 
h3. Code Review

[https://gerrit.cloudera.org/#/c/13352/]


> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Implement a plugin approach (similar to {{authorization_provider}}) for 
> consuming query event hooks, where downstream users can provide their own 
> hook implementations as runtime dependencies.
> Keep but deprecate existing lineage event file writing.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
> would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.
>  
> h3. Code Review
> [https://gerrit.cloudera.org/#/c/13352/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-16 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Description: 
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
h3. Design Proposal

Move lineage logging from be to fe, where we can make use of the same plugin 
approach as {{authorization_provider}} to allow a downstream user to provide 
their own lineage consumers as runtime dependencies.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.

 
h3. Code Review

[https://gerrit.cloudera.org/#/c/13352/]

  was:
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
h3. Design Proposal

Move lineage logging from be to fe, where we can make use of the same plugin 
approach as {{authorization_provider}} to allow a downstream user to provide 
their own lineage consumers as runtime dependencies.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.


> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Move lineage logging from be to fe, where we can make use of the same plugin 
> approach as {{authorization_provider}} to allow a downstream user to provide 
> their own lineage consumers as runtime dependencies.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
> would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.
>  
> h3. Code Review
> [https://gerrit.cloudera.org/#/c/13352/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-02 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Description: 
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
h3. Design Proposal

Move lineage logging from be to fe, where we can make use of the same plugin 
approach as {{authorization_provider}} to allow a downstream user to provide 
their own lineage consumers as runtime dependencies.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.

  was:
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with a singleton.  
Singleton would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.


> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Move lineage logging from be to fe, where we can make use of the same plugin 
> approach as {{authorization_provider}} to allow a downstream user to provide 
> their own lineage consumers as runtime dependencies.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
> would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-02 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Attachment: ImpalaPostExecHook-infra.patch

> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> Approach should be similar to that of choosing authorization provider, where 
> the publication strategy can be chosen at runtime via configuration flag(s).
> Scope of this ticket is to move lineage publication to the fe and add 
> appropriate hooks that a user can implement.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with a singleton.  
> Singleton would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-02 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Description: 
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with a singleton.  
Singleton would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.

  was:
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement


> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> Approach should be similar to that of choosing authorization provider, where 
> the publication strategy can be chosen at runtime via configuration flag(s).
> Scope of this ticket is to move lineage publication to the fe and add 
> appropriate hooks that a user can implement.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with a singleton.  
> Singleton would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-04-30 Thread Dinesh Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Garg updated IMPALA-8473:

Priority: Critical  (was: Major)

> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Priority: Critical
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> Approach should be similar to that of choosing authorization provider, where 
> the publication strategy can be chosen at runtime via configuration flag(s).
> Scope of this ticket is to move lineage publication to the fe and add 
> appropriate hooks that a user can implement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-04-30 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Description: 
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement

  was:
Impetus for this change is to allow lineage to be consumed by Altus via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement


> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Priority: Major
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> Approach should be similar to that of choosing authorization provider, where 
> the publication strategy can be chosen at runtime via configuration flag(s).
> Scope of this ticket is to move lineage publication to the fe and add 
> appropriate hooks that a user can implement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org