[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-10-17 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#29).

Change subject: IMPALA-8571: harden QueryEventHook execution
..

IMPALA-8571: harden QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

A daemon thread is one that will be halted immediately when all
non-daemon thread have completed, for example when the JVM is
shutdown.  There are some technical implications of this, so please
see Java Thread.setDaemon(boolean) for more details.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specifies a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

Hook performance metrics are captured by the frontend, but not yet
published to the backend and are therefore not user-visible. See
IMPALA-8914 for implementation of this feature.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
11 files changed, 834 insertions(+), 118 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/29
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 29
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-10-17 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571: harden QueryEventHook execution
..


Patch Set 28:

(6 comments)

> Patch Set 28:
>
> (25 comments)
>
> Hi Radford. As Bharath and Fredy have disappeared I'll review this now :-)
> Code looks good. Like you I'm slightly worried about the timeout queue. It's 
> somehow wasteful to keep all that state around when the common case is that 
> the tasks completed quickly.

I think I have a better implementation for the timeout check that won't allow 
unbounded growth of the queue.  Before I upload that patchset, the next one 
addresses the rest of the feedback

http://gerrit.cloudera.org:8080/#/c/13748/28//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13748/28//COMMIT_MSG@19
PS28, Line 19: See Java Thread.setDaemon(boolean) for what a "daemon thread" 
entails.
> How can a user decide whether to use daemon threads?
I think it depends on whether or not they feel that query hook execution should 
block shutdown of the JVM, since a daemon thread should be killed immediately 
when all regular threads have completed.  There are some details like the fact 
that `finally` blocks are not executed if a daemon thread is halted in this 
way; they are kind of technical so I didn't want to document them here but 
refer them to a more-definitive source.


http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@82
PS28, Line 82:  * execute. You then create an executor with a thread pool of 
size 1 and a hook
> You can consider this a nit :-) , but I find the use of 'you' jarring
Flashbacks to my technical-writing course :)

(fixed)


http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@182
PS28, Line 182: final ArrayBlockingQueue boundedQueue =
> Can you use BlockingQueue ?
You mean declare the var as the most general type possible, right?  Done.


http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@290
PS28, Line 290: t);
> Does LOG.error work when there is an extra Throwable argument? Looks a bit
Yep, in fact it will log the stack trace this way! Does look weird and I always 
end up running tests to (re)convince myself each time I use it


http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@318
PS28, Line 318: // task queue to grow unbounded if hook tasks are scheduled 
at an interval
> This approach seems clever, and the code is neat and tidy...
For a single timeout task, I think the memory overhead is that of the 
`Runnable` of the task itself, maybe plus a pointer to it.  Call that overhead 
`N`.

If hooks are scheduled every 30 min or more, then there will only ever be max N 
extraneous garbage sitting around.  Anything more frequent than that will cause 
(max) extraneous garbage to grow in increments of `N`.

It's this growth that is worrying to me, but I think I have an alternative way 
to implement this class that retains the simplicity but ensures that there is 
never any extraneous garbage.  I'll need to extend `ExecutorService` in order 
to do this though, so look for that in the next patchset.


http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/service/Frontend.java@1624
PS28, Line 1624:* for {@link QueryEventHook}, which is part of the 
published api.
> I would like to see the guarantees here.
So the `QueryEventHook` class is published in a separate jar than the rest of 
the FE, the intent being that a downstream hook implementation project can 
import only what is needed and remain decoupled from FE implementation.  That 
was my motivation for moving the guarantees there; I deemed the guarantees 
most-useful to users who will be writing hooks.

I removed the guarantees from this javadoc so that we wouldn't have duplication.

It does seem wrong to me that the guarantees are documented in one place but 
implemented in another though.  Do you think that I should move them back here 
and reference them from the `QueryEventHook` javadoc?  I just don't know how a 
user who only imports the published-api will be able to reference this javadoc.



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Me

[Impala-ASF-CR] IMPALA-9053: DDLs should generate lineage graphs.

2019-10-16 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14458 )

Change subject: IMPALA-9053: DDLs should generate lineage graphs.
..


Patch Set 1: Code-Review+1

I've touched lineage-related code in the past but have to admit that I don't 
really have much domain knowledge.  That said, the changes look good to me


--
To view, visit http://gerrit.cloudera.org:8080/14458
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6c7ed9fe3265fd777fe93590cf4eb2d9ba0dd1e
Gerrit-Change-Number: 14458
Gerrit-PatchSet: 1
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 16 Oct 2019 19:48:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-10-16 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571: harden QueryEventHook execution
..


Patch Set 28:

(1 comment)

added some detail to the comment that needs attention

http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@313
PS28, Line 313: // TODO: IMPALA-XXX
> In this latest approach, I schedule a task that will cancel the hook task i
Note that this potential for unbounded growth is possible even though the hook 
task executor has a bounded work queue.  Consider the following scenario:

(1) user configures a hook timeout of 3 seconds
(2) hook tasks take 1 second to execute
(3) hook tasks are submitted every 2 seconds (due to query execution)

Because (2) and (3), hook tasks will never fill the bounded queue and we can 
say that a hook task will be submitted every 2 seconds.  Every hook task 
submitted induces a timeout task to begin after 3 seconds.  Because timeout 
tasks are effectively submitted every 2 seconds but only run after 3 seconds, 
this will cause the timeout task queue to grow.



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 28
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 16 Oct 2019 14:34:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-10-16 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571: harden QueryEventHook execution
..


Patch Set 28:

(1 comment)

I've implemented a simpler approach as suggested by Bharath, but it comes with 
a caveat.  Please carefully review where I've commented (PS28, 
FixedCapacityQueryHookExecutor.java:313)

http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@313
PS28, Line 313: // TODO: IMPALA-XXX
In this latest approach, I schedule a task that will cancel the hook task if it 
is still running after `hookTimeout` duration.  This makes the code simpler 
(fewer threads) and easier to reason about (straight-forward use of executors). 
 However, it comes at a cost, described in the comment.

I've pre-emptively placed a TODO comment here to draw attention to it



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 28
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 16 Oct 2019 13:45:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-10-16 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#28).

Change subject: IMPALA-8571: harden QueryEventHook execution
..

IMPALA-8571: harden QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

Hook performance metrics are captured by the frontend, but not yet
published to the backend and are therefore not user-visible. See
IMPALA-8914 for implementation of this feature.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
11 files changed, 814 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/28
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 28
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-10-03 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571: harden QueryEventHook execution
..


Patch Set 27:

Ok, I've come up with a simpler timeout-monitor implementation based on 
feedback; it uses a single-threaded `ScheduledThreadPoolExecutor` to cancel 
hook tasks.  It should be easier to reason about than the current impl.

Need to write some tests before pushing the patchset.


--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 27
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 03 Oct 2019 10:18:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-09-26 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571: harden QueryEventHook execution
..


Patch Set 26:

(1 comment)

Missed a nit...

http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@248
PS25, Line 248:   static final String ON_QUERY_COMPLETE = "onQueryComplete";
> nit: Move to the top, static final consts are usually on the top.
Done



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 26
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 26 Sep 2019 23:15:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-09-26 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#27).

Change subject: IMPALA-8571: harden QueryEventHook execution
..

IMPALA-8571: harden QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

Hook performance metrics are captured by the frontend, but not yet
published to the backend and are therefore not user-visible. See
IMPALA-8914 for implementation of this feature.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
11 files changed, 843 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/27
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 27
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-09-26 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571: harden QueryEventHook execution
..


Patch Set 26:

(5 comments)

Addressed some nits; still need to look into simplifying the timeout monitor 
executor

http://gerrit.cloudera.org:8080/#/c/13748/25//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13748/25//COMMIT_MSG@41
PS25, Line 41: but not yet
 : published to the backend and are therefore not user-visible
> I'm guessing this is done as a separate patch?
Yes, I was asked to split out the metrics into a separate ticket in order to 
speed up acceptance of the hardening/monitoring


http://gerrit.cloudera.org:8080/#/c/13748/25/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13748/25/be/src/service/impala-server.cc@285
PS25, Line 285: "
> nit: Prefix with "(Advanced)" like in other cases. We typically do not want
Will do


http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@169
PS25, Line 169:  this.metrics_ = Objects.requireNonNull(hookMetrics);
  :
  : Preconditions.checkArgument(hookTimeout >= 1,
  : "hook timeout should be >= 1 but was " + hookTimeout);
  : t
> nit: Preconditions.checkArgument(hookTimeout >=1,  mesg);
Ooh, I also used the wrong templating format in the log message and forgot the 
timeout:

```
String.format("hook timeout should be >= 1 but was %s.", hookTimeout);
```


http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@181
PS25, Line 181:
> nit: extra newline.
Done


http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@190
PS25, Line 190: t-
> nit: space
Done



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 26
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 26 Sep 2019 23:09:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-09-26 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#26).

Change subject: IMPALA-8571: harden QueryEventHook execution
..

IMPALA-8571: harden QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

Hook performance metrics are captured by the frontend, but not yet
published to the backend and are therefore not user-visible. See
IMPALA-8914 for implementation of this feature.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
11 files changed, 841 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/26
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 26
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-08-30 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#25).

Change subject: IMPALA-8571: harden QueryEventHook execution
..

IMPALA-8571: harden QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

Hook performance metrics are captured by the frontend, but not yet
published to the backend and are therefore not user-visible. See
IMPALA-8914 for implementation of this feature.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
11 files changed, 843 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/25
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 25
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution

2019-08-30 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#24).

Change subject: IMPALA-8571: harden QueryEventHook execution
..

IMPALA-8571: harden QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

Hook performance metrics are captured by the frontend, but not yet
published to the backend and are therefore not user-visible. See
IMPALA-8914 for implementation of this feature.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
12 files changed, 968 insertions(+), 111 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/24
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 24
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-27 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#22).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,364 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/22
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 22
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-27 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#23).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,361 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/23
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 23
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-27 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#21).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,364 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/21
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 21
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-27 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#20).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/.k5.KRBTEST.COM
A 
be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/impala-test_radimp.vpc.cloudera.com.keytab
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/kdc.conf
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/krb5.conf
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/krb5cc
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal.kadm5
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal.kadm5.lock
A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal.ok
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/.k5.KRBTEST.COM
A 
be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/impala-test_radimp.vpc.cloudera.com.keytab
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/kdc.conf
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/krb5.conf
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/krb5cc
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal.kadm5
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal.kadm5.lock
A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal.ok
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/.k5.KRBTEST.COM
A 
be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/impala-test_radimp.vpc.cloudera.com.keytab
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/kdc.conf
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/krb5.conf
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/krb5cc
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal.kadm5
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal.kadm5.lock
A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal.ok
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java

[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-19 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#19).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook execution*

Query event hooks are still executed in a thread-pool, but you can
now configure the type of (Java) threads used to execute them via
the newly-added backend flag `query_event_hook_use_daemon_threads`,
which takes a true/false value.

See Java Thread.setDaemon(boolean) for what a "daemon thread" entails.

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,364 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/19
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 19
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-19 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13748/18/be/src/util/hook-metrics.cc
File be/src/util/hook-metrics.cc:

http://gerrit.cloudera.org:8080/#/c/13748/18/be/src/util/hook-metrics.cc@82
PS18, Line 82: void QueryEventHookMetricContainer::Update(const 
TQueryEventHookMetrics& from) {
Is this all I really have to do to make sure the metrics will be published?



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 18
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Mon, 19 Aug 2019 22:16:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-19 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 18:

(10 comments)

Thanks for the feedback, asherman.  Replies inline

http://gerrit.cloudera.org:8080/#/c/13748/18//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13748/18//COMMIT_MSG@31
PS18, Line 31:
> Do you want to discuss --query_event_hook_use_daemon_threads here? How woul
Yes, this is something that should be included in the commit message


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@51
PS18, Line 51:  * Rejections
> This javadoc is great, and just the sort of thing I was hoping to see.
I was not planning on publishing the html; I have just been in the habit of 
using html because eclipse/Idea/etc will render it when displaying the javadoc 
in a tooltip window.  Everything just turns into one giant line in there if you 
don't format.

Anyway, I'll switch to markdown, since it's both human-readable and can be 
rendered by an IDE plugin or javadoc tool if needed.


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@102
PS18, Line 102:  * ${hookClass}.${method}.execution.exceptions
> I think the metrics now have the "query-event-hook" prefix.
re: "query-event-hook" prefix
-

So this is where the split b/w backend/frontend metrics is kind of confusing.  
This javadoc describes the metric names as they are collected in the frontend, 
and unfortunately the names do not exactly align with the backend.  Part of 
this is due to the fact that the Java metrics are objects (e.g. Timers) that 
can provide multiple metrics (e.g. mean time, count).

In passing the metrics to the backend, it seemed to me that the convention was 
to pass the raw data (e.g. mean time, count) as needed, instead of say 
translating the entire metric object to some similarly-capable object in the 
backend.

The "query-event-hook" prefix only exists in the backend metric names.

re: hookClass vs. method
-

In this context, ${hookClass} is always the fully-qualified name of the java 
class that implements `QueryEventHook`, and ${hookMethod} the method name 
invoked on that hook.  An example metric name would be:

org.foo.bar.MyQueryEventHook.onQueryComplete.execution.exceptions

I actually tried to make this very predictable and therefore amenable to 
automation/generation, but if there is some way I can improve that further, 
please let me know.


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@176
PS18, Line 176: // ArrayBlockingQueue contructor performs bounds-check on 
queue size for us
> typo: constructor
Done


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@184
PS18, Line 184: // this executor cancels any hook tasks that
> This may seem picky but in Imapla we like comments with Capitals at the beg
Done


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@255
PS18, Line 255:* {@link ExecutionException}.  This behavior is consistent 
with how futures normally
> We could be more concise by not saying this about how Futures normally beha
Done


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@286
PS18, Line 286:   LOG.warn("QueryEventHook {}.{} execution rejected because 
the " +
> I  sometimes weaselly write "probably because" as you never know :-)
Done


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
File 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java:

http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@1
PS18, Line 1: /**
> Use // style comments for the apache license please
Done


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@80
PS18, Line 80:   // make this longer to reasonably guarantee a timeout
> Is this a FIXME note, not sure I understand what this keans
Reworded as: "Sleep much longer than `hookTimeout` to reasonably guarantee a 
timeout."


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@115
PS18, Line 115:   i

[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-26 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#18).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,376 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/18
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 18
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-25 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#17).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,376 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/17
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 17
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-23 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#16).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

(WIP because still need to implement polling of metrics by backend)

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,375 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/16
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 16
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-15 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 15:

> Patch Set 15:
>
> Build Failed
>
> https://jenkins.impala.io/job/gerrit-code-review-checks/3879/ : Initial code 
> review checks failed. See linked job for details on the failure.

Linker failed due to undefined static member 
`QueryEventHook::metric_containers_` in be/src/util/hook-metrics.cc


--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 15
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Mon, 15 Jul 2019 20:00:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-15 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 15:

(3 comments)

Updated the review with metric code, including actual metric collection on the 
frontend, metric "translation" on the backend, and the JNI plumbing to connect 
the 2.

Only thing missing is the actual polling mechanism to make the connection... 
currently looking at other examples (e.g. event metrics) for some guidance.  
Commit `360b23bfa4eeb337cce380512f106a692bf3c49b` seems to be a good starting 
point(?)

http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/service/impala-server.cc@75
PS15, Line 75: #include "util/hook-metrics.h"
This will be used once I determine the best way to poll for metrics from the 
frontend


http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics-test.cc
File be/src/util/hook-metrics-test.cc:

http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics-test.cc@23
PS15, Line 23: TEST(HookMetricContainer, UpdateTest) {
Review is still WIP: I haven't actually ran this test yet, and I need to add 
more coverage for backend handling of metrics.


http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics.h
File be/src/util/hook-metrics.h:

http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics.h@65
PS15, Line 65:   TGetQueryEventHookMetricsResult* response);
Review is still WIP because I need to actually call this function with metrics 
from the frontend.  The JNI calls already exist, but I need some feedback on 
how to poll the frontend periodically for those metrics.



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 15
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Mon, 15 Jul 2019 12:45:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-15 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#15).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*query-event-hook.${hook_method}.execution-rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*query-event-hook.${hook_method}.execution-exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*query-event-hook.${hook_method}.execution-timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*query-event-hook.${hook_method}.execution-submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*query-event-hook.${hook_method}.mean-execution-time*

Mean time in [ns] that ${hook_name} has taken to complete,
whether normally or by error (e.g. timeout or exception).

*query-event-hook.${hook_method}.mean-queued-time*

Mean time in [ns] between hook task submission
and hook task execution. This indicates how long a hook
task has been queued waiting to execute.

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
A be/src/util/hook-metrics-test.cc
A be/src/util/hook-metrics.cc
A be/src/util/hook-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
19 files changed, 1,368 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/15
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 15
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-10 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 14:

> Patch Set 14:
>
> (8 comments)
>
> > Patch Set 14:
> >
> > (9 comments)
>
> Replied to comments


--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:06:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-10 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 14:

(8 comments)

> Patch Set 14:
>
> (9 comments)

Replied to comments

http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@275
PS14, Line 275: 1
> Since the hook execution is async (to query unregister) anyway, how about w
That's a good idea; I'll bump it


http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@279
PS14, Line 279: true
> Should anyone ever have to set this to false? If not, I suggest removing th
Nobody has requested it, but I was under the impression that a user might want 
to use higher-priority threads for executing hooks.  They might also want to 
ensure that the threads will not be killed at shutdown until the hooks have 
completed executing.


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@97
PS14, Line 97:  * ${hookClass}.${method}.execution.rejections
> Where are all these metrics exposed? I don't see the plumbing to make them
I haven't completed the plumbing yet; I'm still trying to figure out how it 
works.  This is the main reason the CR is still WIP


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@101
PS14, Line 101: exceptions
> nit: failures?
I can change it to failures if you think that's more expressive


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@131
PS14, Line 131: private final long hookTimeout_;
  :   private final TimeUnit hookTimeoutUnit_;
> Just force it to be secs or something?
Yes, it is forced to `SECONDS` in the `QueryHookManager` when instantiating 
this executor.  I am allowing a different time unit here internally to make the 
unit tests faster.


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@290
PS14, Line 290:   context);
> probably helps to include query ID explicitly.
Query ID is not currently available; are you suggesting that it be passed to 
the hook from the backend?  I do like that idea


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@327
PS14, Line 327: final Future f = hookExecutor_.submit(() -> 
{
> I don't think you need to offload this to a separate executor pool again.
I think I'm missing something.  If I just do new FutureTask<>(callHook).run(), 
won't that just run callHook in this thread?  And doesn't that mean that we 
won't get to line 337 until callHook has completed?


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@77
PS14, Line 77:   static final String BE_HOOKS_FLAG = "query_event_hook_classes";
> nit: any reason to remove private qualifier?
I used one of the constants in an error message logged by 
`FixedCapacityQueryHookExecutor`.  If you think it's cleaner, I can make these 
private again and maybe(?) make `FCQHExecutor` an inner class



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:04:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8589: Re-enable flaky test query event hooks.py

2019-07-08 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13713 )

Change subject: IMPALA-8589: Re-enable flaky test_query_event_hooks.py
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13713/9/tests/custom_cluster/test_query_event_hooks.py
File tests/custom_cluster/test_query_event_hooks.py:

http://gerrit.cloudera.org:8080/#/c/13713/9/tests/custom_cluster/test_query_event_hooks.py@49
PS9, Line 49:  expected_count=-1)
This test no longer checks for the execution of `onQueryComplete`... it only 
checks that `onImpalaStartup` is called, which is not a query event hook at 
all, really.

I think we still need coverage for `onQueryComplete` because it executes on the 
thread pool executor, which is different than `onImpalaStartup`



--
To view, visit http://gerrit.cloudera.org:8080/13713
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia64550e986b5eba59a1d77657943932bb977d470
Gerrit-Change-Number: 13713
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Mon, 08 Jul 2019 20:32:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-08 Thread radford nguyen (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#14).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*${hookClass}.${method}.execution.rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*${hookClass}.${method}.execution.exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*${hookClass}.${method}.execution.timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*${hookClass}.${method}.submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*${hookClass}.${method}.execution.time*

Timer indicating the amount of time ${hookClass}.${method}
has taken to complete, whether normally or by error (e.g. timeout or 
exception).
Because tasks can potentially be cancelled before beginning execution, the
Timer.getCount() of this timer may be less than
`${hookClass}.${method}.submissions`

*${hookClass}.${method}.queued.time*

Timer indicating the amount of time between hook task submission
and hook task execution. Since a task may be cancelled before it even begins
execution, {@link Timer#getCount()} of this timer may be less than
`${hookClass}.${method}.submissions`

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 818 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/14
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-08 Thread radford nguyen (Code Review)
Hello Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#13).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

A timeout for hook execution can be configured through the backend
flag `query_event_hook_timeout_s`, which specified a timeout value
in seconds. If a hook has not completed execution within this timeout
(measured from hook submission, not execution) then the hook task
will be cancelled in order to free up resources.

*hook rejection*

The hook execution engine now has a fixed-capacity work queue whose
capacity can be configured through the backend flag
`query_event_hook_queue_capacity`. This queue is used to store
hook tasks that are submitted when there are no free threads
available for hook execution. All hook tasks submitted when the
queue is at capacity will be rejected and logged without affecting
the result of the query.

*hook performance metrics*

The following hook metrics are captured:

*${hookClass}.${method}.execution.rejections*

Counter indicating how many submitted tasks have been rejected
due to a full work queue

*${hookClass}.${method}.execution.exceptions*

Counter indicating how many tasks have thrown an exception
during execution

*${hookClass}.${method}.execution.timeouts*

Counter indicating how many tasks have been cancelled due to
not completing within {@code hookTimeout_s} of submission.

*${hookClass}.${method}.submissions*

Counter indicating the number of times ${hookClass}.${method}
has been submitted for execution.

*${hookClass}.${method}.execution.time*

Timer indicating the amount of time ${hookClass}.${method}
has taken to complete, whether normally or by error (e.g. timeout or 
exception).
Because tasks can potentially be cancelled before beginning execution, the
Timer.getCount() of this timer may be less than
`${hookClass}.${method}.submissions`

*${hookClass}.${method}.queued.time*

Timer indicating the amount of time between hook task submission
and hook task execution. Since a task may be cancelled before it even begins
execution, {@link Timer#getCount()} of this timer may be less than
`${hookClass}.${method}.submissions`

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 817 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/13
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 13
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-07 Thread radford nguyen (Code Review)
Hello Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#12).

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..

IMPALA-8571[WIP]: improve QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

*hook rejection*

*hook performance metrics*

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 782 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/12
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 12
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8572[WIP]: improve QueryEventHook execution

2019-07-07 Thread radford nguyen (Code Review)
Hello Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#11).

Change subject: IMPALA-8572[WIP]: improve QueryEventHook execution
..

IMPALA-8572[WIP]: improve QueryEventHook execution

This commit hardens guarantees around QueryEventHook execution by
adding the following features:

*hook timeout/cancellation*

*hook rejection*

*hook performance metrics*

Testing:

- added unit tests for new features
- re-ran existing E2E tests

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 782 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/11
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 11
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution

2019-07-02 Thread radford nguyen (Code Review)
Hello Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#10).

Change subject: IMPALA-8572: improve QueryEventHook execution
..

IMPALA-8572: improve QueryEventHook execution

WIP

Testing:

TBD

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 767 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/10
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 10
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution

2019-07-02 Thread radford nguyen (Code Review)
Hello Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#9).

Change subject: IMPALA-8572: improve QueryEventHook execution
..

IMPALA-8572: improve QueryEventHook execution

WIP

Testing:

TBD

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 658 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/9
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 9
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution

2019-07-01 Thread radford nguyen (Code Review)
Hello Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13748

to look at the new patch set (#8).

Change subject: IMPALA-8572: improve QueryEventHook execution
..

IMPALA-8572: improve QueryEventHook execution

WIP

Testing:

TBD

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 658 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/8
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 8
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution

2019-07-01 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8572: improve QueryEventHook execution
..


Patch Set 7:

(2 comments)

add comment where you should probably start reviewing

http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@161
PS7, Line 161:   Future submitOnQueryComplete(
Entry point for executing hook onQueryComplete asynchronously


http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@133
PS7, Line 133:   final String msg = String.format("hook timeout should be 
positive but was {}. " +
might allow negative value to signify block indefinitely



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 7
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Tue, 02 Jul 2019 01:12:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution

2019-07-01 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8572: improve QueryEventHook execution
..

IMPALA-8572: improve QueryEventHook execution

Description goes here

Testing:

TBD

Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java
10 files changed, 658 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/7
--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 7
Gerrit-Owner: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: Publish lineage info via hook

2019-05-24 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#26). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: Publish lineage info via hook
..

IMPALA-8473: Publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- query completion
- see IMPALA-8572 for caveat/details

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_event_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `query_event_hook_nthreads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook on-query-complete execution will simply be logged
and will not be (directly) fatal to the system.

Tests:
- added unit tests for FE hook execution
- added E2E tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java
M tests/authorization/test_provider.py
A tests/custom_cluster/test_query_event_hooks.py
20 files changed, 1,109 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/26
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 26
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: Publish lineage info via hook

2019-05-24 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#24). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: Publish lineage info via hook
..

IMPALA-8473: Publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- query completion
- see IMPALA-8572 for caveat/details

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_event_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `query_event_hook_nthreads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook on-query-complete execution will simply be logged
and will not be (directly) fatal to the system.

Tests:
- added unit tests for FE hook execution
- added E2E tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java
M tests/authorization/test_provider.py
A tests/hooks/test_hooks.py
M tests/run-tests.py
21 files changed, 1,110 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/24
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 24
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: Publish lineage info via hook

2019-05-23 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#21). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: Publish lineage info via hook
..

IMPALA-8473: Publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- query completion
- see IMPALA-8572 for caveat/details

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_event_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `query_event_hook_nthreads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook on-query-complete execution will simply be logged
and will not be (directly) fatal to the system.

Tests:
- added unit tests for FE hook execution
- added E2E tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java
M tests/authorization/test_provider.py
A tests/hooks/test_hooks.py
20 files changed, 1,109 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/21
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 21
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..


Patch Set 20:

Fixed code-review checks, implemented minor feedback, and managed to 
drastically improve 1 e2e test case


--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 20
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 23 May 2019 02:28:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim 
Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#20).

Change subject: IMPALA-8473: publish lineage info via hook
..

IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- query completion
- see IMPALA-8572 for caveat/details

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_event_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `query_event_hook_nthreads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook on-query-complete execution will simply be logged
and will not be (directly) fatal to the system.

Tests:

- added unit tests for FE hook execution
- added e2e tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java
A tests/hooks/test_hooks.py
20 files changed, 1,169 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/20
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 20
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..


Patch Set 19:

(2 comments)

Thanks for +1; replied to your comments.  Next patchset will fix code-review 
checks

http://gerrit.cloudera.org:8080/#/c/13352/18/fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
File fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java:

http://gerrit.cloudera.org:8080/#/c/13352/18/fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java@44
PS18, Line 44:   private QueryCompleteContext mockQueryCompleteContext;
> I don't understand why this is marked as Mock, it seems a simple class that
I just typically use mocks to:

- avoid complex creation logic that tends to change with refactoring
- allow behavior verification, such as verifying that a method was invoked so 
many times with such and such arguments
- allow customized stubbing of behavior without boilerplate

None of these actually apply in this test suite currently, so I can replace it 
with a POJO if you think it's better that way.


http://gerrit.cloudera.org:8080/#/c/13352/18/fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java@67
PS18, Line 67:   private static QueryEventHookManager createSut(int nThreads, 
String... hooks)
> What is SUT? Maybe use something clearer to the reader
SUT is system-under-test; I thought that nomenclature was somewhat standard?  
Not sure, but I think it came from a Martin Fowler blog... maybe that's not as 
standard as I think



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 19
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 22 May 2019 23:36:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..


Patch Set 19:

Think this is ready for final review now.

Sorry about all the patchsets; don't really have my impala workflow established 
yet.


--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 19
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 22 May 2019 21:39:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim 
Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#19).

Change subject: IMPALA-8473: publish lineage info via hook
..

IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- query completion
- see IMPALA-8572 for caveat/details

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_event_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `query_event_hook_nthreads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook on-query-complete execution will simply be logged
and will not be (directly) fatal to the system.

Tests:

- added unit tests for FE hook execution
- added e2e tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java
A tests/hooks/test_hooks.py
20 files changed, 1,231 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/19
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 19
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim 
Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#18).

Change subject: IMPALA-8473: publish lineage info via hook
..

IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- query completion
- see IMPALA-8572 for caveat/details

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_event_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `query_event_hook_nthreads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook on-query-complete execution will simply be logged
and will not be (directly) fatal to the system.

Tests:

- added unit tests for FE hook execution
- added e2e tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java
A tests/hooks/test_hooks.py
20 files changed, 1,230 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/18
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 18
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim 
Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#17).

Change subject: IMPALA-8473: publish lineage info via hook
..

IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_exec_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `num_query_exec_hook_threads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook post-query execution will simply be logged
and will not bring down the system.

Tests:

- added unit tests for FE hook execution
- added e2e tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java
A tests/hooks/test_hooks.py
20 files changed, 1,183 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/17
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 17
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..


Patch Set 17:

(15 comments)

still one more iteration needed

http://gerrit.cloudera.org:8080/#/c/13352/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13352/16//COMMIT_MSG@38
PS16, Line 38: to every hook to execute asynchronously.  In other words,
> Nit: spell out IOW to help future non-native English speakers
good point; done


http://gerrit.cloudera.org:8080/#/c/13352/17/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java:

http://gerrit.cloudera.org:8080/#/c/13352/17/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@25
PS17, Line 25: public interface QueryExecHook {
Before we merge this, are we happy with this name?  The backend seems to use 
the terminology "query event", e.g. `(*request_state)->query_events()`

https://github.com/apache/impala/blob/3.2.0/be/src/service/impala-server.cc#L934

Perhaps we should rename this to `QueryEventHook` and likewise in other similar 
places.


http://gerrit.cloudera.org:8080/#/c/13352/16/tests/hooks/test_hooks.py
File tests/hooks/test_hooks.py:

http://gerrit.cloudera.org:8080/#/c/13352/16/tests/hooks/test_hooks.py@299
PS16, Line 299:   pass
> If you make LOG_DIR unique you could leave it around which might help someo
much better; done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py
File tests/hooks/test_hooks.py:

http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@30
PS17, Line 30: from getpass import getuser
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@31
PS17, Line 31: from ImpalaService import ImpalaHiveServer2Service
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@32
PS17, Line 32: from TCLIService import TCLIService
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@33
PS17, Line 33: from thrift.transport.TSocket import TSocket
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@34
PS17, Line 34: from thrift.transport.TTransport import TBufferedTransport
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@35
PS17, Line 35: from thrift.protocol import TBinaryProtocol
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@36
PS17, Line 36: from tests.common.custom_cluster_test_suite import 
CustomClusterTestSuite
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@37
PS17, Line 37: from tests.common.file_utils import assert_file_in_dir_contains,\
> flake8: E402 module level import not at top of file
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@37
PS17, Line 37: from tests.common.file_utils import assert_file_in_dir_contains,\
> flake8: F401 'tests.common.file_utils.assert_no_files_in_dir_contain' impor
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@205
PS17, Line 205:   def __wait_for_file(self, filepath, timeout_s=10):
Should this go in common/file_utils.py?  I didn't put it there now because it's 
just a first-pass, quick-and-dirty implementation


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@217
PS17, Line 217: class TestHooksStartupFail(CustomClusterTestSuite):
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@248
PS17, Line 248: d
> flake8: E304 blank lines found after function decorator
Done



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 17
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 22 May 2019 20:11:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..


Patch Set 14:

(1 comment)

As discussed in Slack, I've uploaded a simple change from the last patchset 
where we pass the json lineage string as opposed to the thrift object.  (This 
is actually the same approach as the first few patchsets

http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc@499
PS15, Line 499:   if (AreQueryHooksEnabled()) {
> One thing to be aware of this is that LogLineageRecord() is called from Unr
Really great point; I've copied this comment into the hook interface javadoc



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 22 May 2019 07:18:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-22 Thread radford nguyen (Code Review)
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim 
Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#16).

Change subject: IMPALA-8473: publish lineage info via hook
..

IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_exec_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `num_query_exec_hook_threads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook post-query execution will simply be logged
and will not bring down the system.

Tests:

- added unit tests for FE hook execution
- added e2e tests for hook configuration, execution, error

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java
A tests/hooks/test_hooks.py
19 files changed, 1,118 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/16
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 16
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-20 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@161
PS15, Line 161:   hookExecutor_.shutdown();
> Don't think so; my understanding is that all submitted jobs will still exec
Regardless, I think this does need to `awaitTermination` because any hooks that 
hang will hang the system exit, and we need to check for that and kill them here



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Tue, 21 May 2019 00:12:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-20 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..


Patch Set 14:

(8 comments)

Great feedback on PS14 and 15.  I replied to most comments.

Another PS coming soon, incorporating Andrew's suggestions about 
error-handling.  Also mentioned logging metrics to the webui to aid in 
debugging/diagnosis, but I'm not sure how to do this yet.  Will focus on 
capturing them for now.

http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc@505
PS15, Line 505: if (!status.ok()) {
> I don't see a test case for this.
This status won't indicate if any futures threw an exception (based on current 
implementation of Frontend.postQueryHook).  It will only be a fail if the job 
submission itself threw an exception.

So there's no test case to check for hooks throwing exceptions to trigger a 
fail status because that's not the feature.  Currently, the intended behavior 
is that no exceptions thrown from a hook will do anything to the system except 
log themselves.  (See comments/javadoc in FE).


http://gerrit.cloudera.org:8080/#/c/13352/14/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java:

http://gerrit.cloudera.org:8080/#/c/13352/14/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@59
PS14, Line 59:   void postQueryExecute(PostQueryHookContext context);
> Would it be useful to the Hook implementor to have some sort of Log object
Sounds useful.  Would you all be ok with simply passing in the actual `Logger`, 
or do you think that coupling the interface to our logging choice is a bad 
idea?  It's kind of a wash to me since slf4j is supposed to be a facade anyway.


http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java:

http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@35
PS15, Line 35:* Any {@link Exception} thrown from this method will 
effectively fail
> Is this the proper behavior? Do we really want to prevent impala from start
I think so, as I am considering a failing hook startup to be something akin to 
a configuration error.

I like the `onFoo` naming scheme as well, and I'd change all hook methods to 
match this


http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@52
PS15, Line 52:* Any {@link Exception} thrown from this method will only be 
caught
> Aren't the exceptions from this function re-thrown in the Manager?
They're rethrown in the thread pool executor, which only affects the Future.  
Note that all the Futures are currently ignored by the Frontend.

The only reason they're caught here in the first place is to log them.


http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@178
PS12, Line 178:* asynchronously, returning immediately with a List of 
{@link Future}s
> I missed this design doc, is there a link?
Here's the initial design doc: 
https://docs.google.com/document/d/1W476Zvjada9WBgmCCywnAakchfE1eNF7zyTy4cY9Q5w/edit?usp=sharing

Note that the design/approach has evolved since the time-of-writing, as more 
feedback has been incorporated


http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@161
PS15, Line 161:   hookExecutor_.shutdown();
> I think it is possible to lose logs during a shutdown if we don't `awaitTer
Don't think so; my understanding is that all submitted jobs will still 
execute-to-completion on the executor threads.  `awaitTermination` would only 
block the `cleanUp` thread until they have completed.


http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@203
PS15, Line 203:   return hookExecutor_.submit(() -> {
> As I understand it, submit() adds the work to some sort of queue. So if the
I believe the `ExecutorService` I am using rejects new work submission when its 
internal queue is full, and it provides a configurable hook for handling when 
that happens.  This is something I need to look into some more, but I _believe_ 
that the code as-written will throw a runtime exception when this occurs on 
`hookExecutor_.

[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-17 Thread radford nguyen (Code Review)
Hello Andrew Sherman, Anonymous Coward (498), Fredy Wijaya, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#15).

Change subject: IMPALA-8473: publish lineage info via hook
..

IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_exec_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `num_query_exec_hook_threads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook post-query execution will simply be logged
and will not bring down the system.

Tests:

- added unit tests for FE hook execution
- added e2e tests for hook configuration, execution, error

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java
A tests/hooks/test_hooks.py
19 files changed, 1,105 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/15
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 15
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook

2019-05-17 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: IMPALA-8473: publish lineage info via hook
..

IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_exec_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `num_query_exec_hook_threads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook post-query execution will simply be logged
and will not bring down the system.

Tests:

- added unit tests for FE hook execution
- added e2e tests for hook configuration, execution, error

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java
A tests/hooks/test_hooks.py
19 files changed, 1,120 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/14
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-17 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 12:

(4 comments)

answered some concerns around logging and error handling

http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@55
PS7, Line 55:  * performed synchronously during {@link 
#createFromConfig(BackendConfig)}.
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@131
PS12, Line 131: throw new InternalException(msg, e);
> As the class names are specified externally to Impala, this is one of the p
Yes, any exception thrown from any hook during startup will kill the entire 
system.  I documented this better in the javadoc for the hook interface, but I 
should add it to this class javadoc as well.

Good suggestion to also log the error message; I erroneously assumed that the 
exception message/trace would automatically be logged somehow.


http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@146
PS12, Line 146: throw new InternalException(msg, e);
> What will happen to the system if this occurs?
Will log.

An exception from the hook here will kill startup.  The thinking is that this 
is akin to a system configuration error, in which case I don't think we would 
want to continue.


http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@178
PS12, Line 178:   hookExecutor_.submit(() -> 
hook.postQueryExecute(context));
> Suppose the PostExecHook is failing, how will we know?
Still WIP: I intend to log the exception and simply swallow it.  (See design 
doc discussion and also QueryExecHook.postQueryExecute javadoc section on 
"Error Handling")

It's not possible to fail the query since hook execution is async, and the only 
way (besides logging) I can think of to indicate to the user that a hook has 
failed is to either fail a subsequent query or trigger a daemon 
`CLEAN_EXIT_WITH_ERROR` in the backend.  I'm not really familiar enough with 
Impala to understand the implications of either approach, so logging along with 
buyer-beware documentation in the interface is the route I am pursuing for now.

This warrants more discussion, but I do believe/hope that better error-handling 
can be added later on without any redesign.



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 12
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Sat, 18 May 2019 00:14:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-17 Thread radford nguyen (Code Review)
Hello Anonymous Coward (498), Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#12).

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The backend flag
`query_exec_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

The backend flag `num_query_exec_hook_threads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java
16 files changed, 581 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/12
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 12
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-17 Thread radford nguyen (Code Review)
Hello Anonymous Coward (498), Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13352

to look at the new patch set (#11).

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The hadoop property
`impala.post.exec.hooks` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java
A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java
16 files changed, 579 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/11
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 11
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The hadoop property
`impala.post.exec.hooks` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
8 files changed, 436 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/7
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 7
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 7:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h
File be/src/service/frontend.h:

http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h@190
PS6, Line 190: PostExecHook
> nit: PostQueryExecHook is probably a better name?
Done


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java
File fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java:

http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@36
PS2, Line 36:
> It is dangerous to let plugin fail Impala in this way. Could impala catches
This is during impala daemon start-up; is it not preferred to prevent start-up 
if there is a configuration problem?


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@47
PS2, Line 47:
> "when a (qualifying) Impala query" should it be "after a (qualifying) Impal
"has executed" already communicates that the point of time in question is in 
the past, so "after" is redundant.

But I think it's clearer, so I'll change it to "after"


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@48
PS2, Line 48:
> should "has executed" be "has been executed"?
This is just a difference between active voice and passive voice.  No 
difference in meaning


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@63
PS2, Line 63:
> why don't you have cleanup function, which will be called by Impala when Im
Because of the way impala shuts down, it will never be guaranteed to execute.  
This is covered and commented on in the design doc


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
File fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java:

http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@28
PS6, Line 28: lineageGraph
> nit: use _ suffix naming convention for member varaibles, i.e. lineageGraph
kk


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@31
PS6, Line 31: Objects.requireNonNull(lineageGraph).deepCopy();
> I don't think we want to do a deep copy each time.
Each time it's constructed?  Or called?

Because a single context object is passed to all the hook implementations


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@33
PS6, Line 33: singleton
> i'm not 100% convinced why this needs to be a singleton. Can Frontend owns
I agree.  But what about JniFrontend owning it?


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@78
PS6, Line 78: private static final int NUM_HOOK_THREADS = 3;
> should this be configurable?
Probably.  I'd like to get the hook configuration sorted first before I tackle 
this though, since they should probably use the same config mechanism


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@97
PS6, Line 97: Configuration config
> I don't think passing hadoop configuration is correct here. Impala doesn't
Yeah that sounds better.  Using a backend flag


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@98
PS6, Line 98: LOG.info("impalaStartup hook invoked with config: {}", config);
> config can get pretty big, we don't want to print it on info log level
The `toString` of the config object actually only prints the file names, not 
their content.


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@152
PS6, Line 152:   @Override
 :   protected void finalize() throws Throwable {
 : this.cleanUp();
 : super.finalize();
 :   }
> i don't know if it's a good idea do this in finalize(), maybe use Runtime.a
Will do


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@175
PS6, Line 175: void
> can we return a List instead of void in case we want to do somethin
Do you intend on passing the futures back to the backend?  I fully support at 
least storing the futures here in case we decide to take action later on, but 
since this is the "manager", i think that any course of action woul

[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java
File fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java:

http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java@31
PS5, Line 31: this.lineageGraph = 
Objects.requireNonNull(lineageGraph).deepCopy();
Lot of defensive copies because `TLineageGraph` is mutable, and context objects 
are given by reference to potentially multiple hooks.  This initial copy may 
effect performance since it occurs even if no hooks are registered.  This could 
be avoided if we provide an additional flag to disable hooks, independent of 
hook registration.  Then we could avoid the JNI call and this constructor 
altogether.

Feedback?



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 5
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 16 May 2019 19:34:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The hadoop property
`impala.post.exec.hooks` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaQueryExecHook.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
7 files changed, 361 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/5
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 5
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has removed Todd Lipcon from this change.  ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Removed reviewer Todd Lipcon.
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 3
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13352


Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The hadoop property
`impala.post.exec.hooks` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
7 files changed, 361 insertions(+), 25 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/3
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 3
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-22 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#10).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Wrote e2e test to verify new flag behavior

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A tests/authorization/test_provider.py
M tests/authorization/test_ranger.py
15 files changed, 304 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/10
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 10
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-22 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12901 )

Change subject: IMPALA-8309: add user authorization_provider flag
..


Patch Set 9:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java@141
PS9, Line 141: getAuthorizationFactoryClassOrNull
> nit: remove the `orNull` in the function name
because the annotation is already present or because you don't like the name?  
i think there should be some indication that the return value can be null, if 
we're not using Optional


http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/JniCatalog.java@114
PS9, Line 114: //<<< 67f77d41d40523074385b8dbccfa6ef6ef81dd57
> Why is this commented out?
ugh was sanity checking a merge conflict and forgot to delete


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py
File tests/authorization/test_provider.py:

http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@21
PS5, Line 21: import os
> flake8: F401 'time' imported but unused
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@25
PS5, Line 25: from tests.common.custom_cluster_test_suite import 
CustomClusterTestSuite
> flake8: F401 'tests.util.filesystem_utils.IS_LOCAL' imported but unused
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@28
PS5, Line 28: class TestProviderFails(CustomClusterTestSuite):
> flake8: F401 'tests.common.impala_test_suite.ImpalaTestSuite' imported but
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@38
PS5, Line 38:   LOG_DIR = tempfile.mkdtemp(prefix="test_provider_", 
dir=os.getenv("LOG_DIR"))
> line has trailing whitespace
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@39
PS5, Line 39:
> flake8: E302 expected 2 blank lines, found 1
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@53
PS5, Line 53: e
> flake8: E502 the backslash is redundant between brackets
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@54
PS5, Line 54:
> flake8: E502 the backslash is redundant between brackets
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@55
PS5, Line 55: a
> flake8: E502 the backslash is redundant between brackets
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@58
PS5, Line 58:
> flake8: E502 the backslash is redundant between brackets
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@59
PS5, Line 59:
> flake8: E502 the backslash is redundant between brackets
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@60
PS5, Line 60:
> flake8: E502 the backslash is redundant between brackets
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@66
PS5, Line 66:
> flake8: E502 the backslash is redundant between brackets
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@67
PS5, Line 67:
> flake8: E501 line too long (92 > 90 characters)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@68
PS5, Line 68:
> flake8: E124 closing bracket does not match visual indentation
Ack


http://gerrit.cloudera.org:8080/#/c/12901/6/tests/authorization/test_provider.py
File tests/authorization/test_provider.py:

http://gerrit.cloudera.org:8080/#/c/12901/6/tests/authorization/test_provider.py@60
PS6, Line 60:
not the happiest with this approach; open to suggestions


http://gerrit.cloudera.org:8080/#/c/12901/7/tests/authorization/test_provider.py
File tests/authorization/test_provider.py:

http://gerrit.cloudera.org:8080/#/c/12901/7/tests/authorization/test_provider.py@27
PS7, Line 27:
> flake8: E302 expected 2 blank lines, found 1
Ack


http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py@38
PS4, Line 38:   """
> Would prefer if there was an e2e test with an invalid as well as valid `--a
Ack



--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Cha

[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-21 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#9).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Wrote e2e test to verify new flag behavior

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A tests/authorization/test_provider.py
M tests/authorization/test_ranger.py
15 files changed, 331 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/9
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 9
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-17 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#8).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Wrote e2e test to verify new flag behavior

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A tests/authorization/test_provider.py
M tests/authorization/test_ranger.py
16 files changed, 306 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/8
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 8
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-17 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#6).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Wrote e2e test to verify new flag behavior

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A tests/authorization/test_provider.py
M tests/authorization/test_ranger.py
16 files changed, 313 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/6
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 6
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-17 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#7).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Wrote e2e test to verify new flag behavior

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A tests/authorization/test_provider.py
M tests/authorization/test_ranger.py
16 files changed, 305 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/7
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 7
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-17 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#5).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Wrote e2e test to verify new flag behavior

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
A tests/authorization/test_provider.py
M tests/authorization/test_ranger.py
16 files changed, 315 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/5
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 5
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-10 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12901 )

Change subject: IMPALA-8309: add user authorization_provider flag
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
File 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java:

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java@130
PS2, Line 130:   public String getProviderName() { return "sentry"; }
> I dunno, in this case the magic string is returned as the implementation of
IOW, if we provide a constant, that just means that there will be multiple 
paths to getting the same information and they could potentially and 
unintentionally diverge



--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 4
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 11 Apr 2019 04:45:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8363: Fix E2E start with impala log dir

2019-04-10 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12935 )

Change subject: IMPALA-8363: Fix E2E start with impala_log_dir
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12935/1/tests/common/custom_cluster_test_suite.py
File tests/common/custom_cluster_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/12935/1/tests/common/custom_cluster_test_suite.py@147
PS1, Line 147: kwargs = {
 :   "cluster_size": cluster_size,
 :   "num_coordinators": cluster_size,
 :   "expected_num_executors": cluster_size,
 :   "default_query_options": 
method.func_dict.get(DEFAULT_QUERY_OPTIONS)
 : }
 : if IMPALA_LOG_DIR in method.func_dict:
 :   kwargs["impala_log_dir"] = method.func_dict[IMPALA_LOG_DIR]
 : self._start_impala_cluster(cluster_args, **kwargs)
 : super(CustomClusterTestSuite, self).setup_
> Agree with Fredy. Could also do something clever with kwargs, e.g.
I think the kwargs approach is better because then `start_impala_cluster` can 
still be "in charge" of defaulting the `impala_log_dir` option



--
To view, visit http://gerrit.cloudera.org:8080/12935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745
Gerrit-Change-Number: 12935
Gerrit-PatchSet: 2
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 11 Apr 2019 04:27:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8363: Fix E2E start with impala log dir

2019-04-10 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12935

to look at the new patch set (#2).

Change subject: IMPALA-8363: Fix E2E start with impala_log_dir
..

IMPALA-8363: Fix E2E start with impala_log_dir

This commit fixes the `CustomClusterTestSuite` to wait for the
correct number of executors when `impala_log_dir` is specified
in the test decorator.  Previously, the default value of 3
was always used, regardless of `cluster_size`.

Testing:

- Manual verification using tests/authorization/test_ranger.py
with custom `impala_log_dir` and `cluster_size` arguments.
Failed before changes, passed after changes

- Ran all original E2E tests

Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745
---
M tests/common/custom_cluster_test_suite.py
1 file changed, 8 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12935/2
--
To view, visit http://gerrit.cloudera.org:8080/12935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745
Gerrit-Change-Number: 12935
Gerrit-PatchSet: 2
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8363: fix e2e start with impala log dir

2019-04-09 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12935 )

Change subject: IMPALA-8363: fix e2e start with impala_log_dir
..


Patch Set 1:

> Patch Set 1:
>
> (1 comment)

I think the kwargs approach is better because then `start_impala_cluster` can 
still be "in charge" of defaulting the `impala_log_dir` option


--
To view, visit http://gerrit.cloudera.org:8080/12935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745
Gerrit-Change-Number: 12935
Gerrit-PatchSet: 1
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Tue, 09 Apr 2019 18:44:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8363: fix e2e start with impala log dir

2019-04-04 Thread radford nguyen (Code Review)
radford nguyen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12935


Change subject: IMPALA-8363: fix e2e start with impala_log_dir
..

IMPALA-8363: fix e2e start with impala_log_dir

This commit fixes the `CustomClusterTestSuite` to wait for the
correct number of executors when `impala_log_dir` is specified
in the test decorator.  Previously, the default value of 3
was always used, regardless of `cluster_size`.

Testing:

- Ran tests/authorization/test_ranger.py with custom
`impala_log_dir` and `cluster_size` arguments and
observed pass.  Observed to fail before changes

- Ran all original e2e tests

Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745
---
M tests/common/custom_cluster_test_suite.py
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12935/1
--
To view, visit http://gerrit.cloudera.org:8080/12935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745
Gerrit-Change-Number: 12935
Gerrit-PatchSet: 1
Gerrit-Owner: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-04 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12901 )

Change subject: IMPALA-8309: add user authorization_provider flag
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
File 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java:

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java@130
PS2, Line 130:   public String getProviderName() { return "sentry"; }
> Can we make "sentry" a public static final String?
I dunno, in this case the magic string is returned as the implementation of the 
abstract `getProviderName` method.  IOW, the method itself kind of acts as the 
public contract for getting the name, and any other entity which wants to get 
that value should be calling the method (as opposed to accessing a public 
static constant).



--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 4
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 04 Apr 2019 20:24:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8226: Add grant/revoke to/from group for Ranger

2019-04-04 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12914 )

Change subject: IMPALA-8226: Add grant/revoke to/from group for Ranger
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12914/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java:

http://gerrit.cloudera.org:8080/#/c/12914/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java@193
PS7, Line 193:   String user, List groups, String clusterName, 
List privileges) {
nit: We could probably use a `Collection groups` to be more general 
here, since the group's items are copied into a `List` when creating the 
request.  Same with `privileges`.


http://gerrit.cloudera.org:8080/#/c/12914/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java@233
PS7, Line 233: if (!groups.isEmpty()) request.getGroups().addAll(groups);
nit: is the `if` statement really necessary given the contract of `addAll`?


http://gerrit.cloudera.org:8080/#/c/12914/10/fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java
File 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java:

http://gerrit.cloudera.org:8080/#/c/12914/10/fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java@170
PS10, Line 170: "%s is not supported in Impalad", 
ClassUtil.getMethodName()));
Isn't it more accurate to say that this isn't supported with sentry?



--
To view, visit http://gerrit.cloudera.org:8080/12914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I28b7b3e4c776ad1bb5bdc184c7d733d0b5ef5e96
Gerrit-Change-Number: 12914
Gerrit-PatchSet: 11
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 04 Apr 2019 19:57:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-03 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12901 )

Change subject: IMPALA-8309: add user authorization_provider flag
..


Patch Set 4:

(1 comment)

Modified existing e2e tests to use new `authorization_provider` flag

http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py@38
PS4, Line 38:   "--authorization_provider=ranger")
I changed these to use the new flag as I imagine this is the preferred way to 
select internally-supported auth providers.

I didn't see any existing e2e tests for the `authorization_factory_class` flag, 
otherwise I would have duplicated them using the `authorization_provider` flag.

Up to this point, looks like we only have full test coverage of authorization 
selection behavior in the fe, and I'm wondering if this is enough



--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 4
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 04 Apr 2019 04:30:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-03 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#4).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
M tests/authorization/test_ranger.py
15 files changed, 229 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/4
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 4
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-03 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#3).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
14 files changed, 227 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/3
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 3
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-03 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12901 )

Change subject: IMPALA-8309: add user authorization_provider flag
..


Patch Set 2:

(8 comments)

Addressed nits; adding e2e tests next

http://gerrit.cloudera.org:8080/#/c/12901/2/be/src/service/frontend.cc
File be/src/service/frontend.cc:

http://gerrit.cloudera.org:8080/#/c/12901/2/be/src/service/frontend.cc@41
PS2, Line 41: "sentry",
> nit: move default to line above like other DEFINE
Honest question: why?

I don't really see consistent usage in this file; for example 
`DEFINE_string(ranger_app_id` begins the description string on a newline but 
`DEFINE_string(server_name` begins the description on the same line... even 
though both description strings end up wrapping.

I think the current approach is perfectly readable, so I don't understand the 
point of these 3 nits.


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
File 
fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java:

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java@31
PS2, Line 31: factoryClass
> Should this be `factoryClassName`?
Yeah that's a better name.  I was waffling back and forth between using the 
actual `Class` object vs. the canonical name.


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java@141
PS2, Line 141:   public @Nullable String getAuthorizationFactoryClassOrNull() {
> nit: I'm not opposed to adding the @Nullable annotation and adding `orNull`
That's the thing, I wasn't able to find another example in the codebase for 
handling optional flags.  If you could point one out I'll make sure this 
conforms


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@773
PS2, Line 773: * @param beCfg
 :* @return
 :* @throws InternalException
> Finish the documentation
Done


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@778
PS2, Line 778:   throws InternalException {
> nit: indent 4
Done


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@794
PS2, Line 794: +authzProvider
> nit: add space around the `+`
Done


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@812
PS2, Line 812:   throws InternalException {
> nit: indent 4
Done


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@822
PS2, Line 822: +authzFactoryClassName
> nit: add space around the `+`
Done



--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 2
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 04 Apr 2019 02:08:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-01 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12901 )

Change subject: IMPALA-8309: add user authorization_provider flag
..


Patch Set 2:

(10 comments)

> Patch Set 1:
>
> (8 comments)

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java@141
PS2, Line 141:   public @Nullable String getAuthorizationFactoryClassOrNull() {
Not sure if there's a standard for handling optional flags... at least I didn't 
really see one.

We could alternatively return an `Optional` (generally my preference) or just 
return the raw value that the be populates and let callers decide what is 
"present" or not.


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java@113
PS1, Line 113: final AuthorizationFactory authzFactory
> line too long (99 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java:

http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@92
PS1, Line 92:   Class 
authorization_factory_class,
> line too long (105 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@93
PS1, Line 93:   String authorization_provider) {
> line too long (91 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@110
PS1, Line 110: JniFrontend.authzFactoryClassNameFrom(authCfg("", 
"ranger"))
> line too long (102 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@113
PS1, Line 113:
> line too long (98 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@118
PS1, Line 118: SentryAuthorizationFactory.class.getCanonicalName(),
> line too long (98 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@123
PS1, Line 123:
> line too long (98 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@128
PS1, Line 128: )
> line too long (98 > 90)
Ack


http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java:

http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@47
PS2, Line 47: origFlags = BackendConfig.INSTANCE.getBackendCfg();
This seems to be what other tests do



--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 2
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Mon, 01 Apr 2019 09:38:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-04-01 Thread radford nguyen (Code Review)
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12901

to look at the new patch set (#2).

Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
14 files changed, 217 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/2
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 2
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-03-31 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12901 )

Change subject: IMPALA-8309: add user authorization_provider flag
..


Patch Set 1:

Hi guys, relatively simply ticket here to add a more human-friendly flag for 
choosing authorization provider(s)


--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 1
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Mon, 01 Apr 2019 05:24:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag

2019-03-31 Thread radford nguyen (Code Review)
radford nguyen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12901


Change subject: IMPALA-8309: add user authorization_provider flag
..

IMPALA-8309: add user authorization_provider flag

This commit adds a `authorization_provider` user-facing flag
in order to provide a more human-readable alternative to the
`authorization_factory_class` for internally-provided
authorization strategies.

The `authorization_factory_class` flag is retained, but no
longer takes a default value if not specified.  The default
for `authorization_provider` is "sentry" in order to retain
backwards-compatibility.

If specified, `authorization_factory_class` will take
precedence.

Testing:

Manually started minicluster with each of following flags
and verified correct authorization strategy chosen:

- provider='' factory='' => sentry
- provider=sentry factory='' => sentry
- provider=ranger factory='' => ranger
- provider='' factory=sentry => sentry
- provider='' factory=ranger => ranger
- provider=sentry factory=sentry => sentry
- provider=ranger factory=sentry => sentry
- provider=sentry factory=ranger => ranger
- provider=ranger factory=ranger => ranger

Wrote unit tests to capture above assertions

Ran fe unit and e2e tests

Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
---
M be/src/service/frontend.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java
M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
14 files changed, 201 insertions(+), 57 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/1
--
To view, visit http://gerrit.cloudera.org:8080/12901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19
Gerrit-Change-Number: 12901
Gerrit-PatchSet: 1
Gerrit-Owner: radford nguyen 


[Impala-ASF-CR] IMPALA-7917 (Part 2): Decouple Sentry from Impala

2019-03-06 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12542 )

Change subject: IMPALA-7917 (Part 2): Decouple Sentry from Impala
..


Patch Set 12:

(1 comment)

hi @fredy, just adding a teeny little comment

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java
File fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java@121
PS2, Line 121: analyzer.registerPrivReq(builder ->
> I like the last one. Will update.
I also prefer the last one because it makes it easy to see which methods are 
called on `builder` at-a-glance.  With the one-liner, method calls on other 
objects (e.g. arguments) produce noise



--
To view, visit http://gerrit.cloudera.org:8080/12542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I051d06451fb14ba2ffb3f7e0d4aae37dee55ab86
Gerrit-Change-Number: 12542
Gerrit-PatchSet: 12
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 07 Mar 2019 03:48:30 +
Gerrit-HasComments: Yes