[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#29). Change subject: IMPALA-8571: harden QueryEventHook execution .. IMPALA-8571: harden QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. A daemon thread is one that will be halted immediately when all non-daemon thread have completed, for example when the JVM is shutdown. There are some technical implications of this, so please see Java Thread.setDaemon(boolean) for more details. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specifies a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* Hook performance metrics are captured by the frontend, but not yet published to the backend and are therefore not user-visible. See IMPALA-8914 for implementation of this feature. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 11 files changed, 834 insertions(+), 118 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/29 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 29 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 28: (6 comments) > Patch Set 28: > > (25 comments) > > Hi Radford. As Bharath and Fredy have disappeared I'll review this now :-) > Code looks good. Like you I'm slightly worried about the timeout queue. It's > somehow wasteful to keep all that state around when the common case is that > the tasks completed quickly. I think I have a better implementation for the timeout check that won't allow unbounded growth of the queue. Before I upload that patchset, the next one addresses the rest of the feedback http://gerrit.cloudera.org:8080/#/c/13748/28//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13748/28//COMMIT_MSG@19 PS28, Line 19: See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. > How can a user decide whether to use daemon threads? I think it depends on whether or not they feel that query hook execution should block shutdown of the JVM, since a daemon thread should be killed immediately when all regular threads have completed. There are some details like the fact that `finally` blocks are not executed if a daemon thread is halted in this way; they are kind of technical so I didn't want to document them here but refer them to a more-definitive source. http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@82 PS28, Line 82: * execute. You then create an executor with a thread pool of size 1 and a hook > You can consider this a nit :-) , but I find the use of 'you' jarring Flashbacks to my technical-writing course :) (fixed) http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@182 PS28, Line 182: final ArrayBlockingQueue boundedQueue = > Can you use BlockingQueue ? You mean declare the var as the most general type possible, right? Done. http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@290 PS28, Line 290: t); > Does LOG.error work when there is an extra Throwable argument? Looks a bit Yep, in fact it will log the stack trace this way! Does look weird and I always end up running tests to (re)convince myself each time I use it http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@318 PS28, Line 318: // task queue to grow unbounded if hook tasks are scheduled at an interval > This approach seems clever, and the code is neat and tidy... For a single timeout task, I think the memory overhead is that of the `Runnable` of the task itself, maybe plus a pointer to it. Call that overhead `N`. If hooks are scheduled every 30 min or more, then there will only ever be max N extraneous garbage sitting around. Anything more frequent than that will cause (max) extraneous garbage to grow in increments of `N`. It's this growth that is worrying to me, but I think I have an alternative way to implement this class that retains the simplicity but ensures that there is never any extraneous garbage. I'll need to extend `ExecutorService` in order to do this though, so look for that in the next patchset. http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/service/Frontend.java@1624 PS28, Line 1624:* for {@link QueryEventHook}, which is part of the published api. > I would like to see the guarantees here. So the `QueryEventHook` class is published in a separate jar than the rest of the FE, the intent being that a downstream hook implementation project can import only what is needed and remain decoupled from FE implementation. That was my motivation for moving the guarantees there; I deemed the guarantees most-useful to users who will be writing hooks. I removed the guarantees from this javadoc so that we wouldn't have duplication. It does seem wrong to me that the guarantees are documented in one place but implemented in another though. Do you think that I should move them back here and reference them from the `QueryEventHook` javadoc? I just don't know how a user who only imports the published-api will be able to reference this javadoc. -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Me
[Impala-ASF-CR] IMPALA-9053: DDLs should generate lineage graphs.
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/14458 ) Change subject: IMPALA-9053: DDLs should generate lineage graphs. .. Patch Set 1: Code-Review+1 I've touched lineage-related code in the past but have to admit that I don't really have much domain knowledge. That said, the changes look good to me -- To view, visit http://gerrit.cloudera.org:8080/14458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6c7ed9fe3265fd777fe93590cf4eb2d9ba0dd1e Gerrit-Change-Number: 14458 Gerrit-PatchSet: 1 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 19:48:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 28: (1 comment) added some detail to the comment that needs attention http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@313 PS28, Line 313: // TODO: IMPALA-XXX > In this latest approach, I schedule a task that will cancel the hook task i Note that this potential for unbounded growth is possible even though the hook task executor has a bounded work queue. Consider the following scenario: (1) user configures a hook timeout of 3 seconds (2) hook tasks take 1 second to execute (3) hook tasks are submitted every 2 seconds (due to query execution) Because (2) and (3), hook tasks will never fill the bounded queue and we can say that a hook task will be submitted every 2 seconds. Every hook task submitted induces a timeout task to begin after 3 seconds. Because timeout tasks are effectively submitted every 2 seconds but only run after 3 seconds, this will cause the timeout task queue to grow. -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 28 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 14:34:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 28: (1 comment) I've implemented a simpler approach as suggested by Bharath, but it comes with a caveat. Please carefully review where I've commented (PS28, FixedCapacityQueryHookExecutor.java:313) http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/28/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@313 PS28, Line 313: // TODO: IMPALA-XXX In this latest approach, I schedule a task that will cancel the hook task if it is still running after `hookTimeout` duration. This makes the code simpler (fewer threads) and easier to reason about (straight-forward use of executors). However, it comes at a cost, described in the comment. I've pre-emptively placed a TODO comment here to draw attention to it -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 28 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 16 Oct 2019 13:45:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#28). Change subject: IMPALA-8571: harden QueryEventHook execution .. IMPALA-8571: harden QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* Hook performance metrics are captured by the frontend, but not yet published to the backend and are therefore not user-visible. See IMPALA-8914 for implementation of this feature. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 11 files changed, 814 insertions(+), 114 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/28 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 28 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 27: Ok, I've come up with a simpler timeout-monitor implementation based on feedback; it uses a single-threaded `ScheduledThreadPoolExecutor` to cancel hook tasks. It should be easier to reason about than the current impl. Need to write some tests before pushing the patchset. -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 27 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 03 Oct 2019 10:18:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 26: (1 comment) Missed a nit... http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java: http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@248 PS25, Line 248: static final String ON_QUERY_COMPLETE = "onQueryComplete"; > nit: Move to the top, static final consts are usually on the top. Done -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 26 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 26 Sep 2019 23:15:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#27). Change subject: IMPALA-8571: harden QueryEventHook execution .. IMPALA-8571: harden QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* Hook performance metrics are captured by the frontend, but not yet published to the backend and are therefore not user-visible. See IMPALA-8914 for implementation of this feature. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 11 files changed, 843 insertions(+), 114 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/27 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 27 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571: harden QueryEventHook execution .. Patch Set 26: (5 comments) Addressed some nits; still need to look into simplifying the timeout monitor executor http://gerrit.cloudera.org:8080/#/c/13748/25//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13748/25//COMMIT_MSG@41 PS25, Line 41: but not yet : published to the backend and are therefore not user-visible > I'm guessing this is done as a separate patch? Yes, I was asked to split out the metrics into a separate ticket in order to speed up acceptance of the hardening/monitoring http://gerrit.cloudera.org:8080/#/c/13748/25/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13748/25/be/src/service/impala-server.cc@285 PS25, Line 285: " > nit: Prefix with "(Advanced)" like in other cases. We typically do not want Will do http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@169 PS25, Line 169: this.metrics_ = Objects.requireNonNull(hookMetrics); : : Preconditions.checkArgument(hookTimeout >= 1, : "hook timeout should be >= 1 but was " + hookTimeout); : t > nit: Preconditions.checkArgument(hookTimeout >=1, mesg); Ooh, I also used the wrong templating format in the log message and forgot the timeout: ``` String.format("hook timeout should be >= 1 but was %s.", hookTimeout); ``` http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@181 PS25, Line 181: > nit: extra newline. Done http://gerrit.cloudera.org:8080/#/c/13748/25/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@190 PS25, Line 190: t- > nit: space Done -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 26 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 26 Sep 2019 23:09:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#26). Change subject: IMPALA-8571: harden QueryEventHook execution .. IMPALA-8571: harden QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* Hook performance metrics are captured by the frontend, but not yet published to the backend and are therefore not user-visible. See IMPALA-8914 for implementation of this feature. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 11 files changed, 841 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/26 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 26 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#25). Change subject: IMPALA-8571: harden QueryEventHook execution .. IMPALA-8571: harden QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* Hook performance metrics are captured by the frontend, but not yet published to the backend and are therefore not user-visible. See IMPALA-8914 for implementation of this feature. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 11 files changed, 843 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/25 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 25 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571: harden QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#24). Change subject: IMPALA-8571: harden QueryEventHook execution .. IMPALA-8571: harden QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* Hook performance metrics are captured by the frontend, but not yet published to the backend and are therefore not user-visible. See IMPALA-8914 for implementation of this feature. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 12 files changed, 968 insertions(+), 111 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/24 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 24 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#22). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,364 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/22 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 22 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#23). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,361 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/23 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 23 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#21). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,364 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/21 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 21 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#20). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/.k5.KRBTEST.COM A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/impala-test_radimp.vpc.cloudera.com.keytab A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/kdc.conf A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/krb5.conf A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/krb5cc A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal.kadm5 A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal.kadm5.lock A be/src/rpc/299e-533f-3ecd-6fea/krb5kdc/principal.ok A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/.k5.KRBTEST.COM A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/impala-test_radimp.vpc.cloudera.com.keytab A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/kdc.conf A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/krb5.conf A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/krb5cc A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal.kadm5 A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal.kadm5.lock A be/src/rpc/54a1-4bf4-07b7-314f/krb5kdc/principal.ok A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/.k5.KRBTEST.COM A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/impala-test_radimp.vpc.cloudera.com.keytab A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/kdc.conf A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/krb5.conf A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/krb5cc A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal.kadm5 A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal.kadm5.lock A be/src/rpc/ab02-9d46-479e-73ba/krb5kdc/principal.ok M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Andrew Sherman, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#19). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook execution* Query event hooks are still executed in a thread-pool, but you can now configure the type of (Java) threads used to execute them via the newly-added backend flag `query_event_hook_use_daemon_threads`, which takes a true/false value. See Java Thread.setDaemon(boolean) for what a "daemon thread" entails. *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,364 insertions(+), 114 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/19 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 19 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/13748/18/be/src/util/hook-metrics.cc File be/src/util/hook-metrics.cc: http://gerrit.cloudera.org:8080/#/c/13748/18/be/src/util/hook-metrics.cc@82 PS18, Line 82: void QueryEventHookMetricContainer::Update(const TQueryEventHookMetrics& from) { Is this all I really have to do to make sure the metrics will be published? -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 18 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Mon, 19 Aug 2019 22:16:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. Patch Set 18: (10 comments) Thanks for the feedback, asherman. Replies inline http://gerrit.cloudera.org:8080/#/c/13748/18//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13748/18//COMMIT_MSG@31 PS18, Line 31: > Do you want to discuss --query_event_hook_use_daemon_threads here? How woul Yes, this is something that should be included in the commit message http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@51 PS18, Line 51: * Rejections > This javadoc is great, and just the sort of thing I was hoping to see. I was not planning on publishing the html; I have just been in the habit of using html because eclipse/Idea/etc will render it when displaying the javadoc in a tooltip window. Everything just turns into one giant line in there if you don't format. Anyway, I'll switch to markdown, since it's both human-readable and can be rendered by an IDE plugin or javadoc tool if needed. http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@102 PS18, Line 102: * ${hookClass}.${method}.execution.exceptions > I think the metrics now have the "query-event-hook" prefix. re: "query-event-hook" prefix - So this is where the split b/w backend/frontend metrics is kind of confusing. This javadoc describes the metric names as they are collected in the frontend, and unfortunately the names do not exactly align with the backend. Part of this is due to the fact that the Java metrics are objects (e.g. Timers) that can provide multiple metrics (e.g. mean time, count). In passing the metrics to the backend, it seemed to me that the convention was to pass the raw data (e.g. mean time, count) as needed, instead of say translating the entire metric object to some similarly-capable object in the backend. The "query-event-hook" prefix only exists in the backend metric names. re: hookClass vs. method - In this context, ${hookClass} is always the fully-qualified name of the java class that implements `QueryEventHook`, and ${hookMethod} the method name invoked on that hook. An example metric name would be: org.foo.bar.MyQueryEventHook.onQueryComplete.execution.exceptions I actually tried to make this very predictable and therefore amenable to automation/generation, but if there is some way I can improve that further, please let me know. http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@176 PS18, Line 176: // ArrayBlockingQueue contructor performs bounds-check on queue size for us > typo: constructor Done http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@184 PS18, Line 184: // this executor cancels any hook tasks that > This may seem picky but in Imapla we like comments with Capitals at the beg Done http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@255 PS18, Line 255:* {@link ExecutionException}. This behavior is consistent with how futures normally > We could be more concise by not saying this about how Futures normally beha Done http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@286 PS18, Line 286: LOG.warn("QueryEventHook {}.{} execution rejected because the " + > I sometimes weaselly write "probably because" as you never know :-) Done http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java File fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java: http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@1 PS18, Line 1: /** > Use // style comments for the apache license please Done http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@80 PS18, Line 80: // make this longer to reasonably guarantee a timeout > Is this a FIXME note, not sure I understand what this keans Reworded as: "Sleep much longer than `hookTimeout` to reasonably guarantee a timeout." http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@115 PS18, Line 115: i
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#18). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,376 insertions(+), 114 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/18 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 18 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#17). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,376 insertions(+), 114 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/17 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 17 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#16). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution (WIP because still need to implement polling of metrics by backend) This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,375 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/16 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 16 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. Patch Set 15: > Patch Set 15: > > Build Failed > > https://jenkins.impala.io/job/gerrit-code-review-checks/3879/ : Initial code > review checks failed. See linked job for details on the failure. Linker failed due to undefined static member `QueryEventHook::metric_containers_` in be/src/util/hook-metrics.cc -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 15 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Mon, 15 Jul 2019 20:00:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. Patch Set 15: (3 comments) Updated the review with metric code, including actual metric collection on the frontend, metric "translation" on the backend, and the JNI plumbing to connect the 2. Only thing missing is the actual polling mechanism to make the connection... currently looking at other examples (e.g. event metrics) for some guidance. Commit `360b23bfa4eeb337cce380512f106a692bf3c49b` seems to be a good starting point(?) http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/service/impala-server.cc@75 PS15, Line 75: #include "util/hook-metrics.h" This will be used once I determine the best way to poll for metrics from the frontend http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics-test.cc File be/src/util/hook-metrics-test.cc: http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics-test.cc@23 PS15, Line 23: TEST(HookMetricContainer, UpdateTest) { Review is still WIP: I haven't actually ran this test yet, and I need to add more coverage for backend handling of metrics. http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics.h File be/src/util/hook-metrics.h: http://gerrit.cloudera.org:8080/#/c/13748/15/be/src/util/hook-metrics.h@65 PS15, Line 65: TGetQueryEventHookMetricsResult* response); Review is still WIP because I need to actually call this function with metrics from the frontend. The JNI calls already exist, but I need some feedback on how to poll the frontend periodically for those metrics. -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 15 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Mon, 15 Jul 2019 12:45:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#15). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *query-event-hook.${hook_method}.execution-rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *query-event-hook.${hook_method}.execution-exceptions* Counter indicating how many tasks have thrown an exception during execution *query-event-hook.${hook_method}.execution-timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *query-event-hook.${hook_method}.execution-submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *query-event-hook.${hook_method}.mean-execution-time* Mean time in [ns] that ${hook_name} has taken to complete, whether normally or by error (e.g. timeout or exception). *query-event-hook.${hook_method}.mean-queued-time* Mean time in [ns] between hook task submission and hook task execution. This indicates how long a hook task has been queued waiting to execute. Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc A be/src/util/hook-metrics-test.cc A be/src/util/hook-metrics.cc A be/src/util/hook-metrics.h M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M common/thrift/metrics.json A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 19 files changed, 1,368 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/15 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 15 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. Patch Set 14: > Patch Set 14: > > (8 comments) > > > Patch Set 14: > > > > (9 comments) > > Replied to comments -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 14 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 11 Jul 2019 00:06:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. Patch Set 14: (8 comments) > Patch Set 14: > > (9 comments) Replied to comments http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@275 PS14, Line 275: 1 > Since the hook execution is async (to query unregister) anyway, how about w That's a good idea; I'll bump it http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@279 PS14, Line 279: true > Should anyone ever have to set this to false? If not, I suggest removing th Nobody has requested it, but I was under the impression that a user might want to use higher-priority threads for executing hooks. They might also want to ensure that the threads will not be killed at shutdown until the hooks have completed executing. http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@97 PS14, Line 97: * ${hookClass}.${method}.execution.rejections > Where are all these metrics exposed? I don't see the plumbing to make them I haven't completed the plumbing yet; I'm still trying to figure out how it works. This is the main reason the CR is still WIP http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@101 PS14, Line 101: exceptions > nit: failures? I can change it to failures if you think that's more expressive http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@131 PS14, Line 131: private final long hookTimeout_; : private final TimeUnit hookTimeoutUnit_; > Just force it to be secs or something? Yes, it is forced to `SECONDS` in the `QueryHookManager` when instantiating this executor. I am allowing a different time unit here internally to make the unit tests faster. http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@290 PS14, Line 290: context); > probably helps to include query ID explicitly. Query ID is not currently available; are you suggesting that it be passed to the hook from the backend? I do like that idea http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@327 PS14, Line 327: final Future f = hookExecutor_.submit(() -> { > I don't think you need to offload this to a separate executor pool again. I think I'm missing something. If I just do new FutureTask<>(callHook).run(), won't that just run callHook in this thread? And doesn't that mean that we won't get to line 337 until callHook has completed? http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java: http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@77 PS14, Line 77: static final String BE_HOOKS_FLAG = "query_event_hook_classes"; > nit: any reason to remove private qualifier? I used one of the constants in an error message logged by `FixedCapacityQueryHookExecutor`. If you think it's cleaner, I can make these private again and maybe(?) make `FCQHExecutor` an inner class -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 14 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 11 Jul 2019 00:04:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8589: Re-enable flaky test query event hooks.py
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13713 ) Change subject: IMPALA-8589: Re-enable flaky test_query_event_hooks.py .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/13713/9/tests/custom_cluster/test_query_event_hooks.py File tests/custom_cluster/test_query_event_hooks.py: http://gerrit.cloudera.org:8080/#/c/13713/9/tests/custom_cluster/test_query_event_hooks.py@49 PS9, Line 49: expected_count=-1) This test no longer checks for the execution of `onQueryComplete`... it only checks that `onImpalaStartup` is called, which is not a query event hook at all, really. I think we still need coverage for `onQueryComplete` because it executes on the thread pool executor, which is different than `onImpalaStartup` -- To view, visit http://gerrit.cloudera.org:8080/13713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia64550e986b5eba59a1d77657943932bb977d470 Gerrit-Change-Number: 13713 Gerrit-PatchSet: 9 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Mon, 08 Jul 2019 20:32:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#14). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *${hookClass}.${method}.execution.rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *${hookClass}.${method}.execution.exceptions* Counter indicating how many tasks have thrown an exception during execution *${hookClass}.${method}.execution.timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *${hookClass}.${method}.submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *${hookClass}.${method}.execution.time* Timer indicating the amount of time ${hookClass}.${method} has taken to complete, whether normally or by error (e.g. timeout or exception). Because tasks can potentially be cancelled before beginning execution, the Timer.getCount() of this timer may be less than `${hookClass}.${method}.submissions` *${hookClass}.${method}.queued.time* Timer indicating the amount of time between hook task submission and hook task execution. Since a task may be cancelled before it even begins execution, {@link Timer#getCount()} of this timer may be less than `${hookClass}.${method}.submissions` Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 818 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/14 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 14 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#13). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* A timeout for hook execution can be configured through the backend flag `query_event_hook_timeout_s`, which specified a timeout value in seconds. If a hook has not completed execution within this timeout (measured from hook submission, not execution) then the hook task will be cancelled in order to free up resources. *hook rejection* The hook execution engine now has a fixed-capacity work queue whose capacity can be configured through the backend flag `query_event_hook_queue_capacity`. This queue is used to store hook tasks that are submitted when there are no free threads available for hook execution. All hook tasks submitted when the queue is at capacity will be rejected and logged without affecting the result of the query. *hook performance metrics* The following hook metrics are captured: *${hookClass}.${method}.execution.rejections* Counter indicating how many submitted tasks have been rejected due to a full work queue *${hookClass}.${method}.execution.exceptions* Counter indicating how many tasks have thrown an exception during execution *${hookClass}.${method}.execution.timeouts* Counter indicating how many tasks have been cancelled due to not completing within {@code hookTimeout_s} of submission. *${hookClass}.${method}.submissions* Counter indicating the number of times ${hookClass}.${method} has been submitted for execution. *${hookClass}.${method}.execution.time* Timer indicating the amount of time ${hookClass}.${method} has taken to complete, whether normally or by error (e.g. timeout or exception). Because tasks can potentially be cancelled before beginning execution, the Timer.getCount() of this timer may be less than `${hookClass}.${method}.submissions` *${hookClass}.${method}.queued.time* Timer indicating the amount of time between hook task submission and hook task execution. Since a task may be cancelled before it even begins execution, {@link Timer#getCount()} of this timer may be less than `${hookClass}.${method}.submissions` Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 817 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/13 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 13 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution
Hello Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#12). Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution .. IMPALA-8571[WIP]: improve QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* *hook rejection* *hook performance metrics* Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 782 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/12 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 12 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8572[WIP]: improve QueryEventHook execution
Hello Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#11). Change subject: IMPALA-8572[WIP]: improve QueryEventHook execution .. IMPALA-8572[WIP]: improve QueryEventHook execution This commit hardens guarantees around QueryEventHook execution by adding the following features: *hook timeout/cancellation* *hook rejection* *hook performance metrics* Testing: - added unit tests for new features - re-ran existing E2E tests Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 782 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/11 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 11 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution
Hello Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#10). Change subject: IMPALA-8572: improve QueryEventHook execution .. IMPALA-8572: improve QueryEventHook execution WIP Testing: TBD Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 767 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/10 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 10 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution
Hello Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#9). Change subject: IMPALA-8572: improve QueryEventHook execution .. IMPALA-8572: improve QueryEventHook execution WIP Testing: TBD Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 658 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/9 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 9 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution
Hello Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13748 to look at the new patch set (#8). Change subject: IMPALA-8572: improve QueryEventHook execution .. IMPALA-8572: improve QueryEventHook execution WIP Testing: TBD Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 658 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/8 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 8 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8572: improve QueryEventHook execution .. Patch Set 7: (2 comments) add comment where you should probably start reviewing http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java File fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java: http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@161 PS7, Line 161: Future submitOnQueryComplete( Entry point for executing hook onQueryComplete asynchronously http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java: http://gerrit.cloudera.org:8080/#/c/13748/7/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@133 PS7, Line 133: final String msg = String.format("hook timeout should be positive but was {}. " + might allow negative value to signify block indefinitely -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 7 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Tue, 02 Jul 2019 01:12:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8572: improve QueryEventHook execution
radford nguyen has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/13748 ) Change subject: IMPALA-8572: improve QueryEventHook execution .. IMPALA-8572: improve QueryEventHook execution Description goes here Testing: TBD Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift A fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java M fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java M fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M query-event-hook-api/src/main/java/org/apache/impala/hooks/QueryEventHook.java 10 files changed, 658 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13748/7 -- To view, visit http://gerrit.cloudera.org:8080/13748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643 Gerrit-Change-Number: 13748 Gerrit-PatchSet: 7 Gerrit-Owner: radford nguyen
[Impala-ASF-CR] IMPALA-8473: Publish lineage info via hook
radford nguyen has uploaded a new patch set (#26). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: Publish lineage info via hook .. IMPALA-8473: Publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - query completion - see IMPALA-8572 for caveat/details The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_event_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `query_event_hook_nthreads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. In other words, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook on-query-complete execution will simply be logged and will not be (directly) fatal to the system. Tests: - added unit tests for FE hook execution - added E2E tests for hook configuration, execution, error - ran full build, tests Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java M tests/authorization/test_provider.py A tests/custom_cluster/test_query_event_hooks.py 20 files changed, 1,109 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/26 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 26 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: Publish lineage info via hook
radford nguyen has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: Publish lineage info via hook .. IMPALA-8473: Publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - query completion - see IMPALA-8572 for caveat/details The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_event_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `query_event_hook_nthreads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. In other words, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook on-query-complete execution will simply be logged and will not be (directly) fatal to the system. Tests: - added unit tests for FE hook execution - added E2E tests for hook configuration, execution, error - ran full build, tests Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java M tests/authorization/test_provider.py A tests/hooks/test_hooks.py M tests/run-tests.py 21 files changed, 1,110 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/24 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 24 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: Publish lineage info via hook
radford nguyen has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: Publish lineage info via hook .. IMPALA-8473: Publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - query completion - see IMPALA-8572 for caveat/details The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_event_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `query_event_hook_nthreads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. In other words, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook on-query-complete execution will simply be logged and will not be (directly) fatal to the system. Tests: - added unit tests for FE hook execution - added E2E tests for hook configuration, execution, error - ran full build, tests Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java M tests/authorization/test_provider.py A tests/hooks/test_hooks.py 20 files changed, 1,109 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/21 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 21 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. Patch Set 20: Fixed code-review checks, implemented minor feedback, and managed to drastically improve 1 e2e test case -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 20 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 23 May 2019 02:28:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#20). Change subject: IMPALA-8473: publish lineage info via hook .. IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - query completion - see IMPALA-8572 for caveat/details The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_event_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `query_event_hook_nthreads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. In other words, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook on-query-complete execution will simply be logged and will not be (directly) fatal to the system. Tests: - added unit tests for FE hook execution - added e2e tests for hook configuration, execution, error - ran full build, tests Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java A tests/hooks/test_hooks.py 20 files changed, 1,169 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/20 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 20 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. Patch Set 19: (2 comments) Thanks for +1; replied to your comments. Next patchset will fix code-review checks http://gerrit.cloudera.org:8080/#/c/13352/18/fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java File fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java: http://gerrit.cloudera.org:8080/#/c/13352/18/fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java@44 PS18, Line 44: private QueryCompleteContext mockQueryCompleteContext; > I don't understand why this is marked as Mock, it seems a simple class that I just typically use mocks to: - avoid complex creation logic that tends to change with refactoring - allow behavior verification, such as verifying that a method was invoked so many times with such and such arguments - allow customized stubbing of behavior without boilerplate None of these actually apply in this test suite currently, so I can replace it with a POJO if you think it's better that way. http://gerrit.cloudera.org:8080/#/c/13352/18/fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java@67 PS18, Line 67: private static QueryEventHookManager createSut(int nThreads, String... hooks) > What is SUT? Maybe use something clearer to the reader SUT is system-under-test; I thought that nomenclature was somewhat standard? Not sure, but I think it came from a Martin Fowler blog... maybe that's not as standard as I think -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 19 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 22 May 2019 23:36:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. Patch Set 19: Think this is ready for final review now. Sorry about all the patchsets; don't really have my impala workflow established yet. -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 19 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 22 May 2019 21:39:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#19). Change subject: IMPALA-8473: publish lineage info via hook .. IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - query completion - see IMPALA-8572 for caveat/details The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_event_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `query_event_hook_nthreads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. In other words, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook on-query-complete execution will simply be logged and will not be (directly) fatal to the system. Tests: - added unit tests for FE hook execution - added e2e tests for hook configuration, execution, error - ran full build, tests Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java A tests/hooks/test_hooks.py 20 files changed, 1,231 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/19 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 19 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#18). Change subject: IMPALA-8473: publish lineage info via hook .. IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - query completion - see IMPALA-8572 for caveat/details The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_event_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `query_event_hook_nthreads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. In other words, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook on-query-complete execution will simply be logged and will not be (directly) fatal to the system. Tests: - added unit tests for FE hook execution - added e2e tests for hook configuration, execution, error - ran full build, tests Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/QueryCompleteContext.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHook.java A fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryEventHookManagerTest.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryEventHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorEventHook.java A tests/hooks/test_hooks.py 20 files changed, 1,230 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/18 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 18 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#17). Change subject: IMPALA-8473: publish lineage info via hook .. IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_exec_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `num_query_exec_hook_threads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. In other words, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook post-query execution will simply be logged and will not bring down the system. Tests: - added unit tests for FE hook execution - added e2e tests for hook configuration, execution, error - ran full build, tests Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java A tests/hooks/test_hooks.py 20 files changed, 1,183 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/17 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 17 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. Patch Set 17: (15 comments) still one more iteration needed http://gerrit.cloudera.org:8080/#/c/13352/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13352/16//COMMIT_MSG@38 PS16, Line 38: to every hook to execute asynchronously. In other words, > Nit: spell out IOW to help future non-native English speakers good point; done http://gerrit.cloudera.org:8080/#/c/13352/17/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java: http://gerrit.cloudera.org:8080/#/c/13352/17/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@25 PS17, Line 25: public interface QueryExecHook { Before we merge this, are we happy with this name? The backend seems to use the terminology "query event", e.g. `(*request_state)->query_events()` https://github.com/apache/impala/blob/3.2.0/be/src/service/impala-server.cc#L934 Perhaps we should rename this to `QueryEventHook` and likewise in other similar places. http://gerrit.cloudera.org:8080/#/c/13352/16/tests/hooks/test_hooks.py File tests/hooks/test_hooks.py: http://gerrit.cloudera.org:8080/#/c/13352/16/tests/hooks/test_hooks.py@299 PS16, Line 299: pass > If you make LOG_DIR unique you could leave it around which might help someo much better; done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py File tests/hooks/test_hooks.py: http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@30 PS17, Line 30: from getpass import getuser > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@31 PS17, Line 31: from ImpalaService import ImpalaHiveServer2Service > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@32 PS17, Line 32: from TCLIService import TCLIService > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@33 PS17, Line 33: from thrift.transport.TSocket import TSocket > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@34 PS17, Line 34: from thrift.transport.TTransport import TBufferedTransport > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@35 PS17, Line 35: from thrift.protocol import TBinaryProtocol > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@36 PS17, Line 36: from tests.common.custom_cluster_test_suite import CustomClusterTestSuite > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@37 PS17, Line 37: from tests.common.file_utils import assert_file_in_dir_contains,\ > flake8: E402 module level import not at top of file Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@37 PS17, Line 37: from tests.common.file_utils import assert_file_in_dir_contains,\ > flake8: F401 'tests.common.file_utils.assert_no_files_in_dir_contain' impor Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@205 PS17, Line 205: def __wait_for_file(self, filepath, timeout_s=10): Should this go in common/file_utils.py? I didn't put it there now because it's just a first-pass, quick-and-dirty implementation http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@217 PS17, Line 217: class TestHooksStartupFail(CustomClusterTestSuite): > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/13352/17/tests/hooks/test_hooks.py@248 PS17, Line 248: d > flake8: E304 blank lines found after function decorator Done -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 17 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 22 May 2019 20:11:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. Patch Set 14: (1 comment) As discussed in Slack, I've uploaded a simple change from the last patchset where we pass the json lineage string as opposed to the thrift object. (This is actually the same approach as the first few patchsets http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc@499 PS15, Line 499: if (AreQueryHooksEnabled()) { > One thing to be aware of this is that LogLineageRecord() is called from Unr Really great point; I've copied this comment into the hook interface javadoc -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 14 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Wed, 22 May 2019 07:18:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
Hello Andrew Sherman, Anonymous Coward (498), Austin Nobis, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#16). Change subject: IMPALA-8473: publish lineage info via hook .. IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_exec_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `num_query_exec_hook_threads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook post-query execution will simply be logged and will not bring down the system. Tests: - added unit tests for FE hook execution - added e2e tests for hook configuration, execution, error Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java A tests/hooks/test_hooks.py 19 files changed, 1,118 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/16 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 16 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@161 PS15, Line 161: hookExecutor_.shutdown(); > Don't think so; my understanding is that all submitted jobs will still exec Regardless, I think this does need to `awaitTermination` because any hooks that hang will hang the system exit, and we need to check for that and kill them here -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 14 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Tue, 21 May 2019 00:12:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. Patch Set 14: (8 comments) Great feedback on PS14 and 15. I replied to most comments. Another PS coming soon, incorporating Andrew's suggestions about error-handling. Also mentioned logging metrics to the webui to aid in debugging/diagnosis, but I'm not sure how to do this yet. Will focus on capturing them for now. http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13352/15/be/src/service/impala-server.cc@505 PS15, Line 505: if (!status.ok()) { > I don't see a test case for this. This status won't indicate if any futures threw an exception (based on current implementation of Frontend.postQueryHook). It will only be a fail if the job submission itself threw an exception. So there's no test case to check for hooks throwing exceptions to trigger a fail status because that's not the feature. Currently, the intended behavior is that no exceptions thrown from a hook will do anything to the system except log themselves. (See comments/javadoc in FE). http://gerrit.cloudera.org:8080/#/c/13352/14/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java: http://gerrit.cloudera.org:8080/#/c/13352/14/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@59 PS14, Line 59: void postQueryExecute(PostQueryHookContext context); > Would it be useful to the Hook implementor to have some sort of Log object Sounds useful. Would you all be ok with simply passing in the actual `Logger`, or do you think that coupling the interface to our logging choice is a bad idea? It's kind of a wash to me since slf4j is supposed to be a facade anyway. http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java: http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@35 PS15, Line 35:* Any {@link Exception} thrown from this method will effectively fail > Is this the proper behavior? Do we really want to prevent impala from start I think so, as I am considering a failing hook startup to be something akin to a configuration error. I like the `onFoo` naming scheme as well, and I'd change all hook methods to match this http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java@52 PS15, Line 52:* Any {@link Exception} thrown from this method will only be caught > Aren't the exceptions from this function re-thrown in the Manager? They're rethrown in the thread pool executor, which only affects the Future. Note that all the Futures are currently ignored by the Frontend. The only reason they're caught here in the first place is to log them. http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@178 PS12, Line 178:* asynchronously, returning immediately with a List of {@link Future}s > I missed this design doc, is there a link? Here's the initial design doc: https://docs.google.com/document/d/1W476Zvjada9WBgmCCywnAakchfE1eNF7zyTy4cY9Q5w/edit?usp=sharing Note that the design/approach has evolved since the time-of-writing, as more feedback has been incorporated http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@161 PS15, Line 161: hookExecutor_.shutdown(); > I think it is possible to lose logs during a shutdown if we don't `awaitTer Don't think so; my understanding is that all submitted jobs will still execute-to-completion on the executor threads. `awaitTermination` would only block the `cleanUp` thread until they have completed. http://gerrit.cloudera.org:8080/#/c/13352/15/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@203 PS15, Line 203: return hookExecutor_.submit(() -> { > As I understand it, submit() adds the work to some sort of queue. So if the I believe the `ExecutorService` I am using rejects new work submission when its internal queue is full, and it provides a configurable hook for handling when that happens. This is something I need to look into some more, but I _believe_ that the code as-written will throw a runtime exception when this occurs on `hookExecutor_.
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
Hello Andrew Sherman, Anonymous Coward (498), Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#15). Change subject: IMPALA-8473: publish lineage info via hook .. IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_exec_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `num_query_exec_hook_threads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook post-query execution will simply be logged and will not bring down the system. Tests: - added unit tests for FE hook execution - added e2e tests for hook configuration, execution, error Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java A tests/hooks/test_hooks.py 19 files changed, 1,105 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/15 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 15 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8473: publish lineage info via hook
radford nguyen has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: IMPALA-8473: publish lineage info via hook .. IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. The legacy behavior of writing the lineage file is being retained but deprecated. Hooks can be implemented by downstream consumers (i.e. runtime dependencies) to hook into supported places during Impala query execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. 2 backend flags configure this behavior: - `query_exec_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. - `num_query_exec_hook_threads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. An exception during hook post-query execution will simply be logged and will not bring down the system. Tests: - added unit tests for FE hook execution - added e2e tests for hook configuration, execution, error Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/CountingQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/PostQueryErrorExecHook.java A tests/hooks/test_hooks.py 19 files changed, 1,120 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/14 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 14 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 12: (4 comments) answered some concerns around logging and error handling http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@55 PS7, Line 55: * performed synchronously during {@link #createFromConfig(BackendConfig)}. > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@131 PS12, Line 131: throw new InternalException(msg, e); > As the class names are specified externally to Impala, this is one of the p Yes, any exception thrown from any hook during startup will kill the entire system. I documented this better in the javadoc for the hook interface, but I should add it to this class javadoc as well. Good suggestion to also log the error message; I erroneously assumed that the exception message/trace would automatically be logged somehow. http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@146 PS12, Line 146: throw new InternalException(msg, e); > What will happen to the system if this occurs? Will log. An exception from the hook here will kill startup. The thinking is that this is akin to a system configuration error, in which case I don't think we would want to continue. http://gerrit.cloudera.org:8080/#/c/13352/12/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@178 PS12, Line 178: hookExecutor_.submit(() -> hook.postQueryExecute(context)); > Suppose the PostExecHook is failing, how will we know? Still WIP: I intend to log the exception and simply swallow it. (See design doc discussion and also QueryExecHook.postQueryExecute javadoc section on "Error Handling") It's not possible to fail the query since hook execution is async, and the only way (besides logging) I can think of to indicate to the user that a hook has failed is to either fail a subsequent query or trigger a daemon `CLEAN_EXIT_WITH_ERROR` in the backend. I'm not really familiar enough with Impala to understand the implications of either approach, so logging along with buyer-beware documentation in the interface is the route I am pursuing for now. This warrants more discussion, but I do believe/hope that better error-handling can be added later on without any redesign. -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 12 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Sat, 18 May 2019 00:14:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Hello Anonymous Coward (498), Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#12). Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The backend flag `query_exec_hook_classes` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. The backend flag `num_query_exec_hook_threads` specifies the number of threads to use for asynchronous hook execution. (Relevant if multiple hooks are registered.) Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java 16 files changed, 581 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/12 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 12 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
Hello Anonymous Coward (498), Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13352 to look at the new patch set (#11). Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The hadoop property `impala.post.exec.hooks` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java A fe/src/test/java/org/apache/impala/testutil/AlwaysErrorQueryExecHook.java A fe/src/test/java/org/apache/impala/testutil/DummyQueryExecHook.java 16 files changed, 579 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/11 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 11 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The hadoop property `impala.post.exec.hooks` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java 8 files changed, 436 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/7 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 7 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 7: (14 comments) http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h File be/src/service/frontend.h: http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h@190 PS6, Line 190: PostExecHook > nit: PostQueryExecHook is probably a better name? Done http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java File fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java: http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@36 PS2, Line 36: > It is dangerous to let plugin fail Impala in this way. Could impala catches This is during impala daemon start-up; is it not preferred to prevent start-up if there is a configuration problem? http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@47 PS2, Line 47: > "when a (qualifying) Impala query" should it be "after a (qualifying) Impal "has executed" already communicates that the point of time in question is in the past, so "after" is redundant. But I think it's clearer, so I'll change it to "after" http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@48 PS2, Line 48: > should "has executed" be "has been executed"? This is just a difference between active voice and passive voice. No difference in meaning http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@63 PS2, Line 63: > why don't you have cleanup function, which will be called by Impala when Im Because of the way impala shuts down, it will never be guaranteed to execute. This is covered and commented on in the design doc http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java File fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java: http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@28 PS6, Line 28: lineageGraph > nit: use _ suffix naming convention for member varaibles, i.e. lineageGraph kk http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@31 PS6, Line 31: Objects.requireNonNull(lineageGraph).deepCopy(); > I don't think we want to do a deep copy each time. Each time it's constructed? Or called? Because a single context object is passed to all the hook implementations http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java: http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@33 PS6, Line 33: singleton > i'm not 100% convinced why this needs to be a singleton. Can Frontend owns I agree. But what about JniFrontend owning it? http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@78 PS6, Line 78: private static final int NUM_HOOK_THREADS = 3; > should this be configurable? Probably. I'd like to get the hook configuration sorted first before I tackle this though, since they should probably use the same config mechanism http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@97 PS6, Line 97: Configuration config > I don't think passing hadoop configuration is correct here. Impala doesn't Yeah that sounds better. Using a backend flag http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@98 PS6, Line 98: LOG.info("impalaStartup hook invoked with config: {}", config); > config can get pretty big, we don't want to print it on info log level The `toString` of the config object actually only prints the file names, not their content. http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@152 PS6, Line 152: @Override : protected void finalize() throws Throwable { : this.cleanUp(); : super.finalize(); : } > i don't know if it's a good idea do this in finalize(), maybe use Runtime.a Will do http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@175 PS6, Line 175: void > can we return a List instead of void in case we want to do somethin Do you intend on passing the futures back to the backend? I fully support at least storing the futures here in case we decide to take action later on, but since this is the "manager", i think that any course of action woul
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java File fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java: http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java@31 PS5, Line 31: this.lineageGraph = Objects.requireNonNull(lineageGraph).deepCopy(); Lot of defensive copies because `TLineageGraph` is mutable, and context objects are given by reference to potentially multiple hooks. This initial copy may effect performance since it occurs even if no hooks are registered. This could be avoided if we provide an additional flag to disable hooks, independent of hook registration. Then we could avoid the JNI call and this constructor altogether. Feedback? -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 5 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 16 May 2019 19:34:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The hadoop property `impala.post.exec.hooks` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java A fe/src/main/java/org/apache/impala/hooks/ImpalaQueryExecHook.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 7 files changed, 361 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/5 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 5 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has removed Todd Lipcon from this change. ( http://gerrit.cloudera.org:8080/13352 ) Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. Removed reviewer Todd Lipcon. -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 3 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya
[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook
radford nguyen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13352 Change subject: [WIP] IMPALA-8473: publish lineage info via hook .. [WIP] IMPALA-8473: publish lineage info via hook This commit introduces a hook mechanism for publishing, lineage data specifically, but query information more generally, from Impala. Hooks can be consumed by downstream consumers (i.e. runtime dependencies) at supported places during Impala execution: - impalad startup - post-query execution The consumers are to be frontend Java dependencies intiated at runtime. The hadoop property `impala.post.exec.hooks` specifies a comma-separated list of hook consumer implementation classes that are instantiated and registered at impala start up. Lineage information is passed from the backend after a query completes (but before it returns) and given to every hook to execute asynchronously. IOW, a query may complete and return to the user before any or all hooks have completed executing. Tests: TBD Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 --- M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 7 files changed, 361 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/3 -- To view, visit http://gerrit.cloudera.org:8080/13352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1 Gerrit-Change-Number: 13352 Gerrit-PatchSet: 3 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Anonymous Coward (498) Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#10). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Wrote e2e test to verify new flag behavior Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A tests/authorization/test_provider.py M tests/authorization/test_ranger.py 15 files changed, 304 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/10 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 10 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12901 ) Change subject: IMPALA-8309: add user authorization_provider flag .. Patch Set 9: (19 comments) http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java@141 PS9, Line 141: getAuthorizationFactoryClassOrNull > nit: remove the `orNull` in the function name because the annotation is already present or because you don't like the name? i think there should be some indication that the return value can be null, if we're not using Optional http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: http://gerrit.cloudera.org:8080/#/c/12901/9/fe/src/main/java/org/apache/impala/service/JniCatalog.java@114 PS9, Line 114: //<<< 67f77d41d40523074385b8dbccfa6ef6ef81dd57 > Why is this commented out? ugh was sanity checking a merge conflict and forgot to delete http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py File tests/authorization/test_provider.py: http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@21 PS5, Line 21: import os > flake8: F401 'time' imported but unused Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@25 PS5, Line 25: from tests.common.custom_cluster_test_suite import CustomClusterTestSuite > flake8: F401 'tests.util.filesystem_utils.IS_LOCAL' imported but unused Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@28 PS5, Line 28: class TestProviderFails(CustomClusterTestSuite): > flake8: F401 'tests.common.impala_test_suite.ImpalaTestSuite' imported but Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@38 PS5, Line 38: LOG_DIR = tempfile.mkdtemp(prefix="test_provider_", dir=os.getenv("LOG_DIR")) > line has trailing whitespace Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@39 PS5, Line 39: > flake8: E302 expected 2 blank lines, found 1 Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@53 PS5, Line 53: e > flake8: E502 the backslash is redundant between brackets Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@54 PS5, Line 54: > flake8: E502 the backslash is redundant between brackets Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@55 PS5, Line 55: a > flake8: E502 the backslash is redundant between brackets Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@58 PS5, Line 58: > flake8: E502 the backslash is redundant between brackets Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@59 PS5, Line 59: > flake8: E502 the backslash is redundant between brackets Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@60 PS5, Line 60: > flake8: E502 the backslash is redundant between brackets Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@66 PS5, Line 66: > flake8: E502 the backslash is redundant between brackets Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@67 PS5, Line 67: > flake8: E501 line too long (92 > 90 characters) Ack http://gerrit.cloudera.org:8080/#/c/12901/5/tests/authorization/test_provider.py@68 PS5, Line 68: > flake8: E124 closing bracket does not match visual indentation Ack http://gerrit.cloudera.org:8080/#/c/12901/6/tests/authorization/test_provider.py File tests/authorization/test_provider.py: http://gerrit.cloudera.org:8080/#/c/12901/6/tests/authorization/test_provider.py@60 PS6, Line 60: not the happiest with this approach; open to suggestions http://gerrit.cloudera.org:8080/#/c/12901/7/tests/authorization/test_provider.py File tests/authorization/test_provider.py: http://gerrit.cloudera.org:8080/#/c/12901/7/tests/authorization/test_provider.py@27 PS7, Line 27: > flake8: E302 expected 2 blank lines, found 1 Ack http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py@38 PS4, Line 38: """ > Would prefer if there was an e2e test with an invalid as well as valid `--a Ack -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Cha
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#9). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Wrote e2e test to verify new flag behavior Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A tests/authorization/test_provider.py M tests/authorization/test_ranger.py 15 files changed, 331 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/9 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 9 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#8). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Wrote e2e test to verify new flag behavior Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A tests/authorization/test_provider.py M tests/authorization/test_ranger.py 16 files changed, 306 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/8 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 8 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#6). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Wrote e2e test to verify new flag behavior Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A tests/authorization/test_provider.py M tests/authorization/test_ranger.py 16 files changed, 313 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/6 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 6 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#7). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Wrote e2e test to verify new flag behavior Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A tests/authorization/test_provider.py M tests/authorization/test_ranger.py 16 files changed, 305 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/7 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 7 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#5). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Wrote e2e test to verify new flag behavior Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java A tests/authorization/test_provider.py M tests/authorization/test_ranger.py 16 files changed, 315 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/5 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 5 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12901 ) Change subject: IMPALA-8309: add user authorization_provider flag .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java: http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java@130 PS2, Line 130: public String getProviderName() { return "sentry"; } > I dunno, in this case the magic string is returned as the implementation of IOW, if we provide a constant, that just means that there will be multiple paths to getting the same information and they could potentially and unintentionally diverge -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 4 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 11 Apr 2019 04:45:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8363: Fix E2E start with impala log dir
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12935 ) Change subject: IMPALA-8363: Fix E2E start with impala_log_dir .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/12935/1/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/12935/1/tests/common/custom_cluster_test_suite.py@147 PS1, Line 147: kwargs = { : "cluster_size": cluster_size, : "num_coordinators": cluster_size, : "expected_num_executors": cluster_size, : "default_query_options": method.func_dict.get(DEFAULT_QUERY_OPTIONS) : } : if IMPALA_LOG_DIR in method.func_dict: : kwargs["impala_log_dir"] = method.func_dict[IMPALA_LOG_DIR] : self._start_impala_cluster(cluster_args, **kwargs) : super(CustomClusterTestSuite, self).setup_ > Agree with Fredy. Could also do something clever with kwargs, e.g. I think the kwargs approach is better because then `start_impala_cluster` can still be "in charge" of defaulting the `impala_log_dir` option -- To view, visit http://gerrit.cloudera.org:8080/12935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745 Gerrit-Change-Number: 12935 Gerrit-PatchSet: 2 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 11 Apr 2019 04:27:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8363: Fix E2E start with impala log dir
Hello Austin Nobis, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12935 to look at the new patch set (#2). Change subject: IMPALA-8363: Fix E2E start with impala_log_dir .. IMPALA-8363: Fix E2E start with impala_log_dir This commit fixes the `CustomClusterTestSuite` to wait for the correct number of executors when `impala_log_dir` is specified in the test decorator. Previously, the default value of 3 was always used, regardless of `cluster_size`. Testing: - Manual verification using tests/authorization/test_ranger.py with custom `impala_log_dir` and `cluster_size` arguments. Failed before changes, passed after changes - Ran all original E2E tests Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745 --- M tests/common/custom_cluster_test_suite.py 1 file changed, 8 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12935/2 -- To view, visit http://gerrit.cloudera.org:8080/12935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745 Gerrit-Change-Number: 12935 Gerrit-PatchSet: 2 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8363: fix e2e start with impala log dir
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12935 ) Change subject: IMPALA-8363: fix e2e start with impala_log_dir .. Patch Set 1: > Patch Set 1: > > (1 comment) I think the kwargs approach is better because then `start_impala_cluster` can still be "in charge" of defaulting the `impala_log_dir` option -- To view, visit http://gerrit.cloudera.org:8080/12935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745 Gerrit-Change-Number: 12935 Gerrit-PatchSet: 1 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Tue, 09 Apr 2019 18:44:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8363: fix e2e start with impala log dir
radford nguyen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12935 Change subject: IMPALA-8363: fix e2e start with impala_log_dir .. IMPALA-8363: fix e2e start with impala_log_dir This commit fixes the `CustomClusterTestSuite` to wait for the correct number of executors when `impala_log_dir` is specified in the test decorator. Previously, the default value of 3 was always used, regardless of `cluster_size`. Testing: - Ran tests/authorization/test_ranger.py with custom `impala_log_dir` and `cluster_size` arguments and observed pass. Observed to fail before changes - Ran all original e2e tests Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745 --- M tests/common/custom_cluster_test_suite.py 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12935/1 -- To view, visit http://gerrit.cloudera.org:8080/12935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4f46f40474b4b380abe88647a37e8e4d2231d745 Gerrit-Change-Number: 12935 Gerrit-PatchSet: 1 Gerrit-Owner: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12901 ) Change subject: IMPALA-8309: add user authorization_provider flag .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java: http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java@130 PS2, Line 130: public String getProviderName() { return "sentry"; } > Can we make "sentry" a public static final String? I dunno, in this case the magic string is returned as the implementation of the abstract `getProviderName` method. IOW, the method itself kind of acts as the public contract for getting the name, and any other entity which wants to get that value should be calling the method (as opposed to accessing a public static constant). -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 4 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 04 Apr 2019 20:24:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8226: Add grant/revoke to/from group for Ranger
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12914 ) Change subject: IMPALA-8226: Add grant/revoke to/from group for Ranger .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/12914/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java: http://gerrit.cloudera.org:8080/#/c/12914/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java@193 PS7, Line 193: String user, List groups, String clusterName, List privileges) { nit: We could probably use a `Collection groups` to be more general here, since the group's items are copied into a `List` when creating the request. Same with `privileges`. http://gerrit.cloudera.org:8080/#/c/12914/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java@233 PS7, Line 233: if (!groups.isEmpty()) request.getGroups().addAll(groups); nit: is the `if` statement really necessary given the contract of `addAll`? http://gerrit.cloudera.org:8080/#/c/12914/10/fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java File fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java: http://gerrit.cloudera.org:8080/#/c/12914/10/fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java@170 PS10, Line 170: "%s is not supported in Impalad", ClassUtil.getMethodName())); Isn't it more accurate to say that this isn't supported with sentry? -- To view, visit http://gerrit.cloudera.org:8080/12914 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28b7b3e4c776ad1bb5bdc184c7d733d0b5ef5e96 Gerrit-Change-Number: 12914 Gerrit-PatchSet: 11 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 04 Apr 2019 19:57:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12901 ) Change subject: IMPALA-8309: add user authorization_provider flag .. Patch Set 4: (1 comment) Modified existing e2e tests to use new `authorization_provider` flag http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/12901/4/tests/authorization/test_ranger.py@38 PS4, Line 38: "--authorization_provider=ranger") I changed these to use the new flag as I imagine this is the preferred way to select internally-supported auth providers. I didn't see any existing e2e tests for the `authorization_factory_class` flag, otherwise I would have duplicated them using the `authorization_provider` flag. Up to this point, looks like we only have full test coverage of authorization selection behavior in the fe, and I'm wondering if this is enough -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 4 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 04 Apr 2019 04:30:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#4). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java M tests/authorization/test_ranger.py 15 files changed, 229 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/4 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 4 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#3). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java 14 files changed, 227 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/3 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 3 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12901 ) Change subject: IMPALA-8309: add user authorization_provider flag .. Patch Set 2: (8 comments) Addressed nits; adding e2e tests next http://gerrit.cloudera.org:8080/#/c/12901/2/be/src/service/frontend.cc File be/src/service/frontend.cc: http://gerrit.cloudera.org:8080/#/c/12901/2/be/src/service/frontend.cc@41 PS2, Line 41: "sentry", > nit: move default to line above like other DEFINE Honest question: why? I don't really see consistent usage in this file; for example `DEFINE_string(ranger_app_id` begins the description string on a newline but `DEFINE_string(server_name` begins the description on the same line... even though both description strings end up wrapping. I think the current approach is perfectly readable, so I don't understand the point of these 3 nits. http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java File fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java: http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java@31 PS2, Line 31: factoryClass > Should this be `factoryClassName`? Yeah that's a better name. I was waffling back and forth between using the actual `Class` object vs. the canonical name. http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java@141 PS2, Line 141: public @Nullable String getAuthorizationFactoryClassOrNull() { > nit: I'm not opposed to adding the @Nullable annotation and adding `orNull` That's the thing, I wasn't able to find another example in the codebase for handling optional flags. If you could point one out I'll make sure this conforms http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@773 PS2, Line 773: * @param beCfg :* @return :* @throws InternalException > Finish the documentation Done http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@778 PS2, Line 778: throws InternalException { > nit: indent 4 Done http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@794 PS2, Line 794: +authzProvider > nit: add space around the `+` Done http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@812 PS2, Line 812: throws InternalException { > nit: indent 4 Done http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@822 PS2, Line 822: +authzFactoryClassName > nit: add space around the `+` Done -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 2 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 04 Apr 2019 02:08:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12901 ) Change subject: IMPALA-8309: add user authorization_provider flag .. Patch Set 2: (10 comments) > Patch Set 1: > > (8 comments) http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java@141 PS2, Line 141: public @Nullable String getAuthorizationFactoryClassOrNull() { Not sure if there's a standard for handling optional flags... at least I didn't really see one. We could alternatively return an `Optional` (generally my preference) or just return the raw value that the be populates and let callers decide what is "present" or not. http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java@113 PS1, Line 113: final AuthorizationFactory authzFactory > line too long (99 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java: http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@92 PS1, Line 92: Class authorization_factory_class, > line too long (105 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@93 PS1, Line 93: String authorization_provider) { > line too long (91 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@110 PS1, Line 110: JniFrontend.authzFactoryClassNameFrom(authCfg("", "ranger")) > line too long (102 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@113 PS1, Line 113: > line too long (98 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@118 PS1, Line 118: SentryAuthorizationFactory.class.getCanonicalName(), > line too long (98 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@123 PS1, Line 123: > line too long (98 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/1/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@128 PS1, Line 128: ) > line too long (98 > 90) Ack http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java: http://gerrit.cloudera.org:8080/#/c/12901/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@47 PS2, Line 47: origFlags = BackendConfig.INSTANCE.getBackendCfg(); This seems to be what other tests do -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 2 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Mon, 01 Apr 2019 09:38:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
Hello Austin Nobis, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12901 to look at the new patch set (#2). Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java 14 files changed, 217 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/2 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 2 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12901 ) Change subject: IMPALA-8309: add user authorization_provider flag .. Patch Set 1: Hi guys, relatively simply ticket here to add a more human-friendly flag for choosing authorization provider(s) -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 1 Gerrit-Owner: radford nguyen Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Mon, 01 Apr 2019 05:24:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8309: add user authorization provider flag
radford nguyen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12901 Change subject: IMPALA-8309: add user authorization_provider flag .. IMPALA-8309: add user authorization_provider flag This commit adds a `authorization_provider` user-facing flag in order to provide a more human-readable alternative to the `authorization_factory_class` for internally-provided authorization strategies. The `authorization_factory_class` flag is retained, but no longer takes a default value if not specified. The default for `authorization_provider` is "sentry" in order to retain backwards-compatibility. If specified, `authorization_factory_class` will take precedence. Testing: Manually started minicluster with each of following flags and verified correct authorization strategy chosen: - provider='' factory='' => sentry - provider=sentry factory='' => sentry - provider=ranger factory='' => ranger - provider='' factory=sentry => sentry - provider='' factory=ranger => ranger - provider=sentry factory=sentry => sentry - provider=ranger factory=sentry => sentry - provider=sentry factory=ranger => ranger - provider=ranger factory=ranger => ranger Wrote unit tests to capture above assertions Ran fe unit and e2e tests Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationProvider.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationConfig.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java 14 files changed, 201 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12901/1 -- To view, visit http://gerrit.cloudera.org:8080/12901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I001c20505ba4f0562b60fdef73d15308e8500c19 Gerrit-Change-Number: 12901 Gerrit-PatchSet: 1 Gerrit-Owner: radford nguyen
[Impala-ASF-CR] IMPALA-7917 (Part 2): Decouple Sentry from Impala
radford nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12542 ) Change subject: IMPALA-7917 (Part 2): Decouple Sentry from Impala .. Patch Set 12: (1 comment) hi @fredy, just adding a teeny little comment http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java File fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java: http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java@121 PS2, Line 121: analyzer.registerPrivReq(builder -> > I like the last one. Will update. I also prefer the last one because it makes it easy to see which methods are called on `builder` at-a-glance. With the one-liner, method calls on other objects (e.g. arguments) produce noise -- To view, visit http://gerrit.cloudera.org:8080/12542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I051d06451fb14ba2ffb3f7e0d4aae37dee55ab86 Gerrit-Change-Number: 12542 Gerrit-PatchSet: 12 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: radford nguyen Gerrit-Comment-Date: Thu, 07 Mar 2019 03:48:30 + Gerrit-HasComments: Yes