[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893657#comment-15893657 ] Sudheesh Katkam commented on DRILL-5287: Fixed in [7ebb985|https://github.com/apache/drill/commit/7ebb985edc823692673a42276b4e2a80fd1f256c] > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Labels: doc-impacting, ready-to-commit > Fix For: 1.10.0 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893002#comment-15893002 ] ASF GitHub Bot commented on DRILL-5287: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/758 > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Labels: doc-impacting, ready-to-commit > Fix For: 1.10.0 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889678#comment-15889678 ] ASF GitHub Bot commented on DRILL-5287: --- Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/758#discussion_r103624801 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java --- @@ -280,31 +284,39 @@ public void interrupted(final InterruptedException ex) { } } - QueryState updateEphemeralState(final QueryState queryState) { -switch (queryState) { + void updateEphemeralState(final QueryState queryState) { + // If query is already in zk transient store, ignore the transient state update option. + // Else, they will not be removed from transient store upon completion. + if (!inTransientStore && + !foreman.getQueryContext().getOptions().getOption(ExecConstants.QUERY_TRANSIENT_STATE_UPDATE)) { +return; + } + + switch (queryState) { case ENQUEUED: case STARTING: case RUNNING: case CANCELLATION_REQUESTED: transientProfiles.put(stringQueryId, getQueryInfo()); // store as ephemeral query profile. +inTransientStore = true; break; case COMPLETED: case CANCELED: case FAILED: try { transientProfiles.remove(stringQueryId); + inTransientStore = false; } catch(final Exception e) { logger.warn("Failure while trying to delete the estore profile for this query.", e); } - break; default: throw new IllegalStateException("unrecognized queryState " + queryState); } -return queryState; +return; --- End diff -- remove unnecessary return > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10.0 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889680#comment-15889680 ] ASF GitHub Bot commented on DRILL-5287: --- Github user sudheeshkatkam commented on the issue: https://github.com/apache/drill/pull/758 +1 (minor comment) > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10.0 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15888321#comment-15888321 ] ASF GitHub Bot commented on DRILL-5287: --- Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/758#discussion_r103487885 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java --- @@ -280,8 +281,15 @@ public void interrupted(final InterruptedException ex) { } } - QueryState updateEphemeralState(final QueryState queryState) { -switch (queryState) { + void updateEphemeralState(final QueryState queryState) { + // If query is already in zk transient store, ignore the transient state update option. + // Else, they will not be removed from transient store upon completion. + if (transientProfiles.get(stringQueryId) == null && --- End diff -- I want to bypass the option for the queries which are already in transient store when option is enabled. Otherwise, their state will never get updated and/or will never be removed from transient store. web UI will show these queries as running forever :-) Thanks for raising a good point regarding using transientProfiles.get. I made the change to update and use in memory state instead. Please review the new diffs. > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10.0 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887108#comment-15887108 ] ASF GitHub Bot commented on DRILL-5287: --- Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/758#discussion_r103364711 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java --- @@ -280,8 +281,15 @@ public void interrupted(final InterruptedException ex) { } } - QueryState updateEphemeralState(final QueryState queryState) { -switch (queryState) { + void updateEphemeralState(final QueryState queryState) { + // If query is already in zk transient store, ignore the transient state update option. + // Else, they will not be removed from transient store upon completion. + if (transientProfiles.get(stringQueryId) == null && --- End diff -- Why not just check the option? `transientProfiles.get(stringQueryId)` is quite expensive itself ([contacts ZooKeeper and deserializes data](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZkEphemeralStore.java#L61)). > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886231#comment-15886231 ] ASF GitHub Bot commented on DRILL-5287: --- Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/758#discussion_r103271146 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java --- @@ -413,4 +413,8 @@ String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support"; BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED, true, true); + + String ZK_QUERY_STATE_UPDATE_KEY = "drill.exec.zk.query.state.update"; --- End diff -- I changed it to QUERY_TRANSIENT_STATE_UPDATE_KEY and exec.query.progress.update. Please review the new diffs. > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883279#comment-15883279 ] ASF GitHub Bot commented on DRILL-5287: --- Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/758#discussion_r103005063 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -1010,7 +1010,9 @@ public void addToEventQueue(final QueryState newState, final Exception exception private void recordNewState(final QueryState newState) { state = newState; -queryManager.updateEphemeralState(newState); +if (queryContext.getOptions().getOption(ExecConstants.ZK_QUERY_STATE_UPDATE)) { + queryManager.updateEphemeralState(newState); +} --- End diff -- For long running queries, it may not make much difference. It adds latency of around ~50-60 msec for single query. However, with high concurrency, impact of contention because of zookeeper updates is significant. Like I mentioned in the JIRA, for concurrency=100, the average query response time for simple queries is 8 sec vs 0.2 sec with these updates disabled. It does not impact the query profile. Query profile gets updated and written at the end of the query as usual. This option affects only running queries. In Web UI, you will not see running queries and their state. > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881759#comment-15881759 ] ASF GitHub Bot commented on DRILL-5287: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/758#discussion_r102863394 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -1010,7 +1010,9 @@ public void addToEventQueue(final QueryState newState, final Exception exception private void recordNewState(final QueryState newState) { state = newState; -queryManager.updateEphemeralState(newState); +if (queryContext.getOptions().getOption(ExecConstants.ZK_QUERY_STATE_UPDATE)) { + queryManager.updateEphemeralState(newState); +} --- End diff -- How does this affect query operation for long-running queries? How does it impact the query profile? If updates are enabled, do we still do an update at query completion to finalize the profile? If not, should writing of the profile be automatically disabled if status updates are disabled? Do we do any timeout on updates? Will we notice that the query has not been updated and, say, kill the query due to timeouts? > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876784#comment-15876784 ] Keys Botzum commented on DRILL-5287: Just curious if it would make sense for all queries (short or long) for the status update to be done in an async thread. That way it doesn't slow down query processing. > Provide option to skip updates of ephemeral state changes in Zookeeper > -- > > Key: DRILL-5287 > URL: https://issues.apache.org/jira/browse/DRILL-5287 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > Fix For: 1.10 > > > We put transient profiles in zookeeper and update state as query progresses > and changes states. It is observed that this adds latency of ~45msec for each > update in the query execution path. This gets even worse when high number of > concurrent queries are in progress. For concurrency=100, the average query > response time even for short queries is 8 sec vs 0.2 sec with these updates > disabled. For short lived queries in a high-throughput scenario, it is of no > value to update state changes in zookeeper. We need an option to disable > these updates for short running operational queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)