[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-03-02 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893657#comment-15893657
 ] 

Sudheesh Katkam commented on DRILL-5287:


Fixed in 
[7ebb985|https://github.com/apache/drill/commit/7ebb985edc823692673a42276b4e2a80fd1f256c]

> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.10.0
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893002#comment-15893002
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/758


> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.10.0
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889678#comment-15889678
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/758#discussion_r103624801
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java
 ---
@@ -280,31 +284,39 @@ public void interrupted(final InterruptedException 
ex) {
 }
   }
 
-  QueryState updateEphemeralState(final QueryState queryState) {
-switch (queryState) {
+  void updateEphemeralState(final QueryState queryState) {
+  // If query is already in zk transient store, ignore the transient 
state update option.
+  // Else, they will not be removed from transient store upon 
completion.
+  if (!inTransientStore &&
+  
!foreman.getQueryContext().getOptions().getOption(ExecConstants.QUERY_TRANSIENT_STATE_UPDATE))
 {
+return;
+  }
+
+  switch (queryState) {
   case ENQUEUED:
   case STARTING:
   case RUNNING:
   case CANCELLATION_REQUESTED:
 transientProfiles.put(stringQueryId, getQueryInfo());  // store as 
ephemeral query profile.
+inTransientStore = true;
 break;
 
   case COMPLETED:
   case CANCELED:
   case FAILED:
 try {
   transientProfiles.remove(stringQueryId);
+  inTransientStore = false;
 } catch(final Exception e) {
   logger.warn("Failure while trying to delete the estore profile 
for this query.", e);
 }
-
 break;
 
   default:
 throw new IllegalStateException("unrecognized queryState " + 
queryState);
 }
 
-return queryState;
+return;
--- End diff --

remove unnecessary return


> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10.0
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889680#comment-15889680
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/758
  
+1 (minor comment)


> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10.0
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15888321#comment-15888321
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/758#discussion_r103487885
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java
 ---
@@ -280,8 +281,15 @@ public void interrupted(final InterruptedException ex) 
{
 }
   }
 
-  QueryState updateEphemeralState(final QueryState queryState) {
-switch (queryState) {
+  void updateEphemeralState(final QueryState queryState) {
+  // If query is already in zk transient store, ignore the transient 
state update option.
+  // Else, they will not be removed from transient store upon 
completion.
+  if (transientProfiles.get(stringQueryId) == null &&
--- End diff --

I want to bypass the option for the queries which are already in transient 
store when option is enabled. Otherwise, their state will never get updated 
and/or will never be removed from transient store. web UI will show these 
queries as running forever :-)

Thanks for raising a good point regarding using transientProfiles.get. I 
made the change to update and use in memory state instead. 

Please review the new diffs.



> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10.0
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887108#comment-15887108
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/758#discussion_r103364711
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java
 ---
@@ -280,8 +281,15 @@ public void interrupted(final InterruptedException ex) 
{
 }
   }
 
-  QueryState updateEphemeralState(final QueryState queryState) {
-switch (queryState) {
+  void updateEphemeralState(final QueryState queryState) {
+  // If query is already in zk transient store, ignore the transient 
state update option.
+  // Else, they will not be removed from transient store upon 
completion.
+  if (transientProfiles.get(stringQueryId) == null &&
--- End diff --

Why not just check the option?

`transientProfiles.get(stringQueryId)` is quite expensive itself ([contacts 
ZooKeeper and deserializes 
data](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZkEphemeralStore.java#L61)).


> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886231#comment-15886231
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/758#discussion_r103271146
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -413,4 +413,8 @@
 
   String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support";
   BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new 
BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED, true, true);
+
+  String ZK_QUERY_STATE_UPDATE_KEY = "drill.exec.zk.query.state.update";
--- End diff --

I changed it to QUERY_TRANSIENT_STATE_UPDATE_KEY  and 
exec.query.progress.update. Please review the new diffs. 


> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883279#comment-15883279
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/758#discussion_r103005063
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
@@ -1010,7 +1010,9 @@ public void addToEventQueue(final QueryState 
newState, final Exception exception
 
   private void recordNewState(final QueryState newState) {
 state = newState;
-queryManager.updateEphemeralState(newState);
+if 
(queryContext.getOptions().getOption(ExecConstants.ZK_QUERY_STATE_UPDATE)) {
+  queryManager.updateEphemeralState(newState);
+}
--- End diff --

For long running queries, it may not make much difference. It adds latency 
of around ~50-60 msec for single query. However, with high concurrency, impact 
of contention because of zookeeper updates is significant. Like I mentioned in 
the JIRA, for concurrency=100, the average query response time for simple 
queries is 8 sec vs 0.2 sec with these updates disabled.  It does not impact 
the query profile. Query profile gets updated and written at the end of the 
query as usual.  This option affects only running queries. In Web UI, you will 
not see running queries and their state.


> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881759#comment-15881759
 ] 

ASF GitHub Bot commented on DRILL-5287:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/758#discussion_r102863394
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
@@ -1010,7 +1010,9 @@ public void addToEventQueue(final QueryState 
newState, final Exception exception
 
   private void recordNewState(final QueryState newState) {
 state = newState;
-queryManager.updateEphemeralState(newState);
+if 
(queryContext.getOptions().getOption(ExecConstants.ZK_QUERY_STATE_UPDATE)) {
+  queryManager.updateEphemeralState(newState);
+}
--- End diff --

How does this affect query operation for long-running queries? How does it 
impact the query profile? If updates are enabled, do we still do an update at 
query completion to finalize the profile? If not, should writing of the profile 
be automatically disabled if status updates are disabled?

Do we do any timeout on updates? Will we notice that the query has not been 
updated and, say, kill the query due to timeouts?


> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-21 Thread Keys Botzum (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876784#comment-15876784
 ] 

Keys Botzum commented on DRILL-5287:


Just curious if it would make sense for all queries (short or long) for the 
status update to be done in an async thread. That way it doesn't slow down 
query processing.

> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.10
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)