[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2019-05-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842636#comment-16842636
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on pull request #1654: DRILL-5270: Improve loading of 
profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1654
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.17.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-10-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648613#comment-16648613
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-429493831
 
 
   @priteshm this required some more rework, which I'm hoping I've addressed. 
We can review and try to get this in as part of 1.15.0
   I've rebased this on top of latest master, accounting for conflicts due to  
DRILL-6053 (locking of PStore), DRILL-6422 (shaded Guava imports) and 
DRILL-6492 (schema/workspace insensitivity).
   @arina-ielchiieva / @parthchandra  / @ilooner  any one up for reviewing 
this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2017-02-21 Thread Kunal Khatua (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877119#comment-15877119
 ] 

Kunal Khatua commented on DRILL-5270:
-

As a sample experiment, we ran a Drillbit to display the profile list for a 
directory containing 280K files. 

We found that while the Drillbit took a long time to startup (DRILL-4990 will 
fix this), the load time improves as long as no new profiles are detected. 
Using a PathFilter (HadoopAPI) implementation to find new profiles should have 
helped. However, it appears that the filter isn't pushed down to the file 
system, so we're not able to benefit from it, unless the HDFS API itself is 
improved. There is no regression, however, with this. 

Using in conjunction with the patch for DRILL-5259 does show improved load 
times, since we don't go back to the DFS to re-read the profile list.

This is the content of the Drill log
{code}
2017-02-17 11:09:30,929 pssc-65.qa.lab [main] INFO  
o.apache.drill.exec.server.Drillbit - Startup completed (625436 ms).
2017-02-17 11:14:35,886 pssc-65.qa.lab [qtp2142187763-113] WARN  
o.a.d.e.s.s.s.LocalPersistentStore - Took 5876 ms to list+map from 281001 
profiles
2017-02-17 11:14:35,893 pssc-65.qa.lab [qtp2142187763-113] DEBUG  
o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 100 profiles: 5885 ms
2017-02-17 11:18:25,940 pssc-65.qa.lab [qtp2142187763-118] DEBUG  
o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 100 profiles: 1 ms
2017-02-17 11:19:14,977 pssc-65.qa.lab [qtp2142187763-122] DEBUG  
o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 75 profiles: 1 ms
2017-02-17 11:19:27,554 pssc-65.qa.lab [qtp2142187763-123] DEBUG  
o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 150 profiles: 1 ms
2017-02-17 11:21:58,137 pssc-65.qa.lab [qtp2142187763-124] INFO  
o.a.drill.exec.client.DrillClient - Successfully connected to server 
pssc-65.qa.lab:31010
2017-02-17 11:21:58,409 pssc-65.qa.lab 
[2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a: select * from sys.drillbits
2017-02-17 11:32:02,320 pssc-65.qa.lab 
[2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2017-02-17 11:32:02,321 pssc-65.qa.lab 
[2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: 
State to report: RUNNING
2017-02-17 11:32:02,506 pssc-65.qa.lab 
[2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO  
o.a.d.e.c.ClassCompilerSelector - Java compiler policy: DEFAULT, Debug option: 
true
2017-02-17 11:32:02,792 pssc-65.qa.lab 
[2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: 
State change requested RUNNING --> FINISHED
2017-02-17 11:32:02,793 pssc-65.qa.lab 
[2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: 
State to report: FINISHED
2017-02-17 11:32:02,837 pssc-65.qa.lab [CONTROL-rpc-event-queue] INFO  
query.logger - 
{"queryId":"2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a","schema":"","queryText":"select
 * from 
sys.drillbits","start":1487359318324,"finish":1487359922805,"outcome":"COMPLETED","username":"anonymous","remoteAddress":"10.10.103.65:56448"}
2017-02-17 11:34:49,443 pssc-65.qa.lab [qtp2142187763-129] WARN  
o.a.d.e.s.s.s.LocalPersistentStore - Took 5545 ms to list+map from 1 profiles
2017-02-17 11:34:49,448 pssc-65.qa.lab [qtp2142187763-129] DEBUG  
o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 50 profiles: 5553 ms
2017-02-17 11:36:58,770 pssc-65.qa.lab [qtp2142187763-120] DEBUG  
o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 60 profiles: 1 ms
{code}

> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
> Fix For: 1.10.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload 

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877200#comment-15877200
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

GitHub user kkhatua opened a pull request:

https://github.com/apache/drill/pull/755

DRILL-5270: Improve loading of profiles listing in the WebUI

Using Hadoop API to filter and reduce profile list load time
Using an in-memory treeSet-based cache, maintain the list of most recent
profiles.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkhatua/drill DRILL-5270

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/755.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #755


commit a5f20643850ad399622e5df9a6f37713545dc7a6
Author: Kunal Khatua 
Date:   2017-02-22T01:20:48Z

DRILL-5270: Improve loading of profiles listing in the WebUI

Using Hadoop API to filter and reduce profile list load time
Using an in-memory treeSet-based cache, maintain the list of most recent
profiles.




> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
> Fix For: 1.10.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877207#comment-15877207
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
A summary of the performance is available in this 
[comment](https://issues.apache.org/jira/browse/DRILL-5270?focusedCommentId=15877119&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15877119)
 on the JIRA (DRILL-5270)



> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
> Fix For: 1.10.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877250#comment-15877250
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
For 8266 profiles, when measured from Chrome browser's Network tool:
```
Load First Time: 2.43s 
Load Second Time (no new profiles): 829ms
```


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
> Fix For: 1.10.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979633#comment-15979633
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
@sudheeshkatkam Can you please review the PR?


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
> Fix For: 1.11.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-22 Thread Pritesh Maker (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373974#comment-16373974
 ] 

Pritesh Maker commented on DRILL-5270:
--

[~arina] can you please review this PR?

> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379409#comment-16379409
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171084169
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -61,16 +63,29 @@
   private final AutoCloseableLock readLock = new 
AutoCloseableLock(readWriteLock.readLock());
   private final AutoCloseableLock writeLock = new 
AutoCloseableLock(readWriteLock.writeLock());
 
+  //Provides a threshold above which we report the time to load
+  private static final long LISTTIME_THRESHOLD_MSEC = 2000L;
+
+  private static final int DrillSysFileExtSize = 
DRILL_SYS_FILE_SUFFIX.length();
--- End diff --

`DrillSysFileExtSize` -> `drillSysFileExtSize`


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379406#comment-16379406
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171082847
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java
 ---
@@ -0,0 +1,53 @@
+/**
--- End diff --

Please use comment for the header, not javadoc.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379408#comment-16379408
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171088565
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -112,23 +127,65 @@ public static DrillFileSystem 
getFileSystem(DrillConfig config, Path root) throw
 
   @Override
   public Iterator> getRange(int skip, int take) {
+//Marking currently seen modification time
+long currBasePathModified = 0L;
+try {
+  currBasePathModified = 
fs.getFileStatus(basePath).getModificationTime();
+} catch (IOException ioexcp) {
+  ioexcp.printStackTrace();
+}
+
+//Acquiring lock to avoid reloading for request coming in before 
completion of profile read
--- End diff --

1. Before reading lock acquirement was enough, with your changes you modify 
class fields. Since many threads can access this method, you'll end up with 
raise conditions, also class fields can be cached by threads as well... I think 
design here should be reconsidered.
2. Guava library has several cache implementations. Can we leverage any of 
them instead of using tree set?

Pinging @vlad since he is working on DRILL-6053 which intends to make 
changes in the same class to avoid excessive locking to be aware of intended 
changes.



> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379411#comment-16379411
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171089128
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -141,11 +198,33 @@ public static DrillFileSystem 
getFileSystem(DrillConfig config, Path root) throw
 }
   }
 
+  /**
+   * Add profile name to a TreeSet
+   * @param profileName
--- End diff --

Please do not leave `@param`, `@return` without description. IDE usually 
highlights them, asking to add description. 


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379410#comment-16379410
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171084274
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -61,16 +63,29 @@
   private final AutoCloseableLock readLock = new 
AutoCloseableLock(readWriteLock.readLock());
   private final AutoCloseableLock writeLock = new 
AutoCloseableLock(readWriteLock.writeLock());
 
+  //Provides a threshold above which we report the time to load
+  private static final long LISTTIME_THRESHOLD_MSEC = 2000L;
--- End diff --

`LISTTIME_THRESHOLD_MSEC` -> `LIST_TIME_THRESHOLD_MSEC`


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379412#comment-16379412
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171089532
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java
 ---
@@ -0,0 +1,53 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.sys.store;
+
+import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.PathFilter;
+
+/**
+ * Filter for Drill System Files
+ */
+public class DrillSysFilePathFilter implements PathFilter {
--- End diff --

Please consider using `FileSystemUtil` which help to create filters. 
Passing custom filter is also possible.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-02-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379407#comment-16379407
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171084819
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -112,23 +127,65 @@ public static DrillFileSystem 
getFileSystem(DrillConfig config, Path root) throw
 
   @Override
   public Iterator> getRange(int skip, int take) {
+//Marking currently seen modification time
+long currBasePathModified = 0L;
+try {
+  currBasePathModified = 
fs.getFileStatus(basePath).getModificationTime();
+} catch (IOException ioexcp) {
+  ioexcp.printStackTrace();
--- End diff --

Please do not use `printStackTrace()`


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.13.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384056#comment-16384056
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171944876
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java
 ---
@@ -0,0 +1,53 @@
+/**
--- End diff --

OK. Will fix this.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384054#comment-16384054
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171944817
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -61,16 +63,29 @@
   private final AutoCloseableLock readLock = new 
AutoCloseableLock(readWriteLock.readLock());
   private final AutoCloseableLock writeLock = new 
AutoCloseableLock(readWriteLock.writeLock());
 
+  //Provides a threshold above which we report the time to load
+  private static final long LISTTIME_THRESHOLD_MSEC = 2000L;
--- End diff --

OK. Will fix this.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384048#comment-16384048
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171944701
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java
 ---
@@ -0,0 +1,53 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.sys.store;
+
+import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.PathFilter;
+
+/**
+ * Filter for Drill System Files
+ */
+public class DrillSysFilePathFilter implements PathFilter {
--- End diff --

Ok.  I was thinking of using
```
List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, 
false, sysFileSuffixFilter);
```


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384053#comment-16384053
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171944767
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -141,11 +198,33 @@ public static DrillFileSystem 
getFileSystem(DrillConfig config, Path root) throw
 }
   }
 
+  /**
+   * Add profile name to a TreeSet
+   * @param profileName
--- End diff --

OK. Will fix this. Eclipse didn't pop it up for me.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384060#comment-16384060
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171945072
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -112,23 +127,65 @@ public static DrillFileSystem 
getFileSystem(DrillConfig config, Path root) throw
 
   @Override
   public Iterator> getRange(int skip, int take) {
+//Marking currently seen modification time
+long currBasePathModified = 0L;
+try {
+  currBasePathModified = 
fs.getFileStatus(basePath).getModificationTime();
+} catch (IOException ioexcp) {
+  ioexcp.printStackTrace();
--- End diff --

Will publish a log message and return an empty iterator for now. Not sure 
how to bubble up an error to the UI. I'll take a look at how we do so for 
profile deserialization as a guide


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384058#comment-16384058
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171944952
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -61,16 +63,29 @@
   private final AutoCloseableLock readLock = new 
AutoCloseableLock(readWriteLock.readLock());
   private final AutoCloseableLock writeLock = new 
AutoCloseableLock(readWriteLock.writeLock());
 
+  //Provides a threshold above which we report the time to load
+  private static final long LISTTIME_THRESHOLD_MSEC = 2000L;
+
+  private static final int DrillSysFileExtSize = 
DRILL_SYS_FILE_SUFFIX.length();
--- End diff --

I wanted to treat this like a constant, but this makes it confusing as a 
Class name


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384067#comment-16384067
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/755#discussion_r171945885
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ---
@@ -112,23 +127,65 @@ public static DrillFileSystem 
getFileSystem(DrillConfig config, Path root) throw
 
   @Override
   public Iterator> getRange(int skip, int take) {
+//Marking currently seen modification time
+long currBasePathModified = 0L;
+try {
+  currBasePathModified = 
fs.getFileStatus(basePath).getModificationTime();
+} catch (IOException ioexcp) {
+  ioexcp.printStackTrace();
+}
+
+//Acquiring lock to avoid reloading for request coming in before 
completion of profile read
--- End diff --

I'll provide the explanation below.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384066#comment-16384066
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
@arina-ielchiieva I need to rebase this on top of the latest master 
considering it was originally based on nearly a year old code. When ready, i'll 
create a new PR or push to this one. Let me know which one works.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384080#comment-16384080
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
The choice for a `TreeSet` is to basically use a binary structure that 
keeps the (maximum permitted) profiles sorted and in memory. 

When Drill detect changes, 
(Refer 
https://github.com/kkhatua/drill/blob/f7ad29b9a322bb215d16b3c3b9a2bfc40abfc1ed/exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java#L146)
 
it will fetch all the available profiles in the PStore and reconstruct the 
tree (since the order of the profiles returned by the `FileSystem` is not 
guaranteed). 

I tried using the `PathFilter` to fetch only new profiles, but the cost of 
the `FileSystem` fetching only new profiles, versus the entire list is the 
same! Also, there is the possibility that some profiles might have been deleted 
as new ones were added, so a full reconstruction would take care of that 
scenario as well. 

To evict, as I construct the TreeSet, I simply pop the oldest (by filename) 
entry. The Guava cache options don't seem to provide a way to define the basis 
on which to evict entries.

I believe, @vrozov's work on DRILL-6053 is to address locking during writes 
specifically. The lock I used (and need) is for reads to ensure that multiple 
requests don't trigger an expensive FileSystem call for the same state of the 
PStore. 
e.g. consider T# as timestamps
* `currBasePathModified` = T0 
* _ThreadA_ requests at t=T1 and issues a read-lock
* _ThreadB_ requests at t=T2 but is waiting for read-lock

If the tree exists and no change is detected, _ThreadA_ will use the 
`TreeSet` contents and resume by releasing the lock. 

If the `TreeSet` exists and a change is detected, _ThreadA_ will 
reconstruct the `TreeSet` before using its contents and it will update 
`lastBasePathModified`, before releasing the lock.

When _ThreadB_ gets the read-lock, it discovers that during the wait, the 
`TreeSet` was already updated. So, in terms of t=T2, this is the most recent 
snapshot, so it proceeds to use the treeSet's contents rather than reconstruct. 
That will be deferred to the next request.

We're using the `lastBasePathModified` as a way to provide a 
pseudo-versioned access to the list. That means if there are more profiles 
added *after* _ThreadB_ was waiting for the read-lock, it will not trigger the 
`FileSystem` call right away. 



> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384296#comment-16384296
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/755
  
@kkhatua
1. The read locks are not exclusive (single writer/multiple readers). To 
achieve the required functionality you need to introduce a different lock and 
use write (or exclusive) lock.
2. The choice for TreeSet is not obvious. What are the most common 
operations performed on the collection? Do you optimize for get, put or 
collection construction?

@arina-ielchiieva my github id is `vrozov`.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384327#comment-16384327
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
Thanks, @vrozov. I'll make use of a separate lock for read-only purpose in 
case of `#1`.
For `#2`, I need to construct a size-limited ordered set from a list of 
unordered elements.
In this case, the elements (i.e. profiles) need to be ordered by file-name, 
which is a 1:1 mapping function of the start time epoch for the query.
So, I need to be able to add to such a datastructure in `O(log(n))` time, 
remove in `O(1)` and iterate through it in sequence. So, my puts are the most 
expensive operation. 



> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16398934#comment-16398934
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
Holding off to do a rebase once @vrozov 's PR #1163 (DRILL-6053) goes into 
Apache.


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464374#comment-16464374
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #755: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/755#issuecomment-386729649
 
 
   Closing this PR in favor of #1250 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464375#comment-16464375
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua closed pull request #755: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/755
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 
b/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
index 54fb46ab68..1dafb51f06 100644
--- a/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
+++ b/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
@@ -165,6 +165,8 @@ private ExecConstants() {
   public static final String SYS_STORE_PROVIDER_LOCAL_ENABLE_WRITE = 
"drill.exec.sys.store.provider.local.write";
   public static final String PROFILES_STORE_INMEMORY = 
"drill.exec.profiles.store.inmemory";
   public static final String PROFILES_STORE_CAPACITY = 
"drill.exec.profiles.store.capacity";
+  public static final String PROFILES_STORE_ARCHIVE_ENABLED = 
"drill.exec.profiles.store.archive.enabled";
+  public static final String PROFILES_STORE_ARCHIVE_RATE = 
"drill.exec.profiles.store.archive.rate";
   public static final String IMPERSONATION_ENABLED = 
"drill.exec.impersonation.enabled";
   public static final String IMPERSONATION_MAX_CHAINED_USER_HOPS = 
"drill.exec.impersonation.max_chained_user_hops";
   public static final String AUTHENTICATION_MECHANISMS = 
"drill.exec.security.auth.mechanisms";
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java
index ec06f0ef4b..3569db972c 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java
@@ -18,6 +18,7 @@
 package org.apache.drill.exec.server.rest.profile;
 
 import java.text.SimpleDateFormat;
+import java.util.ArrayList;
 import java.util.Collections;
 import java.util.Date;
 import java.util.Iterator;
@@ -58,6 +59,7 @@
 import org.apache.drill.exec.work.foreman.Foreman;
 import org.glassfish.jersey.server.mvc.Viewable;
 
+import com.google.common.base.Stopwatch;
 import com.google.common.collect.Lists;
 
 @Path("/")
@@ -71,6 +73,7 @@
   @Inject SecurityContext sc;
 
   public static class ProfileInfo implements Comparable {
+private static final String TRAILING_DOTS = " ... ";
 private static final int QUERY_SNIPPET_MAX_CHAR = 150;
 private static final int QUERY_SNIPPET_MAX_LINES = 8;
 
@@ -171,13 +174,13 @@ private String extractQuerySnippet(String queryText) {
   //Trimming down based on line-count
   if (QUERY_SNIPPET_MAX_LINES < queryParts.length) {
 int linesConstructed = 0;
-StringBuilder lineCappedQuerySnippet = new StringBuilder();
+StringBuilder lineCappedQuerySnippet = new 
StringBuilder(QUERY_SNIPPET_MAX_CHAR + TRAILING_DOTS.length());
 for (String qPart : queryParts) {
   lineCappedQuerySnippet.append(qPart);
   if (++linesConstructed < QUERY_SNIPPET_MAX_LINES) {
 lineCappedQuerySnippet.append(System.lineSeparator());
   } else {
-lineCappedQuerySnippet.append(" ... ");
+lineCappedQuerySnippet.append(TRAILING_DOTS);
 break;
   }
 }
@@ -260,8 +263,6 @@ public QProfiles getProfilesJSON(@Context UriInfo uriInfo) {
 
   Collections.sort(runningQueries, Collections.reverseOrder());
 
-  final List finishedQueries = Lists.newArrayList();
-
   //Defining #Profiles to load
   int maxProfilesToLoad = 
work.getContext().getConfig().getInt(ExecConstants.HTTP_MAX_PROFILES);
   String maxProfilesParams = 
uriInfo.getQueryParameters().getFirst(MAX_QPROFILES_PARAM);
@@ -269,8 +270,9 @@ public QProfiles getProfilesJSON(@Context UriInfo uriInfo) {
 maxProfilesToLoad = Integer.valueOf(maxProfilesParams);
   }
 
-  final Iterator> range = 
completed.getRange(0, maxProfilesToLoad);
+  final List finishedQueries = new 
ArrayList(maxProfilesToLoad);
 
+  final Iterator> range = 
completed.getRange(0, maxProfilesToLoad);
   while (range.hasNext()) {
 try {
   final Map.Entry profileEntry = range.next();
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java
new

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464369#comment-16464369
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua opened a new pull request #1250: DRILL-5270: Improve loading of 
profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250
 
 
   When Drill is displaying profiles stored on the file system (Local or 
Distributed), it does so by loading the entire list of `.sys.drill` files in 
the profile directory, sorting and deserializing. This can get expensive, since 
only a single CPU thread does this.
   As an example, a directory of 120K profiles, the time to just fetch the list 
of files alone is over 6 seconds. After that, based on the number of profiles 
being rendered, the time varies. An average of 30ms is needed to deserialize a 
standard profile, which translates to an additional 3sec for therendering of 
default 100 profiles.
   
   A user reported issue confirms just that:
   DRILL-5028 Opening profiles page from web ui gets very slow when a lot of 
history files have been stored in HDFS or Local FS
   
   Additional JIRAs filed ask for managing these profiles
   DRILL-2362 Drill should manage Query Profiling archiving
   DRILL-2861 enhance drill profile file management
   
   This PR brings the following enhancements to achieve that:
   1. Mimick the In-memory persistence of profiles (DRILL-5481), by keeping 
only a predefined `max-capacity` number of profiles in the directory and moving 
the oldest to an 'archived' sub-directory.
   2. Improve loading times by pinning the deserialized list in memory 
(TreeSet; for maintaining a memory-efficient sortedness of the profiles). That 
way, if we do not detect any new profiles in the profileStore (i.e. profile 
directory) since the last time a web-request for rendering the profiles was 
made, we can re-serve the same listing and skip making a trip to the filesystem 
to re-fetch all the profiles.
   
   Reload & reconstruction of the profiles in the Tree is done in the event of 
any of the following states changing:
 i.   Modification Time of profile dir
 ii.  Number of profiles in the profile dir
 iii. Number of profiles requested exceeds existing the currently available 
list
   
   3. When 2 or more web-requests for rendering arrive, the WebServer code 
already processes the requests sequentially. As a result, the earliest request 
will trigger the reconstruction of the in-memory profile-set, and the 
last-modified timestamp of the profileStore is tracked. This way, the remaining 
blocked requests can re-use the freshly-reconstructed profile-set for rendering 
if the underlying profileStore has not been modified. There is an assumption 
made here that the rate of profiles being added to the profileStore is not too 
high to trigger a reconstruction for every queued up request. 
   4. To prevent frequent archiving, there is a threshold (max-capacity) 
defined for triggering the archive. However, the number of profiles archived is 
selected to ensure that the profiles not archived is 90% of the threshold.
   5. To prevent the archiving process from taking too long, an archival rate 
(`drill.exec.profiles.store.archive.rate`) is defined so that upto that many 
number of profiles are archived in one go, before resumption of re-rendering 
takes place.
   6. On a Distributed FileSystem (e.g. HDFS), multiple Drillbits might attempt 
to archive. To mitigate that, if a Drillbit detects that it is unable to 
archive a profile, it will assume that another Drillbit is also archiving, and 
stop archiving any more.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464379#comment-16464379
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-386731514
 
 
   **[Current Apache Master]** User latency when 8 web-clients (wget) request 
for `/profiles`  against a profile store of 123K profiles (max scale range= 
2min)
   
![image](https://user-images.githubusercontent.com/4335237/39652431-606b5f94-4fa2-11e8-8166-9da97bddbdc8.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464382#comment-16464382
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-386732039
 
 
   [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` 
against a profile store of 123K profiles (max scale range= 2min).
   Note: Only caching is enabled and no new profiles have been written to the 
store during the 2 min window.
   _Notice how all the subsequent responses go fast the moment the first 
response is complete, because of the profile cache._
   
![image](https://user-images.githubusercontent.com/4335237/39652532-b533be9a-4fa2-11e8-815e-d46ddcf1b0c5.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464383#comment-16464383
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-386731514
 
 
   **[Current Apache Master]** User latency when 8 web-clients (wget) request 
for `/profiles`  against a profile store of 123K profiles (max scale range= 
2min)
   _Notice how all the response end times are staggered by ~13 secs from the 
previous, because of the profiles being re-read from the disk despite there 
being no change_
   
   
![image](https://user-images.githubusercontent.com/4335237/39652431-606b5f94-4fa2-11e8-8166-9da97bddbdc8.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464384#comment-16464384
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-386732039
 
 
   [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` 
against a profile store of 123K profiles (max scale range= 2min).
   Note: Only caching is enabled and no new profiles have been written to the 
store during the 2 min window.
   _Notice how all the subsequent responses go fast the moment the first 
response is complete, because of the profile cache._
   
   
![image](https://user-images.githubusercontent.com/4335237/39652532-b533be9a-4fa2-11e8-815e-d46ddcf1b0c5.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464388#comment-16464388
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-386733307
 
 
   [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` 
against a profile store of 123K profiles (max scale range= 2min). The requests 
are done in 2 waves
   Note: Both caching **and** archiving is enabled and no new profiles have 
been written to the store during the 2 min window.
   Notice how all the subsequent responses go fast the moment the third 
response is complete. The first 3 clients triggered archiving of profiles from 
123K down to about 92K, each time trying to build the cache. By the time the 
fourth request comes, there is no more archiving, so the requests are served 
from cache (and, hence, they are barely 2-3 seconds apart). The second wave of 
requests from the 8 clients is now completely served by the cache.
   
   
![image](https://user-images.githubusercontent.com/4335237/39652615-0ce957b2-4fa3-11e8-89ee-a8a09e25cbd7.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464389#comment-16464389
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-386734075
 
 
   @arina-ielchiieva / @parthchandra could you review this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464420#comment-16464420
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-386733307
 
 
   [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` 
against a profile store of 123K profiles (max scale range= 2min). The requests 
are done in 2 waves
   Note: Both caching **and** archiving is enabled and no new profiles have 
been written to the store during the 2 min window.
   Notice how all the subsequent responses go fast the moment the third 
response is complete. The first 3 clients triggered archiving of profiles from 
123K down to about 92K, each time trying to build the cache. By the time the 
fourth request comes, there is no more archiving, so the requests are served 
from cache (and, hence, they are barely 2-3 seconds apart). The second wave of 
requests from the 8 clients is now completely served by the cache.
   
   
![image](https://user-images.githubusercontent.com/4335237/39652615-0ce957b2-4fa3-11e8-89ee-a8a09e25cbd7.png)
   
   Backend logging reveals the archiving process:
   ```
   2018-05-01 22:47:37,870 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:47:45,131 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Found 32935 excess profiles. For now, will 
attempt archiving 1 profiles to maprfs:/drillbit/profiles/archived
   2018-05-01 22:48:04,771 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Archived 1 profiles to 
maprfs:/drillbit/profiles/archived in 19635 ms
   2018-05-01 22:48:04,774 kk127.qa.lab [qtp132047013-85] WARN  
o.a.d.e.s.s.s.LocalPersistentStore - Took 26902 ms to list & map 300 profiles 
(out of 122935 profiles in store)
   2018-05-01 22:48:12,310 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:48:18,439 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Found 22935 excess profiles. For now, will 
attempt archiving 1 profiles to maprfs:/drillbit/profiles/archived
   2018-05-01 22:48:38,234 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Archived 1 profiles to 
maprfs:/drillbit/profiles/archived in 19791 ms
   2018-05-01 22:48:38,236 kk127.qa.lab [qtp132047013-85] WARN  
o.a.d.e.s.s.s.LocalPersistentStore - Took 25924 ms to list & map 300 profiles 
(out of 112935 profiles in store)
   2018-05-01 22:48:43,275 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:48:48,911 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Found 12935 excess profiles. For now, will 
attempt archiving 1 profiles to maprfs:/drillbit/profiles/archived
   2018-05-01 22:49:09,757 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Archived 1 profiles to 
maprfs:/drillbit/profiles/archived in 20842 ms
   2018-05-01 22:49:09,759 kk127.qa.lab [qtp132047013-85] WARN  
o.a.d.e.s.s.s.LocalPersistentStore - Took 26482 ms to list & map 300 profiles 
(out of 102935 profiles in store)
   2018-05-01 22:49:14,119 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:49:19,339 kk127.qa.lab [qtp132047013-85] WARN  
o.a.d.e.s.s.s.LocalPersistentStore - Took 5217 ms to list & map 300 profiles 
(out of 92935 profiles in store)
   2018-05-01 22:49:23,656 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:49:24,214 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:49:24,798 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:49:25,365 kk127.qa.lab [qtp132047013-85] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
   2018-05-01 22:55:12,247 kk127.qa.lab [qtp132047013-92] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
   2018-05-01 22:55:12,791 kk127.qa.lab [qtp132047013-92] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
   2018-05-01 22:55:13,276 kk127.qa.lab [qtp132047013-92] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
   2018-05-01 22:55:13,770 kk127.qa.lab [qtp132047013-92] INFO  
o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
   2018-05-01 22:55:30,477 kk127.qa.lab [qtp132047013-92] 

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466821#comment-16466821
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-387275507
 
 
   @kkhatua Why not use the Guava Cache? http://www.baeldung.com/guava-cache . 
I think it would simplify the implementation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467733#comment-16467733
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-387483281
 
 
   I did consider using the Gauva Cache initially, but I could not figure out 
how to specify the eviction policy based on the profile name. Guava provides a 
mechanism to limit the cache size and evict the oldest entry, but I wanted to 
override the mechanism that defines 'oldest'. Lastly, the TreeSet allows us to 
access the elements in a sorted order, which seemed missing in Guava.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467737#comment-16467737
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-387483281
 
 
   I did consider using the Gauva Cache initially, but I could not figure out 
how to specify the eviction policy based on the profile name. Guava provides a 
mechanism to limit the cache size and evict the oldest entry, but I wanted to 
override the mechanism that defines 'oldest'. Lastly, the TreeSet allows us to 
access the elements in a sorted order, which seemed missing in Guava.
   
   Do you think it makes the code cleaner if I were to extract the mechanism 
into a separate implementation of this 'cache' ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467811#comment-16467811
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-387502051
 
 
   @kkhatua I'm still not sure why you want to override the definition of 
oldest? Why is the default LRU eviction policy not sufficient? 
   
   If you need an ordered list of keys for the cache you can accomplish this 
with the Guava cache by adding a key to a TreeSet when the Loader is called, 
and removing a key from a TreeSet when the Removal Listener is called.
   
   My main concern is that implementing our own cache creates complexity and 
opens up the possibility for bugs. Whereas a pre-existing cache is already 
debugged and tested for us.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469293#comment-16469293
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-387839266
 
 
   The way the cache is constructed is by first listing all the profile files 
and sorting them (the profile ID is generated in a monotonically decreasing 
value to ensure sortedness in stores like HBase), This customized TreeSet is 
used to inject profiles (since the FileSystem is not guaranteed to return the 
list in order), so the TreeSet provides the ordering. We retain only the first 
N (which are, implicitly, the latest profiles). If we were to add more profiles 
 than the max capacity, the TreeSet is pruned at the rightmost end.
   With Guava, the eviction policy provides the option of limiting the size, 
but the basis on which it would evict a profile would not work with the 
least-recently used/accessed profile.
   Also, this is currently not a true cache, because the moment we detect 
changes in the underlying store, we reconstruct this 'cache'. Ideally, we'd 
want to identify the newest profiles returned from the FileSystem (using 
filename filters), but the Hadoop API performance is the same (irrespective of 
the filter).
   We, primarily, save the time in fetching file list from the FS and in 
deserializing.
   I can move the implementation of the TreeSet to a separate class to clean up 
the code. That would make debugging simpler too. With Guava, I don't see the 
value add beyond a lower risk of bugs, which should be minimal with the TreeSet 
too. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469487#comment-16469487
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-387873929
 
 
   @kkhatua 
   
   I think I understand the difference in our two perspectives. You wanted a 
cache that will always only contain the **N** most recently created profiles. 
If you happen to access the **N + 1**th youngest profile, the cache will not 
contain it and will never contain it, the cache will only hold the **N** most 
recently created profiles.
   
   I still prefer the approach with the Guava cache because you can still 
effectively achieve the same result. As new profiles are created they can be 
added to the cache. If you access a very old profile, one more recently created 
profile will be evicted from the cache and the old profile will be added to the 
cache since a user just requested it. I would argue this behavior is not only 
easier to implement since we are leveraging a library, but actually more 
desirable since it caches a profile based on when it is used, not when it was 
created.
   
   If you still disagree with using the Guava cache. I agree with your proposal 
of moving your cache into a separate class. I think you should also add some 
unit tests for the cache to verify that it works as expected. The unit tests 
will also make maintaining and enhancing the class easier for future developers.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470959#comment-16470959
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-388152463
 
 
   I actually like the Guava cache approach for its elegance and capabilities, 
but it expands the scope significantly without a huge benefit from what we 
currently have. The concept of the cache that you are envisioning is with the 
complete profile. This is only for listing of the profiles. When an individual 
profile is accessed, Drill ends up fetching a new copy from the PStore to 
serialize the contents to visualize it. 
   I'll move the class and add some unit tests as well. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471016#comment-16471016
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-388166323
 
 
   @kkhatua Sounds good. Thanks for the explanations and thanks for improving 
the performance so much :) !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16473159#comment-16473159
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-388564546
 
 
   Done all the changes. Found an unused import in an unrelated file, so I 
fixed that to make sure the code builds after rebasing to latest master. 
   @arina-ielchiieva / @parthchandra  / @ilooner 
   Can any (or all) of you do a review?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474313#comment-16474313
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187981315
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/sys/TestProfileSet.java
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.sys;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Random;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.exec.store.sys.store.ProfileSet;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+/**
+ * Test the size-constrained ProfileSet for use in the webserver's '/profiles' 
listing
+ */
+public class TestProfileSet {
+  private final static String PROFILE_PREFIX = "t35t-pr0fil3-";
+  static int initCapacity;
+  static int finalCapacity;
+  static int storeCount;
+  static Random rand;
+  static List masterList;
+
+  @BeforeClass
+  public static void setupProfileSet() {
+initCapacity = 50;
+finalCapacity = 70;
+storeCount = 100;
+rand = new Random();
+//Generating source list of storeCount # 'profiles'
+masterList = new LinkedList();
+for (int i = 0; i < storeCount; i++) {
+  masterList.add(PROFILE_PREFIX + StringUtils.leftPad(String.valueOf(i), 
String.valueOf(storeCount).length(), '0'));
+}
+  }
+
+  @Test
+  public void testProfileOrder() throws Exception {
+//clone initial # profiles and verify via iterator.
+ProfileSet testSet = new ProfileSet(initCapacity);
+List srcList = new LinkedList(masterList);
+
+//Loading randomly
+for (int i = 0; i < initCapacity; i++) {
+  String poppedProfile = 
testSet.add(srcList.remove(rand.nextInt(storeCount - i)));
+  assert (poppedProfile == null);
+  assertEquals(null, poppedProfile);
+}
+
+//Testing order
+String prevProfile = null;
+while (!testSet.isEmpty()) {
+  String currOldestProfile = testSet.removeOldest();
+  if (prevProfile != null) {
+assertTrue( prevProfile.compareTo(currOldestProfile) > 0 );
+  }
+  prevProfile = currOldestProfile;
+}
+  }
+
+  //Test if inserts exceeding capacity leads to eviction of oldest
+  @Test
+  public void testExcessInjection() throws Exception {
+//clone initial # profiles and verify via iterator.
+ProfileSet testSet = new ProfileSet(initCapacity);
+List srcList = new LinkedList(masterList);
+
+//Loading randomly
+for (int i = 0; i < initCapacity; i++) {
+  String poppedProfile = 
testSet.add(srcList.remove(rand.nextInt(storeCount - i)));
+  assertEquals(null, poppedProfile);
+}
+
+//Testing Excess by looking at oldest popped
+for (int i = initCapacity; i < finalCapacity; i++) {
+  String toInsert = srcList.remove(rand.nextInt(storeCount - i));
+  String expectedToPop = ( toInsert.compareTo(testSet.getOldest()) > 0 ?
+  toInsert : testSet.getOldest() );
+
+  String oldestPoppedProfile = testSet.add(toInsert);
+  assertEquals(expectedToPop, oldestPoppedProfile);
+}
+
+assertEquals(initCapacity, testSet.size());
+  }
+
+  //Test if size internally resizes to final capacity with no evictions
+  @Test
+  public void testSetResize() throws Exception {
+//clone initial # profiles into a 700-capacity set.
+ProfileSet testSet = new ProfileSet(finalCapacity);
+List srcList = new LinkedList(masterList);
+
+//Loading randomly
+for (int i = 0; i < initCapacity; i++) {
+  String poppedProfile = 
testSet.add(srcList.remove(rand.nextInt(storeCount - i)));
+  assertEquals(null, poppedProfile);
+}
+
+assert(testSet.size() == initCapacity);
 
 Review comment:
   Please use junit assertions in tests.

---

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474317#comment-16474317
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187983484
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (File

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474314#comment-16474314
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187981529
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/ProfileSet.java
 ##
 @@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.sys.store;
+
+import java.util.Iterator;
+import java.util.TreeSet;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Wrapper around TreeSet to mimic a size-bound set ordered by name 
(implicitly the profiles' age)
+ */
+public class ProfileSet implements Iterable {
+  private TreeSet store;
+  private int maxCapacity;
+  //Using a dedicated counter to avoid
+  private AtomicInteger size;
+
+  @SuppressWarnings("unused")
+  @Deprecated
+  private ProfileSet() {}
+
+  public ProfileSet(int capacity) {
+this.store = new TreeSet();
+this.maxCapacity = capacity;
+this.size = new AtomicInteger();
+  }
+
+  public int size() {
+return size.get();
+  }
+
+  /**
+   * Get max capacity of the profile set
+   * @return max capacity
+   */
+  public int capacity() {
+return maxCapacity;
+  }
+
+  /**
+   * Add a profile name to the set, while removing the oldest, if exceeding 
capacity
+   * @param profile
+   * @return oldest profile
+   */
+  public String add(String profile) {
+return add(profile, false);
+  }
+
+  /**
+   * Add a profile name to the set, while removing the oldest or youngest, 
based on flag
+   * @param profile
+   * @param retainOldest indicate retaining policy as oldest
+   * @return youngest/oldest profile
+   */
+  public String add(String profile, boolean retainOldest) {
+store.add(profile);
+if ( size.incrementAndGet() > maxCapacity ) {
 
 Review comment:
   Please remove spaces.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474318#comment-16474318
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187984531
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (File

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474320#comment-16474320
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187984786
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (File

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474319#comment-16474319
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187982481
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (File

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474316#comment-16474316
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187984321
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (File

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474315#comment-16474315
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: 
Improve loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r187981874
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (File

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474652#comment-16474652
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on a change in pull request #1250: DRILL-5270: Improve 
loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r188063691
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (FileStatus st

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474658#comment-16474658
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on a change in pull request #1250: DRILL-5270: Improve 
loading of profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#discussion_r188064161
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java
 ##
 @@ -1,220 +1,377 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.drill.exec.store.sys.store;
-
-import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX;
-
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.Map.Entry;
-
-import javax.annotation.Nullable;
-
-import org.apache.commons.io.IOUtils;
-import org.apache.drill.common.collections.ImmutableEntry;
-import org.apache.drill.common.config.DrillConfig;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
-import org.apache.drill.exec.util.DrillFileSystemUtil;
-import org.apache.drill.exec.store.sys.BasePersistentStore;
-import org.apache.drill.exec.store.sys.PersistentStoreConfig;
-import org.apache.drill.exec.store.sys.PersistentStoreMode;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-
-import com.google.common.base.Function;
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import org.apache.hadoop.fs.PathFilter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-public class LocalPersistentStore extends BasePersistentStore {
-  private static final Logger logger = 
LoggerFactory.getLogger(LocalPersistentStore.class);
-
-  private final Path basePath;
-  private final PersistentStoreConfig config;
-  private final DrillFileSystem fs;
-
-  public LocalPersistentStore(DrillFileSystem fs, Path base, 
PersistentStoreConfig config) {
-this.basePath = new Path(base, config.getName());
-this.config = config;
-this.fs = fs;
-try {
-  mkdirs(getBasePath());
-} catch (IOException e) {
-  throw new RuntimeException("Failure setting pstore configuration path.");
-}
-  }
-
-  protected Path getBasePath() {
-return basePath;
-  }
-
-  @Override
-  public PersistentStoreMode getMode() {
-return PersistentStoreMode.PERSISTENT;
-  }
-
-  private void mkdirs(Path path) throws IOException {
-fs.mkdirs(path);
-  }
-
-  public static Path getLogDir() {
-String drillLogDir = System.getenv("DRILL_LOG_DIR");
-if (drillLogDir == null) {
-  drillLogDir = System.getProperty("drill.log.dir");
-}
-if (drillLogDir == null) {
-  drillLogDir = "/var/log/drill";
-}
-return new Path(new File(drillLogDir).getAbsoluteFile().toURI());
-  }
-
-  public static DrillFileSystem getFileSystem(DrillConfig config, Path root) 
throws IOException {
-Path blobRoot = root == null ? getLogDir() : root;
-Configuration fsConf = new Configuration();
-if (blobRoot.toUri().getScheme() != null) {
-  fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString());
-}
-
-
-DrillFileSystem fs = new DrillFileSystem(fsConf);
-fs.mkdirs(blobRoot);
-return fs;
-  }
-
-  @Override
-  public Iterator> getRange(int skip, int take) {
-try {
-  // list only files with sys file suffix
-  PathFilter sysFileSuffixFilter = new PathFilter() {
-@Override
-public boolean accept(Path path) {
-  return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX);
-}
-  };
-
-  List fileStatuses = DrillFileSystemUtil.listFiles(fs, 
basePath, false, sysFileSuffixFilter);
-  if (fileStatuses.isEmpty()) {
-return Collections.emptyIterator();
-  }
-
-  List files = Lists.newArrayList();
-  for (FileStatus st

[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI

2018-05-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488008#comment-16488008
 ] 

ASF GitHub Bot commented on DRILL-5270:
---

kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles 
listing in the WebUI
URL: https://github.com/apache/drill/pull/1250#issuecomment-391489764
 
 
   @arina-ielchiieva I've made the following changes: 
   1. Refactored to introduce an Archiver
   2. Allow for cache to only apply to WebServer 
   3. For non-webserver request, like SysTables, support for recursive listing. 
This is because, while archiving speeds up performance for WebServers, 
SysTables would need access to archived profiles for analytics.
   4. Added tests for the ProfileSet cache
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)