[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842636#comment-16842636 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on pull request #1654: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1654 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.17.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648613#comment-16648613 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-429493831 @priteshm this required some more rework, which I'm hoping I've addressed. We can review and try to get this in as part of 1.15.0 I've rebased this on top of latest master, accounting for conflicts due to DRILL-6053 (locking of PStore), DRILL-6422 (shaded Guava imports) and DRILL-6492 (schema/workspace insensitivity). @arina-ielchiieva / @parthchandra / @ilooner any one up for reviewing this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.15.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877119#comment-15877119 ] Kunal Khatua commented on DRILL-5270: - As a sample experiment, we ran a Drillbit to display the profile list for a directory containing 280K files. We found that while the Drillbit took a long time to startup (DRILL-4990 will fix this), the load time improves as long as no new profiles are detected. Using a PathFilter (HadoopAPI) implementation to find new profiles should have helped. However, it appears that the filter isn't pushed down to the file system, so we're not able to benefit from it, unless the HDFS API itself is improved. There is no regression, however, with this. Using in conjunction with the patch for DRILL-5259 does show improved load times, since we don't go back to the DFS to re-read the profile list. This is the content of the Drill log {code} 2017-02-17 11:09:30,929 pssc-65.qa.lab [main] INFO o.apache.drill.exec.server.Drillbit - Startup completed (625436 ms). 2017-02-17 11:14:35,886 pssc-65.qa.lab [qtp2142187763-113] WARN o.a.d.e.s.s.s.LocalPersistentStore - Took 5876 ms to list+map from 281001 profiles 2017-02-17 11:14:35,893 pssc-65.qa.lab [qtp2142187763-113] DEBUG o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 100 profiles: 5885 ms 2017-02-17 11:18:25,940 pssc-65.qa.lab [qtp2142187763-118] DEBUG o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 100 profiles: 1 ms 2017-02-17 11:19:14,977 pssc-65.qa.lab [qtp2142187763-122] DEBUG o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 75 profiles: 1 ms 2017-02-17 11:19:27,554 pssc-65.qa.lab [qtp2142187763-123] DEBUG o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 150 profiles: 1 ms 2017-02-17 11:21:58,137 pssc-65.qa.lab [qtp2142187763-124] INFO o.a.drill.exec.client.DrillClient - Successfully connected to server pssc-65.qa.lab:31010 2017-02-17 11:21:58,409 pssc-65.qa.lab [2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:foreman] INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a: select * from sys.drillbits 2017-02-17 11:32:02,320 pssc-65.qa.lab [2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: State change requested AWAITING_ALLOCATION --> RUNNING 2017-02-17 11:32:02,321 pssc-65.qa.lab [2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: State to report: RUNNING 2017-02-17 11:32:02,506 pssc-65.qa.lab [2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO o.a.d.e.c.ClassCompilerSelector - Java compiler policy: DEFAULT, Debug option: true 2017-02-17 11:32:02,792 pssc-65.qa.lab [2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: State change requested RUNNING --> FINISHED 2017-02-17 11:32:02,793 pssc-65.qa.lab [2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:frag:0:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a:0:0: State to report: FINISHED 2017-02-17 11:32:02,837 pssc-65.qa.lab [CONTROL-rpc-event-queue] INFO query.logger - {"queryId":"2758b2a9-6e44-9d79-df64-e6e3e6a9eb4a","schema":"","queryText":"select * from sys.drillbits","start":1487359318324,"finish":1487359922805,"outcome":"COMPLETED","username":"anonymous","remoteAddress":"10.10.103.65:56448"} 2017-02-17 11:34:49,443 pssc-65.qa.lab [qtp2142187763-129] WARN o.a.d.e.s.s.s.LocalPersistentStore - Took 5545 ms to list+map from 1 profiles 2017-02-17 11:34:49,448 pssc-65.qa.lab [qtp2142187763-129] DEBUG o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 50 profiles: 5553 ms 2017-02-17 11:36:58,770 pssc-65.qa.lab [qtp2142187763-120] DEBUG o.a.d.e.s.r.profile.ProfileResources - Time to load MRU 60 profiles: 1 ms {code} > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua > Fix For: 1.10.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877200#comment-15877200 ] ASF GitHub Bot commented on DRILL-5270: --- GitHub user kkhatua opened a pull request: https://github.com/apache/drill/pull/755 DRILL-5270: Improve loading of profiles listing in the WebUI Using Hadoop API to filter and reduce profile list load time Using an in-memory treeSet-based cache, maintain the list of most recent profiles. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kkhatua/drill DRILL-5270 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/755.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #755 commit a5f20643850ad399622e5df9a6f37713545dc7a6 Author: Kunal Khatua Date: 2017-02-22T01:20:48Z DRILL-5270: Improve loading of profiles listing in the WebUI Using Hadoop API to filter and reduce profile list load time Using an in-memory treeSet-based cache, maintain the list of most recent profiles. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua > Fix For: 1.10.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877207#comment-15877207 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/755 A summary of the performance is available in this [comment](https://issues.apache.org/jira/browse/DRILL-5270?focusedCommentId=15877119&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15877119) on the JIRA (DRILL-5270) > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua > Fix For: 1.10.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877250#comment-15877250 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/755 For 8266 profiles, when measured from Chrome browser's Network tool: ``` Load First Time: 2.43s Load Second Time (no new profiles): 829ms ``` > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua > Fix For: 1.10.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979633#comment-15979633 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/755 @sudheeshkatkam Can you please review the PR? > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua > Fix For: 1.11.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373974#comment-16373974 ] Pritesh Maker commented on DRILL-5270: -- [~arina] can you please review this PR? > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379409#comment-16379409 ] ASF GitHub Bot commented on DRILL-5270: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171084169 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -61,16 +63,29 @@ private final AutoCloseableLock readLock = new AutoCloseableLock(readWriteLock.readLock()); private final AutoCloseableLock writeLock = new AutoCloseableLock(readWriteLock.writeLock()); + //Provides a threshold above which we report the time to load + private static final long LISTTIME_THRESHOLD_MSEC = 2000L; + + private static final int DrillSysFileExtSize = DRILL_SYS_FILE_SUFFIX.length(); --- End diff -- `DrillSysFileExtSize` -> `drillSysFileExtSize` > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379406#comment-16379406 ] ASF GitHub Bot commented on DRILL-5270: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171082847 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java --- @@ -0,0 +1,53 @@ +/** --- End diff -- Please use comment for the header, not javadoc. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379408#comment-16379408 ] ASF GitHub Bot commented on DRILL-5270: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171088565 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -112,23 +127,65 @@ public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throw @Override public Iterator> getRange(int skip, int take) { +//Marking currently seen modification time +long currBasePathModified = 0L; +try { + currBasePathModified = fs.getFileStatus(basePath).getModificationTime(); +} catch (IOException ioexcp) { + ioexcp.printStackTrace(); +} + +//Acquiring lock to avoid reloading for request coming in before completion of profile read --- End diff -- 1. Before reading lock acquirement was enough, with your changes you modify class fields. Since many threads can access this method, you'll end up with raise conditions, also class fields can be cached by threads as well... I think design here should be reconsidered. 2. Guava library has several cache implementations. Can we leverage any of them instead of using tree set? Pinging @vlad since he is working on DRILL-6053 which intends to make changes in the same class to avoid excessive locking to be aware of intended changes. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379411#comment-16379411 ] ASF GitHub Bot commented on DRILL-5270: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171089128 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -141,11 +198,33 @@ public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throw } } + /** + * Add profile name to a TreeSet + * @param profileName --- End diff -- Please do not leave `@param`, `@return` without description. IDE usually highlights them, asking to add description. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379410#comment-16379410 ] ASF GitHub Bot commented on DRILL-5270: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171084274 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -61,16 +63,29 @@ private final AutoCloseableLock readLock = new AutoCloseableLock(readWriteLock.readLock()); private final AutoCloseableLock writeLock = new AutoCloseableLock(readWriteLock.writeLock()); + //Provides a threshold above which we report the time to load + private static final long LISTTIME_THRESHOLD_MSEC = 2000L; --- End diff -- `LISTTIME_THRESHOLD_MSEC` -> `LIST_TIME_THRESHOLD_MSEC` > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379412#comment-16379412 ] ASF GitHub Bot commented on DRILL-5270: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171089532 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java --- @@ -0,0 +1,53 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.sys.store; + +import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; + +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.PathFilter; + +/** + * Filter for Drill System Files + */ +public class DrillSysFilePathFilter implements PathFilter { --- End diff -- Please consider using `FileSystemUtil` which help to create filters. Passing custom filter is also possible. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379407#comment-16379407 ] ASF GitHub Bot commented on DRILL-5270: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171084819 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -112,23 +127,65 @@ public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throw @Override public Iterator> getRange(int skip, int take) { +//Marking currently seen modification time +long currBasePathModified = 0L; +try { + currBasePathModified = fs.getFileStatus(basePath).getModificationTime(); +} catch (IOException ioexcp) { + ioexcp.printStackTrace(); --- End diff -- Please do not use `printStackTrace()` > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.13.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384056#comment-16384056 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171944876 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java --- @@ -0,0 +1,53 @@ +/** --- End diff -- OK. Will fix this. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384054#comment-16384054 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171944817 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -61,16 +63,29 @@ private final AutoCloseableLock readLock = new AutoCloseableLock(readWriteLock.readLock()); private final AutoCloseableLock writeLock = new AutoCloseableLock(readWriteLock.writeLock()); + //Provides a threshold above which we report the time to load + private static final long LISTTIME_THRESHOLD_MSEC = 2000L; --- End diff -- OK. Will fix this. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384048#comment-16384048 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171944701 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java --- @@ -0,0 +1,53 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.sys.store; + +import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; + +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.PathFilter; + +/** + * Filter for Drill System Files + */ +public class DrillSysFilePathFilter implements PathFilter { --- End diff -- Ok. I was thinking of using ``` List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); ``` > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384053#comment-16384053 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171944767 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -141,11 +198,33 @@ public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throw } } + /** + * Add profile name to a TreeSet + * @param profileName --- End diff -- OK. Will fix this. Eclipse didn't pop it up for me. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384060#comment-16384060 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171945072 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -112,23 +127,65 @@ public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throw @Override public Iterator> getRange(int skip, int take) { +//Marking currently seen modification time +long currBasePathModified = 0L; +try { + currBasePathModified = fs.getFileStatus(basePath).getModificationTime(); +} catch (IOException ioexcp) { + ioexcp.printStackTrace(); --- End diff -- Will publish a log message and return an empty iterator for now. Not sure how to bubble up an error to the UI. I'll take a look at how we do so for profile deserialization as a guide > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384058#comment-16384058 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171944952 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -61,16 +63,29 @@ private final AutoCloseableLock readLock = new AutoCloseableLock(readWriteLock.readLock()); private final AutoCloseableLock writeLock = new AutoCloseableLock(readWriteLock.writeLock()); + //Provides a threshold above which we report the time to load + private static final long LISTTIME_THRESHOLD_MSEC = 2000L; + + private static final int DrillSysFileExtSize = DRILL_SYS_FILE_SUFFIX.length(); --- End diff -- I wanted to treat this like a constant, but this makes it confusing as a Class name > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384067#comment-16384067 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/755#discussion_r171945885 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java --- @@ -112,23 +127,65 @@ public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throw @Override public Iterator> getRange(int skip, int take) { +//Marking currently seen modification time +long currBasePathModified = 0L; +try { + currBasePathModified = fs.getFileStatus(basePath).getModificationTime(); +} catch (IOException ioexcp) { + ioexcp.printStackTrace(); +} + +//Acquiring lock to avoid reloading for request coming in before completion of profile read --- End diff -- I'll provide the explanation below. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384066#comment-16384066 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/755 @arina-ielchiieva I need to rebase this on top of the latest master considering it was originally based on nearly a year old code. When ready, i'll create a new PR or push to this one. Let me know which one works. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384080#comment-16384080 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/755 The choice for a `TreeSet` is to basically use a binary structure that keeps the (maximum permitted) profiles sorted and in memory. When Drill detect changes, (Refer https://github.com/kkhatua/drill/blob/f7ad29b9a322bb215d16b3c3b9a2bfc40abfc1ed/exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java#L146) it will fetch all the available profiles in the PStore and reconstruct the tree (since the order of the profiles returned by the `FileSystem` is not guaranteed). I tried using the `PathFilter` to fetch only new profiles, but the cost of the `FileSystem` fetching only new profiles, versus the entire list is the same! Also, there is the possibility that some profiles might have been deleted as new ones were added, so a full reconstruction would take care of that scenario as well. To evict, as I construct the TreeSet, I simply pop the oldest (by filename) entry. The Guava cache options don't seem to provide a way to define the basis on which to evict entries. I believe, @vrozov's work on DRILL-6053 is to address locking during writes specifically. The lock I used (and need) is for reads to ensure that multiple requests don't trigger an expensive FileSystem call for the same state of the PStore. e.g. consider T# as timestamps * `currBasePathModified` = T0 * _ThreadA_ requests at t=T1 and issues a read-lock * _ThreadB_ requests at t=T2 but is waiting for read-lock If the tree exists and no change is detected, _ThreadA_ will use the `TreeSet` contents and resume by releasing the lock. If the `TreeSet` exists and a change is detected, _ThreadA_ will reconstruct the `TreeSet` before using its contents and it will update `lastBasePathModified`, before releasing the lock. When _ThreadB_ gets the read-lock, it discovers that during the wait, the `TreeSet` was already updated. So, in terms of t=T2, this is the most recent snapshot, so it proceeds to use the treeSet's contents rather than reconstruct. That will be deferred to the next request. We're using the `lastBasePathModified` as a way to provide a pseudo-versioned access to the list. That means if there are more profiles added *after* _ThreadB_ was waiting for the read-lock, it will not trigger the `FileSystem` call right away. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384296#comment-16384296 ] ASF GitHub Bot commented on DRILL-5270: --- Github user vrozov commented on the issue: https://github.com/apache/drill/pull/755 @kkhatua 1. The read locks are not exclusive (single writer/multiple readers). To achieve the required functionality you need to introduce a different lock and use write (or exclusive) lock. 2. The choice for TreeSet is not obvious. What are the most common operations performed on the collection? Do you optimize for get, put or collection construction? @arina-ielchiieva my github id is `vrozov`. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384327#comment-16384327 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/755 Thanks, @vrozov. I'll make use of a separate lock for read-only purpose in case of `#1`. For `#2`, I need to construct a size-limited ordered set from a list of unordered elements. In this case, the elements (i.e. profiles) need to be ordered by file-name, which is a 1:1 mapping function of the start time epoch for the query. So, I need to be able to add to such a datastructure in `O(log(n))` time, remove in `O(1)` and iterate through it in sequence. So, my puts are the most expensive operation. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16398934#comment-16398934 ] ASF GitHub Bot commented on DRILL-5270: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/755 Holding off to do a rebase once @vrozov 's PR #1163 (DRILL-6053) goes into Apache. > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464374#comment-16464374 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #755: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/755#issuecomment-386729649 Closing this PR in favor of #1250 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464375#comment-16464375 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua closed pull request #755: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/755 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java b/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java index 54fb46ab68..1dafb51f06 100644 --- a/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java +++ b/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java @@ -165,6 +165,8 @@ private ExecConstants() { public static final String SYS_STORE_PROVIDER_LOCAL_ENABLE_WRITE = "drill.exec.sys.store.provider.local.write"; public static final String PROFILES_STORE_INMEMORY = "drill.exec.profiles.store.inmemory"; public static final String PROFILES_STORE_CAPACITY = "drill.exec.profiles.store.capacity"; + public static final String PROFILES_STORE_ARCHIVE_ENABLED = "drill.exec.profiles.store.archive.enabled"; + public static final String PROFILES_STORE_ARCHIVE_RATE = "drill.exec.profiles.store.archive.rate"; public static final String IMPERSONATION_ENABLED = "drill.exec.impersonation.enabled"; public static final String IMPERSONATION_MAX_CHAINED_USER_HOPS = "drill.exec.impersonation.max_chained_user_hops"; public static final String AUTHENTICATION_MECHANISMS = "drill.exec.security.auth.mechanisms"; diff --git a/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java b/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java index ec06f0ef4b..3569db972c 100644 --- a/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java +++ b/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java @@ -18,6 +18,7 @@ package org.apache.drill.exec.server.rest.profile; import java.text.SimpleDateFormat; +import java.util.ArrayList; import java.util.Collections; import java.util.Date; import java.util.Iterator; @@ -58,6 +59,7 @@ import org.apache.drill.exec.work.foreman.Foreman; import org.glassfish.jersey.server.mvc.Viewable; +import com.google.common.base.Stopwatch; import com.google.common.collect.Lists; @Path("/") @@ -71,6 +73,7 @@ @Inject SecurityContext sc; public static class ProfileInfo implements Comparable { +private static final String TRAILING_DOTS = " ... "; private static final int QUERY_SNIPPET_MAX_CHAR = 150; private static final int QUERY_SNIPPET_MAX_LINES = 8; @@ -171,13 +174,13 @@ private String extractQuerySnippet(String queryText) { //Trimming down based on line-count if (QUERY_SNIPPET_MAX_LINES < queryParts.length) { int linesConstructed = 0; -StringBuilder lineCappedQuerySnippet = new StringBuilder(); +StringBuilder lineCappedQuerySnippet = new StringBuilder(QUERY_SNIPPET_MAX_CHAR + TRAILING_DOTS.length()); for (String qPart : queryParts) { lineCappedQuerySnippet.append(qPart); if (++linesConstructed < QUERY_SNIPPET_MAX_LINES) { lineCappedQuerySnippet.append(System.lineSeparator()); } else { -lineCappedQuerySnippet.append(" ... "); +lineCappedQuerySnippet.append(TRAILING_DOTS); break; } } @@ -260,8 +263,6 @@ public QProfiles getProfilesJSON(@Context UriInfo uriInfo) { Collections.sort(runningQueries, Collections.reverseOrder()); - final List finishedQueries = Lists.newArrayList(); - //Defining #Profiles to load int maxProfilesToLoad = work.getContext().getConfig().getInt(ExecConstants.HTTP_MAX_PROFILES); String maxProfilesParams = uriInfo.getQueryParameters().getFirst(MAX_QPROFILES_PARAM); @@ -269,8 +270,9 @@ public QProfiles getProfilesJSON(@Context UriInfo uriInfo) { maxProfilesToLoad = Integer.valueOf(maxProfilesParams); } - final Iterator> range = completed.getRange(0, maxProfilesToLoad); + final List finishedQueries = new ArrayList(maxProfilesToLoad); + final Iterator> range = completed.getRange(0, maxProfilesToLoad); while (range.hasNext()) { try { final Map.Entry profileEntry = range.next(); diff --git a/exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java b/exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/DrillSysFilePathFilter.java new
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464369#comment-16464369 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua opened a new pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250 When Drill is displaying profiles stored on the file system (Local or Distributed), it does so by loading the entire list of `.sys.drill` files in the profile directory, sorting and deserializing. This can get expensive, since only a single CPU thread does this. As an example, a directory of 120K profiles, the time to just fetch the list of files alone is over 6 seconds. After that, based on the number of profiles being rendered, the time varies. An average of 30ms is needed to deserialize a standard profile, which translates to an additional 3sec for therendering of default 100 profiles. A user reported issue confirms just that: DRILL-5028 Opening profiles page from web ui gets very slow when a lot of history files have been stored in HDFS or Local FS Additional JIRAs filed ask for managing these profiles DRILL-2362 Drill should manage Query Profiling archiving DRILL-2861 enhance drill profile file management This PR brings the following enhancements to achieve that: 1. Mimick the In-memory persistence of profiles (DRILL-5481), by keeping only a predefined `max-capacity` number of profiles in the directory and moving the oldest to an 'archived' sub-directory. 2. Improve loading times by pinning the deserialized list in memory (TreeSet; for maintaining a memory-efficient sortedness of the profiles). That way, if we do not detect any new profiles in the profileStore (i.e. profile directory) since the last time a web-request for rendering the profiles was made, we can re-serve the same listing and skip making a trip to the filesystem to re-fetch all the profiles. Reload & reconstruction of the profiles in the Tree is done in the event of any of the following states changing: i. Modification Time of profile dir ii. Number of profiles in the profile dir iii. Number of profiles requested exceeds existing the currently available list 3. When 2 or more web-requests for rendering arrive, the WebServer code already processes the requests sequentially. As a result, the earliest request will trigger the reconstruction of the in-memory profile-set, and the last-modified timestamp of the profileStore is tracked. This way, the remaining blocked requests can re-use the freshly-reconstructed profile-set for rendering if the underlying profileStore has not been modified. There is an assumption made here that the rate of profiles being added to the profileStore is not too high to trigger a reconstruction for every queued up request. 4. To prevent frequent archiving, there is a threshold (max-capacity) defined for triggering the archive. However, the number of profiles archived is selected to ensure that the profiles not archived is 90% of the threshold. 5. To prevent the archiving process from taking too long, an archival rate (`drill.exec.profiles.store.archive.rate`) is defined so that upto that many number of profiles are archived in one go, before resumption of re-rendering takes place. 6. On a Distributed FileSystem (e.g. HDFS), multiple Drillbits might attempt to archive. To mitigate that, if a Drillbit detects that it is unable to archive a profile, it will assume that another Drillbit is also archiving, and stop archiving any more. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464379#comment-16464379 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-386731514 **[Current Apache Master]** User latency when 8 web-clients (wget) request for `/profiles` against a profile store of 123K profiles (max scale range= 2min) ![image](https://user-images.githubusercontent.com/4335237/39652431-606b5f94-4fa2-11e8-8166-9da97bddbdc8.png) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464382#comment-16464382 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-386732039 [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` against a profile store of 123K profiles (max scale range= 2min). Note: Only caching is enabled and no new profiles have been written to the store during the 2 min window. _Notice how all the subsequent responses go fast the moment the first response is complete, because of the profile cache._ ![image](https://user-images.githubusercontent.com/4335237/39652532-b533be9a-4fa2-11e8-815e-d46ddcf1b0c5.png) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464383#comment-16464383 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-386731514 **[Current Apache Master]** User latency when 8 web-clients (wget) request for `/profiles` against a profile store of 123K profiles (max scale range= 2min) _Notice how all the response end times are staggered by ~13 secs from the previous, because of the profiles being re-read from the disk despite there being no change_ ![image](https://user-images.githubusercontent.com/4335237/39652431-606b5f94-4fa2-11e8-8166-9da97bddbdc8.png) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464384#comment-16464384 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-386732039 [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` against a profile store of 123K profiles (max scale range= 2min). Note: Only caching is enabled and no new profiles have been written to the store during the 2 min window. _Notice how all the subsequent responses go fast the moment the first response is complete, because of the profile cache._ ![image](https://user-images.githubusercontent.com/4335237/39652532-b533be9a-4fa2-11e8-815e-d46ddcf1b0c5.png) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464388#comment-16464388 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-386733307 [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` against a profile store of 123K profiles (max scale range= 2min). The requests are done in 2 waves Note: Both caching **and** archiving is enabled and no new profiles have been written to the store during the 2 min window. Notice how all the subsequent responses go fast the moment the third response is complete. The first 3 clients triggered archiving of profiles from 123K down to about 92K, each time trying to build the cache. By the time the fourth request comes, there is no more archiving, so the requests are served from cache (and, hence, they are barely 2-3 seconds apart). The second wave of requests from the 8 clients is now completely served by the cache. ![image](https://user-images.githubusercontent.com/4335237/39652615-0ce957b2-4fa3-11e8-89ee-a8a09e25cbd7.png) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464389#comment-16464389 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-386734075 @arina-ielchiieva / @parthchandra could you review this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464420#comment-16464420 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-386733307 [DRILL-5270] User latency when 8 web-clients (wget) request for `/profiles` against a profile store of 123K profiles (max scale range= 2min). The requests are done in 2 waves Note: Both caching **and** archiving is enabled and no new profiles have been written to the store during the 2 min window. Notice how all the subsequent responses go fast the moment the third response is complete. The first 3 clients triggered archiving of profiles from 123K down to about 92K, each time trying to build the cache. By the time the fourth request comes, there is no more archiving, so the requests are served from cache (and, hence, they are barely 2-3 seconds apart). The second wave of requests from the 8 clients is now completely served by the cache. ![image](https://user-images.githubusercontent.com/4335237/39652615-0ce957b2-4fa3-11e8-89ee-a8a09e25cbd7.png) Backend logging reveals the archiving process: ``` 2018-05-01 22:47:37,870 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:47:45,131 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Found 32935 excess profiles. For now, will attempt archiving 1 profiles to maprfs:/drillbit/profiles/archived 2018-05-01 22:48:04,771 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Archived 1 profiles to maprfs:/drillbit/profiles/archived in 19635 ms 2018-05-01 22:48:04,774 kk127.qa.lab [qtp132047013-85] WARN o.a.d.e.s.s.s.LocalPersistentStore - Took 26902 ms to list & map 300 profiles (out of 122935 profiles in store) 2018-05-01 22:48:12,310 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:48:18,439 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Found 22935 excess profiles. For now, will attempt archiving 1 profiles to maprfs:/drillbit/profiles/archived 2018-05-01 22:48:38,234 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Archived 1 profiles to maprfs:/drillbit/profiles/archived in 19791 ms 2018-05-01 22:48:38,236 kk127.qa.lab [qtp132047013-85] WARN o.a.d.e.s.s.s.LocalPersistentStore - Took 25924 ms to list & map 300 profiles (out of 112935 profiles in store) 2018-05-01 22:48:43,275 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:48:48,911 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Found 12935 excess profiles. For now, will attempt archiving 1 profiles to maprfs:/drillbit/profiles/archived 2018-05-01 22:49:09,757 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Archived 1 profiles to maprfs:/drillbit/profiles/archived in 20842 ms 2018-05-01 22:49:09,759 kk127.qa.lab [qtp132047013-85] WARN o.a.d.e.s.s.s.LocalPersistentStore - Took 26482 ms to list & map 300 profiles (out of 102935 profiles in store) 2018-05-01 22:49:14,119 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:49:19,339 kk127.qa.lab [qtp132047013-85] WARN o.a.d.e.s.s.s.LocalPersistentStore - Took 5217 ms to list & map 300 profiles (out of 92935 profiles in store) 2018-05-01 22:49:23,656 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:49:24,214 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:49:24,798 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:49:25,365 kk127.qa.lab [qtp132047013-85] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85 2018-05-01 22:55:12,247 kk127.qa.lab [qtp132047013-92] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92 2018-05-01 22:55:12,791 kk127.qa.lab [qtp132047013-92] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92 2018-05-01 22:55:13,276 kk127.qa.lab [qtp132047013-92] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92 2018-05-01 22:55:13,770 kk127.qa.lab [qtp132047013-92] INFO o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92 2018-05-01 22:55:30,477 kk127.qa.lab [qtp132047013-92]
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466821#comment-16466821 ] ASF GitHub Bot commented on DRILL-5270: --- ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-387275507 @kkhatua Why not use the Guava Cache? http://www.baeldung.com/guava-cache . I think it would simplify the implementation. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467733#comment-16467733 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-387483281 I did consider using the Gauva Cache initially, but I could not figure out how to specify the eviction policy based on the profile name. Guava provides a mechanism to limit the cache size and evict the oldest entry, but I wanted to override the mechanism that defines 'oldest'. Lastly, the TreeSet allows us to access the elements in a sorted order, which seemed missing in Guava. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467737#comment-16467737 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-387483281 I did consider using the Gauva Cache initially, but I could not figure out how to specify the eviction policy based on the profile name. Guava provides a mechanism to limit the cache size and evict the oldest entry, but I wanted to override the mechanism that defines 'oldest'. Lastly, the TreeSet allows us to access the elements in a sorted order, which seemed missing in Guava. Do you think it makes the code cleaner if I were to extract the mechanism into a separate implementation of this 'cache' ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467811#comment-16467811 ] ASF GitHub Bot commented on DRILL-5270: --- ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-387502051 @kkhatua I'm still not sure why you want to override the definition of oldest? Why is the default LRU eviction policy not sufficient? If you need an ordered list of keys for the cache you can accomplish this with the Guava cache by adding a key to a TreeSet when the Loader is called, and removing a key from a TreeSet when the Removal Listener is called. My main concern is that implementing our own cache creates complexity and opens up the possibility for bugs. Whereas a pre-existing cache is already debugged and tested for us. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469293#comment-16469293 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-387839266 The way the cache is constructed is by first listing all the profile files and sorting them (the profile ID is generated in a monotonically decreasing value to ensure sortedness in stores like HBase), This customized TreeSet is used to inject profiles (since the FileSystem is not guaranteed to return the list in order), so the TreeSet provides the ordering. We retain only the first N (which are, implicitly, the latest profiles). If we were to add more profiles than the max capacity, the TreeSet is pruned at the rightmost end. With Guava, the eviction policy provides the option of limiting the size, but the basis on which it would evict a profile would not work with the least-recently used/accessed profile. Also, this is currently not a true cache, because the moment we detect changes in the underlying store, we reconstruct this 'cache'. Ideally, we'd want to identify the newest profiles returned from the FileSystem (using filename filters), but the Hadoop API performance is the same (irrespective of the filter). We, primarily, save the time in fetching file list from the FS and in deserializing. I can move the implementation of the TreeSet to a separate class to clean up the code. That would make debugging simpler too. With Guava, I don't see the value add beyond a lower risk of bugs, which should be minimal with the TreeSet too. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469487#comment-16469487 ] ASF GitHub Bot commented on DRILL-5270: --- ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-387873929 @kkhatua I think I understand the difference in our two perspectives. You wanted a cache that will always only contain the **N** most recently created profiles. If you happen to access the **N + 1**th youngest profile, the cache will not contain it and will never contain it, the cache will only hold the **N** most recently created profiles. I still prefer the approach with the Guava cache because you can still effectively achieve the same result. As new profiles are created they can be added to the cache. If you access a very old profile, one more recently created profile will be evicted from the cache and the old profile will be added to the cache since a user just requested it. I would argue this behavior is not only easier to implement since we are leveraging a library, but actually more desirable since it caches a profile based on when it is used, not when it was created. If you still disagree with using the Guava cache. I agree with your proposal of moving your cache into a separate class. I think you should also add some unit tests for the cache to verify that it works as expected. The unit tests will also make maintaining and enhancing the class easier for future developers. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470959#comment-16470959 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-388152463 I actually like the Guava cache approach for its elegance and capabilities, but it expands the scope significantly without a huge benefit from what we currently have. The concept of the cache that you are envisioning is with the complete profile. This is only for listing of the profiles. When an individual profile is accessed, Drill ends up fetching a new copy from the PStore to serialize the contents to visualize it. I'll move the class and add some unit tests as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471016#comment-16471016 ] ASF GitHub Bot commented on DRILL-5270: --- ilooner commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-388166323 @kkhatua Sounds good. Thanks for the explanations and thanks for improving the performance so much :) ! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16473159#comment-16473159 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-388564546 Done all the changes. Found an unused import in an unrelated file, so I fixed that to make sure the code builds after rebasing to latest master. @arina-ielchiieva / @parthchandra / @ilooner Can any (or all) of you do a review? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474313#comment-16474313 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187981315 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/sys/TestProfileSet.java ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.sys; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; + +import java.util.LinkedList; +import java.util.List; +import java.util.Random; + +import org.apache.commons.lang3.StringUtils; +import org.apache.drill.exec.store.sys.store.ProfileSet; +import org.junit.BeforeClass; +import org.junit.Test; + +/** + * Test the size-constrained ProfileSet for use in the webserver's '/profiles' listing + */ +public class TestProfileSet { + private final static String PROFILE_PREFIX = "t35t-pr0fil3-"; + static int initCapacity; + static int finalCapacity; + static int storeCount; + static Random rand; + static List masterList; + + @BeforeClass + public static void setupProfileSet() { +initCapacity = 50; +finalCapacity = 70; +storeCount = 100; +rand = new Random(); +//Generating source list of storeCount # 'profiles' +masterList = new LinkedList(); +for (int i = 0; i < storeCount; i++) { + masterList.add(PROFILE_PREFIX + StringUtils.leftPad(String.valueOf(i), String.valueOf(storeCount).length(), '0')); +} + } + + @Test + public void testProfileOrder() throws Exception { +//clone initial # profiles and verify via iterator. +ProfileSet testSet = new ProfileSet(initCapacity); +List srcList = new LinkedList(masterList); + +//Loading randomly +for (int i = 0; i < initCapacity; i++) { + String poppedProfile = testSet.add(srcList.remove(rand.nextInt(storeCount - i))); + assert (poppedProfile == null); + assertEquals(null, poppedProfile); +} + +//Testing order +String prevProfile = null; +while (!testSet.isEmpty()) { + String currOldestProfile = testSet.removeOldest(); + if (prevProfile != null) { +assertTrue( prevProfile.compareTo(currOldestProfile) > 0 ); + } + prevProfile = currOldestProfile; +} + } + + //Test if inserts exceeding capacity leads to eviction of oldest + @Test + public void testExcessInjection() throws Exception { +//clone initial # profiles and verify via iterator. +ProfileSet testSet = new ProfileSet(initCapacity); +List srcList = new LinkedList(masterList); + +//Loading randomly +for (int i = 0; i < initCapacity; i++) { + String poppedProfile = testSet.add(srcList.remove(rand.nextInt(storeCount - i))); + assertEquals(null, poppedProfile); +} + +//Testing Excess by looking at oldest popped +for (int i = initCapacity; i < finalCapacity; i++) { + String toInsert = srcList.remove(rand.nextInt(storeCount - i)); + String expectedToPop = ( toInsert.compareTo(testSet.getOldest()) > 0 ? + toInsert : testSet.getOldest() ); + + String oldestPoppedProfile = testSet.add(toInsert); + assertEquals(expectedToPop, oldestPoppedProfile); +} + +assertEquals(initCapacity, testSet.size()); + } + + //Test if size internally resizes to final capacity with no evictions + @Test + public void testSetResize() throws Exception { +//clone initial # profiles into a 700-capacity set. +ProfileSet testSet = new ProfileSet(finalCapacity); +List srcList = new LinkedList(masterList); + +//Loading randomly +for (int i = 0; i < initCapacity; i++) { + String poppedProfile = testSet.add(srcList.remove(rand.nextInt(storeCount - i))); + assertEquals(null, poppedProfile); +} + +assert(testSet.size() == initCapacity); Review comment: Please use junit assertions in tests. ---
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474317#comment-16474317 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187983484 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (File
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474314#comment-16474314 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187981529 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/ProfileSet.java ## @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.sys.store; + +import java.util.Iterator; +import java.util.TreeSet; +import java.util.concurrent.atomic.AtomicInteger; + +/** + * Wrapper around TreeSet to mimic a size-bound set ordered by name (implicitly the profiles' age) + */ +public class ProfileSet implements Iterable { + private TreeSet store; + private int maxCapacity; + //Using a dedicated counter to avoid + private AtomicInteger size; + + @SuppressWarnings("unused") + @Deprecated + private ProfileSet() {} + + public ProfileSet(int capacity) { +this.store = new TreeSet(); +this.maxCapacity = capacity; +this.size = new AtomicInteger(); + } + + public int size() { +return size.get(); + } + + /** + * Get max capacity of the profile set + * @return max capacity + */ + public int capacity() { +return maxCapacity; + } + + /** + * Add a profile name to the set, while removing the oldest, if exceeding capacity + * @param profile + * @return oldest profile + */ + public String add(String profile) { +return add(profile, false); + } + + /** + * Add a profile name to the set, while removing the oldest or youngest, based on flag + * @param profile + * @param retainOldest indicate retaining policy as oldest + * @return youngest/oldest profile + */ + public String add(String profile, boolean retainOldest) { +store.add(profile); +if ( size.incrementAndGet() > maxCapacity ) { Review comment: Please remove spaces. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474318#comment-16474318 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187984531 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (File
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474320#comment-16474320 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187984786 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (File
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474319#comment-16474319 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187982481 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (File
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474316#comment-16474316 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187984321 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (File
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474315#comment-16474315 ] ASF GitHub Bot commented on DRILL-5270: --- arina-ielchiieva commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r187981874 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (File
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474652#comment-16474652 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r188063691 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (FileStatus st
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474658#comment-16474658 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on a change in pull request #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#discussion_r188064161 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java ## @@ -1,220 +1,377 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.exec.store.sys.store; - -import static org.apache.drill.exec.ExecConstants.DRILL_SYS_FILE_SUFFIX; - -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.io.OutputStream; -import java.util.Collections; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; - -import javax.annotation.Nullable; - -import org.apache.commons.io.IOUtils; -import org.apache.drill.common.collections.ImmutableEntry; -import org.apache.drill.common.config.DrillConfig; -import org.apache.drill.exec.store.dfs.DrillFileSystem; -import org.apache.drill.exec.util.DrillFileSystemUtil; -import org.apache.drill.exec.store.sys.BasePersistentStore; -import org.apache.drill.exec.store.sys.PersistentStoreConfig; -import org.apache.drill.exec.store.sys.PersistentStoreMode; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; - -import com.google.common.base.Function; -import com.google.common.base.Preconditions; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; -import org.apache.hadoop.fs.PathFilter; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class LocalPersistentStore extends BasePersistentStore { - private static final Logger logger = LoggerFactory.getLogger(LocalPersistentStore.class); - - private final Path basePath; - private final PersistentStoreConfig config; - private final DrillFileSystem fs; - - public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig config) { -this.basePath = new Path(base, config.getName()); -this.config = config; -this.fs = fs; -try { - mkdirs(getBasePath()); -} catch (IOException e) { - throw new RuntimeException("Failure setting pstore configuration path."); -} - } - - protected Path getBasePath() { -return basePath; - } - - @Override - public PersistentStoreMode getMode() { -return PersistentStoreMode.PERSISTENT; - } - - private void mkdirs(Path path) throws IOException { -fs.mkdirs(path); - } - - public static Path getLogDir() { -String drillLogDir = System.getenv("DRILL_LOG_DIR"); -if (drillLogDir == null) { - drillLogDir = System.getProperty("drill.log.dir"); -} -if (drillLogDir == null) { - drillLogDir = "/var/log/drill"; -} -return new Path(new File(drillLogDir).getAbsoluteFile().toURI()); - } - - public static DrillFileSystem getFileSystem(DrillConfig config, Path root) throws IOException { -Path blobRoot = root == null ? getLogDir() : root; -Configuration fsConf = new Configuration(); -if (blobRoot.toUri().getScheme() != null) { - fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, blobRoot.toUri().toString()); -} - - -DrillFileSystem fs = new DrillFileSystem(fsConf); -fs.mkdirs(blobRoot); -return fs; - } - - @Override - public Iterator> getRange(int skip, int take) { -try { - // list only files with sys file suffix - PathFilter sysFileSuffixFilter = new PathFilter() { -@Override -public boolean accept(Path path) { - return path.getName().endsWith(DRILL_SYS_FILE_SUFFIX); -} - }; - - List fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter); - if (fileStatuses.isEmpty()) { -return Collections.emptyIterator(); - } - - List files = Lists.newArrayList(); - for (FileStatus st
[jira] [Commented] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488008#comment-16488008 ] ASF GitHub Bot commented on DRILL-5270: --- kkhatua commented on issue #1250: DRILL-5270: Improve loading of profiles listing in the WebUI URL: https://github.com/apache/drill/pull/1250#issuecomment-391489764 @arina-ielchiieva I've made the following changes: 1. Refactored to introduce an Archiver 2. Allow for cache to only apply to WebServer 3. For non-webserver request, like SysTables, support for recursive listing. This is because, while archiving speeds up performance for WebServers, SysTables would need access to archived profiles for analytics. 4. Added tests for the ProfileSet cache This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)