subject:"\[GitHub\] drill issue #755\: DRILL\-5270\: Improve loading of profiles listing in the Web..."

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2018-03-14 Thread kkhatua

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
Holding off to do a rebase once @vrozov 's PR #1163 (DRILL-6053) goes into 
Apache.


---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2018-03-02 Thread kkhatua

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
Thanks, @vrozov. I'll make use of a separate lock for read-only purpose in 
case of `#1`.
For `#2`, I need to construct a size-limited ordered set from a list of 
unordered elements.
In this case, the elements (i.e. profiles) need to be ordered by file-name, 
which is a 1:1 mapping function of the start time epoch for the query.
So, I need to be able to add to such a datastructure in `O(log(n))` time, 
remove in `O(1)` and iterate through it in sequence. So, my puts are the most 
expensive operation. 



---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2018-03-02 Thread vrozov

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/755
  
@kkhatua
1. The read locks are not exclusive (single writer/multiple readers). To 
achieve the required functionality you need to introduce a different lock and 
use write (or exclusive) lock.
2. The choice for TreeSet is not obvious. What are the most common 
operations performed on the collection? Do you optimize for get, put or 
collection construction?

@arina-ielchiieva my github id is `vrozov`.


---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2018-03-02 Thread kkhatua

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755

The choice for a `TreeSet` is to basically use a binary structure that
keeps the (maximum permitted) profiles sorted and in memory.

When Drill detect changes,
(Refer
https://github.com/kkhatua/drill/blob/f7ad29b9a322bb215d16b3c3b9a2bfc40abfc1ed/exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/LocalPersistentStore.java#L146)

it will fetch all the available profiles in the PStore and reconstruct the
tree (since the order of the profiles returned by the `FileSystem` is not
guaranteed).

I tried using the `PathFilter` to fetch only new profiles, but the cost of
the `FileSystem` fetching only new profiles, versus the entire list is the
same! Also, there is the possibility that some profiles might have been deleted
as new ones were added, so a full reconstruction would take care of that
scenario as well.

To evict, as I construct the TreeSet, I simply pop the oldest (by filename)
entry. The Guava cache options don't seem to provide a way to define the basis
on which to evict entries.

I believe, @vrozov's work on DRILL-6053 is to address locking during writes
specifically. The lock I used (and need) is for reads to ensure that multiple
requests don't trigger an expensive FileSystem call for the same state of the
PStore.
e.g. consider T# as timestamps
* `currBasePathModified` = T0
* _ThreadA_ requests at t=T1 and issues a read-lock
* _ThreadB_ requests at t=T2 but is waiting for read-lock

If the tree exists and no change is detected, _ThreadA_ will use the
`TreeSet` contents and resume by releasing the lock.

If the `TreeSet` exists and a change is detected, _ThreadA_ will
reconstruct the `TreeSet` before using its contents and it will update
`lastBasePathModified`, before releasing the lock.

When _ThreadB_ gets the read-lock, it discovers that during the wait, the
`TreeSet` was already updated. So, in terms of t=T2, this is the most recent
snapshot, so it proceeds to use the treeSet's contents rather than reconstruct.
That will be deferred to the next request.

We're using the `lastBasePathModified` as a way to provide a
pseudo-versioned access to the list. That means if there are more profiles
added *after* _ThreadB_ was waiting for the read-lock, it will not trigger the
`FileSystem` call right away.

---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2018-03-02 Thread kkhatua

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
@arina-ielchiieva I need to rebase this on top of the latest master 
considering it was originally based on nearly a year old code. When ready, i'll 
create a new PR or push to this one. Let me know which one works.


---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2017-04-21 Thread kkhatua

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
@sudheeshkatkam Can you please review the PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2017-02-21 Thread kkhatua

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
For 8266 profiles, when measured from Chrome browser's Network tool:
```
Load First Time: 2.43s 
Load Second Time (no new profiles): 829ms
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

2017-02-21 Thread kkhatua

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/755
  
A summary of the performance is available in this 
[comment](https://issues.apache.org/jira/browse/DRILL-5270?focusedCommentId=15877119=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15877119)
 on the JIRA (DRILL-5270)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

[GitHub] drill issue #755: DRILL-5270: Improve loading of profiles listing in the Web...

8 matches

Site Navigation

Mail list logo

Footer information