[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-12-02 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HDFS-6735:

   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I tried this on small cluster writing a load followed by verification I could 
read it all back.

Pushed to branch-2 and to trunk.

Thanks for the patch [~lhofhansl], the reviews [~cmccabe], and the original 
patches [~xieliang007]

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Lars Hofhansl
> Fix For: 2.7.0
>
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735-v7.txt, HDFS-6735-v8.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-28 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v8.txt

One more update. I noticed that the lock in ShortCircuitCache is taking more 
time than warranted. I noticed we have all these Precondition checks, where we 
prebuild the string that is only used in the exceptional case. Much better to 
use static strings with parameters so that the message string is constant and 
the final string is only built in the exception case.
That noticeably decreases the time spend in the ShortCircuitCache.lock.

Could do that in the separate jira, but it seemed easy enough.
Please let me know what you think. Thanks.

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Lars Hofhansl
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735-v7.txt, HDFS-6735-v8.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-27 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v7.txt

So here's the final one (with the findbugs tweak back in).

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Lars Hofhansl
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735-v7.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-27 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: (was: HDFS-6735-v6.txt)

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Lars Hofhansl
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-27 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: (was: HDFS-6735-v6.txt)

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Lars Hofhansl
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-27 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v6.txt

Trying to get another build. The artifacts of the previous one are gone for 
some reason.

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Lars Hofhansl
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735-v6.txt, HDFS-6735-v6.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v6.txt

Updated patch.
The findbugs tweak is still necessary. Locking was correct before, findbugs 
does not seem to realize that all references to cachingStrategy is always 
guarded by the infoLock.

I'll run a 2.4.1 version of this patch against HBase again.


> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735-v6.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v6.txt

Thanks [~ste...@apache.org].

New patch with findbugs tweak.

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735-v6.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v5.txt

Looked through the findbugs warning for DFSInputStream:
* indeed currentNode was wrongly synchronized (was so even before the patch). 
In getCurrentDataNode I had added synchronized(infoLock) but getCurrentData 
should just synchronized as currentNode is seek+read state.
* added a synchronized block in getBlockAt around access to pos, blockEnd, 
currentLocatedBlock. As explained in comment that is not needed, since we never 
get into that if block if we coming from a called synchronized on . But 
if that is so the extra synchronized won't hurt and it should make findbugs 
happy. 


> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735-v5.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v4.txt

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: (was: HDFS-6735-v4.txt)

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v4.txt

New patch:
* added synchronized back to tryZeroCopyRead
* renamed sharedLock to infoLock
* this time did all the correct indentation - harder to review, but this should 
be committable as is
* surrounded every reference to cachingStrategy with synchronized(infoLock) 
{...}, removed volatile

Looking at this again, we can be better about safe publishing with immutable 
state and avoid some of the locks.
For example FileEncryptionInfo and CachingStrategy are already immutable and 
can be 100% safely handled by just a volatile reference; most the LocatedBlocks 
state is also immutable and for those parts we can avoid the locks as well.

Immutable state is easier to reason about and more efficient.
(volatile still places read and write memory fences - but that is cheaper than 
synchronized). Can do that later :)


> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735-v4.txt, 
> HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-11-08 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-6735:

Attachment: HDFS-6735-v3.txt

I classified the state in DFSInputStream into state used by read only and state 
used by both read and pread.

With that here's a new proposed patch.
* makes LocatedBlocks immutable (which was intended it seems)
* pread no longer affects currentNode (that was unintended I think)
* guards state shared between read and pread with an extra sharedLock
(the state used for read only is still guarded by a lock on , which we 
need to take anyway to avoid concurrent stateful reads against the same input 
stream)
* removed all synchronized on private method that were only called from methods 
already synchronized (good practice anyway)
* makes cachingStrategy volatile (made more sense than locking there)
* should be free of deadlocks (never acquire lock on  with sharedLock 
held, but the reverse is possible)
* pos, blockEnd, currentLocatedBlock are not updated in getBlockAt unless 
called on behalf of read (not for pread, hence locking on  not needed 
there)

I have not tested this, yet.
Please have a careful look and let me know what you think.
We might want to further disentangle the mixed state.

(And just maybe the best solution would be for HBase to have an input stream 
for each thread doing read and one for all threads doing preads - and not do 
any of this...?)

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-07-23 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6735:


Attachment: HDFS-6735-v2.txt

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-07-22 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6735:


Description: 
In current DFSInputStream impl, there're a couple of coarser-grained locks in 
read/pread path, and it has became a HBase read latency pain point so far. In 
HDFS-6698, i made a minor patch against the first encourtered lock, around 
getFileLength, in deed, after reading code and testing, it shows still other 
locks we could improve.
In this jira, i'll make a patch against other locks, and a simple test case to 
show the issue and the improved result.

This is important for HBase application, since in current HFile read path, we 
issue all read()/pread() requests in the same DFSInputStream for one HFile. 
(Multi streams solution is another story i had a plan to do, but probably will 
take more time than i expected)

  was:
In current DFSInputStream impl, there're a couple of coarser-grained locks in 
read/pread path, and it has became a HBase read latency pain point so far. In 
HDFS-6698, i made a minor patch against the first encourtered lock, around 
getFileLength, in deed, after reading code and testing, it shows still other 
locks we could improve.
In this jira, i'll make a patch against other locks, and a simple test case to 
show the issue and the improved result.


> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we 
> issue all read()/pread() requests in the same DFSInputStream for one HFile. 
> (Multi streams solution is another story i had a plan to do, but probably 
> will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-07-22 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6735:


Attachment: HDFS-6735.txt

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream

2014-07-22 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6735:


Status: Patch Available  (was: Open)

> A minor optimization to avoid pread() be blocked by read() inside the same 
> DFSInputStream
> -
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in 
> read/pread path, and it has became a HBase read latency pain point so far. In 
> HDFS-6698, i made a minor patch against the first encourtered lock, around 
> getFileLength, in deed, after reading code and testing, it shows still other 
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case 
> to show the issue and the improved result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)