[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173490#comment-16173490 ] Hadoop QA commented on HDFS-6450: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 14s{color} | {color:red} HDFS-6450 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-6450 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12702940/HDFS-7782-001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/21245/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903506#comment-15903506 ] Andrew Wang commented on HDFS-6450: --- I think it's fine to re-use the existing config, it seems unlikely that a client would want hedging only for positional reads. Somewhat relatedly, [~mmokhtar] is working on Impala benchmarks to quantify the perf difference between hedging+positional and non-positional reads. If we get this API completed, it might also be interesting to test. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902444#comment-15902444 ] stack commented on HDFS-6450: - [~elgoiri] Ignore my comment above. I thought this the resolved positional hedged read issue. My bad. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902152#comment-15902152 ] Inigo Goiri commented on HDFS-6450: --- Thanks [~andrew.wang]. Most of the hedging management is done asynchronously but we still need to copy data between threads, so I guess his concern is still valid. The solution is much faster than a regular read but the thread management is still there. In any case, I think it's good practice to make it an opt-in feature. We already rely on the config from HDFS-5776 to enable/disable it. If we want to separate it from the regular hedged reads, we can add a separate one for non-positional reads. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902141#comment-15902141 ] Andrew Wang commented on HDFS-6450: --- Added [~thinktaocs] as a contributor, so can now be assigned JIRAs. Does your patch solve the performance concerns raised by Colin above? If not, it needs to be hidden behind a config option. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901701#comment-15901701 ] Inigo Goiri commented on HDFS-6450: --- [~stack], we are trying to solve the exact same issue as this JIRA reports. Actually, our solution is just an evolved version of the original patch in this JIRA. I'm OK with creating a duplicated issue and linking back but one of the two will be eventually marked as duplicated. Any particular reason to not keep working on this one? > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901691#comment-15901691 ] stack commented on HDFS-6450: - Open a new issue I'd say [~elgoiri]. Link it back here. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901684#comment-15901684 ] Inigo Goiri commented on HDFS-6450: --- In our own scenario, some Datanodes are very likely to be much slower than others. For this reason, we need non-positional hedged reads. [~thinktaocs] has worked on porting and extending the original patch in this JIRA to work in 2.7 and support multiple hedged reads. In our prototype, we use Futures and we allow cancelling/killing slow reads and so on. We've been using it for the last month or so and the improvements are very significant. We've been able to reduce our the tail time of our reads by more than 2x. Would it be possible for [~thinktaocs] to be assigned to this JIRA and upload our prototype? Based on this patch we can start the discussion on the approach. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15806191#comment-15806191 ] Hadoop QA commented on HDFS-6450: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} HDFS-6450 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-6450 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12702940/HDFS-7782-001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18093/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382682#comment-14382682 ] Colin Patrick McCabe commented on HDFS-6450: bq. Would be a (small) shame though if to get hedged non-positional reads one has to bring in erasure coding. I understand the frustration. It would certainly be nice to have hedged non-positional reads, for HBase and other things. The problem that we have always had with hedged non-positional reads is that they don't work well (or really at all) with short-circuit reads. We simply don't have any way to interrupt a blocking read from a file descriptor, other than closing the file descriptor. And closing the FD will also disrupt any other clients that are reading from that FD (the short-circuit code shares FDs across multiple readers). This problem doesn't exist for erasure encoding because erasure encoding doesn't support short-circuit. Yes, the problem can be solved by doing the read in a Future, but the overhead of passing data across threads (and hence across CPU caches, much of the time) would be a significant performance regression. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14380745#comment-14380745 ] Andrew Purtell commented on HDFS-6450: -- bq. I think it's much easier to just implement hedged non-positional reads in the erasure coding-specific subclass of DFSInputStream. Would be a (small) shame though if to get hedged non-positional reads one has to bring in erasure coding. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351346#comment-14351346 ] Zhe Zhang commented on HDFS-6450: - Thanks for the suggestion Colin. Hedged pread is already handling BlockReaderLocal (by wrapping it as a Future). I guess it's reasonable to do the same for non-positional read too? Is it correct to understand hedged vs. non-hedged and positional vs. non-positional as orthogonal dimensions? If so, the only new requirement in hedged non-positional read is to utilize and maintain the states (pos, blockReader). I'm still getting myself around this complex reader code. So please let me know if I missed something. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350869#comment-14350869 ] Colin Patrick McCabe commented on HDFS-6450: I think it would be possible to support hedged non-positional reads in {{BlockReaderLocal}}, but difficult. First we would have to stop re-using the same FD for all instances of a BlockReaderLocal that were reading the same replica. Perhaps we could use dup to create a new FD per blockreader without doing multiple opens. Then we could close the blockreader FD if the local read were being slow. I think it's much easier to just implement hedged non-positional reads in the erasure coding-specific subclass of DFSInputStream. I also think we may want to create a base class for DFSInputStream that both the raid and the non-raid code path inherit from. Inheriting from the non-raid code path is weird because there is a lot of stuff that is not relevant. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349499#comment-14349499 ] Zhe Zhang commented on HDFS-6450: - In the erasure coding project we need to implement the logic of reading from a striped file (HDFS-7782). Hedged reading is a natural fit to support striping layout, because each I/O is likely to cover multiple DNs. So I'd like to dig a little deeper on this JIRA to see if hedged non-positional read should be added to DFIS or only the subclass for erasure coding. bq. Yeah... if the first read, done with the old block reader, doesn't finish first, then I think we should forget about that block reader... even if it's a BlockReaderLocal and the new one is remote. After all, slow or misbehaving local disks are one of the main problems we're trying to cover up with hedged reads. As you pointed out, we need to close the old block reader in this situation. [~cmccabe] I like the idea of reusing the {{blockReader}} at first Future, and updating it to the winning reader. The only concern is we'll never come back to the local reader if it goes slow once. Should we give it another chance? Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349724#comment-14349724 ] Zhe Zhang commented on HDFS-6450: - I just posted a [patch|https://issues.apache.org/jira/secure/attachment/12702940/HDFS-7782-001.patch] under HDFS-7782. It makes some changes to {{DFSInputStream}} that should partially support this JIRA too: # A small {{HedgedReadResult}} class to return both a result buffer and a generated {{BlockReader}} which should be used to adjust the maintained {{blockBreader}} # In {{hedgedFetchBlockByteRange}}, return the reader from the fastest Future # That JIRA has the hedged read logic in {{readBuffer()}} instead of {{readWithStrategy}} level. I'm not so sure which option is better though. [~xieliang007] I wonder if you still plan to continue this work? In either case I'm happy to help complete this JIRA. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349867#comment-14349867 ] Liang Xie commented on HDFS-6450: - Thanks [~zhz] for the comments, i am not active in this area recently, please go ahead:) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066126#comment-14066126 ] Liang Xie commented on HDFS-6450: - bq. How is testHedgedReadBasic exercising the new functionality? Or is it just verifying nothing broke when it is turned on? It was not finished yet, this patch is a preliminary one, and i hold on due to the block reader stuff, i think i have got a feasible direction per Colin's last comments, i'll continue to do the rest asap:) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063892#comment-14063892 ] Colin Patrick McCabe commented on HDFS-6450: Yeah... if the first read, done with the old block reader, doesn't finish first, then I think we should forget about that block reader... even if it's a {{BlockReaderLocal}} and the new one is remote. After all, slow or misbehaving local disks are one of the main problems we're trying to cover up with hedged reads. As you pointed out, we need to close the old block reader in this situation. I think from an implementation point of view, you might consider unsetting {{DFSInputStream#blockReader}} at the beginning of the hedged read (set to null and move the old block reader into the first future). Then have the winning future set it to the block reader that it used. There will be some performance regression just due to using threads and futures here, instead of just doing the read from the same thread that needs the data. Passing data between threads is slower because you might be going between CPU cores. I don't know if there's really a good way to address this without doing something like HDFS-6695, which is out of scope for this change. I think in the short term, applications will have to turn on non-positional hedged reads explicitly, and accept some small loss in throughput for a major reduction in long-tail latency. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064564#comment-14064564 ] Liang Xie commented on HDFS-6450: - bq. I think in the short term, applications will have to turn on non-positional hedged reads explicitly, and accept some small loss in throughput for a major reduction in long-tail latency. yes, it's a trade-off :) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064598#comment-14064598 ] stack commented on HDFS-6450: - I like [~cmccabe]'s take and his suggestion on unsetting current blockReader while the race is on. In the patch, nit, you have to make two instances of Random? Can't share? How is testHedgedReadBasic exercising the new functionality? Or is it just verifying nothing broke when it is turned on? readWithStrategy has synchronized added (you've moved the synchronize down a level it seems) but hedgedReadWithStrategy is not synchronized. Makes for interesting synchronizes inside hedgedReadWithStrategy. Could do w/ comments explaining what is going on. For example, the below is a little baffling until I look closely and see pos accesses are always inside synchronized methods (I think). +synchronized (this) { + if (pos = getFileLength()) { +return -1; + } +} I took a look at this seems to be the only data member that needs synchronize protection in this method. Any danger if someone else changes it under you while this method is running? (between sync blocks)? And it is ok having multiple threads inside hedgedReadWithStrategy at the one time when say readWithStrategy doesn't allow this to happen? I like your use of a an old english catchall when problem WARN logging a hedgedReadWithStrategy return. Might want to change that in v2 (smile). Lots of overlap w/ the pread version. Would be coolio if could have the two methods share code. Thanks for working on this one [~xieliang007] Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061760#comment-14061760 ] Liang Xie commented on HDFS-6450: - After a deep looking, it's kind of hard to reuse/maintain block reader as before. In pread(), we don't have this trouble, because we always create new block reader. In read(), if we want to support hedged read ability, in general: 1) first read(r1) using the old block reader if possible, then wait hedged read timeout setting 2) second read(r2) must create a new block reader, and submit into thread pool 3) wait the first completed task, and return final read result to client side. Here we need to set(remember) this task's block reader to DFIS's block reader variable, and should keep it open, but we also need to close the other block reader to avoid leak. Another thing need to know is that if we remember the faster block reader, if it's a remote block reader, then the following read() will bypass local read in the following r1 operations... Any thought ? [~cmccabe], [~saint@gmail.com] ... Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057296#comment-14057296 ] Hadoop QA commented on HDFS-6450: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654948/HDFS-6450-like-pread.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestProcessCorruptBlocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7320//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7320//console This message is automatically generated. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053600#comment-14053600 ] Hadoop QA commented on HDFS-6450: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654295/HDFS-6450-like-pread.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7285//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7285//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7285//console This message is automatically generated. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie Attachments: HDFS-6450-like-pread.txt HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053324#comment-14053324 ] Liang Xie commented on HDFS-6450: - [~cmccabe] [~saint@gmail.com] I made a prototype already(still adding more test cases), but i think we need to clear one thing right now: how to handle the block reader for read(). In current impl, the read() maintains a blockreader to reuse for furture read calls, and pread() will create a new blockreader in each call always. So if we want to enhance read() to have hedged read ability, should we assign the first completed read request's block reader to above maintained blockreader variable?(i guess most of situations, this behavior probably will be like change the local block reader to a remote block reader?) Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012033#comment-14012033 ] Liang Xie commented on HDFS-6450: - will dive into it a couple of days later. Support non-positional hedged reads in HDFS --- Key: HDFS-6450 URL: https://issues.apache.org/jira/browse/HDFS-6450 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Liang Xie HDFS-5776 added support for hedged positional reads. We should also support hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.2#6252)