[jira] [Comment Edited] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

Sean Mackrory (JIRA) Tue, 10 Jan 2017 12:26:26 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816062#comment-15816062
 ]


Sean Mackrory edited comment on HDFS-10702 at 1/10/17 8:25 PM:
---------------------------------------------------------------

Attaching a patch that I believe wraps up everyone's feedback so far. 
Specifically:

* A SyncInfo object is used both to request current transaction info, and 
submit minimum transaction info in requests - as suggested by [~andrew.wang] to 
future-proof the interface.
* I had a good look at removing that from the RPC layer. Unless it's in the 
header, we'd have to add it to each individual request, and the more I look at 
it the more cumbersome it is to remove. The best solution there is having an 
HDFs-specific RPC header. I don't think it's that bad that it's in the RPC 
header, myself - no immediate plans to use this for YARN, obviously, but 
specifying bounds on the staleness of data could definitely more generally 
useful for dist. systems than just in HDFS. 
* On a similar note, I was a little concerned about it being a static 
ThreadLocal on the RPC client, but it seems there are other analagous settings 
there so I gather there's some guarantee that clients for different filesystems 
are in different threads?
* I also had a look at supporting federation better. There's some more work I 
need to do there - it wasn't immediately working for me the way it seems it 
should. That's easy to add on later, though. I would suggest that for that and 
for the optimization of using the non-checkpointing standby I file a follow-up 
JIRA and build on top of this patch as-is.
* I had a look at the checkpointer, and I didn't see any dangerous assumptions 
that it was the only one reading the state.

Thanks for the reviews everyone!


was (Author: mackrorysd):
Attaching a patch that I believe wraps up everyone's feedback so far. 
Specifically:

* A SyncInfo object is used both to request current transaction info, and 
submit minimum transaction info in requests - as suggested by awang to 
future-proof the interface.
* I had a good look at removing that from the RPC layer. Unless it's in the 
header, we'd have to add it to each individual request, and the more I look at 
it the more cumbersome it is to remove. The best solution there is having an 
HDFs-specific RPC header. I don't think it's that bad that it's in the RPC 
header, myself - no immediate plans to use this for YARN, obviously, but 
specifying bounds on the staleness of data could definitely more generally 
useful for dist. systems than just in HDFS. 
* On a similar note, I was a little concerned about it being a static 
ThreadLocal on the RPC client, but it seems there are other analagous settings 
there so I gather there's some guarantee that clients for different filesystems 
are in different threads?
* I also had a look at supporting federation better. There's some more work I 
need to do there - it wasn't immediately working for me the way it seems it 
should. That's easy to add on later, though. I would suggest that for that and 
for the optimization of using the non-checkpointing standby I file a follow-up 
JIRA and build on top of this patch as-is.
* I had a look at the checkpointer, and I didn't see any dangerous assumptions 
that it was the only one reading the state.

Thanks for the reviews everyone!

> Add a Client API and Proxy Provider to enable stale read from Standby
> ---------------------------------------------------------------------
>
>                 Key: HDFS-10702
>                 URL: https://issues.apache.org/jira/browse/HDFS-10702
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Jiayi Zhou
>            Assignee: Jiayi Zhou
>            Priority: Minor
>         Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

Reply via email to