[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816062#comment-15816062 ]
Sean Mackrory edited comment on HDFS-10702 at 1/10/17 8:25 PM: --------------------------------------------------------------- Attaching a patch that I believe wraps up everyone's feedback so far. Specifically: * A SyncInfo object is used both to request current transaction info, and submit minimum transaction info in requests - as suggested by [~andrew.wang] to future-proof the interface. * I had a good look at removing that from the RPC layer. Unless it's in the header, we'd have to add it to each individual request, and the more I look at it the more cumbersome it is to remove. The best solution there is having an HDFs-specific RPC header. I don't think it's that bad that it's in the RPC header, myself - no immediate plans to use this for YARN, obviously, but specifying bounds on the staleness of data could definitely more generally useful for dist. systems than just in HDFS. * On a similar note, I was a little concerned about it being a static ThreadLocal on the RPC client, but it seems there are other analagous settings there so I gather there's some guarantee that clients for different filesystems are in different threads? * I also had a look at supporting federation better. There's some more work I need to do there - it wasn't immediately working for me the way it seems it should. That's easy to add on later, though. I would suggest that for that and for the optimization of using the non-checkpointing standby I file a follow-up JIRA and build on top of this patch as-is. * I had a look at the checkpointer, and I didn't see any dangerous assumptions that it was the only one reading the state. Thanks for the reviews everyone! was (Author: mackrorysd): Attaching a patch that I believe wraps up everyone's feedback so far. Specifically: * A SyncInfo object is used both to request current transaction info, and submit minimum transaction info in requests - as suggested by awang to future-proof the interface. * I had a good look at removing that from the RPC layer. Unless it's in the header, we'd have to add it to each individual request, and the more I look at it the more cumbersome it is to remove. The best solution there is having an HDFs-specific RPC header. I don't think it's that bad that it's in the RPC header, myself - no immediate plans to use this for YARN, obviously, but specifying bounds on the staleness of data could definitely more generally useful for dist. systems than just in HDFS. * On a similar note, I was a little concerned about it being a static ThreadLocal on the RPC client, but it seems there are other analagous settings there so I gather there's some guarantee that clients for different filesystems are in different threads? * I also had a look at supporting federation better. There's some more work I need to do there - it wasn't immediately working for me the way it seems it should. That's easy to add on later, though. I would suggest that for that and for the optimization of using the non-checkpointing standby I file a follow-up JIRA and build on top of this patch as-is. * I had a look at the checkpointer, and I didn't see any dangerous assumptions that it was the only one reading the state. Thanks for the reviews everyone! > Add a Client API and Proxy Provider to enable stale read from Standby > --------------------------------------------------------------------- > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Jiayi Zhou > Assignee: Jiayi Zhou > Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org