Re: DFS. How to read from a specific datanode
Yes, I agree with you that it should be negotiated. That is "namenode provides an ordered list and the client can choose some based on its own measurements." But I am afraid 0.17.1 does not provide easy interface for this. -Kevin On Thu, Aug 7, 2008 at 3:40 AM, Steve Loughran <[EMAIL PROTECTED]> wrote: > Kevin wrote: >> >> Thank you for the suggestion. I looked at DFSClient. It appears that >> chooseDataNode method decides which data node to connect to. Currently >> it chooses the first non-dead data node returned by namenode, which >> have sorted the nodes by proximity to the client. However, >> chooseDataNode is private, so overriding it seems infeasible. Neither >> are the callers of chooseDataNode public or protected. >> >> I need this because I do not want to trust namenode's ordering. For >> applications where network congestion is rare, we should let the >> client to decide which data node to load from. >> > > dangerous. what happens when network congestion arrives and the apps are out > there. Maybe it should be negotiated -namenode provides an ordered list and > the client can choose some based on its own measurements. If the name node > provides one only, that's the one you get to use >
Re: DFS. How to read from a specific datanode
Kevin wrote: Thank you for the suggestion. I looked at DFSClient. It appears that chooseDataNode method decides which data node to connect to. Currently it chooses the first non-dead data node returned by namenode, which have sorted the nodes by proximity to the client. However, chooseDataNode is private, so overriding it seems infeasible. Neither are the callers of chooseDataNode public or protected. I need this because I do not want to trust namenode's ordering. For applications where network congestion is rare, we should let the client to decide which data node to load from. dangerous. what happens when network congestion arrives and the apps are out there. Maybe it should be negotiated -namenode provides an ordered list and the client can choose some based on its own measurements. If the name node provides one only, that's the one you get to use
Re: DFS. How to read from a specific datanode
Thank you for the idea of submitting request. However, I guess I could not wait until it is served. The worst case is that I would probably hack my copy of hadoop and rebuild it. -Kevin On Wed, Aug 6, 2008 at 11:31 AM, lohit <[EMAIL PROTECTED]> wrote: >>I need this because I do not want to trust namenode's ordering. For >>applications where network congestion is rare, we should let the >>client to decide which data node to load from. > > If this is the case, then providing a method to re-order the datanode list > shouldnt be hard. May be open a JIRA > (https://issues.apache.org/jira/secure/CreateIssue!default.jspa) as > improvement request and continue the discussion there? > > -Lohit > > > - Original Message > From: Kevin <[EMAIL PROTECTED]> > To: core-user@hadoop.apache.org > Sent: Wednesday, August 6, 2008 10:37:44 AM > Subject: Re: DFS. How to read from a specific datanode > > Thank you for the suggestion. I looked at DFSClient. It appears that > chooseDataNode method decides which data node to connect to. Currently > it chooses the first non-dead data node returned by namenode, which > have sorted the nodes by proximity to the client. However, > chooseDataNode is private, so overriding it seems infeasible. Neither > are the callers of chooseDataNode public or protected. > > I need this because I do not want to trust namenode's ordering. For > applications where network congestion is rare, we should let the > client to decide which data node to load from. > > -Kevin > > > > On Tue, Aug 5, 2008 at 7:57 PM, lohit <[EMAIL PROTECTED]> wrote: >> I havent tried it, but see if you can create DFSClient object and use its >> open() and read() calls to get the job done. Basically you would have to >> force currentNode to be your node of interest in there. >> Just curious, what is the use case for your request? >> >> Thanks, >> Lohit >> >> >> >> - Original Message >> From: Kevin <[EMAIL PROTECTED]> >> To: "core-user@hadoop.apache.org" >> Sent: Tuesday, August 5, 2008 6:59:55 PM >> Subject: DFS. How to read from a specific datanode >> >> Hi, >> >> This is about dfs only, not to consider mapreduce. It may sound like a >> strange need, but sometimes I want to read a block from a specific >> data node which holds a replica. Figuring out which datanodes have the >> block is easy. But is there an easy way to specify which datanode I >> want to load from? >> >> Best, >> -Kevin >> >> > >
Re: DFS. How to read from a specific datanode
>I need this because I do not want to trust namenode's ordering. For >applications where network congestion is rare, we should let the >client to decide which data node to load from. If this is the case, then providing a method to re-order the datanode list shouldnt be hard. May be open a JIRA (https://issues.apache.org/jira/secure/CreateIssue!default.jspa) as improvement request and continue the discussion there? -Lohit - Original Message From: Kevin <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Wednesday, August 6, 2008 10:37:44 AM Subject: Re: DFS. How to read from a specific datanode Thank you for the suggestion. I looked at DFSClient. It appears that chooseDataNode method decides which data node to connect to. Currently it chooses the first non-dead data node returned by namenode, which have sorted the nodes by proximity to the client. However, chooseDataNode is private, so overriding it seems infeasible. Neither are the callers of chooseDataNode public or protected. I need this because I do not want to trust namenode's ordering. For applications where network congestion is rare, we should let the client to decide which data node to load from. -Kevin On Tue, Aug 5, 2008 at 7:57 PM, lohit <[EMAIL PROTECTED]> wrote: > I havent tried it, but see if you can create DFSClient object and use its > open() and read() calls to get the job done. Basically you would have to > force currentNode to be your node of interest in there. > Just curious, what is the use case for your request? > > Thanks, > Lohit > > > > - Original Message > From: Kevin <[EMAIL PROTECTED]> > To: "core-user@hadoop.apache.org" > Sent: Tuesday, August 5, 2008 6:59:55 PM > Subject: DFS. How to read from a specific datanode > > Hi, > > This is about dfs only, not to consider mapreduce. It may sound like a > strange need, but sometimes I want to read a block from a specific > data node which holds a replica. Figuring out which datanodes have the > block is easy. But is there an easy way to specify which datanode I > want to load from? > > Best, > -Kevin > >
Re: DFS. How to read from a specific datanode
Yes, the namenode is in charge of deciding the proximity by using DNSToSwitchMapping. On the other hand, I am exploring the possibility to let the client decide which data node to connect to, since sometimes network hierarchy is so complex or dynamic that we better leave it to the client to find out which datanode is nearest. -Kevin On Wed, Aug 6, 2008 at 2:31 AM, Samuel Guo <[EMAIL PROTECTED]> wrote: > Kevin 写道: >> >> Hi, >> >> This is about dfs only, not to consider mapreduce. It may sound like a >> strange need, but sometimes I want to read a block from a specific >> data node which holds a replica. Figuring out which datanodes have the >> block is easy. But is there an easy way to specify which datanode I >> want to load from? >> >> Best, >> -Kevin >> > > DFSClient will choose a node that contains a replicas of the block for you. > The chosen node will be the nearest node to your client node. This method is > awesome. > plz let me know why you want to specify the datanode yourself :) >
Re: DFS. How to read from a specific datanode
Thank you for the suggestion. I looked at DFSClient. It appears that chooseDataNode method decides which data node to connect to. Currently it chooses the first non-dead data node returned by namenode, which have sorted the nodes by proximity to the client. However, chooseDataNode is private, so overriding it seems infeasible. Neither are the callers of chooseDataNode public or protected. I need this because I do not want to trust namenode's ordering. For applications where network congestion is rare, we should let the client to decide which data node to load from. -Kevin On Tue, Aug 5, 2008 at 7:57 PM, lohit <[EMAIL PROTECTED]> wrote: > I havent tried it, but see if you can create DFSClient object and use its > open() and read() calls to get the job done. Basically you would have to > force currentNode to be your node of interest in there. > Just curious, what is the use case for your request? > > Thanks, > Lohit > > > > - Original Message > From: Kevin <[EMAIL PROTECTED]> > To: "core-user@hadoop.apache.org" > Sent: Tuesday, August 5, 2008 6:59:55 PM > Subject: DFS. How to read from a specific datanode > > Hi, > > This is about dfs only, not to consider mapreduce. It may sound like a > strange need, but sometimes I want to read a block from a specific > data node which holds a replica. Figuring out which datanodes have the > block is easy. But is there an easy way to specify which datanode I > want to load from? > > Best, > -Kevin > >
Re: DFS. How to read from a specific datanode
Kevin 写道: Hi, This is about dfs only, not to consider mapreduce. It may sound like a strange need, but sometimes I want to read a block from a specific data node which holds a replica. Figuring out which datanodes have the block is easy. But is there an easy way to specify which datanode I want to load from? Best, -Kevin DFSClient will choose a node that contains a replicas of the block for you. The chosen node will be the nearest node to your client node. This method is awesome. plz let me know why you want to specify the datanode yourself :)
Re: DFS. How to read from a specific datanode
I havent tried it, but see if you can create DFSClient object and use its open() and read() calls to get the job done. Basically you would have to force currentNode to be your node of interest in there. Just curious, what is the use case for your request? Thanks, Lohit - Original Message From: Kevin <[EMAIL PROTECTED]> To: "core-user@hadoop.apache.org" Sent: Tuesday, August 5, 2008 6:59:55 PM Subject: DFS. How to read from a specific datanode Hi, This is about dfs only, not to consider mapreduce. It may sound like a strange need, but sometimes I want to read a block from a specific data node which holds a replica. Figuring out which datanodes have the block is easy. But is there an easy way to specify which datanode I want to load from? Best, -Kevin
DFS. How to read from a specific datanode
Hi, This is about dfs only, not to consider mapreduce. It may sound like a strange need, but sometimes I want to read a block from a specific data node which holds a replica. Figuring out which datanodes have the block is easy. But is there an easy way to specify which datanode I want to load from? Best, -Kevin