Block placement question
Hello everyone, I am trying to devise my own block placement policy in hadoop 3. My query is that when I choose favored nodes is that choice valid fora particular file or a particular block? What if I wish to send different blocks of the same file to different nodes. How can I control that? Any inputs will be greatly appreciated. Warm Regards, Shuubham Ojha
Re: block placement
Hello Hari, can you give me some insight as to how I can put the ip addresses or rack numbers of datanodes in the choose target method. Warm regards, Shuubham On Fri, Apr 19, 2019 at 10:05 PM Hariharan wrote: > Favoured nodes is a per-file property that you can set at the time of > creation. Note that future rebalancing may not respect this. You can read > more about it here - > https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java#L1171 > > If there is a fixed relation b/w your files and the datanode they need to > be on, then as Hilmi suggested you are better off writing a custom policy > that extends the default one. You would need to extend the chooseTarget > method to apply your logic there. > > ~ Hari > > On Fri, Apr 19, 2019 at 12:44 PM Shuubham Ojha > wrote: > >> Hi Hilmi, thanks for your response. I have already done the following: >> 1. Defined the rack topology in a script and added that to core-site.xml. >> I think this ensures that the hadoop on namenode knows the ip addresses of >> datanodes and corresponding rack numbers. >> 2. I have gone through the default block placement policy code multiple >> times and I can see that a parameter called favored nodes has been used. I >> believe this parameter is of interest to me. >> >> The trouble is that I have a set of nodes with known ip addresses but I >> cant see any specific ip addresses being used in the placement policy code. >> I specifically want to know how to tell my hadoop code to place a >> particular block of data on a particular datanode given it's ip address. I >> think this problem can be resolved by setting the given datanode as favored >> node for the time when the data block of interest is being dequeued to be >> sent to datanodes but I can't how that should be done. >> >> Warm regards, >> Shuubham Ojha >> >> On Fri, Apr 19, 2019 at 4:12 AM Hilmi Egemen Ciritoğlu < >> hilmi.egemen.cirito...@gmail.com> wrote: >> >>> Hi Shuubham, >>> >>> You can simply create your own block placement class by extending >>> BlockPlacementPolicy. >>> As a starting point, you may want to have a look at Block Placement >>> Default Policy Source Code >>> <https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java> >>> . >>> >>> Regards, >>> H. Egemen Ciritoglu >>> >>> On Thu, 18 Apr 2019 at 22:34, Shuubham Ojha >>> wrote: >>> >>>> Can anyone give me some idea as to how I can write my own block >>>> placement strategy in hadoop 3. >>>> >>>> >>>> Shuubham Ojha >>>> >>>
Re: block placement
Favoured nodes is a per-file property that you can set at the time of creation. Note that future rebalancing may not respect this. You can read more about it here - https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java#L1171 If there is a fixed relation b/w your files and the datanode they need to be on, then as Hilmi suggested you are better off writing a custom policy that extends the default one. You would need to extend the chooseTarget method to apply your logic there. ~ Hari On Fri, Apr 19, 2019 at 12:44 PM Shuubham Ojha wrote: > Hi Hilmi, thanks for your response. I have already done the following: > 1. Defined the rack topology in a script and added that to core-site.xml. > I think this ensures that the hadoop on namenode knows the ip addresses of > datanodes and corresponding rack numbers. > 2. I have gone through the default block placement policy code multiple > times and I can see that a parameter called favored nodes has been used. I > believe this parameter is of interest to me. > > The trouble is that I have a set of nodes with known ip addresses but I > cant see any specific ip addresses being used in the placement policy code. > I specifically want to know how to tell my hadoop code to place a > particular block of data on a particular datanode given it's ip address. I > think this problem can be resolved by setting the given datanode as favored > node for the time when the data block of interest is being dequeued to be > sent to datanodes but I can't how that should be done. > > Warm regards, > Shuubham Ojha > > On Fri, Apr 19, 2019 at 4:12 AM Hilmi Egemen Ciritoğlu < > hilmi.egemen.cirito...@gmail.com> wrote: > >> Hi Shuubham, >> >> You can simply create your own block placement class by extending >> BlockPlacementPolicy. >> As a starting point, you may want to have a look at Block Placement >> Default Policy Source Code >> <https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java> >> . >> >> Regards, >> H. Egemen Ciritoglu >> >> On Thu, 18 Apr 2019 at 22:34, Shuubham Ojha >> wrote: >> >>> Can anyone give me some idea as to how I can write my own block >>> placement strategy in hadoop 3. >>> >>> >>> Shuubham Ojha >>> >>
Re: block placement
Hi Hilmi, thanks for your response. I have already done the following: 1. Defined the rack topology in a script and added that to core-site.xml. I think this ensures that the hadoop on namenode knows the ip addresses of datanodes and corresponding rack numbers. 2. I have gone through the default block placement policy code multiple times and I can see that a parameter called favored nodes has been used. I believe this parameter is of interest to me. The trouble is that I have a set of nodes with known ip addresses but I cant see any specific ip addresses being used in the placement policy code. I specifically want to know how to tell my hadoop code to place a particular block of data on a particular datanode given it's ip address. I think this problem can be resolved by setting the given datanode as favored node for the time when the data block of interest is being dequeued to be sent to datanodes but I can't how that should be done. Warm regards, Shuubham Ojha On Fri, Apr 19, 2019 at 4:12 AM Hilmi Egemen Ciritoğlu < hilmi.egemen.cirito...@gmail.com> wrote: > Hi Shuubham, > > You can simply create your own block placement class by extending > BlockPlacementPolicy. > As a starting point, you may want to have a look at Block Placement > Default Policy Source Code > <https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java> > . > > Regards, > H. Egemen Ciritoglu > > On Thu, 18 Apr 2019 at 22:34, Shuubham Ojha > wrote: > >> Can anyone give me some idea as to how I can write my own block >> placement strategy in hadoop 3. >> >> >> Shuubham Ojha >> >
Re: block placement
Hi Shuubham, You can simply create your own block placement class by extending BlockPlacementPolicy. As a starting point, you may want to have a look at Block Placement Default Policy Source Code <https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java> . Regards, H. Egemen Ciritoglu On Thu, 18 Apr 2019 at 22:34, Shuubham Ojha wrote: > Can anyone give me some idea as to how I can write my own block placement > strategy in hadoop 3. > > > Shuubham Ojha >
block placement
Can anyone give me some idea as to how I can write my own block placement strategy in hadoop 3. Shuubham Ojha
Re: HDFS Block placement policy
the best practice is to have an Edge/Gateway node, so the there is no local copy of data. It is also good from a security perspective. I think my this video can help you understand this better: https://www.youtube.com/watch?v=t20niJDO1f4 Regards Gurmukh On 20/05/16 12:29 AM, Ruhua Jiang wrote: Hi all, I have a question related to HDFS Block placement policy. The default, "The default block placement policy is as follows: Place the first replica somewhere – either a random node (if the HDFS client is outside the Hadoop/DataNode cluster) or on the local node (if the HDFS client is running on a node inside the cluster). Place the second replica in a different rack" Let's consider the situation that data are in *1 datanode local disk*, a *hdfs -put* command is used (which means HDFS client is on a datanode) to ingest this data into HDFS. - What will happen (in terms of block placement) if this datanode local disk is full? - Is there a list of available alternative block placement policy implemented, and hdfs -put can use it just by change the hdfs-site.xml config? I notice this https://issues.apache.org/jira/browse/HDFS-385 JIRA ticket but it seems not what we want. - I understand place first block on local machine can improve the perfermance, and we can use HDFS balancer to solve the imblance problem afterwards. However, I just want to explore alternative solutions to avoid this problem at beginning. Thanks Ruhua Jiang -- -- Thanks and Regards Gurmukh Singh
Controlling the block placement and the file placement in HDFS writes
Hello All, I was wondering if the following issues can be solved by extending hdfs classes with custom implementations if possible. Here are my requirements : 1. Is there a way to control that all file blocks belonging to a particular hdfs directory file go to the same physical datanode ( and their corresponding replicas as well ? ) 2. Is there a way to control the volume that is being used to write a file block ? Here are the finer details to give some background for the above two queries : We are using Impala engine to analyze our data and to do so we are generating the files in parquet format using our custom data processing pipelines. It is ideal that all of the files belonging to the same partition be processed by the same node ( as we see our queries can be partitioned by say a key that is common across all queries). In this regard, we would like to control a given file block in a particular path always land on the same physical node so that the impala workers that need to process data for a given query send less data across nodes to manage the result. Hence the first question. The second aspect is that our queries are time based and this time window follows a familiar pattern of old data not being queried much. Hence we would like to preserve the most recent data in the HDFS cache ( impala is helping us manage this aspect via their command set ) but we would like the next recent amount of data chunks to land on an SSD that is present on every datanode. The remaining set of blocks which are very old but in large quantities would land on spinning disks. The decision to choose a given volume is based on the file name as we can control the filename that is being used to generate the file. Please note that the criteria for both of the above is being based on the name of the file and not the size of the file. For the first query, I have looked into BlockPlacementPolicy interface. The method chooseTarget() in the above interface is giving me a list of DataNodes to choose from and return one of them as the target ? My dilemma is whether given a path for a directory , will the input DataNodeDescriptors to the function chooseTarget remain the same for each invocation to the same directory ? Or is it not something not controlled ? For the second query, I have looked at the VolumeChoosingPolicy Interface. On looking at the VolumeChoosingPolicy interface, it looks like the only handle I get is the list of current volumes and no information about the incoming file. Any pointers regarding the above two aspects would be immensely helpful. Thanks a lot for your time. Regards, Ananth
Block placement without rack aware
What is the block placement policy hadoop follows when rack aware is not enabled? Does it just round robin? Thanks.
RE: Block placement without rack aware
It’s still random. If rack aware is not enabled, all nodes are in “default-rack”, and random nodes are chosen for block replications. Regards, Yi Liu From: SF Hadoop [mailto:sfhad...@gmail.com] Sent: Friday, October 03, 2014 7:12 AM To: user@hadoop.apache.org Subject: Block placement without rack aware What is the block placement policy hadoop follows when rack aware is not enabled? Does it just round robin? Thanks.
Re: Block placement without rack aware
It appears to be randomly chosen. I just came across this blog post from Lars George about HBase file locality in HDFS http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html On Thu, Oct 2, 2014 at 4:12 PM, SF Hadoop sfhad...@gmail.com wrote: What is the block placement policy hadoop follows when rack aware is not enabled? Does it just round robin? Thanks.
Re: Block placement without rack aware
Thanks for the info. Exactly what I needed. Cheers. On Thu, Oct 2, 2014 at 4:21 PM, Pradeep Gollakota pradeep...@gmail.com wrote: It appears to be randomly chosen. I just came across this blog post from Lars George about HBase file locality in HDFS http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html On Thu, Oct 2, 2014 at 4:12 PM, SF Hadoop sfhad...@gmail.com wrote: What is the block placement policy hadoop follows when rack aware is not enabled? Does it just round robin? Thanks.
Support multiple block placement policies
Hi there, According to the code, the current implement of HDFS only supports one specific type of block placement policy, which is BlockPlacementPolicyDefault by default. The default policy is enough for most of the circumstances, but under some special circumstances, it works not so well. For example, on a shared cluster, we want to erasure encode all the files under some specified directories. So the files under these directories need to use a new placement policy. But at the same time, other files still use the default placement policy. Here we need to support multiple placement policies for the HDFS. One plain thought is that, the default placement policy is still configured as the default. On the other hand, HDFS can let user specify customized placement policy through the extended attributes(xattr). When the HDFS choose the replica targets, it firstly check the customized placement policy, if not specified, it fallbacks to the default one. Any thoughts? -- Best Wishes! Yours, Zesheng
Examining effect of changing block placement policy.
Hi, I've made some changed to the default block placement policy and want to see how if affects a cluster. Any suggestions on how I can test the before and after of a cluster after making these changes? I read up a bit on Rumen and GridMix in my search for tools that would help me benchmark things on a cluster. As far as I know, I need some job traces to get the ball rolling. I've googled for sample job traces but didn't find anything. I found this page: http://ftp.pdl.cmu.edu/pub/datasets/hla/dataset.html but I'm not sure how to use the data there. I don't have a ton of data, or a bunch of queries I could run on it. My best idea till now is to run a bunch of sorts on different input sizes, and word counts on different combination of files, all while following an exponential inter-job arrival time. I'm planning to do this on AWS's EC2's free tire. Any suggestions on how to observe the effects of changing the policy would be appreciated. Thank you, Arjun
Building custom block placement policy. What is srcPath?
Hi, I want to write a block placement policy that takes the size of the file being placed into account. Something like what is done in CoHadoop or BEEMR paper. I have the following questions: 1- What is srcPath in chooseTarget? Is it the path to the original un-chunked file, or it is a path to a single block, or something else? I added some code to blockplacementpolicydefault to print out the value of srcPath but the results look odd. 2- Will a simple new File(srcPath) will do? 3- I've spent time looking at hadoop source code. I can't find a way to go from srcPath in chooseTarget to a file size. Every function I think can do it, in FSNamesystem, FSDirectory, etc., is either non-public, or cannot be called from inside the blockmanagement package or blockplacement class. How do I go from srcPath in blockplacement class to size of the file being placed? Thank you, AB
Re: Building custom block placement policy. What is srcPath?
Hello, (Inline) On Thu, Jul 24, 2014 at 11:11 PM, Arjun Bakshi baksh...@mail.uc.edu wrote: Hi, I want to write a block placement policy that takes the size of the file being placed into account. Something like what is done in CoHadoop or BEEMR paper. I have the following questions: 1- What is srcPath in chooseTarget? Is it the path to the original un-chunked file, or it is a path to a single block, or something else? I added some code to blockplacementpolicydefault to print out the value of srcPath but the results look odd. The arguments are documented in the interface javadoc: https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java#L61 The srcPath is the file path of the file on HDFS for which the block placement targets are being requested. 2- Will a simple new File(srcPath) will do? Please rephrase? The srcPath is not a local file if thats what you meant. 3- I've spent time looking at hadoop source code. I can't find a way to go from srcPath in chooseTarget to a file size. Every function I think can do it, in FSNamesystem, FSDirectory, etc., is either non-public, or cannot be called from inside the blockmanagement package or blockplacement class. The block placement is something that, within a context of a new file creation, is called when requesting a new block. At this point the file is not complete, so there is no way to determine its actual length, but only the requested block size. I'm not certain if BlockPlacementPolicy is what will solve your goal. How do I go from srcPath in blockplacement class to size of the file being placed? Are you targeting in-progress files or completed files? The latter form of files would result in placement policy calls iff there's an under-replication/losses/etc. to block replicas of the original set. Only for such operations would you have a possibility to determine the actual full length of file (as explained above). Thank you, AB -- Harsh J
Re: Building custom block placement policy. What is srcPath?
Hi, Thanks for the reply. It cleared up a few things. I hadn't thought of situations of under-replication, but I'll give it some thought now. It should be easier since, as you've mentioned, by that time the namenode knows all the blocks that came from the same file as the under-replicated block. For the most part, I was thinking of when a new file is being placed on the cluster. I think this is what you called in-progress files. Say a new 1GB file needs to be placed on to the cluster. I want to make the system take information of the file being 1GB in size into account while placing all its blocks on to nodes in a cluster. I'm not clear on where the file is broken down into blocks/chunks; in terms of which class, which file system(local or hdfs), or where in the process flow. Knowing that will help me come up with a solution. Where is the last place, in terms of a function or point in process that I can find the name of the original file that is being placed on the system? I'm reading the namenode and fsnamesystem code just to see if I can do what I want from there. Any suggestions will be appreciated. Thank you, AB On 07/24/2014 02:12 PM, Harsh J wrote: Hello, (Inline) On Thu, Jul 24, 2014 at 11:11 PM, Arjun Bakshi baksh...@mail.uc.edu wrote: Hi, I want to write a block placement policy that takes the size of the file being placed into account. Something like what is done in CoHadoop or BEEMR paper. I have the following questions: 1- What is srcPath in chooseTarget? Is it the path to the original un-chunked file, or it is a path to a single block, or something else? I added some code to blockplacementpolicydefault to print out the value of srcPath but the results look odd. The arguments are documented in the interface javadoc: https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java#L61 The srcPath is the file path of the file on HDFS for which the block placement targets are being requested. 2- Will a simple new File(srcPath) will do? Please rephrase? The srcPath is not a local file if thats what you meant. 3- I've spent time looking at hadoop source code. I can't find a way to go from srcPath in chooseTarget to a file size. Every function I think can do it, in FSNamesystem, FSDirectory, etc., is either non-public, or cannot be called from inside the blockmanagement package or blockplacement class. The block placement is something that, within a context of a new file creation, is called when requesting a new block. At this point the file is not complete, so there is no way to determine its actual length, but only the requested block size. I'm not certain if BlockPlacementPolicy is what will solve your goal. How do I go from srcPath in blockplacement class to size of the file being placed? Are you targeting in-progress files or completed files? The latter form of files would result in placement policy calls iff there's an under-replication/losses/etc. to block replicas of the original set. Only for such operations would you have a possibility to determine the actual full length of file (as explained above). Thank you, AB
Re: Building custom block placement policy. What is srcPath?
Hi, Inline. On Fri, Jul 25, 2014 at 2:55 AM, Arjun Bakshi baksh...@mail.uc.edu wrote: Hi, Thanks for the reply. It cleared up a few things. I hadn't thought of situations of under-replication, but I'll give it some thought now. It should be easier since, as you've mentioned, by that time the namenode knows all the blocks that came from the same file as the under-replicated block. For the most part, I was thinking of when a new file is being placed on the cluster. I think this is what you called in-progress files. Say a new 1GB file needs to be placed on to the cluster. I want to make the system take information of the file being 1GB in size into account while placing all its blocks on to nodes in a cluster. You are assuming that all files are loaded into the cluster from an existing file on another FS, such as a local FS and the command 'hadoop fs -put', and thereby you can know what the file length is entirely going to be. This is incorrect as an assumption. Programs can write streams of arbitrary data based on their need, also. A HDFS writer can simply create a new file, and write to its output stream any number of bytes it wants to. To HDFS this is no different than a load. It treats both such writes in the equal way - its merely the client goal differing. I'm not clear on where the file is broken down into blocks/chunks; in terms of which class, which file system(local or hdfs), or where in the process flow. Knowing that will help me come up with a solution. Where is the last place, in terms of a function or point in process that I can find the name of the original file that is being placed on the system? The client program chunks the file as it writes. You can look at the DFSOutputStream class for the client implementation. I'm reading the namenode and fsnamesystem code just to see if I can do what I want from there. Any suggestions will be appreciated. Thank you, AB On 07/24/2014 02:12 PM, Harsh J wrote: Hello, (Inline) On Thu, Jul 24, 2014 at 11:11 PM, Arjun Bakshi baksh...@mail.uc.edu wrote: Hi, I want to write a block placement policy that takes the size of the file being placed into account. Something like what is done in CoHadoop or BEEMR paper. I have the following questions: 1- What is srcPath in chooseTarget? Is it the path to the original un-chunked file, or it is a path to a single block, or something else? I added some code to blockplacementpolicydefault to print out the value of srcPath but the results look odd. The arguments are documented in the interface javadoc: https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java#L61 The srcPath is the file path of the file on HDFS for which the block placement targets are being requested. 2- Will a simple new File(srcPath) will do? Please rephrase? The srcPath is not a local file if thats what you meant. 3- I've spent time looking at hadoop source code. I can't find a way to go from srcPath in chooseTarget to a file size. Every function I think can do it, in FSNamesystem, FSDirectory, etc., is either non-public, or cannot be called from inside the blockmanagement package or blockplacement class. The block placement is something that, within a context of a new file creation, is called when requesting a new block. At this point the file is not complete, so there is no way to determine its actual length, but only the requested block size. I'm not certain if BlockPlacementPolicy is what will solve your goal. How do I go from srcPath in blockplacement class to size of the file being placed? Are you targeting in-progress files or completed files? The latter form of files would result in placement policy calls iff there's an under-replication/losses/etc. to block replicas of the original set. Only for such operations would you have a possibility to determine the actual full length of file (as explained above). Thank you, AB -- Harsh J
HDFS block placement
Hey, I am a bit confused about the block placement in Hadoop. Assume that there is no replication and a task (map or reduce) writes a file to HDFS, will be all blocks stored on the same local node (the node on which the task runs)? I think yes but I am node sure. Kind Regards, Lukas Kairies
Re: HDFS block placement
Your thought is correct. If space is available locally, then it is automatically stored locally. On Fri, Jul 26, 2013 at 5:14 PM, Lukas Kairies lukas.xtree...@googlemail.com wrote: Hey, I am a bit confused about the block placement in Hadoop. Assume that there is no replication and a task (map or reduce) writes a file to HDFS, will be all blocks stored on the same local node (the node on which the task runs)? I think yes but I am node sure. Kind Regards, Lukas Kairies -- Harsh J
Re: Pluggable Block placement policy
Hi Harsh, Here is silly question to you :) I want to ask that where the PluggableBlockPlacementPolicy Code should be put? As Ivan describe how to configure, I think that it should put in org.apache.hadoop.hdfs.server.namenode. But when I put the file in this location and execute the mvn command, It get fails :( Please explain the procedure. On Mon, May 20, 2013 at 12:39 PM, Harsh J ha...@cloudera.com wrote: Hi Mohammad, I believe we've already answered this earlier to an almost exact question: http://search-hadoop.com/m/b8FAa2eI6kj1. Does that not suffice? Is there something more specific you are looking for? On Mon, May 20, 2013 at 12:30 PM, Mohammad Mustaqeem 3m.mustaq...@gmail.com wrote: Has anybody used pluggable block placement policy? If yes then give direction to me, how to use this feature in hadoop 2.0.3-alpha and I also need the code of pluggable block placement policy. Thanks in advance. -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270 -- Harsh J -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Pluggable Block placement policy
Has anybody used pluggable block placement policy? If yes then give direction to me, how to use this feature in hadoop 2.0.3-alpha and I also need the code of pluggable block placement policy. Thanks in advance. -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: Pluggable Block placement policy
Hi Mohammad, I believe we've already answered this earlier to an almost exact question: http://search-hadoop.com/m/b8FAa2eI6kj1. Does that not suffice? Is there something more specific you are looking for? On Mon, May 20, 2013 at 12:30 PM, Mohammad Mustaqeem 3m.mustaq...@gmail.com wrote: Has anybody used pluggable block placement policy? If yes then give direction to me, how to use this feature in hadoop 2.0.3-alpha and I also need the code of pluggable block placement policy. Thanks in advance. -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270 -- Harsh J
Block placement Policy
I have read somewhere that a user can specified his own ReplicaPlacementPolicy. How can I specify my own ReplicaPlacementPolicy? I you have any sample ReplicaPlacementPolicy, then please share it.. -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: Block placement Policy
You might find this useful : https://issues.apache.org/jira/browse/HDFS-385 Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Sat, May 4, 2013 at 8:57 PM, Mohammad Mustaqeem 3m.mustaq...@gmail.comwrote: I have read somewhere that a user can specified his own ReplicaPlacementPolicy. How can I specify my own ReplicaPlacementPolicy? I you have any sample ReplicaPlacementPolicy, then please share it.. -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: Block placement Policy
Do you know, how to use it? Sorry, I am very new :( -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Re: Block placement Policy
Have a look at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault. This is default placement policy, you should be able to extend this to implement your own policy and set config parameter dfs.block.replicator.classname to point to your class On Sat, May 4, 2013 at 8:57 PM, Mohammad Mustaqeem 3m.mustaq...@gmail.com wrote: I have read somewhere that a user can specified his own ReplicaPlacementPolicy. How can I specify my own ReplicaPlacementPolicy? I you have any sample ReplicaPlacementPolicy, then please share it.. -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
Can a jobtracker directly access datanode information (block placement) on namenode?
Hello all, I want to get datanode information (related to block placement) that is kept at a namenode from a jobtracker. As far as I understand, the jobtracker uses the locality-of-data for job scheduling, so I believe the jobtracker is keeping the information somewhere in the source code. However, I could not find the location. Can anyone give me a starting point (source code) where the jobtracker has access to block placement information? Thanks.
Re: Can a jobtracker directly access datanode information (block placement) on namenode?
Hi, Any HDFS client can request a list of block locations for a given file path (node-level detail of where blocks are placed for a file), via the FileSystem#getFileBlockLocations API: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.FileStatus,%20long,%20long) MR too gets this info via the user's InputFormat#getSplits method, and schedules with these locations. On Sat, Jul 21, 2012 at 2:06 AM, Kyungyong Lee iamkyungy...@gmail.com wrote: Hello all, I want to get datanode information (related to block placement) that is kept at a namenode from a jobtracker. As far as I understand, the jobtracker uses the locality-of-data for job scheduling, so I believe the jobtracker is keeping the information somewhere in the source code. However, I could not find the location. Can anyone give me a starting point (source code) where the jobtracker has access to block placement information? Thanks. -- Harsh J
Data block placement in HDFS
Hello, It's interesting to see HDFS provided an interface BlockPlacementPolicy for users to define their own data placement policy in Hadoop v0.21, v0.22, and I think it's very useful for some applications. However, I can't find this interface in the latest stable version of Hadoop v1.0. Is the interface replaced by something else or abandoned completely? If it's abandoned, could anyone tell me why? Thanks, Da
Re: Data block placement in HDFS
I think this deals more with our messed up version numbering then anything. Hadoop-1.0 is not a release that is derived from Hadoop 0.21 or 0.22, it comes from the 0.20 line. Hadoop-2.0 is what comes from v0.21 and 0.22. So Hadoop-2 is going to have this, but Hadoop-1 will not. --Bobby Evans From: Zheng Da zhengda1...@gmail.commailto:zhengda1...@gmail.com Reply-To: hdfs-user@hadoop.apache.orgmailto:hdfs-user@hadoop.apache.org hdfs-user@hadoop.apache.orgmailto:hdfs-user@hadoop.apache.org To: hdfs-user@hadoop.apache.orgmailto:hdfs-user@hadoop.apache.org hdfs-user@hadoop.apache.orgmailto:hdfs-user@hadoop.apache.org Subject: Data block placement in HDFS Hello, It's interesting to see HDFS provided an interface BlockPlacementPolicy for users to define their own data placement policy in Hadoop v0.21, v0.22, and I think it's very useful for some applications. However, I can't find this interface in the latest stable version of Hadoop v1.0. Is the interface replaced by something else or abandoned completely? If it's abandoned, could anyone tell me why? Thanks, Da
Re: Data block placement in HDFS
Is v0.21 or v0.22 a stable version? The website doesn't say it explicitly. Thanks, Da On Tue, Jul 3, 2012 at 12:06 PM, Robert Evans ev...@yahoo-inc.com wrote: I think this deals more with our messed up version numbering then anything. Hadoop-1.0 is not a release that is derived from Hadoop 0.21 or 0.22, it comes from the 0.20 line. Hadoop-2.0 is what comes from v0.21 and 0.22. So Hadoop-2 is going to have this, but Hadoop-1 will not. --Bobby Evans From: Zheng Da zhengda1...@gmail.com Reply-To: hdfs-user@hadoop.apache.org hdfs-user@hadoop.apache.org To: hdfs-user@hadoop.apache.org hdfs-user@hadoop.apache.org Subject: Data block placement in HDFS Hello, It's interesting to see HDFS provided an interface BlockPlacementPolicy for users to define their own data placement policy in Hadoop v0.21, v0.22, and I think it's very useful for some applications. However, I can't find this interface in the latest stable version of Hadoop v1.0. Is the interface replaced by something else or abandoned completely? If it's abandoned, could anyone tell me why? Thanks, Da
RE: block placement
Well, here's my first Hadoop Jira :-) https://issues.apache.org/jira/browse/MAPREDUCE-2636 Cheers, Evert From: Harsh J [ha...@cloudera.com] Sent: Thursday, June 30, 2011 4:59 PM To: hdfs-user@hadoop.apache.org Subject: Re: block placement Evert, With the default behavior, every block request is handed a storage device in a round robin fashion. So yes, parallel writes should be well spread over the configured amount of disks. But that's not to say that the file is perfectly distributed across the DNs as you describe. The DNs are chosen randomly for writes (if no local one is available). Regd. MR, I do not believe it does any such optimization right now (in fact, the MR code is quite FS-agnostic). Right now, tasks are run on nodes where blocks are located but metadata about which disk the block may reside on is not maintained by the NN, so MR can't naturally know this to do anything about. This would be good to discuss, however (Search or file a new JIRA?) On Thu, Jun 30, 2011 at 6:33 PM, Evert Lammerts evert.lamme...@sara.nl wrote: Hi list, How does the NN place blocks on the disks within a single node? Does it spread out adjecent blocks of a single file horizontally over the disks? For example, lets say I have four DN's and each has 4 disks. (And forget about replication.) If I copy a file existing of 16 blocks of 128MB each to the cluster, will each disk have exactly one block of the file? If I run some job over this file with its sixteen blocks this is important, since the cluster would use its maximum I/O capabilities. This leads me to another question (which might be better of on mapred-user). Does the JT schedule its tasks to maximally use I/O capabilities? Would it try to process blocks that reside on a disk that is not currently being read from or written to? Or does it just use a randomized strategy? Cheers, Evert -- Harsh J
Re: block placement
Evert, With the default behavior, every block request is handed a storage device in a round robin fashion. So yes, parallel writes should be well spread over the configured amount of disks. But that's not to say that the file is perfectly distributed across the DNs as you describe. The DNs are chosen randomly for writes (if no local one is available). Regd. MR, I do not believe it does any such optimization right now (in fact, the MR code is quite FS-agnostic). Right now, tasks are run on nodes where blocks are located but metadata about which disk the block may reside on is not maintained by the NN, so MR can't naturally know this to do anything about. This would be good to discuss, however (Search or file a new JIRA?) On Thu, Jun 30, 2011 at 6:33 PM, Evert Lammerts evert.lamme...@sara.nl wrote: Hi list, How does the NN place blocks on the disks within a single node? Does it spread out adjecent blocks of a single file horizontally over the disks? For example, lets say I have four DN's and each has 4 disks. (And forget about replication.) If I copy a file existing of 16 blocks of 128MB each to the cluster, will each disk have exactly one block of the file? If I run some job over this file with its sixteen blocks this is important, since the cluster would use its maximum I/O capabilities. This leads me to another question (which might be better of on mapred-user). Does the JT schedule its tasks to maximally use I/O capabilities? Would it try to process blocks that reside on a disk that is not currently being read from or written to? Or does it just use a randomized strategy? Cheers, Evert -- Harsh J
Local block placement policy, request
Is there a way to send a request to the name node to replicate block(s) to a specific DataNode? If not, what would be a way to do this? -Thanks
Re: Local block placement policy, request
Hey Jason, There are some non-public APIs to do this -- have a look at how the Balancer works - the dispatch() function is the guts you're looking for. It might be nice to expose this functionality as a limited private evolving API. In general, though, keep in mind that, whenever you write data, you'll get a local copy first, if the writer is in the cluster. That's how HBase gets locality for most of its accesses. -Todd On Thu, May 26, 2011 at 11:36 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Is there a way to send a request to the name node to replicate block(s) to a specific DataNode? If not, what would be a way to do this? -Thanks -- Todd Lipcon Software Engineer, Cloudera
Re: Block placement in HDFS
Hi Dennis, There were some discussions on this topic earlier: http://issues.apache.org/jira/browse/HADOOP-3799 Do you have any specific use-case for this feature? thanks, dhruba On Mon, Nov 24, 2008 at 10:22 PM, Owen O'Malley [EMAIL PROTECTED] wrote: On Nov 24, 2008, at 8:44 PM, Mahadev Konar wrote: Hi Dennis, I don't think that is possible to do. No, it is not possible. The block placement is determined by HDFS internally (which is local, rack local and off rack). Actually, it was changed in 0.17 or so to be node-local, off-rack, and a second node off rack. -- Owen
Re: Block placement in HDFS
Fyi - Owen is referring to: https://issues.apache.org/jira/browse/HADOOP-2559 On 11/24/08 10:22 PM, Owen O'Malley [EMAIL PROTECTED] wrote: On Nov 24, 2008, at 8:44 PM, Mahadev Konar wrote: Hi Dennis, I don't think that is possible to do. No, it is not possible. The block placement is determined by HDFS internally (which is local, rack local and off rack). Actually, it was changed in 0.17 or so to be node-local, off-rack, and a second node off rack. -- Owen
Re: Block placement in HDFS
Hi All, I try to divide some data into partitions explicitly (like regions of Hbase). I wonder the following way to do is the best method. For example, when we assume a block size 64MB, a file potion corresponding to 0~63MB is allocated to first block? I have three questions as follows: Is the above method valid? Is it the best method? Is there alternative method? Thank in advance. -- Hyunsik Choi Database Information Systems Group Dept. of Computer Science Engineering, Korea University On Mon, 2008-11-24 at 20:44 -0800, Mahadev Konar wrote: Hi Dennis, I don't think that is possible to do. The block placement is determined by HDFS internally (which is local, rack local and off rack). mahadev On 11/24/08 6:59 PM, dennis81 [EMAIL PROTECTED] wrote: Hi everyone, I was wondering whether it is possible to control the placement of the blocks of a file in HDFS. Is it possible to instruct HDFS about which nodes will hold the block replicas? Thanks!
Re: Block placement in HDFS
On Nov 24, 2008, at 8:44 PM, Mahadev Konar wrote: Hi Dennis, I don't think that is possible to do. No, it is not possible. The block placement is determined by HDFS internally (which is local, rack local and off rack). Actually, it was changed in 0.17 or so to be node-local, off-rack, and a second node off rack. -- Owen
Random block placement
My understanding is that HDFS places blocks randomly. As I would expect, then, when I use hadoop fsck to look at block placements for my files, I see that some nodes have more blocks than the average. I would expect that these hot spots would cause a performance hit relative to a more even placement of blocks. I'd like to experiment with non-random block placement to see if this can provide a performance improvement. Where in the code would I start looking to find the existing code for random placement? Cheers, John
Re: Random block placement
Hi John, This file should be a good starting point for you. src/hdfs/org/apache/hadoop/hdfs/server/namenode/ReplicationtargetChooser.java There has been discussions about a pluggable block place policy https://issues.apache.org/jira/browse/HADOOP-3799 Thanks, Lohit - Original Message From: John DeTreville [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Tuesday, August 12, 2008 3:31:49 PM Subject: Random block placement My understanding is that HDFS places blocks randomly. As I would expect, then, when I use hadoop fsck to look at block placements for my files, I see that some nodes have more blocks than the average. I would expect that these hot spots would cause a performance hit relative to a more even placement of blocks. I'd like to experiment with non-random block placement to see if this can provide a performance improvement. Where in the code would I start looking to find the existing code for random placement? Cheers, John
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
When you factor in colo, power, configuration, and administration costs, fewer boxes is always cheaper. I'm not expecting a 2x speedup for the extra CPU, but I'm curious what the hit is. Ted Dunning wrote: Doesn't the incremental CPU cost you as much as an entire extra box? On 2/12/08 12:19 PM, Colin Evans [EMAIL PROTECTED] wrote: The big question for me is how well a dual-CPU 4-core (8 cores per box) configuration will do. Has anyone tried out this configuration with Intel or AMD CPUs? Is the memory throughput sufficient?
RE: Question on DFS block placement and 'what is a rack' wrt DFS block placement
There may still be remaining issues with. One I am aware of is https://issues.apache.org/jira/browse/HADOOP-2677 where smaller capacity nodes become too highly utilized to store mapred intermediate output. -Original Message- From: Jason Venner [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 12, 2008 12:02 PM To: core-user@hadoop.apache.org Subject: Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement We are currently running 15.3, and hope to move to 16.1 when it comes out... Where the heterogeneous disk space issues fixed in15.3? Ted Dunning wrote: I have had issues with machines that are highly disparate in terms of disk space. I expect that some of those issues have been mitigated in recent releases. On 2/12/08 11:51 AM, Jason Venner [EMAIL PROTECTED] wrote: We are starting to build larger clusters, and want to better understand how to configure the network topology. Up to now we have just been setting up a private vlan for the small clusters. We have been thinking about the following machine configurations Compute nodes with a number of spindles and medium disk, that also serve DFS For every 4-8 of the above, one compute node with a large number of spindles with a large number of disks, to bulk out th DFS capacity. We are wondering what the best practices are for network topology in clusters that are built out of the above building blocks. We can readily have 2 or 4 network cards in each node.
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
It isn't popular much anymore, but once upon a time, network topology for clustering was a big topic. Since then, switches have gotten pretty fast and worrying about these things has gone out of fashion a bit other than something on the level of the current rack-aware locality in Hadoop. With 4 NIC's, you could replay some history, however, by building what amounts to a 4-dimensional hyper-torus. I *think* you could pretty well by having four parallel two-level switch networks and assign boxes to second level switches according to a systematic pattern so that you would have local access to much of the cluster. As a simple example, suppose that you have 16 machines M1 through M16, each with two interfaces. You would have two top level switches T1 and T2 and each of those would have four second level switches S1.1 ... S1.4 connected to T1 and S2.1 ... S2.4 connected to T2. The machine connectivity on each interface would look like this: Machineeth0 eth1 1 S1.1 S2.1 2 S1.1 S2.2 3 S1.1 S2.3 4 S1.1 S2.4 5 S1.2 S2.1 6 S1.2 S2.2 7 S1.2 S2.3 8 S1.2 S2.4 9 S1.3 S2.1 10 S1.3S2.2 11 S1.3S2.3 12 S1.3S2.4 13 S1.4S2.1 14 S1.4S2.2 15 S1.4S2.3 16 S1.4S2.4 With this arrangement machine M1 is on a local switch with M2, M3, M4, M5, M9, and M13 which is twice as many machines as would be local if only one interface were used. With four interfaces and four machines on a local switch, the entire cluster is local and T1 and T2 should see no traffic. In a larger cluster, you get the same 16x benefit in locality. The cost is that your ops guys will kill you if you suggest something this elaborate. The wiring between racks alone will make this a nightmare. On 2/12/08 3:01 PM, Jason Venner [EMAIL PROTECTED] wrote: In terms of more exotic situations we were discussing having 4 NIC's 1 for the local subset, 1 for a pair of local subsets, 1 for another pair of local subsets, 1 for the backbone.
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
Doesn't the incremental CPU cost you as much as an entire extra box? On 2/12/08 12:19 PM, Colin Evans [EMAIL PROTECTED] wrote: The big question for me is how well a dual-CPU 4-core (8 cores per box) configuration will do. Has anyone tried out this configuration with Intel or AMD CPUs? Is the memory throughput sufficient?
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
I would concur that it is much better to have sufficient storage in the compute farm for DFS files to be local for the compute tasks. Also, a 16 disk machine typically costs a good bit more than a 6 disk machine + 10 disks because you usually require a second chassis. Sun's Thumper would be an interesting counter-example of this. I have found (in my limited experience) that you want as many disk controllers as you can get and that you want the disk as close to your compute power as possible. For me, that means that my ideal machine is a moderate CPU or two attached to 1-3 TB of storage. My smallest machines have slow CPU with two SATA drives (could be 2 x 500GB, but mostly are 500GB + 73GB for historical reasons). These machines can be had for $500 second-hand and $1000 new from reputable vendors. My larger machines have 6 disks and dual Xeons, but cost about $3-4K and only have about twice the net Hadoop throughput and take up twice the rack space. I would *much* rather have 6 times as many of the little boxes. On 2/12/08 1:01 PM, Doug Cutting [EMAIL PROTECTED] wrote: From my reading, I conjecture that an ideal configuration would be 1 local disk per cpu for local data/reducing, and some number of separate disks for dfs. Is this an accurate assessment? DFS storage is typically local on compute nodes.
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
Jason Venner wrote: Is disk arm contention (seek) a problem in a 6 disk configuration as most likely all of the disks would be serving /local/ and /dfs/? It should not be. MapReduce i/o is is sequential, in chunks large enough that seeks should not dominate. Doug
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
Why not down-grade the CPU power and increase the number of chassis to get more disks (and controllers and network interfaces)? On 2/12/08 12:53 PM, Jason Venner [EMAIL PROTECTED] wrote: We have 3 types of machines we can get, 2 disk, 6 disk and 16 disk machines. *They all have 4 dual core cpus.*
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
Jason Venner wrote: We have 3 types of machines we can get, 2 disk, 6 disk and 16 disk machines. They all have 4 dual core cpus. The 2 disk machines have about 1 TB, the 6 disks about 3TB and the 16 disk about 8TB. The 16 disk machines have about 25% slower CPU's than the 2/6 disk machines. We handle a lot of bulky data, and don't think we can fit it all o the 3TB machines if those are our sole compute/dfs nodes. Your performance will be better if you buy enough of the 6 disk nodes to hold all your data than if you intermix 16 disk nodes. Are the 16 disk nodes considerably cheaper per byte stored than the 6 disk boxes? From my reading, I conjecture that an ideal configuration would be 1 local disk per cpu for local data/reducing, and some number of separate disks for dfs. Is this an accurate assessment? DFS storage is typically local on compute nodes. Doug
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
We run the intel cpu's. The newer NorthBridges seem to have more memory bandwidth than the older ones. We have a mix of special purpose assembly we call from java, so we are locked into intel right now. The performance benchmarks I have seen suggest java runs about 30% faster on the AMD's due to the higher memory bandwidth. Colin Evans wrote: Because of acquiring servers of different capacities at different times, we have 2 servers with 1TB of disk each, and 11 servers with ~300GB each. The 1TB servers tend to be under-utilized by HDFS given their capacity. This makes sense, as block replicas need to be relatively evenly distributed across the cluster in order to allow tasks to be run close to data. For out next cluster, we're going with uniform disk, CPU, and memory configurations. The big question for me is how well a dual-CPU 4-core (8 cores per box) configuration will do. Has anyone tried out this configuration with Intel or AMD CPUs? Is the memory throughput sufficient? Jason Venner wrote: We are starting to build larger clusters, and want to better understand how to configure the network topology. Up to now we have just been setting up a private vlan for the small clusters. We have been thinking about the following machine configurations Compute nodes with a number of spindles and medium disk, that also serve DFS For every 4-8 of the above, one compute node with a large number of spindles with a large number of disks, to bulk out th DFS capacity. We are wondering what the best practices are for network topology in clusters that are built out of the above building blocks. We can readily have 2 or 4 network cards in each node. -- Jason Venner Attributor - Publish with Confidence http://www.attributor.com/ Attributor is hiring Hadoop Wranglers, contact if interested
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
Because of acquiring servers of different capacities at different times, we have 2 servers with 1TB of disk each, and 11 servers with ~300GB each. The 1TB servers tend to be under-utilized by HDFS given their capacity. This makes sense, as block replicas need to be relatively evenly distributed across the cluster in order to allow tasks to be run close to data. For out next cluster, we're going with uniform disk, CPU, and memory configurations. The big question for me is how well a dual-CPU 4-core (8 cores per box) configuration will do. Has anyone tried out this configuration with Intel or AMD CPUs? Is the memory throughput sufficient? Jason Venner wrote: We are starting to build larger clusters, and want to better understand how to configure the network topology. Up to now we have just been setting up a private vlan for the small clusters. We have been thinking about the following machine configurations Compute nodes with a number of spindles and medium disk, that also serve DFS For every 4-8 of the above, one compute node with a large number of spindles with a large number of disks, to bulk out th DFS capacity. We are wondering what the best practices are for network topology in clusters that are built out of the above building blocks. We can readily have 2 or 4 network cards in each node.
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
I have had issues with machines that are highly disparate in terms of disk space. I expect that some of those issues have been mitigated in recent releases. On 2/12/08 11:51 AM, Jason Venner [EMAIL PROTECTED] wrote: We are starting to build larger clusters, and want to better understand how to configure the network topology. Up to now we have just been setting up a private vlan for the small clusters. We have been thinking about the following machine configurations Compute nodes with a number of spindles and medium disk, that also serve DFS For every 4-8 of the above, one compute node with a large number of spindles with a large number of disks, to bulk out th DFS capacity. We are wondering what the best practices are for network topology in clusters that are built out of the above building blocks. We can readily have 2 or 4 network cards in each node.
Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
We are currently running 15.3, and hope to move to 16.1 when it comes out... Where the heterogeneous disk space issues fixed in15.3? Ted Dunning wrote: I have had issues with machines that are highly disparate in terms of disk space. I expect that some of those issues have been mitigated in recent releases. On 2/12/08 11:51 AM, Jason Venner [EMAIL PROTECTED] wrote: We are starting to build larger clusters, and want to better understand how to configure the network topology. Up to now we have just been setting up a private vlan for the small clusters. We have been thinking about the following machine configurations Compute nodes with a number of spindles and medium disk, that also serve DFS For every 4-8 of the above, one compute node with a large number of spindles with a large number of disks, to bulk out th DFS capacity. We are wondering what the best practices are for network topology in clusters that are built out of the above building blocks. We can readily have 2 or 4 network cards in each node.