Hi, This is indeed a good way to explain, most of the improvement has already been discussed. waiting for sequel of this comic.
Regards, Abhishek On Wed, Nov 30, 2011 at 1:55 PM, maneesh varshney <mvarsh...@gmail.com>wrote: > Hi Matthew > > I agree with both you and Prashant. The strip needs to be modified to > explain that these can be default values that can be optionally overridden > (which I will fix in the next iteration). > > However, from the 'understanding concepts of HDFS' point of view, I still > think that block size and replication factors are the real strengths of > HDFS, and the learners must be exposed to them so that they get to see how > hdfs is significantly different from conventional file systems. > > On personal note: thanks for the first part of your message :) > > -Maneesh > > > On Wed, Nov 30, 2011 at 1:36 PM, GOEKE, MATTHEW (AG/1000) < > matthew.go...@monsanto.com> wrote: > > > Maneesh, > > > > Firstly, I love the comic :) > > > > Secondly, I am inclined to agree with Prashant on this latest point. > While > > one code path could take us through the user defining command line > > overrides (e.g. hadoop fs -D blah -put foo bar) I think it might confuse > a > > person new to Hadoop. The most common flow would be using admin > determined > > values from hdfs-site and the only thing that would need to change is > that > > conversation happening between client / server and not user / client. > > > > Matt > > > > -----Original Message----- > > From: Prashant Kommireddi [mailto:prash1...@gmail.com] > > Sent: Wednesday, November 30, 2011 3:28 PM > > To: common-user@hadoop.apache.org > > Subject: Re: HDFS Explained as Comics > > > > Sure, its just a case of how readers interpret it. > > > > 1. Client is required to specify block size and replication factor each > > time > > 2. Client does not need to worry about it since an admin has set the > > properties in default configuration files > > > > A client could not be allowed to override the default configs if they are > > set final (well there are ways to go around it as well as you suggest by > > using create(....) :) > > > > The information is great and helpful. Just want to make sure a beginner > who > > wants to write a "WordCount" in Mapreduce does not worry about specifying > > block size' and replication factor in his code. > > > > Thanks, > > Prashant > > > > On Wed, Nov 30, 2011 at 1:18 PM, maneesh varshney <mvarsh...@gmail.com > > >wrote: > > > > > Hi Prashant > > > > > > Others may correct me if I am wrong here.. > > > > > > The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of block > > size > > > and replication factor. In the source code, I see the following in the > > > DFSClient constructor: > > > > > > defaultBlockSize = conf.getLong("dfs.block.size", > DEFAULT_BLOCK_SIZE); > > > > > > defaultReplication = (short) conf.getInt("dfs.replication", 3); > > > > > > My understanding is that the client considers the following chain for > the > > > values: > > > 1. Manual values (the long form constructor; when a user provides these > > > values) > > > 2. Configuration file values (these are cluster level defaults: > > > dfs.block.size and dfs.replication) > > > 3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3) > > > > > > Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the > API > > to > > > create a file is > > > void create(...., short replication, long blocksize); > > > > > > I presume it means that the client already has knowledge of these > values > > > and passes them to the NameNode when creating a new file. > > > > > > Hope that helps. > > > > > > thanks > > > -Maneesh > > > > > > On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi < > > prash1...@gmail.com > > > >wrote: > > > > > > > Thanks Maneesh. > > > > > > > > Quick question, does a client really need to know Block size and > > > > replication factor - A lot of times client has no control over these > > (set > > > > at cluster level) > > > > > > > > -Prashant Kommireddi > > > > > > > > On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges < > dejan.men...@gmail.com > > > > >wrote: > > > > > > > > > Hi Maneesh, > > > > > > > > > > Thanks a lot for this! Just distributed it over the team and > comments > > > are > > > > > great :) > > > > > > > > > > Best regards, > > > > > Dejan > > > > > > > > > > On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney < > > mvarsh...@gmail.com > > > > > >wrote: > > > > > > > > > > > For your reading pleasure! > > > > > > > > > > > > PDF 3.3MB uploaded at (the mailing list has a cap of 1MB > > > attachments): > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 > > > > > > > > > > > > > > > > > > Appreciate if you can spare some time to peruse this little > > > experiment > > > > of > > > > > > mine to use Comics as a medium to explain computer science > topics. > > > This > > > > > > particular issue explains the protocols and internals of HDFS. > > > > > > > > > > > > I am eager to hear your opinions on the usefulness of this visual > > > > medium > > > > > to > > > > > > teach complex protocols and algorithms. > > > > > > > > > > > > [My personal motivations: I have always found text descriptions > to > > be > > > > too > > > > > > verbose as lot of effort is spent putting the concepts in proper > > > > > time-space > > > > > > context (which can be easily avoided in a visual medium); > sequence > > > > > diagrams > > > > > > are unwieldy for non-trivial protocols, and they do not explain > > > > concepts; > > > > > > and finally, animations/videos happen "too fast" and do not offer > > > > > > self-paced learning experience.] > > > > > > > > > > > > All forms of criticisms, comments (and encouragements) welcome :) > > > > > > > > > > > > Thanks > > > > > > Maneesh > > > > > > > > > > > > > > > > > > > > This e-mail message may contain privileged and/or confidential > > information, and is intended to be received only by persons entitled > > to receive such information. If you have received this e-mail in error, > > please notify the sender immediately. Please delete it and > > all attachments from any servers, hard drives or any other media. Other > > use of this e-mail by you is strictly prohibited. > > > > All e-mails and attachments sent and received are subject to monitoring, > > reading and archival by Monsanto, including its > > subsidiaries. The recipient of this e-mail is solely responsible for > > checking for the presence of "Viruses" or other "Malware". > > Monsanto, along with its subsidiaries, accepts no liability for any > damage > > caused by any such code transmitted by or accompanying > > this e-mail or any attachment. > > > > > > The information contained in this email may be subject to the export > > control laws and regulations of the United States, potentially > > including but not limited to the Export Administration Regulations (EAR) > > and sanctions regulations issued by the U.S. Department of > > Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of > this > > information you are obligated to comply with all > > applicable U.S. export laws and regulations. > > > > >