
With all due to respect to the senior members of this site, I wanted to first 
congratulate Lokesh for his interest in Hadoop. I want to know how many fresh 
graduates are interested in this technology. I guess not many. So we have to 
welcome Lokesh to Hadoop world.

I agree to the seniors.......It is good and important to know the real world 
problems ....

But coming to your question - as per my knowledge - if u want to learn / shine 
in Hadoop - know the following compulsorily.
1) Linux
2) Java
3) Sql

Seniors may correct me or add or modify to the following list.


 From: Sanjay Subramanian <sanjay.subraman...@wizecommerce.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>; "ch...@embree.us" 
Sent: Thursday, May 23, 2013 11:03 PM
Subject: Re: Where to begin from??

I agree with Chris…don't worry about what the technology is called Hadoop , Big 
table, Lucene, Hive….Model the problem and see what the solution could 
be….that’s very important 

And Lokesh please don't mind…we are writing to u perhaps stuff that u don't 
want to hear but its an important real perspective

To illustrate what I mean let me give u a few problems to think about and see 
how u would solve them….

1. Before Microsoft took over Skype at least this feature used to be there and 
the feature is like this……u type the name of a person and it used to come back 
with some search results in milliseconds often searching close to a billion 
names…….How would u design such a search architecture ?

2.  In 2012, say 50 million users (cookie based) searched Macys.com on a SALES 
weekend and say 20,000 bought $100 dollar shoes. Now this year 2013 on that 
SALES weekend 60 million users (cookie based) are buying on the website….You 
want to give a 25% extra reward to only those cookies that were from last 
year…So u are looking for an intersection set of possibly 20,000 cookies in two 
sets - 50million and 60 million…..How would u solve this problem within milli 
seconds  ?

3. Last my favorite….The Postal Services department wants to think of new 
business ideas to avoid bankruptcy…One idea I have is they have zillion small 
delivery vans that go to each street in the country….Say I lease out the space 
to BIG wireless phone providers and promise them them that I will mount 
wireless signal strength measurement systems on these vans and I will provide 
them data 3  times a day…how will u devise a solution to analyse and store data 

I am sure if u look around in India as well u will see a lot of situations 
where u want to solve a problem….

As Chris says , think about the problem u want to solve, then model the 
solutions and pick the best one…

On the flip side….I can tell u it will still be a few years till many Banks and 
Stock trading houses will believe in Cassandra and Hbase for OLTP because that 
data is critical……If your timeline in Facebook does not show a photo , its 
possibly OK but if your 1 million deposit I a bank does not show up for days or 
suddenly vanishes - u r possibly not going to take that lightly…..

Ok enough RAMBLING….

Good luck


From: Chris Embree <cemb...@gmail.com>
Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org>, "ch...@embree.us" 
Date: Thursday, May 23, 2013 7:47 PM
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: Re: Where to begin from??

I'll be chastised and have mean things said about me for this. 

Get some experience in IT before you start looking at Hadoop.  My reasoning is 
this:  If you don't know how to develop real applications in a Non-Hadoop 
world, you'll struggle a lot to develop with Hadoop.

Asking what "things you need to know in compulsory" is like saying you want to 
"learn computers" -- totally worthless!  Find a problem to solve and seek to 
learn the tools you need to solve your problem.  Otherwise, your learning is 
un-applied and somewhat useless. 

Picture a recent acting school graduate how to direct the next Star Wars movie. 
 It's almost like that.

On Thu, May 23, 2013 at 10:39 PM, Lokesh Basu <lokesh.b...@gmail.com> wrote:

Hi all, 
>I'm a computer science undergraduate and has recently started to explore about 
>Hadoop. I find it very interesting and want to get involved both as 
>contributor and developer for this open source project. I have been going 
>through many text book related to Hadoop and HDFS but still I find it very 
>difficult as to where should a beginner start from before writing his first 
>line of code as contributer or developer.
Also please tell me what are the things I compulsorily need to know before I 
dive into depth of these things.  
>Thanking you all in anticipation. 
>Lokesh Chandra Basu
>B. Tech
>Computer Science and Engineering
>Indian Institute of Technology, Roorkee
>India(GMT +5hr 30min)

This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient,
 please contact the sender by reply email and destroy all copies of the 
original message along with any attachments, from your computer system. If you 
are the intended recipient, please be advised that the content of this message 
is subject to access, review
 and disclosure by the sender's Email System Administrator.

Reply via email to