Re: hadoop learning
Hi Rishabh, I didn't know anything about Hadoop a few months ago, and I started from the very beginning. I don't suggest you to start with online documentation, that is always fragmented, incomplete and sometimes not even up to date. Also starting by directly using Hadoop is the fastest way to frustration and will just take you to abandon this technology. I can suggest you two books I used to start with, and they have been quite helpful for someone who didn't even know what mapreduce is. They provide many examples and use cases (especially the first one): - OReilly - Hadoop The Definitive Guide 3rd Edition. This is quite old but, other than the coding part, it could explain quite well what hadoop is, what it does and how it works. It is mainly about old versions of Hadoop, but I believe it's something you should know, even because most of articles online still refer to the pre-YARN terminology. - Addison-Wesley Professional - Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2. This is what you I used to really understand the new hadoop architecture and terminology. Sometimes it gives too many details, but better more than less. It also has a couple of chapters about installing Hadoop. Good luck Fabio On Sat, Feb 21, 2015 at 3:33 PM, Ted Yu yuzhih...@gmail.com wrote: Rishabh: You can start with: http://wiki.apache.org/hadoop/HowToContribute There're several components: common, hdfs, YARN, mapreduce, ... Which ones are you interested in ? Cheers On Sat, Feb 21, 2015 at 12:18 AM, Bhupendra Gupta bhupendra1...@gmail.com wrote: I have been learning and trying to implement a hadoop ecosystem for one of the POC from last 1 month or so and i think that the best way to learn is by doing it.. Hadoop as the concept has lots of implementation and i picked up hortonworks sandbox for learning... This has helped me in guaging some of the concepts and few practical understanding as well. Happy learning Sent from my iPhone Bhupendra Gupta On 21-Feb-2015, at 1:39 pm, Rishabh Agrawal ss.rishab...@gmail.com wrote: Hello, Please tell me where can i learn the concepts of Big Data and Hadoop from the scratch. Please provide some links online. Rishabh Agrawal
Re: hadoop learning
I have been learning and trying to implement a hadoop ecosystem for one of the POC from last 1 month or so and i think that the best way to learn is by doing it.. Hadoop as the concept has lots of implementation and i picked up hortonworks sandbox for learning... This has helped me in guaging some of the concepts and few practical understanding as well. Happy learning Sent from my iPhone Bhupendra Gupta On 21-Feb-2015, at 1:39 pm, Rishabh Agrawal ss.rishab...@gmail.com wrote: Hello, Please tell me where can i learn the concepts of Big Data and Hadoop from the scratch. Please provide some links online. Rishabh Agrawal
Re: hadoop learning
Rishabh: You can start with: http://wiki.apache.org/hadoop/HowToContribute There're several components: common, hdfs, YARN, mapreduce, ... Which ones are you interested in ? Cheers On Sat, Feb 21, 2015 at 12:18 AM, Bhupendra Gupta bhupendra1...@gmail.com wrote: I have been learning and trying to implement a hadoop ecosystem for one of the POC from last 1 month or so and i think that the best way to learn is by doing it.. Hadoop as the concept has lots of implementation and i picked up hortonworks sandbox for learning... This has helped me in guaging some of the concepts and few practical understanding as well. Happy learning Sent from my iPhone Bhupendra Gupta On 21-Feb-2015, at 1:39 pm, Rishabh Agrawal ss.rishab...@gmail.com wrote: Hello, Please tell me where can i learn the concepts of Big Data and Hadoop from the scratch. Please provide some links online. Rishabh Agrawal
Re: Hadoop Learning Environment
Hi, Yay, I followed the steps you described and got the following error. Any idea? vagrant up creating provisioner directive for running tests Bringing machine 'bigtop1' up with 'virtualbox' provider... == bigtop1: Box 'puppetlab-centos-64-nocm' could not be found. Attempting to find and install... bigtop1: Box Provider: virtualbox bigtop1: Box Version: = 0 == bigtop1: Adding box 'puppetlab-centos-64-nocm' (v0) for provider: virtualbox bigtop1: Downloading: http://puppet-vagrant-boxes.puppetlabs.com/centos-64-x64-vbox4210-nocm.box == bigtop1: Successfully added box 'puppetlab-centos-64-nocm' (v0) for 'virtualbox'! There are errors in the configuration of this machine. Please fix the following errors and try again: vm: * The 'hostmanager' provisioner could not be found. Thanks Jim On Nov 4, 2014, at 6:36 PM, jay vyas jayunit100.apa...@gmail.com wrote: Hi daemon: Actually, for most folks who would want to actually use a hadoop cluster, i would think setting up bigtop is super easy ! If you have issues with it ping me and I can help you get started. Also, we have docker containers - so you dont even *need* a VM to run a 4 or 5 node hadoop cluster. install vagrant install VirtualBox git clone https://github.com/apache/bigtop cd bigtop/bigtop-deploy/vm/vagrant-puppet vagrant up Then vagrant destroy when your done. This to me is easier than manually downloading an appliance, picking memory starting the virtualbox gui, loading the appliance , etc... and also its easy to turn the simple single node bigtop VM into a multinode one, by just modifying the vagrantile. On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle daeme...@gmail.com wrote: What you want as a sandbox depends on what you are trying to learn. If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all of the suggestions (perhaps excluding BigTop due to its setup complexities) are great. Laptop? perhaps but laptop's are really kind of infuriatingly slow (because of the hardware - you pay a price for a 30-45watt average heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7 with lots of memory. What do you think of the thought that you will pretty quickly graduate to wanting a small'ish desktop for your sandbox? A simple, single node, Hadoop instance will let you learn many things. The next level of complexity comes when you are attempting to deal with data whose processing needs to be split up, so you can learn about how to split data in Mapping, reduce the splits via reduce jobs, etc. For that, you could get a windows desktop box or e.g. RedHat/CentOS and use virtualization. Something like a 4 core i5 with 32gb of memory, running 3 or for some things 4, vm's. You could load e.g. hortonworks into each of the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives off of eBay and you can have a lot of learning. ... “The race is not to the swift, nor the battle to the strong, but to those who can see it coming and jump aside.” - Hunter Thompson Daemeon On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano osum...@gmail.com wrote: you can try the pivotal vm as well. http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov lfedo...@hortonworks.com wrote: Tim, download Sandbox from http://hortonworks/com You will have everything needed in a small VM instance which will run on your home desktop. Thank you! Sincerely, Leonid Fedotov Systems Architect - Professional Services lfedo...@hortonworks.com office: +1 855 846 7866 ext 292 mobile: +1 650 430 1673 On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not
Re: Hadoop Learning Environment
Hi tim. Id suggest using apache bigtop for this. BigTop integrates the hadoop ecosystem into a single upstream distribution, packages everything, curates smoke tests, vagrant, docker recipes for deployment. Also, we curate a blueprint hadoop application (bigpetstore) which you build yourself, easily, and can run to generate, process, and visualize the bigdata ecosystem. You can also easily deploy bigtop onto ec2 if you want to pay for it . On Tue, Nov 4, 2014 at 2:28 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- jay vyas
Re: Hadoop Learning Environment
Tim, download Sandbox from http://hortonworks/com You will have everything needed in a small VM instance which will run on your home desktop. *Thank you!* *Sincerely,* *Leonid Fedotov* Systems Architect - Professional Services lfedo...@hortonworks.com office: +1 855 846 7866 ext 292 mobile: +1 650 430 1673 On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Hadoop Learning Environment
Hello Tim, Horton and Cloudera both offer VM’s (Including Virtual box, which is free) you can pull down to play with, if you’re looking just for something small to get you started. i’m partial to the horton works one myself. Hope that help. JC On Nov 4, 2014, at 2:28 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net http://pool.sks-keyservers.net/ --recv-keys F186197B
Re: Hadoop Learning Environment
Or on your local laptop or desktop you can setup the env using VM and VM image of Hadoop and related components. Wrote instructions sometime back here https://www.linkedin.com/today/post/article/20140924133831-2560863-new-to-hadoop-and-want-to-setup-dev-environment On Nov 5, 2014 2:25 AM, Jim Colestock j...@ramblingredneck.com wrote: Hello Tim, Horton and Cloudera both offer VM’s (Including Virtual box, which is free) you can pull down to play with, if you’re looking just for something small to get you started. i’m partial to the horton works one myself. Hope that help. JC On Nov 4, 2014, at 2:28 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: Hadoop Learning Environment
you can try the pivotal vm as well. http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov lfedo...@hortonworks.com wrote: Tim, download Sandbox from http://hortonworks/com You will have everything needed in a small VM instance which will run on your home desktop. *Thank you!* *Sincerely,* *Leonid Fedotov* Systems Architect - Professional Services lfedo...@hortonworks.com office: +1 855 846 7866 ext 292 mobile: +1 650 430 1673 On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Hadoop Learning Environment
What you want as a sandbox depends on what you are trying to learn. If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all of the suggestions (perhaps excluding BigTop due to its setup complexities) are great. Laptop? perhaps but laptop's are really kind of infuriatingly slow (because of the hardware - you pay a price for a 30-45watt average heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7 with lots of memory. What do you think of the thought that you will pretty quickly graduate to wanting a small'ish desktop for your sandbox? A simple, single node, Hadoop instance will let you learn many things. The next level of complexity comes when you are attempting to deal with data whose processing needs to be split up, so you can learn about how to split data in Mapping, reduce the splits via reduce jobs, etc. For that, you could get a windows desktop box or e.g. RedHat/CentOS and use virtualization. Something like a 4 core i5 with 32gb of memory, running 3 or for some things 4, vm's. You could load e.g. hortonworks into each of the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives off of eBay and you can have a lot of learning. *...“The race is not to the swift,nor the battle to the strong,but to those who can see it coming and jump aside.” - Hunter ThompsonDaemeon* On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano osum...@gmail.com wrote: you can try the pivotal vm as well. http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov lfedo...@hortonworks.com wrote: Tim, download Sandbox from http://hortonworks/com You will have everything needed in a small VM instance which will run on your home desktop. *Thank you!* *Sincerely,* *Leonid Fedotov* Systems Architect - Professional Services lfedo...@hortonworks.com office: +1 855 846 7866 ext 292 mobile: +1 650 430 1673 On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Hadoop Learning Environment
Hi daemon: Actually, for most folks who would want to actually use a hadoop cluster, i would think setting up bigtop is super easy ! If you have issues with it ping me and I can help you get started. Also, we have docker containers - so you dont even *need* a VM to run a 4 or 5 node hadoop cluster. install vagrant install VirtualBox git clone https://github.com/apache/bigtop cd bigtop/bigtop-deploy/vm/vagrant-puppet vagrant up Then vagrant destroy when your done. This to me is easier than manually downloading an appliance, picking memory starting the virtualbox gui, loading the appliance , etc... and also its easy to turn the simple single node bigtop VM into a multinode one, by just modifying the vagrantile. On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle daeme...@gmail.com wrote: What you want as a sandbox depends on what you are trying to learn. If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all of the suggestions (perhaps excluding BigTop due to its setup complexities) are great. Laptop? perhaps but laptop's are really kind of infuriatingly slow (because of the hardware - you pay a price for a 30-45watt average heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7 with lots of memory. What do you think of the thought that you will pretty quickly graduate to wanting a small'ish desktop for your sandbox? A simple, single node, Hadoop instance will let you learn many things. The next level of complexity comes when you are attempting to deal with data whose processing needs to be split up, so you can learn about how to split data in Mapping, reduce the splits via reduce jobs, etc. For that, you could get a windows desktop box or e.g. RedHat/CentOS and use virtualization. Something like a 4 core i5 with 32gb of memory, running 3 or for some things 4, vm's. You could load e.g. hortonworks into each of the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives off of eBay and you can have a lot of learning. *...“The race is not to the swift,nor the battle to the strong,but to those who can see it coming and jump aside.” - Hunter ThompsonDaemeon* On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano osum...@gmail.com wrote: you can try the pivotal vm as well. http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov lfedo...@hortonworks.com wrote: Tim, download Sandbox from http://hortonworks/com You will have everything needed in a small VM instance which will run on your home desktop. *Thank you!* *Sincerely,* *Leonid Fedotov* Systems Architect - Professional Services lfedo...@hortonworks.com office: +1 855 846 7866 ext 292 mobile: +1 650 430 1673 On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- jay vyas
Re: Hadoop Learning Environment
Try docker! http://ferry.opencore.io/en/latest/examples/hadoop.html On Tue, Nov 4, 2014 at 6:36 PM, jay vyas jayunit100.apa...@gmail.com wrote: Hi daemon: Actually, for most folks who would want to actually use a hadoop cluster, i would think setting up bigtop is super easy ! If you have issues with it ping me and I can help you get started. Also, we have docker containers - so you dont even *need* a VM to run a 4 or 5 node hadoop cluster. install vagrant install VirtualBox git clone https://github.com/apache/bigtop cd bigtop/bigtop-deploy/vm/vagrant-puppet vagrant up Then vagrant destroy when your done. This to me is easier than manually downloading an appliance, picking memory starting the virtualbox gui, loading the appliance , etc... and also its easy to turn the simple single node bigtop VM into a multinode one, by just modifying the vagrantile. On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle daeme...@gmail.com wrote: What you want as a sandbox depends on what you are trying to learn. If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all of the suggestions (perhaps excluding BigTop due to its setup complexities) are great. Laptop? perhaps but laptop's are really kind of infuriatingly slow (because of the hardware - you pay a price for a 30-45watt average heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7 with lots of memory. What do you think of the thought that you will pretty quickly graduate to wanting a small'ish desktop for your sandbox? A simple, single node, Hadoop instance will let you learn many things. The next level of complexity comes when you are attempting to deal with data whose processing needs to be split up, so you can learn about how to split data in Mapping, reduce the splits via reduce jobs, etc. For that, you could get a windows desktop box or e.g. RedHat/CentOS and use virtualization. Something like a 4 core i5 with 32gb of memory, running 3 or for some things 4, vm's. You could load e.g. hortonworks into each of the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives off of eBay and you can have a lot of learning. *...“The race is not to the swift,nor the battle to the strong,but to those who can see it coming and jump aside.” - Hunter ThompsonDaemeon* On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano osum...@gmail.com wrote: you can try the pivotal vm as well. http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov lfedo...@hortonworks.com wrote: Tim, download Sandbox from http://hortonworks/com You will have everything needed in a small VM instance which will run on your home desktop. *Thank you!* *Sincerely,* *Leonid Fedotov* Systems Architect - Professional Services lfedo...@hortonworks.com office: +1 855 846 7866 ext 292 mobile: +1 650 430 1673 On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I want to setup an environment where I can teach myself hadoop. Usually the way I'll handle this is to grab a machine off the Amazon free tier and setup whatever software I want. However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with the intention of learning it? To keep things running longer in the free tier I would either setup however many nodes as I want and keep them stopped when I'm not actively using them. Or just setup a few nodes with a few different accounts (with a different gmail address for each one.. easy enough to do). Failing that, what are some other free/cheap solutions for setting up a hadoop learning environment? Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- jay vyas