Hi, Yay,
I followed the steps you described and got the following error.
Any idea?
vagrant up
creating provisioner directive for running tests
Bringing machine 'bigtop1' up with 'virtualbox' provider...
== bigtop1: Box 'puppetlab-centos-64-nocm' could not be found. Attempting to
find and install...
bigtop1: Box Provider: virtualbox
bigtop1: Box Version: = 0
== bigtop1: Adding box 'puppetlab-centos-64-nocm' (v0) for provider: virtualbox
bigtop1: Downloading:
http://puppet-vagrant-boxes.puppetlabs.com/centos-64-x64-vbox4210-nocm.box
== bigtop1: Successfully added box 'puppetlab-centos-64-nocm' (v0) for
'virtualbox'!
There are errors in the configuration of this machine. Please fix
the following errors and try again:
vm:
* The 'hostmanager' provisioner could not be found.
Thanks
Jim
On Nov 4, 2014, at 6:36 PM, jay vyas jayunit100.apa...@gmail.com wrote:
Hi daemon: Actually, for most folks who would want to actually use a hadoop
cluster, i would think setting up bigtop is super easy ! If you have issues
with it ping me and I can help you get started.
Also, we have docker containers - so you dont even *need* a VM to run a 4 or
5 node hadoop cluster.
install vagrant
install VirtualBox
git clone https://github.com/apache/bigtop
cd bigtop/bigtop-deploy/vm/vagrant-puppet
vagrant up
Then vagrant destroy when your done.
This to me is easier than manually downloading an appliance, picking memory
starting the virtualbox gui, loading the appliance , etc... and also its
easy to turn the simple single node bigtop VM into a multinode one,
by just modifying the vagrantile.
On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle daeme...@gmail.com wrote:
What you want as a sandbox depends on what you are trying to learn.
If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all of
the suggestions (perhaps excluding BigTop due to its setup complexities) are
great. Laptop? perhaps but laptop's are really kind of infuriatingly slow
(because of the hardware - you pay a price for a 30-45watt average heating
bill). A laptop is an OK place to start if it is e.g. an i5 or i7 with lots
of memory. What do you think of the thought that you will pretty quickly
graduate to wanting a small'ish desktop for your sandbox?
A simple, single node, Hadoop instance will let you learn many things. The
next level of complexity comes when you are attempting to deal with data
whose processing needs to be split up, so you can learn about how to split
data in Mapping, reduce the splits via reduce jobs, etc. For that, you could
get a windows desktop box or e.g. RedHat/CentOS and use virtualization.
Something like a 4 core i5 with 32gb of memory, running 3 or for some things
4, vm's. You could load e.g. hortonworks into each of the vm's and practice
setting up a 3/4 way cluster. Throw in 2-3 1tb drives off of eBay and you can
have a lot of learning.
...
“The race is not to the swift,
nor the battle to the strong,
but to those who can see it coming and jump aside.” - Hunter Thompson
Daemeon
On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano osum...@gmail.com wrote:
you can try the pivotal vm as well.
http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html
On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov lfedo...@hortonworks.com
wrote:
Tim,
download Sandbox from http://hortonworks/com
You will have everything needed in a small VM instance which will run on your
home desktop.
Thank you!
Sincerely,
Leonid Fedotov
Systems Architect - Professional Services
lfedo...@hortonworks.com
office: +1 855 846 7866 ext 292
mobile: +1 650 430 1673
On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy bluethu...@gmail.com wrote:
Hey all,
I want to setup an environment where I can teach myself hadoop. Usually the
way I'll handle this is to grab a machine off the Amazon free tier and setup
whatever software I want.
However I realize that Hadoop is a memory intensive, big data solution. So
what I'm wondering is, would a t2.micro instance be sufficient for setting up
a cluster of hadoop nodes with the intention of learning it? To keep things
running longer in the free tier I would either setup however many nodes as I
want and keep them stopped when I'm not actively using them. Or just setup a
few nodes with a few different accounts (with a different gmail address for
each one.. easy enough to do).
Failing that, what are some other free/cheap solutions for setting up a
hadoop learning environment?
Thanks,
Tim
--
GPG me!!
gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader of
this message is not