Sandbox is just meant to be a learning environment i guess, to see what's
possible, how things can be connected. The real distribution will have much
higher performance and is the one you need when you want to investigate
performance issues. The only real drawback of the real distributions is
that
okies, thank you D, i will start playing around with the Sandbox version.
On Thu, Mar 13, 2014 at 5:55 AM, Dieter De Witte drdwi...@gmail.com wrote:
Sandbox is just meant to be a learning environment i guess, to see what's
possible, how things can be connected. The real distribution will
Hello Team,
I am starting off on Hadoop eco-system and wanted to learn first based on
my use case if Hadoop is right tool for me.
I have only structured data and my goal is to safe this data into Hadoop
and take benefit of replication factor. I am using Microsoft tools for
doing analysis and it
I would suggest that given the level of details that you are looking for
and fundamental nature of your questions, you should get hold of books or
online documentation. Basically some reading/research.
Latest edition of
http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1449311520 is
Thank you Shahab but it would be really nice if I can get some input on my
initial question as it would really help.
On Wed, Mar 12, 2014 at 3:11 PM, Shahab Yunus shahab.yu...@gmail.comwrote:
I would suggest that given the level of details that you are looking for
and fundamental nature of
Assuming that the following is your initial questions:
*My question here is how benefits YARN architecture give me in tems of
analysis that my Microsoft, Netezza of Tableau products are not giving me.
I am just trying to understand value of introducing Hadoop in my
Architecture in terms of
Hi,
1) HDFS is just a file system, it hides the fact that it is distributed.
2) Mapreduce is the most lowlevel analytics tool I think, you can just
specify an input and in your map and reduce function define some
functionality to deal with this input. No need for HBase,... although they
can be
Thanks D, that certainly answers my question.
I was just taking quick look at Hortonworks HDP vs Hortonworks Sandbox, do
you know of any benefits of using Sandbox as opposed to Hortonworks Data
Platforms?
On Wed, Mar 12, 2014 at 4:02 PM, Dieter De Witte drdwi...@gmail.com wrote:
Hi,
1) HDFS
Hey D,
Regarding your point 5: For a proof of concept I would use a ready-made
virtual machine from one to 3 big vendors - cloudera, mapR and hortonworks
I want to understand how this virtual setup would work and how much master
and slaves nodes I can have in this virtual setup and in general