Re: Use Cases for Structured Data

2014-03-13 Thread Dieter De Witte
Sandbox is just meant to be a learning environment i guess, to see what's possible, how things can be connected. The real distribution will have much higher performance and is the one you need when you want to investigate performance issues. The only real drawback of the real distributions is that

Re: Use Cases for Structured Data

2014-03-13 Thread ados1...@gmail.com
okies, thank you D, i will start playing around with the Sandbox version. On Thu, Mar 13, 2014 at 5:55 AM, Dieter De Witte drdwi...@gmail.com wrote: Sandbox is just meant to be a learning environment i guess, to see what's possible, how things can be connected. The real distribution will

Use Cases for Structured Data

2014-03-12 Thread ados1...@gmail.com
Hello Team, I am starting off on Hadoop eco-system and wanted to learn first based on my use case if Hadoop is right tool for me. I have only structured data and my goal is to safe this data into Hadoop and take benefit of replication factor. I am using Microsoft tools for doing analysis and it

Re: Use Cases for Structured Data

2014-03-12 Thread Shahab Yunus
I would suggest that given the level of details that you are looking for and fundamental nature of your questions, you should get hold of books or online documentation. Basically some reading/research. Latest edition of http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1449311520 is

Re: Use Cases for Structured Data

2014-03-12 Thread ados1...@gmail.com
Thank you Shahab but it would be really nice if I can get some input on my initial question as it would really help. On Wed, Mar 12, 2014 at 3:11 PM, Shahab Yunus shahab.yu...@gmail.comwrote: I would suggest that given the level of details that you are looking for and fundamental nature of

Re: Use Cases for Structured Data

2014-03-12 Thread Shahab Yunus
Assuming that the following is your initial questions: *My question here is how benefits YARN architecture give me in tems of analysis that my Microsoft, Netezza of Tableau products are not giving me. I am just trying to understand value of introducing Hadoop in my Architecture in terms of

Re: Use Cases for Structured Data

2014-03-12 Thread Dieter De Witte
Hi, 1) HDFS is just a file system, it hides the fact that it is distributed. 2) Mapreduce is the most lowlevel analytics tool I think, you can just specify an input and in your map and reduce function define some functionality to deal with this input. No need for HBase,... although they can be

Re: Use Cases for Structured Data

2014-03-12 Thread ados1...@gmail.com
Thanks D, that certainly answers my question. I was just taking quick look at Hortonworks HDP vs Hortonworks Sandbox, do you know of any benefits of using Sandbox as opposed to Hortonworks Data Platforms? On Wed, Mar 12, 2014 at 4:02 PM, Dieter De Witte drdwi...@gmail.com wrote: Hi, 1) HDFS

Re: Use Cases for Structured Data

2014-03-12 Thread ados1...@gmail.com
Hey D, Regarding your point 5: For a proof of concept I would use a ready-made virtual machine from one to 3 big vendors - cloudera, mapR and hortonworks I want to understand how this virtual setup would work and how much master and slaves nodes I can have in this virtual setup and in general