Architecture question on Injesting Data into Hadoop

2014-03-24 Thread ados1...@gmail.com
Hello Team, I am doing POC in Hadoop and want to understand what is recommended architecture to injest data from different data stream like web log, portal, mobile, pos system into Hadoop system? Also what are the use cases where we need to have hbase on top of HDFS? Can't we only have hdfs and

Re: Architecture question on Injesting Data into Hadoop

2014-03-24 Thread Geoffry Roberts
Based on what you have said, it sounds as if you want to append records to a file(s) in hdfs. I was able to do this with WebHDFS and with the hadoop client. But you asked about architecture. Would a POST to a url satisfy you as to architecture? If so setup WebHDFS as POST to it. On Mon, Mar

Re: Architecture question on Injesting Data into Hadoop

2014-03-24 Thread Mohammad Tariq
Hi Apurva, In would use some data ingestion tool like Apache Flume to make the task easier without much human intervention. Create sources for your different systems and rest will be taken care of by Fume. However, it is not a must to use something like Flume. But it will definitely make your

Re: Architecture question on Injesting Data into Hadoop

2014-03-24 Thread Shahab Yunus
@ados1984, HDFS is a file system and HBase is a data store on top of that. You cannot create tables (in the conventional meaning of the word table in database/store) directly on HDFS without HBase. Regards, Shahab On Mon, Mar 24, 2014 at 4:11 PM, Geoffry Roberts threadedb...@gmail.comwrote:

Re: Architecture question on Injesting Data into Hadoop

2014-03-24 Thread ados1...@gmail.com
Thank you Shahab but I am using Impala and so I can directly create tables using Impala on hdfs without using hbase or any other noSQL technology. Other question that I have here is let's say that I have 3 nodes...now if am installing impala on one node then on that node am able to create tables

Re: Architecture question on Injesting Data into Hadoop

2014-03-24 Thread ados1...@gmail.com
Thank you Tariq but using Flume...how is structured data captured into hdfs, let's say I do not have hbase or any other data store on top of Hadoop then in that case...how will structured and un-structured data from different input streams be captured into hdfs using flume and how can i go in and

Re: Architecture question on Injesting Data into Hadoop

2014-03-24 Thread ados1...@gmail.com
Geoffry, Can you elaborate more on your architecture...also when you refer to Hadoop Client, what exactly are you referring to? HUE or cloudera manager or something else? Kindly advise. On Mon, Mar 24, 2014 at 4:11 PM, Geoffry Roberts threadedb...@gmail.comwrote: Based on what you have said,