Re: getting into the industry, is theory enough?

2014-10-18 Thread Dima Spivak
Dear S,

If you're interested in HBase, you can't go wrong with reading the ref
guide (hbase.apache.org/book.html) and checking out open JIRAs (
issues.apache.org/jira/browse/HBASE) waiting for someone to fix them :).
Even if you don't want to go into developing the technology itself, there's
no better way to prove your ability to use it than by contributing to the
open source project and employers absolutely take note of that.

All the best,
   Dima

On Fri, Oct 17, 2014 at 2:14 PM, Rendon, Carlos (KBB - Irvine) 
carlos.ren...@kbb.com wrote:

 Hi,

 I can tell you my experience. I saw a job opening at KBB.com that had to
 do with crunching large amounts of data for user personalization. I took
 the job and they provided Hadoop/HBase training.

 When interviewing, I focused on my previous successful experiences working
 on projects involving data analysis at large scale. In my case prior
 technologies were things like Microsoft SQL Server and custom Java code. I
 didn't have any Hadoop experience but I did have experience with data
 analysis. I also talked about prior successful software projects I had
 finished, and general software engineering skills.

 Hope that helps,
 Carlos

 -Original Message-
 From: S Ahmed [mailto:sahmed1...@gmail.com]
 Sent: Friday, October 17, 2014 1:18 PM
 To: user@hbase.apache.org
 Subject: RE: getting into the industry, is theory enough?

 Hello!

 How do you guys suggest someone get into the hbase/hadoop industry with a
 focus on the software development side (as oppose to ops) ?

 If all you have done is read a few books, played around with hbase and
 maybe cloudera's packages how exactly would that result in getting some
 kind of employment in the industry?

 I'm not sure how mature the market is so I am weary if it is a good idea
 or not to focus in this domain.

 Do most people just fall into the industry b/c their company provides
 training and real-life problems to solve?  i.e. you have to be in a company
 that at some point adopts hadoop/hbase and you are fortunate enough to get
 on the project.



Archive Files

2014-10-18 Thread Ravindranath Akila
Is there any approach HBASE can store archive like rarely used files on
cheap storage?

That's a vague question. If I may elaborate...

Our current office stores terabytes of well structured log data on s3 to
save cost. The other day I was asked to process all these files. These
files are still used for Analytics and other decision making. The logs come
from a RTB (Real Time Bidding) system.

Now ideally these files would have been on HDFS, but would incur large
storage costs over time since they are only occasionally used but the
servers need to be up and running to store them.

By context of Big Data, aren't these files big date files? If so is there a
cheap way of storing them on HBASE? For example, write  a storage adapter
of sorts.

I'm really sorry if this isn't the right place to ask this. Thanks in
advance :)


-- 
R. A.
BTW, there is a website called* Thank God it's Friday!*
It tells you fun things to do in your area over the weekend.
*See here: http://www.ThankGodItIsFriday.com
http://www.ThankGodItIsFriday.com*


Re: Archive Files

2014-10-18 Thread Wilm Schumacher


Am 18.10.2014 um 12:35 schrieb Ravindranath Akila:
 Is there any approach HBASE can store archive like rarely used files on
 cheap storage?
hadoop directly is equiped for that. There are HAR files, map files and
sequence files.

If I understand correctly, sequence files is what you are searching for.

In the other hand, you could read the data into a hbase, and do your
stuff there, if you put your logging data directly into the hbase
cluster by default.

Best wishes,

Wilm


Re: Archive Files

2014-10-18 Thread Ted Yu
Take a look at https://issues.apache.org/jira/browse/HDFS-6584

It is in the upcoming hadoop 2.6 release. 

Cheers

On Oct 18, 2014, at 3:35 AM, Ravindranath Akila ravindranathak...@gmail.com 
wrote:

 Is there any approach HBASE can store archive like rarely used files on
 cheap storage?
 
 That's a vague question. If I may elaborate...
 
 Our current office stores terabytes of well structured log data on s3 to
 save cost. The other day I was asked to process all these files. These
 files are still used for Analytics and other decision making. The logs come
 from a RTB (Real Time Bidding) system.
 
 Now ideally these files would have been on HDFS, but would incur large
 storage costs over time since they are only occasionally used but the
 servers need to be up and running to store them.
 
 By context of Big Data, aren't these files big date files? If so is there a
 cheap way of storing them on HBASE? For example, write  a storage adapter
 of sorts.
 
 I'm really sorry if this isn't the right place to ask this. Thanks in
 advance :)
 
 
 -- 
 R. A.
 BTW, there is a website called* Thank God it's Friday!*
 It tells you fun things to do in your area over the weekend.
 *See here: http://www.ThankGodItIsFriday.com
 http://www.ThankGodItIsFriday.com*


Re: Archive Files

2014-10-18 Thread Ravindranath Akila
Thanks so much guys! This is exactly what I was looking for! :-)

On Saturday, October 18, 2014, Ted Yu yuzhih...@gmail.com wrote:

 Take a look at https://issues.apache.org/jira/browse/HDFS-6584

 It is in the upcoming hadoop 2.6 release.

 Cheers

 On Oct 18, 2014, at 3:35 AM, Ravindranath Akila 
 ravindranathak...@gmail.com javascript:; wrote:

  Is there any approach HBASE can store archive like rarely used files on
  cheap storage?
 
  That's a vague question. If I may elaborate...
 
  Our current office stores terabytes of well structured log data on s3 to
  save cost. The other day I was asked to process all these files. These
  files are still used for Analytics and other decision making. The logs
 come
  from a RTB (Real Time Bidding) system.
 
  Now ideally these files would have been on HDFS, but would incur large
  storage costs over time since they are only occasionally used but the
  servers need to be up and running to store them.
 
  By context of Big Data, aren't these files big date files? If so is
 there a
  cheap way of storing them on HBASE? For example, write  a storage adapter
  of sorts.
 
  I'm really sorry if this isn't the right place to ask this. Thanks in
  advance :)
 
 
  --
  R. A.
  BTW, there is a website called* Thank God it's Friday!*
  It tells you fun things to do in your area over the weekend.
  *See here: http://www.ThankGodItIsFriday.com
  http://www.ThankGodItIsFriday.com*



-- 
R. A.
BTW, there is a website called* Thank God it's Friday!*
It tells you fun things to do in your area over the weekend.
*See here: http://www.ThankGodItIsFriday.com
http://www.ThankGodItIsFriday.com*