Re: Using parquet

2014-10-20 Thread Nick Dimiduk
Not currently. HBase uses it's own file format that makes different assumptions than parquet. Instead, HBase supports it's own format optimizations, such as block encodings and compression. I would be interested in an exercise to see what things are necessary for HBase to support a columnar format

Using parquet

2014-10-20 Thread Nishanth S
Hey folks, I have been reading a bit about parque and how hive and impala works well on data stored in parque format.Is it even possible to do the same with hbase to reduce storage etc.. Thanks, Nishanth

Re: getting into the industry, is theory enough?

2014-10-20 Thread Andrew Purtell
I think Dima's point was, please correct me if I am wrong, if you are looking for HBase related work, then contributing something to the project that gets your name associated with the technology in public will have relevant companies taking an interest in you they wouldn't otherwise. This is de

Re: can read operation be performed during the execution of major compaction

2014-10-20 Thread Elliott Clark
No read operations are not blocked by any compactions. Compaction will swap in it's newly created file for any subsequent read requests. Any read requests that come in during the compaction will be answered from the files already there and the memstore. On Mon, Oct 20, 2014 at 8:52 AM, sahanashan

can read operation be performed during the execution of major compaction

2014-10-20 Thread sahanashankar
Hello, I am trying to understand the working of major compaction and i am wondering if a read operation can be performed while the major compaction process is running on HBase. Will the read operation be queued and start after major compaction has finished executing? or can both execute together?

Re: getting into the industry, is theory enough?

2014-10-20 Thread S Ahmed
Dima, I think there is a def. a benefit by contributing to the actual o/s project, but at the same time it is also not really what you would be doing at work and the learning curve is that much higher. On Sat, Oct 18, 2014 at 3:47 AM, Dima Spivak wrote: > Dear S, > > If you're interested in HBa

Re: How can I set the num of mappers when I use hbase RowCounter on Yarn?

2014-10-20 Thread Ted Yu
RowCounter calls: TableMapReduceUtil.initTableMapperJob(tableName, scan, RowCounterMapper.class, ImmutableBytesWritable.class, Result.class, job); which uses TableInputFormat. * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the

Re: How can I set the num of mappers when I use hbase RowCounter on Yarn?

2014-10-20 Thread Shahab Yunus
Have you tried setting the following property through the command line? -D mapreduce.job.mappers Regards, Shahab On Mon, Oct 20, 2014 at 2:24 AM, liub...@inspur.com wrote: > Hello, > I used hbase Rowcounter on yarn , but the num of mappers was 1, and the > progress was 0%. > The input dat

Re: How can I set the num of mappers when I use hbase RowCounter on Yarn?

2014-10-20 Thread Ted Yu
How many regions does the table have ? Did you presplit the table ? Thanks On Oct 19, 2014, at 11:24 PM, "liub...@inspur.com" wrote: > Hello, > I used hbase Rowcounter on yarn , but the num of mappers was 1, and the > progress was 0%. > The input data was quit large, about 587GB. > So how c

How can I set the num of mappers when I use hbase RowCounter on Yarn?

2014-10-20 Thread liub...@inspur.com
Hello, I used hbase Rowcounter on yarn , but the num of mappers was 1, and the progress was 0%. The input data was quit large, about 587GB. So how can I set the num of mappers when I use hbase RowCounter on yarn to make it faster? The commands I used were below: hadoop jar hbase-server-0.99.0