RE: completebulkload Exception

2011-06-10 Thread Panayotis Antonopoulos
Hi, This has been discussed before. Find the thread: Help with NPE during bulk load (completebulkload) Completebulkload doesn't load the HBase configuration properly T‏ry : completebulkload -Dhbase.cluster.distributed=true /tmp/output myTable > Subject: completebulkload Exception > To: user@hb

RE: HFiles that fit within a single region VS better load balancing at reduce phase

2011-05-25 Thread Panayotis Antonopoulos
o the following JIRAs which speedup LoadIncrementalHFiles: > https://issues.apache.org/jira/browse/HBASE-3871 > https://issues.apache.org/jira/browse/HBASE-3721 > > Note: parallelizing splitting of HFile(s) by LoadIncrementalHFiles is done > on a single machine. > > Thanks > > 2011/5/2

HFiles that fit within a single region VS better load balancing at reduce phase

2011-05-25 Thread Panayotis Antonopoulos
Hello, I am currently working on a MR job that will output HFiles that will be bulk loaded in an HBase Table. According to the HBase site in order for the bulk loading to be efficient each HFile of the MR job should fit within a single region. In order to achieve that I use the TotalOrderPartiti

RE: HFiles created by MR Jobs and HBase Performance

2011-05-19 Thread Panayotis Antonopoulos
does not need to have distinct key ranges, they just need to fit > within the overall range of the region. This does impact read performance so > multiple hfiles get cleaned up and condensed into one during a compaction. > > -chris > > 2011/5/17 Panayotis Antonopoulos > &g

HFiles created by MR Jobs and HBase Performance

2011-05-17 Thread Panayotis Antonopoulos
Hello, I am writing a MR job where each reducer will output one HFile containing some of the rows of the table that will be created. At first I thought to use the HashPartitioner to achieve load balancing, but this would mix the rows and the output of each reducer will not be a continuous pa

RE: Pagination through families / columns?

2011-05-12 Thread Panayotis Antonopoulos
If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention. > From: m...@imageshack.net > Subject: Pagination through families / columns? > Date: Thu, 12 May 2011 13:49:16 -0700 > To: user@hbase.apache.org > > Hey Guys, > > Not sure if this functiona

RE: HFileOutputFormat: writing to multiple columns for each row (not multiple column families)

2011-05-01 Thread Panayotis Antonopoulos
It seems that I was not sorting the KeyValues properly as I was not using the KeyValueSortReducer that comes with HBase. > From: antonopoulos...@hotmail.com > To: user@hbase.apache.org > Subject: HFileOutputFormat: writing to multiple columns for each row (not > multiple column families) > Date

HFileOutputFormat: writing to multiple columns for each row (not multiple column families)

2011-04-30 Thread Panayotis Antonopoulos
Hello, I am trying to use HFileOutputFormat to write to many columns (of the same column family) for each row but I can't figure out how I will do that. Can anyone give me some advice? Thank you in advance! Panagiotis.

RE: Help with NPE during bulk load (completebulkload)

2011-04-30 Thread Panayotis Antonopoulos
You are right the -c flag doesn't work on CDH3U0 version. -Dhbase.cluster.distributed=true solved the problem for the bulk upload. However I am having the same NPE while trying to use the TableInputFormat. Does anyone know what is going on? Regards, Panagiotis. > From: andy.saut...@returnp

RE: Problem with zookeeper port while using completebulkupload

2011-04-27 Thread Panayotis Antonopoulos
going on or > are you using hbase configs? > > St.Ack > > > 2011/4/27 Panayotis Antonopoulos : > > > > I downloaded HBase 0.90.1 again and it works perfectly. > > Is there anything wrong with HBase 0.90.2 and completebulkload? > > > >> From: antonop

RE: Problem with zookeeper port while using completebulkupload

2011-04-27 Thread Panayotis Antonopoulos
I downloaded HBase 0.90.1 again and it works perfectly. Is there anything wrong with HBase 0.90.2 and completebulkload? > From: antonopoulos...@hotmail.com > To: user@hbase.apache.org > Subject: Problem with zookeeper port while using completebulkupload > Date: Wed, 27 Apr 2011 18:51:38 +0300 >

Problem with zookeeper port while using completebulkupload

2011-04-27 Thread Panayotis Antonopoulos
Hello, I am trying to use completebulkload in HBase 0.90.2 and I get the following exception: 11/04/27 18:45:51 ERROR zookeeper.ZKConfig: no clientPort found in zoo.cfg Exception in thread "main" org.apache.hadoop.hbase.ZooKeeperConnectionException: java.io.IOException: Unable to determine Zoo

RE: HBase - Column family

2011-04-23 Thread Panayotis Antonopoulos
I am also a beginner, so I would like to ask you something about the method you proposed. HBase is column-oriented. This means (as far as I know from databases) that it stores its data column by column and not row by row. If we use the schema you suggested then when we want some of the documents