A question about HBase MapReduce

2012-05-24 Thread Florin P
Hello! I've read Lars George's blog http://www.larsgeorge.com/2009/05/hbase-mapreduce-101-part-i.html where at the end of the article, he mentioned "In the next post I will show you how to import data from a raw data file into a HBase table and how you eventually process the data in the HBase t

HBASE Secondary index support

2011-07-07 Thread Florin P
Hello! Perhaps this question occurs all the time. Maybe it should be in the FAQ. So, what are the options to achieve secondary index on HBase? And: what is the best option to implement secondary index on Hbase? Thank you, Florin

Re: zookeeper connection issue - distributed mode

2011-07-05 Thread Florin P
Hello! The property hbase.zookeeper.quorum is taken from hbase-site.xml on the HBase master machine. taken from Hbase master hbase-site.xml hbase.zookeeper.quorum For me, it worked. Success, Florin --- On Tue, 7/5/11, d

Re: Possible issue when creating/deleting HBase table multiple times

2011-07-03 Thread Florin P
running for months on end.  I wonder whats > different.  You are on > > 0.90.3? > > > > St.Ack > > > > On Fri, Jul 1, 2011 at 12:59 AM, Florin P > wrote: > > > Hello! > > >   I'm using HBase > 0.90.1-cdh3u1-SNAPSHOT. Running the attached &g

Some queestions about HBase Architecture

2011-07-01 Thread Florin P
Hello! I've read the HBase architecture from the book http://hbase.apache.org/book.html#architecture (HBA) and confronted with HBase definitive guide (HBDG) http://ofps.oreilly.com/titles/9781449396107/architecture.html Some questions raised: 1. How many MemStores can have Region? HBDG: "A HReg

Re: HBase region size

2011-07-01 Thread Florin P
Hello! Thank you for your responses. We are going to implement the solution with storing the metadata information on HBase and the content of the files into HDFS map files. We'll keep the reference of the map file in the HBase. Kind regards, Florin --- On Fri, 7/1/11, Andrew Purtell wrote: >

Possible issue when creating/deleting HBase table multiple times

2011-07-01 Thread Florin P
Hello! I'm using HBase 0.90.1-cdh3u1-SNAPSHOT. Running the attached code(adapted after sujee at sujee.net), after a while I was getting the below exception. The main scenario is like this: 1. if table does not exist, create it 2. populate the table with some data 3. flush the data 4. close th

RE: HBase region size

2011-06-29 Thread Florin P
Hello! We have the almost the same scenario as Aditya, but with some differences. 1. our files are documents in any format (xls, pdf, doc, html etc) 2. we are expecting to have more than 5 millions of these documents 3. The size of them varies like this 70% from them have their

Re: Obtain many mappers (or regions)

2011-06-29 Thread Florin P
o pre split temporary tables and a > lot less work and overhead. > > This is something that could be part of an indexing > solution. ;-P > (meaning that the classes are reusable for other > solutions...) > > HTH -Mike > > Sent from a remote device. Please excuse

RE: Obtain many mappers (or regions)

2011-06-27 Thread Florin P
then we obtain the "magic" number of mappers 2. We have observed this behavior, by implementing a Driver for the MR job and setting up the mapred.map.tasks to 40 let's say. Then the number of mappers are calculated correctly to 32. Regards, Florin --- On Mon, 6/2

RE: Obtain many mappers (or regions)

2011-06-27 Thread Florin P
th this to get a background on > what HBase can do... > http://hbase.apache.org/book.html > .. there is a section on MapReduce with HBase as well. > > -Original Message- > From: Florin P [mailto:florinp...@yahoo.com] > > Sent: Monday, June 27, 2011 4:53 AM > To:

Obtain many mappers (or regions)

2011-06-27 Thread Florin P
Hello! I have the following scenario: 1. A temporary HBase table with small number of rows (aprox 100) 2. A cluster with 2 machines that I would like to crunch the data contained in the rows I would like to create two mappers that will crunch the data from rows. How can I achieve this? A gene

Re: How to efficiently join HBase tables?

2011-06-16 Thread Florin P
Hello! Regarding the same subject of joining, I have the following scenario: 1. I have a big table DOCS that contains the columns UUID DOCID sdsd 1 hdhs 3 gdhg 7 shdg 9 and so on (hope you got the idea) 2. an external list of docID (LIST) 3 1 7