Re: hbase hashing algorithm and schema design

2011-06-09 Thread Sam Seigal
Hi Otis, This approach looks much better than the uuid approach. I will definitely try it out. Thanks for the contribution. Sam - Original Message From: Otis Gospodnetic To: user@hbase.apache.org Sent: Wed, June 8, 2011 6:02:16 PM Subject: Re: hbase hashing algorithm and schema design

Re: Hbase Hardware requirement

2011-06-09 Thread Ted Dunning
On Fri, Jun 10, 2011 at 3:58 AM, Michel Segel wrote: > Not to mention you don't get a linear boost w port bonding. > Well, you don't get linear boost with switch level bonding. You can get it, however. > You have to be careful on hardware recommendations because there are > pricing sweet spots

Re: hbase hashing algorithm and schema design

2011-06-09 Thread Sam Seigal
Thanks for the reply Joey. This sounds better than using uuids .. I will give it a shot. One more thing, with such a setup, will it still be possible to do map reduce jobs ? Is it possible to create a single scanner that will look at all the prefixes ? if not, is it possible to map reduce with mul

Re: Hbase Hardware requirement

2011-06-09 Thread Michel Segel
Expensive is relative and with the latest intel hardware release you're starting to see 10gbe on the motherboard. Not to mention you don't get a linear boost w port bonding. You have to be careful on hardware recommendations because there are pricing sweet spots and technology changes. Sent fr

Re: LZO Compression

2011-06-09 Thread Arun Ramakrishnan
I followed the instructions for building on Ubuntu Maverick. I tried to build the packager. But, I have the following potential errors. the Jar seemed to build fine. Looking at the last line of the log, is there anything that is a problem ? I am yet to get my hbase running. Hence trying to resolv

Re: Hadoop not working after replacing hadoop-core.jar with hadoop-core-append.jar

2011-06-09 Thread Stack
On Tue, Jun 7, 2011 at 2:32 PM, Stack wrote: > On Mon, Jun 6, 2011 at 10:37 PM, Mike Spreitzer wrote: >> So my >> suggestion is to be unequivocal about it: when running distributed, always >> build your own Hadoop and put its -core JAR into your HBase installation >> (or use Cloudera, which has d

RE: Hadoop not working after replacing hadoop-core.jar with hadoop-core-append.jar

2011-06-09 Thread Zhong, Andy
Yes, name does matter. It should work after renaming hadoop-core-append.jar with hadoop-core.jar. It may help if checking http://www.michael-noll.com/blog/2011/04/14/building-an-hadoop-0-20-x-ve rsion-for-hbase-0-90-2/ -Original Message- From: saint@gmail.com [mailto:saint@gmail.

copy_table.rb doesn't work in 0.20.6

2011-06-09 Thread James Hammerton
Hi, Before trying to merge regions on a table in our live database we decided to copy the table and merge the regions on the copy first to test the merging code works before risking our live data. However when we try to run the copy_table.rb script it fails due to the following error (this is act

Re: Hadoop not working after replacing hadoop-core.jar with hadoop-core-append.jar

2011-06-09 Thread Stack
On Tue, Jun 7, 2011 at 2:32 PM, Stack wrote: > On Mon, Jun 6, 2011 at 10:37 PM, Mike Spreitzer wrote: >> Also: explicitly explain >> how the file has to be named (there is a strict naming requirement so that >> the launching scripts work, right?). >> ... > Also, I second what Andrew says where

Re: Hbase Hardware requirement

2011-06-09 Thread M. C. Srivas
Ensure enough networking bandwidth to match your drive-bandwidth, otherwise your compaction rates are going to be abysmal. 10 GigE ports are expensive, so consider 2 x 1GigE per box (or even 4 x 1GigE if you can get that many on-board NICs). On Wed, Jun 8, 2011 at 8:49 AM, Andrew Purtell wrote:

Re: in-memory data grid vs. ehcache + hbase

2011-06-09 Thread Stack
I don't know of a generic soln to the prob. you describe. Sounds like you have hacked up something for your purposes only the local cache is read-only? Can't you change your inserts so they update both hbase and push out to local cache? St.Ack On Thu, Jun 9, 2011 at 9:20 AM, Hiller, Dean x6607

Re: 0.92.0 availability

2011-06-09 Thread Stack
On Thu, Jun 9, 2011 at 5:03 AM, Eric Charles wrote: > Hi, > > From http://s.apache.org/x4, there are 178 open issues (40%). > > Btw, it would be cool if the Atlassian guys would provide a graph showing > the evolution in time of that number. Are we more on the rising or > descending side? > lol.

Re: How to efficiently join HBase tables?

2011-06-09 Thread Michel Segel
>> This sounds effective to me, so long as you can perform any desired >> operations on the all the rows matching a single join value via a single >> iteration through the stream of reduce input values (for example, if the >> set >> of data for each join value fits in memory). Otherwise you'd need

Re: How to efficiently join HBase tables?

2011-06-09 Thread Michel Segel
Doug, I think I should clarify something... Yes I am the only one who is saying get() won't work. The question was asked on how to do an efficient join where there were no specific parameters like joining on key values. It wasn't until yesterday that Eran gave an example of the specific problem

RE: in-memory data grid vs. ehcache + hbase

2011-06-09 Thread Hiller, Dean x66079
Oh, and I was hoping something like this kind of framework using the hbase slaves file was already existence...hard to believe it would not be since our performance increase would be around 100 times in this casewe are currently using something other than hbase and when we change to this typ

RE: in-memory data grid vs. ehcache + hbase

2011-06-09 Thread Hiller, Dean x66079
Well, I was hoping there is something with ehcache or some kind of cache where it would work like this 1. write using hbase client into the grid which came from some web update(which is VERY rare occurrence as this is admin stuff) 2. write something out to all nodes telling it to evict the stale

Re: bulk load map/reduce sample code

2011-06-09 Thread Harsh J
Fleming, You can use the importtsv tool with the bulk load option (-Dimporttsv.bulk.output=path) and the "," delimiter; and then follow it up with the completebulkload tool to finish a bulk load. For help like usage, etc.; Run the following (Commands relative to HBASE_HOME): $ HADOOP_CLASSPATH=`b

Re: 0.92.0 availability

2011-06-09 Thread Eric Charles
Hi, From http://s.apache.org/x4, there are 178 open issues (40%). Btw, it would be cool if the Atlassian guys would provide a graph showing the evolution in time of that number. Are we more on the rising or descending side? Tks, - Eric On 09/06/11 02:58, Otis Gospodnetic wrote: I wouldn't

Re: How to efficiently join HBase tables?

2011-06-09 Thread Eran Kutner
Exactly! Thanks Dave for a much better explanation than mine! -eran On Thu, Jun 9, 2011 at 00:35, Dave Latham wrote: > I believe this is what Eran is suggesting: > > Table A > --- > Row1 (has joinVal_1) > Row2 (has joinVal_2) > Row3 (has joinVal_1) > > Table B > --- > Row4 (has joinVa

bulk load map/reduce sample code

2011-06-09 Thread hmchiud
Hi, I am planning to use bulk load to import my csv file into HBase0.90. Is there any bulk load map/reduce sample code? Thank you very much. Fleming Chiu(邱宏明) Ext: 707-2260 Be Veg, Go Green, Save the Planet! ---

Re: HBase Backups

2011-06-09 Thread Eric Charles
Oops, sorry, I confused MapR (the company) with Map Reduce (MR, the technology). Time for me to update my knowledge on Hadoop ecosystem. Tks, - Eric On 09/06/11 09:49, Ted Dunning wrote: On Thu, Jun 9, 2011 at 9:12 AM, Eric Charleswrote: Good news! I suppose there's a risk of "incoherent" b

Re: HBase Backups

2011-06-09 Thread Ted Dunning
On Thu, Jun 9, 2011 at 9:12 AM, Eric Charles wrote: > Good news! > > I suppose there's a risk of "incoherent" backup. > There would be but we spent a ton of time making that not so. And the hbase devs have done a bunch of work making sure that the WAL works right. > I mean, with classical sql

RE: Does Put support "don't put if row exists"?

2011-06-09 Thread Ma, Ming
It looks like there is a HBase API called checkAndPut. By setting the value to be "null", you can achieve "put only when the row+column family+column qualifier doesn't exist". Nice feature. _ From: Ma, Ming Sent: Wednesday, June 08, 2011 9:54 PM To: us

Re: HBase Backups

2011-06-09 Thread Eric Charles
Good news! I suppose there's a risk of "incoherent" backup. I mean, with classical sql databases, online-backups ensure that the taken dataset can be restored in a state where all open transactions are committed. Even if the backup takes hours, the initial backuped data is finally updated to