Re: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Andrey Stepachev
gt; > As to is it worth it? Yes, because right now there is not a good indexing solution to HBase when it comes to a map/reduce. > > I don't think I'm the first one to think about it.... > > Thx > -Mike > >> Subject: Re: Using external indexes in an HBa

RE: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Michael Segel
t? Yes, because right now there is not a good indexing solution to HBase when it comes to a map/reduce. I don't think I'm the first one to think about it Thx -Mike > Subject: Re: Using external indexes in an HBase Map/Reduce job... > From: m...@mlogiciels.com > Date: Tu

Re: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Matthew LeMieux
what's the best way to do it? > > Can you use an object to feed in to a m/r job? (And that's the key point I'm > trying to solve.) > > Does that make sense? > > -Mike > >> Subject: Re: Using external indexes in an HBase Map/Reduce job... >&g

Re: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Jack Levin
) > > Suppose I have an external index. I get the list of row keys in a List Object. > > Now I want to process the list in a m/r job. > > So what's the best way to do it? > > Can you use an object to feed in to a m/r job? (And that's the key point I'm >

RE: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Michael Segel
it? Can you use an object to feed in to a m/r job? (And that's the key point I'm trying to solve.) Does that make sense? -Mike > Subject: Re: Using external indexes in an HBase Map/Reduce job... > From: m...@mlogiciels.com > Date: Tue, 12 Oct 2010 11:53:11 -0700 > To: us

Re: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Matthew LeMieux
n do a m/r reading from the file and >> setting my own splits to control parallelism. >> But I'm hoping for a more elegant solution. >> >> I know that its possible, but I haven't thought it out... Was hoping someone >> else had this solved. >> >>

Re: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread jason
se had this solved. > > thx > >> From: buttl...@llnl.gov >> To: user@hbase.apache.org >> Date: Tue, 12 Oct 2010 08:35:25 -0700 >> Subject: RE: Using external indexes in an HBase Map/Reduce job... >> >> Sorry, I am not clear on exactly what you are trying

Re: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Steven Noels
Did you have a look at Lily? A billion items will be interesting, but we offer M/R index rebuild (against SOLR) and incremental updates as well. You could take a look at the RowLog library we did to do this in a robust way - which has no Lily dependencies. www.lilyproject.org Cheers, Steven. On

RE: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Michael Segel
n't thought it out... Was hoping someone else had this solved. thx > From: buttl...@llnl.gov > To: user@hbase.apache.org > Date: Tue, 12 Oct 2010 08:35:25 -0700 > Subject: RE: Using external indexes in an HBase Map/Reduce job... > > Sorry, I am not clear on exactly what you a

RE: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Michael Segel
the best way to do it? > From: oct...@gmail.com > Date: Tue, 12 Oct 2010 16:54:00 +0400 > Subject: Re: Using external indexes in an HBase Map/Reduce job... > To: user@hbase.apache.org > > Hi Michael Segel. > > If I understand your question correctrly, you looking for op

RE: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Buttler, David
ase-u...@hadoop.apache.org Subject: Using external indexes in an HBase Map/Reduce job... Hi, Now I realize that most everyone is sitting in NY, while some of us can't leave our respective cities Came across this problem and I was wondering how others solved it. Suppose you have a really large ta

Re: Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Andrey Stepachev
Hi Michael Segel. If I understand your question correctrly, you looking for optimal way for scanning index search results? If not, my answer below is not relevant :). 1. For mr joins or large index results scan bloom filters can be used like described here http://blog.rapleaf.com/dev/2009/09/25/b

Using external indexes in an HBase Map/Reduce job...

2010-10-12 Thread Michael Segel
Hi, Now I realize that most everyone is sitting in NY, while some of us can't leave our respective cities Came across this problem and I was wondering how others solved it. Suppose you have a really large table with 1 billion rows of data. Since HBase really doesn't have any indexes built