Re: Hash indexing of HFiles

2011-07-19 Thread Claudio Martella
e a bit more suitable for my needs. > > On Mon, Jul 18, 2011 at 12:32 PM, Stack wrote: > >> On Mon, Jul 18, 2011 at 9:22 AM, Claudio Martella >> wrote: >>> Yes, I had a look at it a while ago. For what I know perfect hashing >>> doesn't work that good fo

Re: HBase and Hadoop 0.20-security-append

2011-07-18 Thread Claudio Martella
gt; > For production purposes internally, we're running a custom Hadoop build > based off of CDH3. > > If you run into any problems getting things setup, let us know and we'll try > to help out. > > Gary > > > > On Mon, Jul 18, 2011 at 9:13 AM, Claudio Mart

Re: Hash indexing of HFiles

2011-07-18 Thread Claudio Martella
On 7/18/11 6:05 PM, Stack wrote: > On Mon, Jul 18, 2011 at 4:04 AM, Claudio Martella > wrote: >> No, you can have collisions, so the index is not perfect (which means >> you can have buckets for colliding keys and empty unused entries in the >> hashtable directory). >

Re: HBase and Hadoop 0.20-security-append

2011-07-18 Thread Claudio Martella
On 7/18/11 5:50 PM, Stack wrote: > On Mon, Jul 18, 2011 at 6:01 AM, Claudio Martella > wrote: >> I'm guessing how HBase behaves with 0.20-security-append. Can I run it >> on this hadoop version? >> > My guess is that it will work (where'd you find this bran

HBase and Hadoop 0.20-security-append

2011-07-18 Thread Claudio Martella
g with Yahoo!'s 0.20-security-append to get both these features and allowing to deploy both the systems on the same cluster. I'm guessing how HBase behaves with 0.20-security-append. Can I run it on this hadoop version? Can anybody quickly report on that? Thanks Claudio -- Claudio Mart

Re: Hash indexing of HFiles

2011-07-18 Thread Claudio Martella
On 7/16/11 10:08 PM, Stack wrote: > On Fri, Jul 15, 2011 at 10:06 AM, Claudio Martella > wrote: >> On 7/15/11 6:24 PM, Stack wrote: >>> How do you figure the N in the below Claudio? >> N is the total amount of pairs in the sequence file. You know that when >>

Re: Hash indexing of HFiles

2011-07-15 Thread Claudio Martella
ted > in this recent posting by Mikhail Bautin of an hfile v2). I'd be interested in that, do you have a reference to it? > St.Ack > > On Fri, Jul 15, 2011 at 7:58 AM, Claudio Martella > wrote: >> Hi Michal, >> >> >> what I was talking about is more of a

Re: Hash indexing of HFiles

2011-07-15 Thread Claudio Martella
ster >> random i/o from hash indexing of data in each sequence file. >> >> Does anybody know if anybody has developed other indexing techniques for >> sequence files other than Btrees? >> >> >> Thanks! >> >> -- >> Claudio Martel

Re: data structure

2011-07-15 Thread Claudio Martella
gt;> - Original Message - >> From: Claudio Martella >> Sent: Fri Jul 15 2011 14:40:38 GMT+0200 (CET) >> To: >> CC: >> Subject: Re: data structure > >> supposed you want a per-hour granularity, you could have a key like this >> >> _ >

Re: data structure

2011-07-15 Thread Claudio Martella
the date into the row key, there are many > requests/impressions for a particular user (row key) > so no chance to put a date into the key > > andre > > > > -- Claudio Martella Free Software & Open Technologies Analyst TIS innovation park Via Siemens 19 | Siemenss

Hash indexing of HFiles

2011-07-15 Thread Claudio Martella
nce file. Does anybody know if anybody has developed other indexing techniques for sequence files other than Btrees? Thanks! -- Claudio Martella Free Software & Open Technologies Analyst TIS innovation park Via Siemens 19 | Siemensstr. 19 39100 Bolzano | 39100 Bozen Tel. +39 0471 068 123 Fa

Re: client-side caching

2011-07-05 Thread Claudio Martella
g sounds easy until you need to worry about invalidation. It's hard to > build efficient and correct invalidation. > On Jul 5, 2011 2:13 AM, "Claudio Martella" > wrote: >> I've seen that. But that's about caching on regionserver-side through >> memcache. &

Re: client-side caching

2011-07-05 Thread Claudio Martella
#x27;ll implement it through memcache. On 7/4/11 7:03 PM, Ted Yu wrote: > See HBASE-4018 > > On Mon, Jul 4, 2011 at 7:33 AM, Claudio Martella > wrote: >> Hello list, >> >> i'm using hbase 0.90.3 on a 5 nodes cluster. I'm using a table as a >> st

client-side caching

2011-07-04 Thread Claudio Martella
e in these situations? some client-side caching already in hbase? Best, Claudio -- Claudio Martella Digital Technologies Unit Research & Development - Analyst TIS innovation park Via Siemens 19 | Siemensstr. 19 39100 Bolzano | 39100 Bozen Tel. +39 0471 068 123 Fax +39 0471 068 129 claudi

Re: on the impact of incremental counters

2011-06-20 Thread Claudio Martella
on RegionServer failure) your use case can tolerate. So in addition to > taking advantage of group commit you can amortize sync overhead further with > the tradeoff that under failure conditions your counters (or other data) may > become imprecise. For some use cases that is fine. > >

on the impact of incremental counters

2011-06-18 Thread Claudio Martella
bloom filters). Thanks! Claudio -- Claudio Martella Digital Technologies Unit Research & Development - Analyst TIS innovation park Via Siemens 19 | Siemensstr. 19 39100 Bolzano | 39100 Bozen Tel. +39 0471 068 123 Fax +39 0471 068 129 claudio.marte...@tis.bz.it http://www.tis.bz.it Short in

Re: is there an atomic checkAndPut in hbase?

2011-01-31 Thread Claudio Martella
gt; the > message is not the intended recipient or an authorized representative of the > intended recipient, you are hereby notified that any dissemination of this > communication is strictly prohibited. If you have received this communication > in > error, please notify us

Re: I give up, help please

2010-12-22 Thread Claudio Martella
Happy to have helped. On 12/22/10 6:37 AM, Pete Haidinyak wrote: > Good call, looks like that might have been my problem. Seems simple > now. ;-) > > Thanks > > -Pete > > > On Tue, 21 Dec 2010 20:33:31 -0800, Claudio Martella > wrote: > >> Could you ch

Re: I give up, help please

2010-12-21 Thread Claudio Martella
r.java:1236) > ... 9 more > Caused by: java.io.EOFException > at > java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323) > at java.io.DataInputStream.readUTF(DataInputStream.java:572) > at > org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:151) &

the semantics of HTable.put()

2010-12-18 Thread Claudio Martella
Hello list, just two lines for a proposal. Wouldn't it make more sense if put would return the old value in case the put ends up being an update instead of an insert? This would mimic HashMap's behavior and would be very useful. What do you think? -- Claudio Martella Digital Technol

Re: Determine in which row a column exists

2010-12-10 Thread Claudio Martella
gt;Value()); > } > > I was just thinking of making this scan faster within this table, maybe with > a BloomFilter. > > Another table with productId - clusters row will be the next option. > > Thank you. > > > On Fri, Dec 10, 2010 at 1:52 PM, Claudio Martella < >

Re: Determine in which row a column exists

2010-12-10 Thread Claudio Martella
o determine which clusters a product belongs to, we perform a > scan over the table using column, > > e.g. > > Scan s = new Scan(); > s.addColumn(Bytes.toBytes("products"), Bytes.toBytes("24659517")); > ResultScanner scanner = table.getScanner(s); > > I

GraphDB over HBase or Columnstore in general

2010-12-08 Thread Claudio Martella
ompact and fit mostly on the same regionserver. 3) I guess that for the scanning I'd make extensive use of Filters. I guess regexp Filter will be my friend. Do you have concerns about performance of filters applied to this data model? Thank you very much Claudio -- Claudio Martella Digital Te

Re: incremental counters and a global String->Long Dictionary

2010-12-02 Thread Claudio Martella
byte[], byte[], org.apache.hadoop.hbase.client.Put) > > St.Ack > > On Thu, Dec 2, 2010 at 7:42 AM, Claudio Martella > wrote: >> Hi Ryan, >> >> yes that would help for sure. Shouldn't this feature be documented? >> >> Thanks >> >> >>

Re: incremental counters and a global String->Long Dictionary

2010-12-02 Thread Claudio Martella
ed if the value does not exist. > > Would that help? > > -ryan > > On Tue, Nov 30, 2010 at 6:07 AM, Claudio Martella > wrote: >> Hi Dave, >> >> thanks for you idea. I also considered this possibility. Although the >> possibility of a collision is very small

Re: incremental counters and a global String->Long Dictionary

2010-12-02 Thread Claudio Martella
Hi Todd, you're right, there's no need to be purists in this case. Thanks On 12/1/10 9:24 AM, Todd Lipcon wrote: > On Tue, Nov 30, 2010 at 6:02 AM, Claudio Martella < > claudio.marte...@tis.bz.it> wrote: > >> Lars, >> >> yes, that's exactly

Re: incremental counters and a global String->Long Dictionary

2010-11-30 Thread Claudio Martella
t integer mapping for your dictionary. And, it is somewhat > recoverable if you ever lose your dictionary for some reason. > > Dave > > -----Original Message- > From: Claudio Martella [mailto:claudio.marte...@tis.bz.it] > Sent: Monday, November 29, 2010 7:13 AM >

Re: incremental counters and a global String->Long Dictionary

2010-11-30 Thread Claudio Martella
. And, if you can always check your >> dictionary later for collisions if this feels wrong. >> This should be a good deal simpler than trying to keep around an order >> dependent integer mapping for your dictionary. And, it is somewhat >> recoverable if you ev

Re: incremental counters and a global String->Long Dictionary

2010-11-29 Thread Claudio Martella
succeeding to do so is adding it and then releasing > the lock. Or some such. > > Lars > > On Nov 29, 2010, at 16:12, Claudio Martella > wrote: > >> Hello list, >> >> I'm kind of new to HBase, so I'll post this email with a request for >&

incremental counters and a global String->Long Dictionary

2010-11-29 Thread Claudio Martella
#x27;d relay on a system with higher latency (ZK). Does anybody have a better solution with hbase? I guess using hbase_transational would also be a possibility, but again, what about speed and the actual issues with the package (like recovering in the face of hregion failure). Thank you, Claudio --