e a bit more suitable for my needs.
>
> On Mon, Jul 18, 2011 at 12:32 PM, Stack wrote:
>
>> On Mon, Jul 18, 2011 at 9:22 AM, Claudio Martella
>> wrote:
>>> Yes, I had a look at it a while ago. For what I know perfect hashing
>>> doesn't work that good fo
gt;
> For production purposes internally, we're running a custom Hadoop build
> based off of CDH3.
>
> If you run into any problems getting things setup, let us know and we'll try
> to help out.
>
> Gary
>
>
>
> On Mon, Jul 18, 2011 at 9:13 AM, Claudio Mart
On 7/18/11 6:05 PM, Stack wrote:
> On Mon, Jul 18, 2011 at 4:04 AM, Claudio Martella
> wrote:
>> No, you can have collisions, so the index is not perfect (which means
>> you can have buckets for colliding keys and empty unused entries in the
>> hashtable directory).
>
On 7/18/11 5:50 PM, Stack wrote:
> On Mon, Jul 18, 2011 at 6:01 AM, Claudio Martella
> wrote:
>> I'm guessing how HBase behaves with 0.20-security-append. Can I run it
>> on this hadoop version?
>>
> My guess is that it will work (where'd you find this bran
g with Yahoo!'s
0.20-security-append to get both these features and allowing to deploy
both the systems on the same cluster.
I'm guessing how HBase behaves with 0.20-security-append. Can I run it
on this hadoop version?
Can anybody quickly report on that?
Thanks
Claudio
--
Claudio Mart
On 7/16/11 10:08 PM, Stack wrote:
> On Fri, Jul 15, 2011 at 10:06 AM, Claudio Martella
> wrote:
>> On 7/15/11 6:24 PM, Stack wrote:
>>> How do you figure the N in the below Claudio?
>> N is the total amount of pairs in the sequence file. You know that when
>>
ted
> in this recent posting by Mikhail Bautin of an hfile v2).
I'd be interested in that, do you have a reference to it?
> St.Ack
>
> On Fri, Jul 15, 2011 at 7:58 AM, Claudio Martella
> wrote:
>> Hi Michal,
>>
>>
>> what I was talking about is more of a
ster
>> random i/o from hash indexing of data in each sequence file.
>>
>> Does anybody know if anybody has developed other indexing techniques for
>> sequence files other than Btrees?
>>
>>
>> Thanks!
>>
>> --
>> Claudio Martel
gt;> - Original Message -
>> From: Claudio Martella
>> Sent: Fri Jul 15 2011 14:40:38 GMT+0200 (CET)
>> To:
>> CC:
>> Subject: Re: data structure
>
>> supposed you want a per-hour granularity, you could have a key like this
>>
>> _
>
the date into the row key, there are many
> requests/impressions for a particular user (row key)
> so no chance to put a date into the key
>
> andre
>
>
>
>
--
Claudio Martella
Free Software & Open Technologies
Analyst
TIS innovation park
Via Siemens 19 | Siemenss
nce file.
Does anybody know if anybody has developed other indexing techniques for
sequence files other than Btrees?
Thanks!
--
Claudio Martella
Free Software & Open Technologies
Analyst
TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fa
g sounds easy until you need to worry about invalidation. It's hard to
> build efficient and correct invalidation.
> On Jul 5, 2011 2:13 AM, "Claudio Martella"
> wrote:
>> I've seen that. But that's about caching on regionserver-side through
>> memcache.
&
#x27;ll
implement it through memcache.
On 7/4/11 7:03 PM, Ted Yu wrote:
> See HBASE-4018
>
> On Mon, Jul 4, 2011 at 7:33 AM, Claudio Martella > wrote:
>> Hello list,
>>
>> i'm using hbase 0.90.3 on a 5 nodes cluster. I'm using a table as a
>> st
e in these situations? some client-side caching
already in hbase?
Best,
Claudio
--
Claudio Martella
Digital Technologies
Unit Research & Development - Analyst
TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax +39 0471 068 129
claudi
on RegionServer failure) your use case can tolerate. So in addition to
> taking advantage of group commit you can amortize sync overhead further with
> the tradeoff that under failure conditions your counters (or other data) may
> become imprecise. For some use cases that is fine.
>
>
bloom filters).
Thanks!
Claudio
--
Claudio Martella
Digital Technologies
Unit Research & Development - Analyst
TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax +39 0471 068 129
claudio.marte...@tis.bz.it http://www.tis.bz.it
Short in
gt; the
> message is not the intended recipient or an authorized representative of the
> intended recipient, you are hereby notified that any dissemination of this
> communication is strictly prohibited. If you have received this communication
> in
> error, please notify us
Happy to have helped.
On 12/22/10 6:37 AM, Pete Haidinyak wrote:
> Good call, looks like that might have been my problem. Seems simple
> now. ;-)
>
> Thanks
>
> -Pete
>
>
> On Tue, 21 Dec 2010 20:33:31 -0800, Claudio Martella
> wrote:
>
>> Could you ch
r.java:1236)
> ... 9 more
> Caused by: java.io.EOFException
> at
> java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323)
> at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> at
> org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:151)
&
Hello list,
just two lines for a proposal. Wouldn't it make more sense if put would
return the old value in case the put ends up being an update instead of
an insert?
This would mimic HashMap's behavior and would be very useful. What do
you think?
--
Claudio Martella
Digital Technol
gt;Value());
> }
>
> I was just thinking of making this scan faster within this table, maybe with
> a BloomFilter.
>
> Another table with productId - clusters row will be the next option.
>
> Thank you.
>
>
> On Fri, Dec 10, 2010 at 1:52 PM, Claudio Martella <
>
o determine which clusters a product belongs to, we perform a
> scan over the table using column,
>
> e.g.
>
> Scan s = new Scan();
> s.addColumn(Bytes.toBytes("products"), Bytes.toBytes("24659517"));
> ResultScanner scanner = table.getScanner(s);
>
> I
ompact
and fit mostly on the same regionserver.
3) I guess that for the scanning I'd make extensive use of Filters. I
guess regexp Filter will be my friend. Do you have concerns about
performance of filters applied to this data model?
Thank you very much
Claudio
--
Claudio Martella
Digital Te
byte[], byte[], org.apache.hadoop.hbase.client.Put)
>
> St.Ack
>
> On Thu, Dec 2, 2010 at 7:42 AM, Claudio Martella
> wrote:
>> Hi Ryan,
>>
>> yes that would help for sure. Shouldn't this feature be documented?
>>
>> Thanks
>>
>>
>>
ed if the value does not exist.
>
> Would that help?
>
> -ryan
>
> On Tue, Nov 30, 2010 at 6:07 AM, Claudio Martella
> wrote:
>> Hi Dave,
>>
>> thanks for you idea. I also considered this possibility. Although the
>> possibility of a collision is very small
Hi Todd,
you're right, there's no need to be purists in this case.
Thanks
On 12/1/10 9:24 AM, Todd Lipcon wrote:
> On Tue, Nov 30, 2010 at 6:02 AM, Claudio Martella <
> claudio.marte...@tis.bz.it> wrote:
>
>> Lars,
>>
>> yes, that's exactly
t integer mapping for your dictionary. And, it is somewhat
> recoverable if you ever lose your dictionary for some reason.
>
> Dave
>
> -----Original Message-
> From: Claudio Martella [mailto:claudio.marte...@tis.bz.it]
> Sent: Monday, November 29, 2010 7:13 AM
>
. And, if you can always check your
>> dictionary later for collisions if this feels wrong.
>> This should be a good deal simpler than trying to keep around an order
>> dependent integer mapping for your dictionary. And, it is somewhat
>> recoverable if you ev
succeeding to do so is adding it and then releasing
> the lock. Or some such.
>
> Lars
>
> On Nov 29, 2010, at 16:12, Claudio Martella
> wrote:
>
>> Hello list,
>>
>> I'm kind of new to HBase, so I'll post this email with a request for
>&
#x27;d relay on a system with higher latency (ZK).
Does anybody have a better solution with hbase? I guess using
hbase_transational would also be a possibility, but again, what about
speed and the actual issues with the package (like recovering in the
face of hregion failure).
Thank you,
Claudio
--
30 matches
Mail list logo