On Nov 13, Stack wrote:
>On Sun, Nov 13, 2011 at 7:13 AM, Arvind Jayaprakash
>wrote:
>> A common confusion is b/w MAX_FILESIZE and BLOCKSIZE. Given that
>> MAX_FILESIZE is not listed on :60010/master.jsp, one tends to assume
>> BLOCKSIZE represents that value.
>We s
A common confusion is b/w MAX_FILESIZE and BLOCKSIZE. Given that
MAX_FILESIZE is not listed on :60010/master.jsp, one tends to assume
BLOCKSIZE represents that value.
On Nov 10, lars hofhansl wrote:
>"BLOCKSIZE => '536870912'"
>
>
>You set your blocksize to 512mb? The default is 64k (65536), try t
On Sep 06, sagar naik wrote:
>I can dedup based on timestamp of the event.
>Can I increment the counter value and assign the version as the timestamp of
>this event ?
Is it because you have an infinitesimally fine grained timestamp, you
assume two events wont happen at the "same time" (as defined
On Sep 07, lars hofhansl wrote:
>Hi Arvind,
>
>This is interesting:
>
>> * Multiple machines can concurrently/actively handle requests for the
>> same key, so the loss of one server does not mean that a range of keys
>> is temporarily unavailable. A hbase cluster does have a partial,
>> temporary o
On Sep 06, Something Something wrote:
>Anyway, before I spent a lot of time on it, I thought I should check if
>anyone has compared HBase against CitrusLeaf. If you've, I would greatly
>appreciate it if you would share your experiences.
Disclaimer: I was an early evaluator/tester of citrusleaf ab
On Sep 02, sagar naik wrote:
>We are counting events for our application.
>Sometimes, the same event arrives multiple times.
>This leads to counting of same event multiple times.
>Is there a way I can avoid this ?
>(Say timestamp on value or filters ?)
A basic data model question is do you want a
It is possible to control the region size (hstore size) on a per table
basis? I have certain applications where the overall keyspace is small
but I'd like the data to spread nicely over many region servers that use
a certain table and another one that has potentially 2 orders of
magnitude of data a
On Jul 14, Andre Reiter wrote:
>new we are running mapreduce jobs, to generate a report: for example we
>want to know how many impressions were done by all users in last x
>days. therefore the scan of the MR job is running over all data in our
>hbase table for the particular family. this takes at t
On Jul 07, Andrew Purtell wrote:
>> Since HDFS is mostly write once how are updates/deletes handled?
>
>Not mostly, only write once.
>
>Deletes are just another write, but one that writes tombstones
>"covering" data with older timestamps.
>
>When serving queries, HBase searches store files back in
On Jun 02, Sam Seigal wrote:
> -
>
>My eventId can be one of 12 distinct values (let us say from A-L) , and I
>have a 4 node cluster running HBase right now.
>
>After doing some research in our OLTP database, I found that the majority
>(about 45% of the data) from the last 6 months written in the
On May 31, Ferdy Galema wrote:
>You can use the merge tool to combine adjacent regions. It requires a
>bit of manual work because you need to specify the regions by hand. The
>cluster also needs to be offline (I recommend to keep zookeeper running
>though). Check if merging succeeded with the hb
My setup seems to have a lot of regions with no data that just keep
accumulating over time. Here are some details:
I have time-series data (created by opentsdb) being inserted into hbase
every minute. Since the data has little value after say 15 days, I go
ahead and delete all old data.
When I lo
12 matches
Mail list logo