1.) We use a different cluster for each app. I don't know if this is
best practice or not to be honest. We just wanted to separate downtime
and potential damage for each application.
2.) We usually use the HBase APIs directly. Having said that, we
recently started working on a new service. We were
Currently we're trying to use the REST API on HBase 0.90.1. I'm getting
a 500 response saying "Invalid row key".
I am trying to post this from a python program:
URL:
http://ourserver:8080/rest_test/713c5967-b2e9-4f44-b0a1-8b862838f865/metrics:json
Data:
{"Row": {"Cell": [{"@column": "metrics:json
d like ideally is to get an idea of what the
> fixed cost (in terms of bytes) is for each my tables, and then understand
> how I can calculate a variable bytes/record cost.
>
> Is this feasible?
>
> Norbert
>
> On Mon, Jan 24, 2011 at 1:16 PM, Xavier Stevens wrote:
>
>>
Norbert,
It would probably be best if you wrote a quick MapReduce job that
iterates over those records and outputs the sum of bytes for each one.
Then you could use that output and get some general descriptive
statistics based on it.
Cheers,
-Xavier
On 1/24/11 9:37 AM, Norbert Burger wrote:
Are you using a newer linux kernel with the new and "improved" memory
allocator?
If so try setting this in hadoop-env.sh:
export MALLOC_ARENA_MAX=
Maybe start by setting it to 4. You can thank Todd Lipcon if this works
for you.
Cheers,
-Xavier
On 1/11/11 7:24 AM, Andrey Stepachev wrote:
> N
ut why your
> system is better performing? The default TableInputFormat is just
> creating N map tasks, one for each region, which are all roughly the
> same data-size.
>
> What do you do?
> -ryan
>
> On Mon, Jul 26, 2010 at 3:29 PM, Xavier Stevens wrote:
>> We have
We have something that might interest you.
http://socorro.googlecode.com/svn/trunk/analysis/src/java/org/apache/hadoop/hbase/mapreduce/
We haven't fully tested everything yet, so don't blame us if something
goes wrong. It's basically the exact same as TableInputFormat except it
takes an array o