On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG wrote:
> All the DNs almost have the same number of blocks. Major compaction
> makes no difference.
>
I would expect major compaction to even the number of blocks across
the cluster and it'd move the data for each region local to the
regionserver.
Th
I have another question about option 2. It seems like I need to handle the
distributed scan differently to read from start row to end row, assuming 1
byte hash of the original key is used as prefix since the order of the
original key range is different from the resulting distributed key range.
On
All the DNs almost have the same number of blocks. Major compaction
makes no difference.
Thanks
Weihua
2011/5/18 Stack :
> Are there more blocks on these hot DNs than there are on the cool
> ones? If you run a major compaction and then run your tests, does it
> make a difference?
> St.Ack
>
> O
Alex:
Can you summarize HBaseWD in your blog, including points 1 and 2 below ?
Thanks
On Wed, May 18, 2011 at 8:03 AM, Alex Baranau wrote:
> There are several options here. E.g.:
>
> 1) Given that you have "original key" of the record, you can fetch the
> stored record key from HBase and use it
Hi there-
Re: " When I started inserting data in the tables it seems that they are
always inserting in a single region,"
You probably want to read this as a general warning...
http://hbase.apache.org/book.html#timeseries
.. and check this out as a potential solution for bucketing timeseries k
Hi,
I have three tables and I receive in one 1500 m/s and the other two about 500
m/s.
My row key is based on time on the three tables. When I started inserting data
in the tables it seems that they are always inserting in a single region, what
is supposed to be normal based that the key i
Ian:
Please take a look at https://issues.apache.org/jira/browse/HBASE-3794:
+TEST_UTIL.getConfiguration().setInt("hbase.regionserver.port", 0);
TestRegionServer rs = new
TestRegionServer(TEST_UTIL.getConfiguration());
On Wed, May 18, 2011 at 2:23 PM, Stack wrote:
> On Wed, May 18, 201
On Wed, May 18, 2011 at 1:50 PM, Ian Stevens wrote:
> Hi everyone. We had some tests which were using HBaseTestingUtility to start
> a single node cluster. These worked fine on our desktops and testing
> environments, but when we switched to running the tests on Amazon Web
> Services, startMini
Thank you, I like the second option better to avoid the roundtrip to HBase.
I am trying it out now.
On Wed, May 18, 2011 at 10:03 AM, Alex Baranau wrote:
> There are several options here. E.g.:
>
> 1) Given that you have "original key" of the record, you can fetch the
> stored record key from HBa
Hi everyone. We had some tests which were using HBaseTestingUtility to start a
single node cluster. These worked fine on our desktops and testing
environments, but when we switched to running the tests on Amazon Web Services,
startMiniCluster() raised a BindException:
> [exec] u.startM
Can you run hbck?
J-D
2011/5/17 bijieshan :
> Yes, you're right. While count the .META., the result will exclude the -ROOT-
> region and the .META. region. Pardon me ,I should not mention about that.
> Maybe the less 2 region is just a coincidence here, I can show another
> scenario about this
Vidhyashankar:
table.getRegionsInfo() is for advanced users (such as you) :-)
Anyway, we shouldn't enforce user to call it.
On Wed, May 18, 2011 at 11:12 AM, Vidhyashankar Venkataraman <
vidhy...@yahoo-inc.com> wrote:
> Thanks Ted! Will do it right away.
>
> 1. we should provide the following new
Thanks Ted! Will do it right away.
1. we should provide the following new API where numOfRegions is the
expected number of regions to go online:
I used table.getRegionsInfo() to make sure all regions were online instead of
this function. But that function requires apriori knowledge of the number
Vidhyashankar:
Please file the following JIRAs:
1. we should provide the following new API where numOfRegions is the
expected number of regions to go online:
public boolean isTableAvailable(final byte[] tableName, int
numOfRegions) throws IOException {
2. HBaseAdmin.createTableAsync() should c
On Tue, May 17, 2011 at 4:25 PM, Vidhyashankar Venkataraman
wrote:
> 2. The master getting stuck unable to delete a WAL (I have seen this before
> on this forum and a related JIRA on this one): We had worked around by
> manually deleting a WAL. But during times when the master crashed during
As in, the use of isTableAvailable there indicates, a bulk load should happen
only if all the regions are available.
But that may not be the case since the function returns back true if even one
region (regionCount.get()>0 check) is online.
V
On 5/17/11 7:14 PM, "Ted Yu" wrote:
Did you mean
Is this the issue:
http://search-hadoop.com/m/4uDV51XrPxj/ipv6&subj=Re+HBase+Client+connect+to+remote+HBase?
St.Ack
On Tue, May 17, 2011 at 7:47 AM, Sergey Bartunov wrote:
> I'd just installed new Ubuntu 11.04, downloaded hbase 0.90.2 and run
> bin/start-hbase.sh
>
> It starts but not correctly,
For jobs you create it sounds as a great idea.
What about other jobs such as Hive/Pig jobs?
Can anyone as any idea how it can be done on all MR jobs in a cluster no
matter the triggering of the jobs?
Ophir
On Wed, May 18, 2011 at 7:09 PM, Joey Echeverria wrote:
> Hi Ophir,
>
> That sounds like
I'd just installed new Ubuntu 11.04, downloaded hbase 0.90.2 and run
bin/start-hbase.sh
It starts but not correctly, i.e. I couldn't create new table from the
shell. Everything had worked on Ubuntu 10.10.
The error message from the logs:
org.apache.hadoop.hbase.client.RetriesExhaustedException: F
Hi Ophir,
That sounds like a useful feature, maybe file a jira?
I've never tried to save counters form the MR job into HBase, but you
could pull it from the file as you said or from the Job object after
waitForCompletion() returns by calling getCounters().
-Joey
On Wed, May 18, 2011 at 8:21 AM,
Hi All,
Currently MR job spilled his counters into file at the end of the run.
Is there any built-in configuration/plug-in to make it store these counters
into HBase as well?
Sounds to me like a great feature!
Does anybody did something similar?
If you did, how did you do it? Run on directory an
There are several options here. E.g.:
1) Given that you have "original key" of the record, you can fetch the
stored record key from HBase and use it to create Put with updated (or new)
cells.
Currently you'll need to use distributes scan for that, there's not analogue
for Get operation yet (see h
Are there more blocks on these hot DNs than there are on the cool
ones? If you run a major compaction and then run your tests, does it
make a difference?
St.Ack
On Tue, May 17, 2011 at 8:03 PM, Weihua JIANG wrote:
> -ROOT- and .META. table are not served by these hot region servers.
>
> I gener
Its not the number of tables that is of import, its the number of
regions. You can have your regions in as many tables as you like. I
do not believe there a cost to having more tables.
St.Ack
On Wed, May 18, 2011 at 5:54 AM, Wayne wrote:
> How many tables can a cluster realistically handle or
stack-3 wrote:
>
> On Mon, May 16, 2011 at 4:55 AM, Stan Barton wrote:
>>> Sorry. How do you enable overcommitment of memory, or do you mean to
>>> say that your processes add up to more than the RAM you have?
>>>
>>
>> The memory overcommitment is needed because in order to let java still
>>
How many tables can a cluster realistically handle or how many tables/node
can be supported? I am looking for a realistic idea of whether a 10 node
cluster can support 100 or even 500 tables. I realize it is recommended to
have a few tables at most (and to use the row key to add everything to one
t
Not out of the box. I use the following resource for packaging lzo with
the cloudera release:
https://github.com/toddlipcon/hadoop-lzo-packager
On 05/18/2011 08:35 AM, Pete Haidinyak wrote:
Does the Cloudera VM have LZO data compression available? If not,
since its a 32 bit system what's the be
27 matches
Mail list logo