Hi,
Thanks for the testing and performance report!
You said you used the stargate Client package? It is pretty basic, written
mainly for convenience for writing test cases in the test suite.
Regarding Stargate quality in general, this is an alpha release. It can survive
torture testing with P
fine tuning? I am not sure if caching help in case of
random read. I agree that the all local setup is naive, will do a more
realistic test and share the observation.
Haijun
____
From: Andrew Purtell
To: hbase-user@hadoop.apache.org
Sent: Monday, August 3,
Hello,
It looks like the load put on HDFS by HBase during this rapid upload of
data exceeds the I/O capacity of your cluster. What I would do in this case
is add more nodes until the load is sufficiently spread around.
You have a five node cluster? How much RAM? How many CPU cores? How is the
di
I agree. Well done!
You may also want to look at how Vertica implements parallel analytics on
top of a column based store commercially:
http://www.vertica.com/_pdf/VerticaArchitectureWhitePaper.pdf
For example, this is interesting:
"Logical tables are decomposed and physically stored as over
This will work on the vanilla 0.18.3 also.
Let me pull up this branch now to the latest 0.20 branch. It has not been
updated since the first RC.
- Andy
From: stack
To: hbase-user@hadoop.apache.org; arie.karhend...@gmail.com
Sent: Monday, August 10, 2009 10
Your replication setting is 6? Why do you raise it above the
default (3)? I run with replication=2 on small clusters to
improve write performance.
- Andy
From: llpind
To: hbase-user@hadoop.apache.org
Sent: Wednesday, August 12, 2009 1:15:31 PM
Subject: Re
Yes, if you are using Put.setWriteToWAL(false), then the data won't have any
persistence until after a memstore flush.
- Andy
--- On Sat, 8/15/09, Amandeep Khurana wrote:
From: Amandeep Khurana
Subject: Re: data loss with hbase 0.19.3
To: hbase-user@hadoop.apache.org
Date: Saturday, Augus
Most that I am aware of set up transient test environments up on EC2.
You can use one instance to create an EBS volume containing all software
and config you need, then snapshot it, then clone volumes based on the
snapshot to attach to any number of instances you need. Use X-Large
instances, at l
I back Stargate. It is not going away.
- Andy
From: Greg Cottman
To: "hbase-user@hadoop.apache.org"
Sent: Tuesday, August 18, 2009 4:30:19 PM
Subject: REST or Stargate?
I'm still a little confused by 0.20.0 releasing both a rewrite of the REST
interface
Some number of CPU instructions are always emulated when running in a VM,
anything that would affect real processor state with respect to hardware or
affecting the integrity of other tasks. MMU functions are virtualized/shadowed
and require an extra level of mediation. Emulation of privileged op
The behavior of TableInputFormat is to schedule one mapper for every table
region.
In addition to what others have said already, if your reducer is doing little
more than storing data back into HBase (via TableOutputFormat), then you can
consider writing results back to HBase directly from the
There are plans to host live region assignments in ZK and keep only an
up-to-date copy of this state in META for use on cold boot. This is on the
roadmap for 0.21 but perhaps could be considered for 0.20.1 also. This may help
here.
A TM development group saw the same behavior on a 0.19 cluster
Hi,
This is far too little RAM and underpowered for CPU also. The rule of thumb is
1GB RAM for system, 1GB RAM for each Hadoop daemon (HDFS, jobtracker,
tasktracker, etc.), 1GB RAM for Zookeeper, 1GB RAM (but more if you want
performance/caching) for HBase region servers; and 1 hardware core fo
avior.
JG
Andrew Purtell wrote:
> There are plans to host live region assignments in ZK and keep only an
> up-to-date copy of this state in META for use on cold boot. This is on the
> roadmap for 0.21 but perhaps could be considered for 0.20.1 also. This may
> help here.
> A TM d
See http://issues.apache.org/jira/browse/HDFS-200. I think support for what
HBase needs from HDFS is almost there. Alternatively you could consider a
different type of underlying filesystem -- Lustre, Gluster, etc.
However, you say that HBase crashes for a very small table. I wonder why that
is
In this keynote address here at VLDB 2009 (http://vldb2009.org/?q=node/22)
Raghu Ramakrishnan, Yahoo! Research's Chief Scientist, made prominent mention
of HBase, much to my surprise (and later chagrin). This happened near the end
of the talk when a number of the new elastic/scalable/"nosql" sto
Zookeeper is deployed as an ensemble of quorum peers in order to be highly
available. Ensembles should be odd so one of a set of conflicting votes (should
they occur) is always a majority. Typical deployments are of 3, 5, and 7
dedicated servers.
- Andy
an improvement (do you think the kick up the behind I rendered
him after his amsterdam talk could have had anything to do with it?).
Can we write him to figure more on how evaluation was done?
Should we try and get into vldb next year?
Good stuff Andy. Any thing else interesting at the conference?
dfordsteph...@gmail.com> wrote:
> Interesting. I need to see what sort of eval was going on for that
> presentation...
>
> He probably forgot to tweak GC :)
>
> On Tue, Aug 25, 2009 at 9:32 AM, Andrew Purtell
> wrote:
>
> > > Can we write him to figure more on how e
ested in our project. But we
> need more flexible data store schema to resolve engineering problems,
> especially performance and practicability.
>
> @andy
> Does ryan's result different from JG's?
> On Wed, Aug 26, 2009 at 2:50 AM, Andrew Purtell wrote:
>
>> H
> dfs.datanode.socket.write.timeout => 0
This isn't needed any more given the locally patched Hadoop jar we distribute
containing the fix for HDFS-127
From: stack
To: hbase-user@hadoop.apache.org
Sent: Thursday, August 27, 2009 6:29:38 AM
Subject: Re: Setting
It wouldn't hurt as you mention for increased tolerance of loss of data nodes.
But the clients cache ROOT and META so read performance gain will be negligible.
- Andy
From: Alexandra Alecu
To: hbase-user@hadoop.apache.org
Sent: Thursday, August 27, 2009 6:
Stargate scanners support start and stop row keys, so you can bound it to a
range scan. You don't need to know an exact key when specifying start and stop
keys for HBase scanners. Just use something lexicographically close.
For 0.21 Stargate needs an extension for composing scan (and get) filter
I think Yair is right. I've seen this same thing when the major versions of ZK
differ on the client and server side.
- Andy
From: Jean-Daniel Cryans
To: hbase-user@hadoop.apache.org
Sent: Tuesday, September 1, 2009 3:45:20 AM
Subject: Re: problem running H
Another possibility is the security groups up on EC2 are not set up to allow
the ZK client port (2181).
- Andy
From: Andrew Purtell
To: hbase-user@hadoop.apache.org
Sent: Tuesday, September 1, 2009 9:47:38 AM
Subject: Re: problem running Hbase & K
>From my point of view the biggest difference between the two systems is the
>following: Cassandra can always accept writes. It uses its own local storage
>(no dependence on HDFS or anything like that) and P2P data replication. In
>contrast, HBase depends on HDFS so is unavailable if the filesys
Right... I recall an incident in AWS where a malformed gossip packet took
down all of Dynamo. Seems that even P2P doesn't mitigate against corner cases.
On Tue, Sep 1, 2009 at 3:12 PM, Jonathan Ellis wrote:
> The big win for Cassandra is that its p2p distribution model -- which
> drives the con
To be precise, S3. http://status.aws.amazon.com/s3-20080720.html
- Andy
From: Andrew Purtell
To: hbase-user@hadoop.apache.org
Sent: Tuesday, September 1, 2009 5:53:09 PM
Subject: Re: Cassandra vs HBase
Right... I recall an incident in AWS where a
200 MB is below the region split point of 256 MB so your table will only have
one region, therefore hosted on one region server. You can split the table
manually by any of the following options:
1) Adjust hbase.hregion.max.filesize in hbase-site.xml (global setting)
2) Set the MAX_FILE_SIZE att
need a data storage engine, which need
> not so strong consistency, and it can provide better writing and
> reading throughput like HDFS. Maybe, we can design another system like a
> simpler HBase ?
>
> Schubert
>
> On Wed, Sep 2, 2009 at 8:56 AM, Andrew Purtell wrote:
>
Regionserver logs will mention if major compaction happened on a region
or not, but maybe only at DEBUG log level, I'm not sure. I was running
with DEBUG at the time and saw immediately on a region server log tail
that the store files were rewritten with LZO.
I downloaded a HFile out of a randomly
libgplcompression.so in hbase/lib/native/ on all region server nodes?
hadoop-gpl-compression.jar in hbase/lib on all region server nodes?
Run at DEBUG log level and initiate major_compaction. Anything relevant in the
master log? Anything relevant in the region server logs?
- Andy
_
Hi Ken,
Compactions are serviced by a thread which sleeps for a configurable interval
and then wakes to do work. As compaction requests are raised, they are queued
and the thread is signaled and wakes early. When a region server first starts
up, a limit is imposed on how many compaction request
compaction on open IFF the region has
references, may be this facility is no longer needed?
St.Ack
On Wed, Sep 2, 2009 at 6:05 PM, Andrew Purtell wrote:
> Hi Ken,
>
> Compactions are serviced by a thread which sleeps for a configurable
> interval and then wakes to do work. As compaction
Ok, I'm going to ramble about HBase, Bigtable, and storage engineering. Sorry
about that in advance.
IIRC, Google aims for ~100 tablets per tablet server. I believe each tablet is
also kept to around 200 MB. These details are several years old so maybe have
changed. This minimizes the amount of
y and
> > people
> > > do
> > > > > >> know how to optimise, use and manage them. It seems
> > column-oriented
> > > > > database
> > > > > >> systems are still young :)
> > > > > >>
> > >
Yes I understand now. Thank you.
- Andy
From: stack
To: hbase-user@hadoop.apache.org
Sent: Friday, September 4, 2009 10:45:40 AM
Subject: Re: Time-series data problem (WAS -> Cassandra vs HBase)
On Fri, Sep 4, 2009 at 10:16 AM, Andrew Purtell wr
There is some limited functionality for this already -- HTable.checkAndPut().
Java object serialization doesn't work the way I think you are expecting it to
work. Bytecode is not sent, only class name and field data. Therefore the class
implementation must already exist server side somewhere on
If I/we can get around to HBASE-1002 it may be substantially easier to use
filters.
- Andy
From: stack
To: hbase-user@hadoop.apache.org
Sent: Tuesday, September 8, 2009 11:18:34 AM
Subject: Re: Any basic documentation about using filters with hbase?
I'm
If I/we can get around to HBASE-1002 it may be substantially easier to use
filters.
- Andy
From: stack
To: hbase-user@hadoop.apache.org
Sent: Tuesday, September 8, 2009 11:18:34 AM
Subject: Re: Any basic documentation about using filters with hbase?
I'm
The 0.20_on_hadoop-0.18.3 branch has been pulled up to 0.20.0 release and is
now closed.
- Andy
uestions
I am using the Stargate REST API (version 0.20.0) - thanks to Andrew Purtell
for his help!! - and I had a couple of questions:
1. If I don't put the optional "timestamp" attribute in a Cell when I'm
inserting data, the timestamp ends up as zero, which caused m
THBase and ITHBase are officially supported components in 0.20.0 and will
remain so in 0.21.0. Their presence in contrib/ is because they are not core
functionality but that placement should not be construed as second class.
- Andy
From: Keith Thomas
To:
A pseudo distributed config (LocalHBaseCluster) starts up in process region
servers by invoking the HRegionServer constructor, the first few lines of which
is:
machineName = DNS.getDefaultHost(
conf.get("hbase.regionserver.dns.interface","default"),
conf.get("hbase.regionserv
The shell will work on the same host as the server. The region server and
master are bound to the localhost interface. Any remote client connecting to ZK
to find these addresses will get localhost addresses. So the question is why
the master and region server sockets are bound to lo. This is som
You can also set time to lives on a per column family basis. Expired cells are
also garbage collected at the next major compaction.
- Andy
From: stack
To: hbase-user@hadoop.apache.org
Sent: Monday, September 14, 2009 7:27:20 AM
Subject: Re: Schema design:
It should be possible to associate N+1 elastic IPs with the master and N region
server instances. Allocate a pool of elastic IPs and associate them as
appropriate when (re)starting instances. Be sure to set the security group
policy to allow connections to the ZK, master, and region server ports
Lucas,
> [...] ServerManager: 0 region servers, 1 dead, average load
Is this a pseudo distributed test? I.e. everything running on one EC2 instance?
What instance type are you using?
The reported exceptions indicate a possible problem with DFS. Are there
exceptions in the DataNode logs? Or t
Bonjour Guillaume,
Your issue #2 looks like two separate issues:
2a) Memcache flusher gating. This is better in 0.20.0. I encourage you to
upgrade for this and any number of other reasons.
2b) HDFS-127. See https://issues.apache.org/jira/browse/HDFS-127. Upgrade to
HBase 0.20.0 or patc
er.CompactSplitThread:
regionserver/10.209.207.80:60020.compactor exiting
Lucas
On Wed, Sep 16, 2009 at 1:37 PM, Andrew Purtell wrote:
> Lucas,
>
> > [...] ServerManager: 0 region servers, 1 dead, average load
>
> Is this a pseudo distributed test? I.e. everything running
Nazário dos Santos <
nazario.lu...@gmail.com> wrote:
> Thanks, Andy. I'll give it a shot.
>
>
> On Wed, Sep 16, 2009 at 3:18 PM, Andrew Purtell
> wrote:
>
> > Hi Lucas,
> >
> > The log fragment you sent is actually after the problem occurred, and it
Can I wear my user hat?
I have used the Region Historian on occasion as it is much easier than grepping
through master logs to find transitions. We also discard master logs after 7
days because they are large, especially when running with DEBUG from time to
time. Obviously we don't have that pr
Hive and HBase can coexist independently on top of the same DFS filesystem and
cluster.
There are two ongoing projects to hook up HBase and HIVE that I am aware of:
http://issues.apache.org/jira/browse/HIVE-705
https://issues.apache.org/jira/browse/HIVE-806
Depending on how far along t
y that useful compared to reading logs
and that it caused us a lot of trouble thus we should disable it (to
prevent bugs) and build something better.
J-D
On Thu, Sep 17, 2009 at 9:45 PM, Andrew Purtell wrote:
> Can I wear my user hat?
>
> I have used the Region Historian on occasion as
Everything I talked about is better done in parallel on logs. Definitely agree.
On Thu, September 17, 2009 11:04 pm, stack wrote:
> Its a sweet feature, I know how it works, but I find myself never really
> using it. Instead I go to logs because there I can get a more
> comprehensive picture t
t;
>> >hbase.regionserver.class
>> >org.apache.hadoop.hbase.ipc.IndexedRegionInterface
>> >
>> >
>> >hbase.regionserver.impl
>> >
>> >
>> >
>> org.apache.hadoop.hbase.regionserver.tableinde
Hi Adam, thanks for writing in.
I suggest using Thrift or the native Java API instead of REST for benchmarking
performance. If you must use REST, for bulk throughput benching, consider using
Stargate (contrib/stargate/) and bulk transactions -- scanners with 'batch'
parameter set >= 100, or mul
Looks like your DFS NameNode became unavailable about the same time that
ZooKeeper timeouts started happening. Overloading? Anything relevant in the
NameNode logs?
- Andy
From: Lucas Nazário dos Santos
To: hbase-user@hadoop.apache.org
Sent: Wed, October 7
-1 due to 1890. We need a new RC.
- Andy
From: stack
To: HBase Dev List ; hbase-user@hadoop.apache.org
Sent: Mon, October 5, 2009 10:41:58 PM
Subject: ANN: hbase 0.20.1 Release Candidate 1 available for download
I've posted an hbase 0.20.1 release candidat
No regionservers are started?
None have checked in.
- Andy
From: Ananth T. Sarathy
To: hbase-user@hadoop.apache.org
Sent: Wed, October 7, 2009 10:35:20 AM
Subject: Re: hbase on s3 and safemode
Here is the log since I started it...
Wed Oct 7 13:27:26 E
ns(ms): 20Number of transactions batched in
Syncs: 32 Number of syncs: 581 SyncTimes(ms): 12750
2009-10-07 11:31:28,581 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_5038589320720026567 is added to invalidSet
of 192.168.1.3:50010
On Wed, Oct 7, 2009 at 1:58 PM, A
HBase won't leave safe mode if the regionservers cannot contact the master. So
the question is why cannot your regionservers contact the master. If the
regionserver processes are confirmed running, then it's a firewall or AWS
Security Groups config problem most likely.
status was a shell comma
rs on implicitly besides
start-hbase.sh?
Ananth T Sarathy
On Wed, Oct 7, 2009 at 2:31 PM, Andrew Purtell wrote:
> HBase won't leave safe mode if the regionservers cannot contact the master.
> So the question is why cannot your regionservers contact the master. If the
> regionserver proc
rom: Ananth T. Sarathy
> To: hbase-user@hadoop.apache.org
> Sent: Wed, October 7, 2009 11:36:22 AM
> Subject: Re: hbase on s3 and safemode
>
> is there a way to turn my regionservers on implicitly besides
> start-hbase.sh?
> Ananth T Sarathy
>
>
> On Wed, Oct 7, 200
n Wed, Oct 7, 2009 at 3:41 PM, Andrew Purtell wrote:
> Did you edit hbase-site.xml such that HBase data directories are not in
> /tmp? Maybe a silly question... but it happens sometimes.
>
> If your hbase.rootdir points to an HDFS filesystem, what does 'hadoop fs
> -lsr hdfs://na
l sorts of things in the bucket when I explore it.
We are going to set up .20.0 and point it to a new bucket. Any tips I should
know about to avoid something like this or data loss?
Ananth T Sarathy
On Wed, Oct 7, 2009 at 3:55 PM, Andrew Purtell wrote:
> One possibility is you loaded data, b
hing like this or data loss?
>
> Ananth T Sarathy
>
>
> On Wed, Oct 7, 2009 at 3:55 PM, Andrew Purtell wrote:
>
>> One possibility is you loaded data, but not enough to cause a flush, then
>> there appeared to be some network related problem, and you killed the
>&
wrote:
> there is all sorts of things in the bucket when I explore it.
>
> We are going to set up .20.0 and point it to a new bucket. Any tips I should
> know about to avoid something like this or data loss?
>
> Ananth T Sarathy
>
>
> On Wed, Oct 7, 2009 at 3:55 PM,
Do you really only have "localhost" as the value of hbase.zookeeper.quorum in
your hbase-site.xml config file? Or perhaps no such entry at all? Or maybe
hbase-site.xml is not on the classpath of the task trackers? Are you running an
external ZooKeeper ensemble or is HBase managing it?
How many
tandalone" mode for now. I know it sounds
weird, but hopefully we will get funding for additional hardware soon.
So.. just one HBase master for now :(
--- On Mon, 10/12/09, Andrew Purtell wrote:
From: Andrew Purtell
Subject: Re: SessionExpiredException: KeeperErrorCode = Session exp
HBase goes down if the Zookeeper leases expire. This can easily happen if you
overload your system. The processes will not get enough CPU, or will be
partially swapped out and suspended, when the next ZK heartbeat is due. If a
process misses the heartbeat, ZK expires the session. HBase daemons r
Nice job Tatsuya-san.
Even on small clusters I raise hbase.regionserver.handler.count to 100 and
recommend this in general. I added a comment about this on the troubleshooting
page of the HBase wiki: http://wiki.apache.org/hadoop/Hbase/Troubleshooting
Also I filed an issue on this to track it:
Troubleshooting wiki updated:
http://wiki.apache.org/hadoop/Hbase/Troubleshooting
- Andy
From: Ananth T. Sarathy
To: hbase-user@hadoop.apache.org
Sent: Tue, October 20, 2009 12:52:45 AM
Subject: Re: zookeeper configuration
that fixed it!
Ananth T Sarathy
The reason JG points to load as being a problem as all signs point to it: This
is usually the culprit behind DFS "no live block" errors -- the namenode is too
busy and/or falling behind, or the datanodes are falling behind, or actually
failing. Also, in the log snippets you provide, HBase is com
of the
test, but I don't think that this should result in data loss. Failed
inserts at the client level I can handle, but loss of data that was
previously thought to be stored in hbase is a major issue. Are there
plans to make hbase more resilient to load based failures?
Regards,
elsif
Please ignore my earlier response. That does look suspiciously like client
error.
Can you correlate the block
/mnt/8d8aa987-017c-4f4c-822c-fe53f4163cee/current/subdir18/blk_-6090437519946022724
to a file? And this file to a region? What is in the master log as relates to
this region around th
> In general, I do not recommend running with VMs... Running two hbase
> nodes on a single node in VMs vs running one hbase node on the same node
> w/o VM, I don't really see where you'd get any benefit.
We use a mixed deployment model where we move HDFS and HBase into Xen's dom0,
then deploy
HBase is not developed or tested against Hadoop's S3FS. Have you considered
using HDFS backed by EBS volumes instead of S3? HDFS on top of EBS insures the
filesystem
semantics that HBase expects.
Getting back to the first pastebin, I see this
> 2009-10-26 11:38:12,282 INFO
org.apache.hadoop.hba
26,548 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scanning meta region {server:
10.1.29.20:60020, regionname: .META.,,1, startKey: <>}
Andrew Purtell wrote:
> Please ignore my earlier response. That does look suspiciously like client
> error.
>
> Can yo
he add table. Any ideas?
> Ananth T Sarathy
>
>
> On Mon, Oct 26, 2009 at 7:31 PM, Andrew Purtell
> wrote:
>
> > HBase is not developed or tested against Hadoop's S3FS. Have you
> considered
> > using HDFS backed by EBS volumes instead of S3? HDFS on
The FindBugs static code checker flagged WildcardColumnTracker.checkColumn
also, null dereference possible on some code paths. I checked in a fix which
did not hurt this issue but would not help either. I added null tests and flow
control to prevent the issue seen by the checker but did not chan
By the way that's not an admission of anything. The recursion issue predates
the FindBugs warning fix. :-)
From: Andrew Purtell
To: hbase-user@hadoop.apache.org
Sent: Thu, October 29, 2009 5:59:39 PM
Subject: Re: stackoverflow error
The FindBugs static
INFO org.apache.hadoop.hbase.regionserver.HRegion:
compaction completed on region .META.,,1 in 0sec
Forwarded Message
From: Andrew Purtell (RD-US)
Sent: Thu 10/29/2009 4:44 PM
Subject: RE: jhat (was: RE: HBase 0.20.0 region server crash due to out
of memory error
You should consider provisioning more nodes to get beyond this ceiling you
encountered.
DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing cannot
happen fast enough to avoid an OOME. Possibly there was even insufficient CPU
to GC. The log entries you highlighted indicate the
Not that I'm aware of. What are you trying to do Mike?
- Andy
From: mike anderson
To: hbase-user@hadoop.apache.org
Sent: Fri, November 6, 2009 11:36:23 PM
Subject: ruby/stargate
Does anybody know of a rubby wrapper for the stargate interface?
Cheers,
Mike
2:29:32 AM
Subject: Re: ruby/stargate
>From another thread in the list, see http://github.com/greglu/hbase-ruby
J-D
On Sun, Nov 8, 2009 at 1:35 AM, Andrew Purtell wrote:
> Not that I'm aware of. What are you trying to do Mike?
>
> - Andy
>
>
>
>
> __
hbase to run itself into the ground causing
data loss or corruption under any circumstances.
*
*
Andrew Purtell wrote:
> You should consider provisioning more nodes to get beyond this ceiling you
> encountered.
>
> DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flu
gt; point at some point in its lifetime as more and more data is added. We
>> need to have a graceful method to put the cluster into safe mode until
>> more resources can be added or the load on the cluster has been
>> reduced. We cannot allow hbase to run itself into the ground
When you try to start the region servers, what do you see in the log?
If you don't change the client port (hbase.zookeeper.property.clientPort), does
it work?
- Andy
From: Jeff Zhang
To: hbase-user@hadoop.apache.org
Sent: Tue, November 10, 2009 2:40:2
So that's about 1500 regions per region server. Should be more like 200-250 I
think because as a practical matter the recovery time for redeploying 1500
regions upon hardware failure will block clients for a long time. It may not
even be possible if all region servers are almost at the point of
See http://issues.apache.org/jira/browse/HBASE-1537 . Already committed on
trunk. You can set a parameter on a Scan object that will cause HBase to chunk
the response. The semantics of the scanner's next() method changes in this
case. More than one call to next() may be required to move to the n
Hi Jeff,
Do what is best for the most common use you anticipate will happen on query or
scan. Are you going to be most often displaying the value as a string? If so,
store as string via Bytes.toBytes(String). Or are you most often going to
perform some integer arithmetic or some other operation
The new scripts in trunk at src/contrib/ec2 will offer this approach soon.
Right now they simply back HDFS with instance storage (volatile) and rely on
not having more than the HDFS replication factor (default = 3) instances crash
or terminate at one time. Using EBS is a big win for its persiste
Flush files have good read locality as soon as they are written. As JD and Ryan
say, after the major compaction interval elapses in a cluster's lifetime,
region store files generally have good read locality also. This interval is
configurable and you can also trigger it manually via the shell or
There's a team evaluating HBase in Trend that raised this very issue today.
This is the test as described:
"We execute the following step via Java API:
a.
create many tables (about 1000 tables), each table have 10 columns and 20 rows
(value length is 60-100 bytes)
b. delete some t
[WAS -> Re: Table disabled but all
regions still online?]
I am torn. I sort of want to just fix it right in 0.21 but your tm team
writing such a test would indicate this an impotant feature and maybe we should
not wait?
On Nov 18, 2009, at 11:13 AM, Andrew Purtell wrote:
> There
1) How long did you wait before logging on to the master and checking status? I
bet if you checked the slave instances status at that time (e.g. using
elasticfox) they were still pending startup. The slaves are started last and
EC2 may not start them right away nor all at once.
2) The way priva
As for convenient ssh between master and slaves, that's an easy thing to set
up. I'll put up an issue for that.
On Wed Nov 18th, 2009 11:07 PM PST Andrew Purtell wrote:
>1) How long did you wait before logging on to the master and checking status?
>I bet if you checked the
Naresh,
> How many regions are created after splitting the one region ?
Two.
> Do all the split regions remain on the same node that hosted the first region?
One half remains on the same node. The other is closed and released for
reassignment. Depending on the number of nodes and current clus
pe I think.
- Andy
From: Lars George
To: hbase-user@hadoop.apache.org
Sent: Thu, November 19, 2009 3:32:36 AM
Subject: .deb
Hi,
I know that Andrew Purtell did compile a .rpm for HBase, do we still have that
and is current? Is there a .deb too? Would it make sense to
1 - 100 of 525 matches
Mail list logo