We have a couple clusters running with lzo compression. When testing
the new 0.20.1 release I setup a single node cluster and reused the
compression jar and native libraries from the 0.20.0 release. The
following session log shows a table being created with the lzo option
and some rows being add
Random writes is kind of the area where HBase is great at. Like you
said, you cant just rewrite large sequence files every time you have a
change, so you would consider using a system like HBase to help with
that.
I'd say if you want some map-reduce capacity, hdfs, hbase you will
need 4 cpus per
On Wed, Sep 23, 2009 at 2:01 PM, Andrzej Jan Taramina
wrote:
> I asked about the best way to process a large quantity of smaller XML files
> using Hadoop mapred, on the main Hadoop
> mailing list, and was advised that HBase would be a good alternative to
> handle this.
>
> ...
>
> What I would li
Looks like it was removed before the release of 0.19.0 by hbase-728 (Do svn
diff -r705770:707247 conf/hbase-default.xml) so hasn't been working with a
while?
St.Ack
On Wed, Sep 23, 2009 at 2:09 PM, Clint Morgan wrote:
>
>hbase.regionserver.optionalcacheflushinterval
>180
>
>
hbase.regionserver.optionalcacheflushinterval
180
Amount of time to wait since the last time a region was flushed before
invoking an optional cache flush (An optional cache flush is a
flush even though memcache is not at the memcache.flush.size).
Default: 30 minu
I asked about the best way to process a large quantity of smaller XML files
using Hadoop mapred, on the main Hadoop
mailing list, and was advised that HBase would be a good alternative to handle
this.
More specifically, we need to start by processing about 250K XML files, each of
which is in th
Whats the option name Clint? I just checked out 0.19.0 and had a look in
hbase-default to try and jog my memory but I'm not sure which setting it is
(was).
To force a flush you could do below in a cron job?
echo "flush 'TABLENAME'" | ./bin/hbase shell
... or variations thereof.
St.Ack
On Wed,
Is there no optional memstore flush anymore? I recall in 0.19 the memcache
would flush every so-often and you could configure this period (optional
cache flush interval).
Digging through now, I don't see it in 0.20. Is this mechanism no longer
supported?
Due to a couple of mixups, our stop cluste
On Wed, Sep 23, 2009 at 9:56 AM, Molinari, Guy wrote:
> Hi Stack (and others),
> The reason for the small initial region size was intended to force
> splits so that the load would be evenly distributed. If I could
> pre-define the key ranges for the splits, then I could go to a much
> larger
Hi Stack (and others),
The reason for the small initial region size was intended to force
splits so that the load would be evenly distributed. If I could
pre-define the key ranges for the splits, then I could go to a much
larger block size. So, say if I have 10 nodes and a 100MB data set,
10 matches
Mail list logo