Hi Steven,
First off, congrats for the progress! This is super exciting.
As usual, if you need help you know where to find us :) but it seems
you have it all well under control.
As far as Buzzwords is concerned, I had a proposal submitted but was
rejected. But with projects like Lily we will get
Ackermann
- "Datenmodellierung und Architektur beim Einsatz von
Semantik-Web-Technologien in der Praxis" by Sacha Berger
- "Hadoop and HBase - Ein Überblick" by Lars George
As usual this is followed by an open discussion at a nearby
pub/restaurant so that we can also enjoy
Ackermann
- "Datenmodellierung und Architektur beim Einsatz von
Semantik-Web-Technologien in der Praxis" by Sacha Berger
- "Hadoop and HBase - Ein Überblick" by Lars George
As usual this is followed by an open discussion at a nearby
pub/restaurant so that we can also enjoy
Hi Lucas,
What OS are you on? What kernel version? What is your Hadoop and HBase
version? How much heap do you assign to each Java process?
Lars
On Wed, Nov 17, 2010 at 3:05 PM, Lucas Nazário dos Santos
wrote:
> Hi,
>
> This problem is widely know, but I'm not able to come up with a decent
> so
JD,
Should we create a metric for it so that it dynamically counts per
region its usage? That can then be exposed via Ganglia context or JMX.
Just wondering.
Lars
On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik wrote:
> hi,
>
> Thanks for the suggestions JD & Michael.
> The region servers serv
Lucas
>
>
>
> On Wed, Nov 17, 2010 at 2:12 PM, Lars George wrote:
>
>> Hi Lucas,
>>
>> What OS are you on? What kernel version? What is your Hadoop and HBase
>> version? How much heap do you assign to each Java process?
>>
>> Lars
>>
>>
ase your epoll limit. Some
> tips about that here:
>
> http://www.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/
>
> Thanks
> -Todd
>
> On Wed, Nov 17, 2010 at 9:10 AM, Lars George wrote:
>
>> Are you running on EC2? Couldn't you sim
/proc/sys/fs/epoll/max_user_watches. I'm not quite sure about what to do.
>
> Can I favor max_user_watches over max_user_instances? With what value?
>
> I also tried to play with the Xss argument and decreased it to 128K with no
> luck (xcievers at 4096).
>
> Lucas
>
>
>
Hi Navraj,
This is because 0.90 uses Maven, and that has a local cache (usually
under ~/.m2). You need to replace the existing jar with yours, see
http://maven.apache.org/guides/mini/guide-3rd-party-jars-local.html
for example on how to do this. Replace the jar with yours and use the
following art
I would not say "no" immediately. I know some have done so (given the
version was the same) and used add_table.rb to add the table to META.
YMMV.
Lars
On Thu, Nov 18, 2010 at 6:01 AM, Ted Yu wrote:
> No.
>
> See https://issues.apache.org/jira/browse/HBASE-1684
>
> On Wed, Nov 17, 2010 at 8:25 PM
Hi Joy,
[1] is what [2] does. They are just a thin wrapper around the raw API.
And as Alex pointed out and you noticed too, [2] adds the benefit to
have locality support. If you were to add this to [1] then you have
[2].
Lars
On Thu, Nov 18, 2010 at 5:30 AM, Saptarshi Guha
wrote:
> Hello,
>
>
compactions do not run while you are copying. Splits/compactions
>> change hfile layout on disk. You must freeze the table so your copy is
>> consistent (source data remains unchanged start to finish).
>>
>> Best regards,
>>
>> - Andy
>>
>>
>&
Have a read here:
http://outerthought.org/blog/417-ot.html
Especially: "One interesting option that is missing is the ability to
retrieve the latest version less than or equal to a given timestamp,
thus giving the 'latest' state of the record at a certain point in
time. Update: this is (obviously
Yeah, this will be superseded by WHIRR-25 over the next month or two.
The "root" name was simply a choice, no reason not to change it. As
for Ganglia, do you see the Ganglia daemon run on each node? If not,
please have a look into the logs on the servers, the user scripts
usually log their process
Hi Henning,
Could you look at the Master UI while doing the import? The issue with
a cold bulk import is that you are hitting one region server
initially, and while it is filling up its in-memory structures all is
nice and dandy. Then ou start to tax the server as it has to flush
data out and it b
> the start of the run and then even got better. An unexpected load
> behavior for me (would have expected early changes but then
> some stable behavior up to the end).
>
> Thanks,
> Henning
>
> Am Freitag, den 19.11.2010, 15:21 +0100 schrieb Lars George:
>
>> Hi
Am Freitag, den 19.11.2010, 16:16 +0100 schrieb Lars George:
>
>> Hi Henning,
>>
>> And you what you have seen is often difficult to explain. What I
>> listed are the obvious contenders. But ideally you would do a post
>> mortem on the master and slave logs for H
Hi Hari,
This is most certainly a classpath issue. You either have to add the jar to all
TaskTracker servers and add it into the hadoop-env.sh in the HADOOP_CLASSPATH
line (and copy it to all servers again *and* restart the TaskTracker process!)
or put the jar into the job jar into a /lib direc
g the same error. Weird thing is that tasks on the master node are also
> failing with the same error, even though all my files are available on
> master. I am sure I'm missing something basic here, but unable to pinpoint
> the exact problem.
>
> hari
>
> On Sun,
Hi Hari,
You are missing the quorum setting. It seems the hbase-site.xml is missing from
the classpath on the clients. Did you pack it into the jar?
And yes, even one ZK server is fine in such a small cluster.
You can see it is trying to connect to localhost which is the default if the
site f
fixed when I add ejabber to the
> quroum right? After all, it is responding to changes I make in my xml file.
> What else can be the issue here?
>
> hari
>
> On Mon, Nov 22, 2010 at 12:54 AM, Lars George wrote:
>
>> Hi Hari,
>>
>> You are missing the quorum
Hi Mark,
First please read this post: http://outerthought.org/blog/417-ot.html
Rest inline below.
On Nov 22, 2010, at 7:45, Mark Jarecki wrote:
> Hi,
>
> I'm completely new to HBase and have some questions regarding cell
> timestamps.
>
> My questions: Are there practical limitations to t
what is the default value
> of HBASE_MANAGES_ZK ? Because I have not explicitly set it to true in my
> hbase-env.sh file.
>
> thanks,
> hari
>
> On Mon, Nov 22, 2010 at 10:39 AM, Lars George wrote:
>
>> Hi Hari,
>>
>> On which of these for machines do you h
I agree with Mark. HBase starts the built in ZK support on the nodes that are
listed in the quorum. That is why it works as Mark says when you add the
ejabber host.
What is broken is your job config. For some reason you do not seem to have the
right config in your jar as it tries to connect to
ion/split(?) rate and make
> judgement on whether some configuration is properly set (e.g.
> hbase.hregion.memstore.flush.size).
>
> Thanks,
> Alex Baranau
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase
>
> On Fri, Nov 19, 2010 at 5:
Also see: https://issues.apache.org/jira/browse/HBASE-2470 (I
completely forgot I had opened it).
On Fri, Nov 19, 2010 at 2:15 PM, Pan W wrote:
> Hi, Lars
>
> It's very nice of you to show the helpful blog to me. Now I see :-)
> --
> Pan W
>
>
null this way. But when I
> change dfs.replication in my config file, I do see a change in replication
> after upload into HDFS.
>
> On Mon, Nov 22, 2010 at 2:36 PM, Hari Sreekumar
> wrote:
>
>> Hi Lars,
>>
>> I start them through HBase implicitly. I
Oleg,
Do you have Ganglia or some other graphing tool running against the
cluster? It gives you metrics that are crucial here, for example the
load on Hadoop and its DataNodes as well as insertion rates etc. on
HBase. What is also interesting is the compaction queue to see if the
cluster is going
mon.
>
> It will help a lot if you can provide your configurations and system
> characteristics (maybe in a Wiki page).
> It will also help to get more of the "small tweaks" that you found helpful.
>
>
> Lior Schachter
>
>
>
>
>
>
>
> On Mo
lt config
> all the while! Printing out the config turns was really helpful. So
> getConf() reads the hadoop default and core-site.xml files, and
> HBaseConfiguration() reads hbase properties.
>
> Thanks a lot,
> Hari
>
> On Mon, Nov 22, 2010 at 4:41 PM, Lars George w
heap !! very cheap :)
>>
>> I believe that more money will come when we show the viability of the
>> system... I also read that heterogeneous clusters are common.
>>
>> It will help a lot if you can provide your configurations and system
>> characteristics (maybe
Hi fnord,
See https://issues.apache.org/jira/browse/HBASE-1537 and
https://issues.apache.org/jira/browse/HBASE-2673 for details. Not sure
when that went in though but you should have that available, no?
Lars
On Tue, Nov 23, 2010 at 2:48 PM, fnord 99 wrote:
> Hi,
>
> our machines have 24GB of RA
Hi Hari,
Disabling a table simply takes time as all RSs need to report back
that the regions are flushed and closed. You may time out on that.
This is where the async, "fire and forget" version of that call comes
in. But if you need to wait you need to use the async and then poll
the status of the
ction. Set "hbase.client.retries.number"
to something higher until it works.
Lars
On Wed, Nov 24, 2010 at 12:01 PM, Hari Sreekumar
wrote:
> Hi Lars,
>
> Is the async version available in hbase-0.20.6 ASF version? It is
> still in development right?
>
> hari
>
I have set up and maintained clusters between 6 and 40 machines while
being a full time developer, so all as part of the development
process. I used simple scripts like the ones I documented here
(http://www.larsgeorge.com/2009/02/hadoop-scripts-part-1.html).
Cluster SSH as mentioned is also used m
Hi Alex,
Oh no, you do NOT want to use column families that way. The are semi static and
should not be changed too often nor should there be too many. Adding a CF
requires disabling the table too.
Use columns, row keys or timestamps for that use-case.
Lars
On Nov 25, 2010, at 17:31, Nanheng
Hi Tim,
You can issue a $ hbase-daemon.sh stop regionserver on each node which tells
the master to move the regions over and shut down the RS properly.
Lars
On Nov 25, 2010, at 18:24, Tim Robertson wrote:
> Hi all,
>
> Please forgive this rather naive question - I have a cluster and want
>
Hi Tim,
Understood, but yes it is handled by the master.
Lars
On Nov 25, 2010, at 19:25, Tim Robertson wrote:
> Hi Lars,
>
> Thanks. I was just wary of doing that on the .META and ROOT. and
> wanted a confirmation.
>
> Cheers,
> Tim
>
>
>
> On Thu, Nov
Hi Rod,
I guess you will have to be careful setting them right. The
hbase.regionserver.global.memstore.upperLimit does not allocate
anything but acts as a barrier or threshold to flush data out. If you
were to set this to 1.0 then it would never trigger, while only
relying on the other triggers to
t; Alex
>
>
> On Thu, Nov 25, 2010 at 10:18 AM, Lars George wrote:
>> Hi Alex,
>>
>> Oh no, you do NOT want to use column families that way. The are semi static
>> and should not be changed too often nor should there be too many. Adding a
>> CF require
by the
> hbase.regionserver.global.memstore.upperLimit
> other than the block cache?
> Rod
>
> On Thu, Nov 25, 2010 at 2:00 PM, Lars George wrote:
>
>> Hi Rod,
>>
>> I guess you will have to be careful setting them right. The
>> hbase.regionserver.global.memstore.upperLimit does not allocate
&
Hi Hari,
ITHBase is what you are asking for I assume? Check the contrib
directory, they are in a separate jar you also will need to add.
Lars
On Fri, Nov 26, 2010 at 8:12 AM, Hari Sreekumar
wrote:
> Hi,
>
> From which version of HBase is this available. I have v0.20.6, but
> couldn't find t
ything like that? What is the logic behind contrib
> folder?
>
> On Fri, Nov 26, 2010 at 2:05 PM, Lars George wrote:
>
>> Hi Hari,
>>
>> ITHBase is what you are asking for I assume? Check the contrib
>> directory, they are in a separate jar you also will need to add
n you knowledge?
>
> Thanks a lot for you time,
> Hari
>
> On Fri, Nov 26, 2010 at 8:09 PM, Lars George wrote:
>
>> Hi Hari,
>>
>> In the new 0.89+ versions those were all complete removed and moved to
>> GitHub. The biggest reason being that you need to ha
It sure does. You have a number of handler threads that serve multiple clients.
As with Hadoop you can easily saturate on IO so that the core count is actually
only helpful depending if you have a chance to use them properly. That is also
why you must have if you can a spindle per core. Must run
Hi Friso,
Great to know! Todd was the last one to try to crash G1 and the recent
iteration seemed much more stable.
Lars
On Nov 29, 2010, at 10:49, Friso van Vollenhoven
wrote:
> On a slightly related note, we've been running with G1 with default settings
> on a 16GB heap for some weeks no
Hi Claudio,
Did you have a look at Google's Percolator paper? I think a mechanism like this
may work. Another option often used to implement distributed transactions is
using Zookeeper where you could create an ephemeral node on the new word and
the host succeeding to do so is adding it and the
I like that idea Dave.
As for the checkAndPut(), this will not work as Claudio intended? He
wanted the counter and put to run together, so that former is only
half the deal? Just wondering.
Lars
On Tue, Nov 30, 2010 at 1:43 AM, Buttler, David wrote:
> A while back I had a strange idea to bypass
What version of HBase are you using?
On Dec 1, 2010, at 9:24, 梁景明 wrote:
> i found that if i didnt control timestamp of the put
> mapreduce can run, otherwise just one time mapreduce.
> the question is i scan by timestamp to get my data
> so to put timestamp is my scan thing.
>
> any ideas
:
> 0.20.6
>
> 2010/12/2 Lars George
>
>> What version of HBase are you using?
>>
>> On Dec 1, 2010, at 9:24, 梁景明 wrote:
>>
>> > i found that if i didnt control timestamp of the put
>> > mapreduce can run, otherwise just one time mapreduce.
&
Have you seen the "HBase on Windows" page on the HBase Wiki? It may help along
the way.
On Dec 2, 2010, at 19:30, Vijay wrote:
> Hi Dave,
>As you suggested, I started the zookeeper explicitly using the
> following command
> $ /cygdrive/g/hbase/bin/hbase-daemon.sh --config
> /cygdri
5、i ran javacode again.
> 6、i ran shell_2 to scan ,insert data failed
> --
> ROW COLUMN+CELL
> ------
> 7、i ran shell_4
> 8、i ran javacode again.
> 9、i ran shell_2 to scan ,ins
Hi Alex,
You will need to add your client IP address (or - but not really
recommmended - 0.0.0.0/0 for the world) into the Security Group that
you used to start the cluster on EC2 and allow TCP access to a few
ports that the client needs to communicate with HBase. For starters
2181 which is the Zo
mons.sh start zookeeper
> Of cause, you need't restart HBase cluster if you do as i say.
>
> On Sun, Dec 5, 2010 at 1:03 AM, Lars George wrote:
>
>> Hi Alex,
>>
>> You will need to add your client IP address (or - but not really
>> recommmended - 0.0.0.0/0 f
Hi Hari,
The filters are applied server side (aka predicate pushdown). If you
were to scan and skip you incur extra IO cost. Filters, as they are
implemented right now are simply skipping and therefore all they can
do is slightly improve a normal scan performance. In the future you
might get bette
Hi Dmitriy,
I think you sent this to the wrong list? You sent to hbase-user but
this is a Mahout related question. Please check.
Lars
On Mon, Dec 6, 2010 at 12:17 AM, Dmitriy Lyubimov wrote:
> Dear all,
>
> I am testing the command line integration for the SSVD patch in hadoop mode
> and runnin
ather the shell delete thing setting some
> current timestamp in hbase.
>
> so, when i put the data timestamp before current , that would not set.
>
> i am not sure about this.
>
> thanks any way ,
>
> 2010/12/3 Lars George
>
>> Did you check that the compaction was
Hi Jiajun,
Sure, why not? What are you trying to achieve?
Lars
On Mon, Dec 6, 2010 at 3:19 AM, 陈加俊 wrote:
> Hi
>
> Do I can use the scan filter ? the HBase that we used is version 0.20.6.
>
> jiajun
>
> On Mon, Dec 6, 2010 at 2:05 AM, Lars George wrote:
>
>>
Hi Hari,
What you are asking for is transactions. I'd say try to avoid it.
HBase can only guarantee atomicity on the row level. So if you want
something across tables and rows then you need to use for example
ZooKeeper to implement a transactional support system- There is also
THBase, which gives
Hi Exception,
For starters the logs say you are trying the wrong ZooKeeper node to
get the HBase details (localhost) and you config has:
hbase.zookeeper.quorum
dev32
hbase.zookeeper.quorum
localhost
You are declaring it twice and the last one wins. Remove the second
Hi Gabriel,
What max heap to you give the various daemons? This is really odd that
you see OOMEs, I would like to know what it has consumed. You are
saying the Hadoop DataNodes actually crash with the OOME?
Lars
On Mon, Dec 6, 2010 at 9:02 AM, Gabriel Reid wrote:
> Hi,
>
> We're currently runni
Hi AK,
This issue? https://issues.apache.org/jira/browse/HBASE-3310
Lars
On Mon, Dec 6, 2010 at 9:17 AM, Amandeep Khurana wrote:
> The command I'm running on the shell:
>
> create 'table', {NAME=>'fam', COMPRESSION=>'GZ'}
>
> or
>
> create 'table', {NAME=>'fam', COMPRESSION=>'LZO'}
>
>
> Here'
problem.
>
> I also run the flume instance on the same cluster. Do the flume and hbase
> share the same zookeeper? Is this the reason why I get this problem?
>
>
>
>
> On Mon, Dec 6, 2010 at 7:27 PM, Lars George wrote:
>
>> Hi Exception
Hi Alex,
That is indeed the recommended way, i.e. use binary values if you can. As long
as you can express the same sorting as a long as opposed to a string then
that's the way to go for sure.
Lars
On Dec 7, 2010, at 8:21, Alex Baranau wrote:
> I think I've faced by the key format, smth lik
= 0x12cc00271d00013, negotiated
> timeout = 4
>
> As you can see, it can connect to zookeeper but still block.
>
> I dig into the code a little bit and find the program blocked at this line:
> this.connection.locateRegion(tableName, HConstants.EMPTY_START_ROW);
> at HTable.java
>
> T
Hi Jan,
Any day now!
Really, there just a few little road bumps but nothing major ad once
they are resolved it will be released. Just rushing it for the sake of
releasing it will not make anyone happy (if we find issues right away
just afterwards). Please bear with us!
Lars
On Tue, Dec 14, 2010
t;
> On 14.12.2010 16:17, Lars George wrote:
>>
>> Hi Jan,
>>
>> Any day now!
>>
>> Really, there just a few little road bumps but nothing major ad once
>> they are resolved it will be released. Just rushing it for the sake of
>> releasing it will n
Hi Norbert,
You have seen
http://dumbotics.com/2009/07/31/dumbo-over-hbase/
and
https://github.com/tims/lasthbase
thought, right? Isn't that what you are looking for?
Lars
On Tue, Dec 14, 2010 at 7:32 PM, Norbert Burger
wrote:
> Thanks J-D :-) Somehow, I missed the javadocs for TIFB/TIF, w
Hi Mohit,
The one under /hbase/.logs is the one per region server. It is split
in case there is a region server crash and put into
/hbase///.oldlogs before the region is
redeployed.
Are you sure you saw a .logs underneath the region directory or was it
in fact a .oldlogs?
Lars
On Wed, Dec 15, 2
Hi,
What Stack says is right and the same for trunk, i.e. when you ask with a
specific timestamp and have not compacted the stores yet you will see the
specific version even if there are 3 or more newer ones. The logic is in the
ScanQueryMatcher.match() function. It skips the newer version and
Hi Bradford,
I heard this before recently and one of the things that bit the person
in question in the butt was swapping. Could you check that all
machines are positively healthy and not swapping etc. - just to rule
out the (not so) obvious stuff.
Lars
On Mon, Dec 20, 2010 at 8:22 PM, Bradford S
Is this up on EC2 then you may know that write performance is a magnitude
slower than an a comparable dedicated cluster! Most EC2 cluster I have tested
(with and without EBS and various instance sizes etc.) only did about 2-3MB/s -
taken this into account can you do the math if they do even less
Hi Michael,
I noticed the same and raised that issue a few days back. We will add
the documentation back in, it must have been dropped during the merge.
Thanks for bringing it up here though.
Lars
On Thu, Dec 23, 2010 at 4:32 AM, Michael Russo wrote:
> The 0.20 branch has detailed documentation
Please note that there is an issue with a test config file in the
hbase-test.jar that overrides the configuration. Can you make sure you
do not have the hbase-test.jar on your client's classpath?
Lars
On Thu, Dec 23, 2010 at 9:34 AM, King JKing wrote:
> When I comment any line contain 127.0.0.1
Writing data only hits the WAL and MemStore, so that should equal in
the same performance for both models. One thing that Mike mentioned is
how you distribute the load. How many servers are you using? How are
inserting your data (sequential or random)? Why do you use a Put since
this sounds like a
Hi Marc,
> 1) It seems importtsv will only accept one family at a time. It shows some
> sort of security access error if I give it a column list with columns from
> different families. Is this a limitation of the bulk loader, or is this a
> consequence of some security configuration somewhere?
T
>
> Does that sound right?
>
> Marc
>
>
> On Thu, Dec 23, 2010 at 2:34 PM, Todd Lipcon wrote:
>
>> You beat me to it, Lars! Was writing a response when some family arrived
>> for
>> the holidays, and when I came back, you had written just what I had started
Hi H,
While you can do that by hand I strongly recommend using Apache Whirr
(http://incubator.apache.org/projects/whirr.html) which has Hadoop and
(in trunk now) also HBase support, straight from the Apache tarballs.
If you want to set them up manually then you simply spin up N machines
and follo
p.com/
>
>
>
> - Original Message
>> From: Lars George
>> To: user@hbase.apache.org
>> Sent: Mon, January 3, 2011 12:32:11 PM
>> Subject: Re: Hbase/Hadoop cluster setup on AWS
>>
>> Hi H,
>>
>> While you can do that by hand I stro
.philwhln.com/map-reduce-with-ruby-using-hadoop
>
> Thanks,
> Phil
>
> On Tue, Jan 4, 2011 at 12:23 AM, Lars George wrote:
>> Hi Otis,
>>
>> It also supports CDH although it does only start Hadoop
>> (HDFS/MapReduce). I am going to open a JIRA to facilitate the s
10-245-121-242.ec2.internal:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/
>
> I'm guessing if you do not browse the file-system through the browser.
> Otherwise, it's possible I missed a step that then required me to have
> to do this.
>
> Cheers,
> Phil
>
>
Hi,
I ran some tests on various EC2 clusters from c1.medium, c1.xlarge to
m2.2xlarge with EBS on 1+10 instances. The instance storage usually
averages at around 2-3 MB/s for writes and the EBS backed m2.2xlarge
did 7-8 MB/s on writes. Reading I think is less an issue, but writing
is really bad. On
I never got that reverse logic, who came up with that?
On Wed, Jan 5, 2011 at 6:59 PM, Jean-Daniel Cryans wrote:
> What you are doing is filtering out the rows with the value you are looking
> for.
>
> J-D
>
> On Wed, Jan 5, 2011 at 12:45 AM, how to get the cell value
> wrote:
>> I would like
Oh, well spotted Tatsuya, didn't even see his use of "tablename" there!
On Thu, Jan 6, 2011 at 4:22 AM, Tatsuya Kawano wrote:
>
> Hi,
>
>> byte [] by = rs.getValue(Bytes.toBytes(tablename));
>
> Result#getValue() doesn't take a table name but a column name in
> "family:qualifier" format, so you
Hi Joel,
Marking it "in-memory" is *not* making it all stay or be loaded into
memory. It is just a priority flag to retain blocks of that CF
preferably in the block caches. So it caches it up to the max block
cache size. The rest may cause some churn but that is the best you can
do.
Lars
On Tue,
Hi Wayne,
0.90.0 is out. Get it while it's hot from the HBase home page.
Lars
On Jan 21, 2011, at 20:22, Wayne wrote:
> I enthusiastically created a ticket:
> https://issues.apache.org/jira/browse/HBASE-3463
>
> This might be a dumb question I should already know the answer to...but when
> i
I agree with Friso, using Todd's LZO Packager this is really easy:
https://github.com/toddlipcon/hadoop-lzo-packager
Lars
On Wed, Jan 26, 2011 at 10:41 AM, Friso van Vollenhoven
wrote:
> Are you sure it is not a problem with the network on your side? Clicking the
> link and downloading that ja
No as the data is now stored compressed and needs to be read somehow.
You simply have to follow the steps outlined above and get it started
again. :(
Lars
On Tue, Jan 25, 2011 at 8:16 PM, Peter Haidinyak wrote:
> Thanks, is there a way to turn off compression on a table when the region
> server
Benoit,
You probably tripped this up? https://issues.apache.org/jira/browse/HBASE-3476
Lars
On Wed, Jan 26, 2011 at 5:53 AM, tsuna wrote:
> You can run ``hbase org.apache.hadoop.hbase.io.hfile.HFile -f
> "$region" -m'' where $region is every HFile (located under
> /hbase/$table/*/$family). Thi
Hi,
Have a look at
http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
This is awesome and applies equally to HBase, simply s/Tablet/Region/g
and you have the same issues using sequential row keys.
Ikai, if you read this, please do another set for HB
Hi Nanheng,
You have to use its own configuration key to enable it. So when you
create the job configuration do add an
conf.set("hfile.compression", "gz");
Obviously before you create the job or use
job.getConfiguration().set("hfile.compression", "gz");
instead.
Lars
On Thu, Jan 27, 2011 at
Hi Stuart,
Do you have the usual
job.setJarByClass(.class);
?
Lars
On Thu, Jan 27, 2011 at 7:53 AM, Stack wrote:
> Does the job run anyway?
> St.Ack
>
> On Wed, Jan 26, 2011 at 10:43 PM, Stuart Scott wrote:
>> Hi,
>>
>>
>>
>> Has anyone come across the error below? Any ideas how to resol
Hi Jesse,
Yeah, I'd recommend Todd's version as well. I will ask if we should
update the Wiki accordingly.
Lars
On Wed, Jan 26, 2011 at 9:27 PM, Jesse Hutton wrote:
> There may still be an issue with http://github.com/kevinweil/hadoop-lzo and
> CDH3B3. I ran into something similar to
> https://
, so it also works
> for the local installs on dev machines...
>
>
> Friso
>
>
>
> On 26 jan 2011, at 12:38, Lars George wrote:
>
>> I agree with Friso, using Todd's LZO Packager this is really easy:
>> https://github.com/toddlipcon/hadoop-lzo-packager
&g
AME+"_"+tablename);
> job.setJarByClass(RowCount.class);
>
>
> -Original Message-
> From: Lars George [mailto:lars.geo...@gmail.com]
> Sent: 27 January 2011 07:07
> To: user@hbase.apache.org
> Subject: Re: No job jar file set Map Reduce Job
>
>
Hi Pete,
Look into the Mozilla Socorro project
(http://code.google.com/p/socorro/) for how to "salt" the keys to get
better load balancing across sequential keys. The principle is to add
a salt, in this case a number reflecting the number of servers
available (some multiple of that to allow for gr
Thank you Stack for doing this. Appreciated.
On Mon, Jan 31, 2011 at 5:33 AM, Stack wrote:
> Daniel:
>
> It looks like 0.90.0 hbase is showing in the releases repository now.
> Let me know if an issue with it.
>
> Sorry it took so long,
> St.Ack
>
>
>
> On Wed, Jan 26, 2011 at 1:05 PM, Daniel Ian
Hi Dean,
Yes you can do that. See a (slightly outdated) oost about them here
http://hbaseblog.com/2010/11/30/hbase-coprocessors/
Think of Coprocessors at their simplest level as triggers before and
after basically every event that can happen on the server side. You
can use this as you intended to
Sorry for the late bump...
It is quite nice to store JSON as strings in HBase, i.e. use for
example JSONObject to convert to something like "{ "name' : "lars" }"
and then Bytes.toBytes(jsonString). Since Hive now has a HBase handler
you can use Hive and its built in JSON support to query cells lik
Hi Stack,
I was just asking Todd the same thing, ie. fixed new size vs NewRatio.
He and you have done way more on GC debugging than me so I trust
whatever Todd or you say. I would leave the UseParNewGC for good
measure (not relying on implicit defaults). I also re-read just before
I saw your reply
1 - 100 of 200 matches
Mail list logo