I need to run through some server maintenance on my data nodes, including a
reboot. My splitlogs, though, only seem to have a replication factor of 1
(when a data nodes is taken offline, I sometimes have missing blocks for
them). I know I can decommission data nodes with the exclude.dfs file, but
t
piece of software yourself.
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodio...@carrieriq.com
>
> ____
> From: Patrick Schless [patrick.schl...@gmail.com]
> Sent:
q.com
>
>
> From: Ted Yu [yuzhih...@gmail.com]
> Sent: Friday, December 13, 2013 3:33 PM
> To: user@hbase.apache.org
> Cc: user
> Subject: Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes
>
> Patrick:
> Attachment didn't go th
rs enough store files build up to require compactions.
>
> There's nothing else automated in HDFS or HBase that I could see causing
> this.
>
> On Fri, Dec 13, 2013 at 3:07 PM, Patrick Schless
> wrote:
>
> > CDH4.1.2
> > HBase 0.92.1
> > HDFS 2.0.0
&
CDH4.1.2
HBase 0.92.1
HDFS 2.0.0
Every 3 hours, our production HBase cluster does something that causes all
the data nodes to have a sustained spike in CPU/network/disk. The spike
lasts about 30 mins, and during this time the cluster has greatly increased
latencies for our typical application usa
to make use of the setBatch
> feature
>
> Regards,
> Dhaval
>
>
> ____________
> From: Patrick Schless
> To: user
> Sent: Monday, 4 November 2013 6:03 PM
> Subject: Scanner Caching with wildly varying row widths
>
>
> We have an ap
We have an application where a row can contain anywhere between 1 and
360 cells (there's only 1 column family). In practice, most rows have
under 100 cells.
Now we want to run some mapreduce jobs that touch every cell within a range
(eg count how many cells we have). With scanner caching set
I should mention that I shot an email to Cloudera today, and expect to hear
from them soon. It might be that they're the standard go-to for this sort
of thing, but I'm wondering if there are other options I should be
considering.
On Mon, Nov 4, 2013 at 4:36 PM, Patrick Schless
wro
Our team's strength's lie more around application development and devops
than jvm and hadoop tuning. We currently run two CDH4 clusters (without the
cloudera manager), and are interested in establishing a relationship with a
consultancy that can help us tune and maintain things (config, server
spec
We run two hbase clusters, and one (master) replicates to the other
(standby). We did some maintenance last night which involved bringing all
of hbase down while we made changes to HDFS. After bringing things back up,
our ageOfLastShippedOp on a few of the master region servers jumped to
around -9
Are you missing data? Replication/recovery of blocks is automatic, and
there isn't a manual process to it.
FWIW, for something like changing the hard drive configs on the box, it
would be a better idea to unbalance the node ahead of time and then
rebalance it.
On Fri, Aug 9, 2013 at 7:18 AM, oc
Doing a stop on the master and the region
> > servers will screw things up.
> >
> > J-D
> >
> > On Fri, Aug 2, 2013 at 3:28 PM, Patrick Schless
> > wrote:
> > > Doesn't stop-hbase.sh (and its ilk) require the server to be able to
> > manage
> >
I run HBase replication, and while improperly restarting my standby cluster
I lost a few splitlog blocks in my replicated table (on the standby
cluster).
I'm thinking that my standby table is possibly borked now (I can't use the
VerifyRep job because I use the increment API). Is it reasonable to f
ng a stop on the master and the region
> servers will screw things up.
>
> J-D
>
> On Fri, Aug 2, 2013 at 3:28 PM, Patrick Schless
> wrote:
> > Doesn't stop-hbase.sh (and its ilk) require the server to be able to
> manage
> > the clients (using unpassworded SSH k
at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3205)
On Fri, Aug 2, 2013 at 11:31 AM, Ted Yu wrote:
> Can you try running:
>
> hbck -repair
>
> On Fri, Aug 2, 2013 at 9:28 AM, Patrick Schless
> wrote:
>
> > I was testing an hbase table rename script I found in a
iel Cryans wrote:
> Doing a bin/stop-hbase.sh is the way to go, then on the Hadoop side
> you do stop-all.sh. I think your ordering is correct but I'm not sure
> you are using the right commands.
>
> J-D
>
> On Fri, Aug 2, 2013 at 8:27 AM, Patrick Schless
> wrote:
> >
I was testing an hbase table rename script I found in a JIRA, and it didn't
work for me. Not a huge deal (I went with a different solution), but it
left some data I want to clean up.
I was trying to rename a table from "t1" to "t1.renamed". Now in HBase,
'list' shows 't1.renamed'. In HDFS, I have
the Namenode and
> datanode logs? I'd suggest you start by doing a fsck on one of those
> files with the option that gives the block locations first.
>
> By the way why do you have split logs? Are region servers dying every
> time you try out something?
>
> On Thu, Aug 1, 20
nyone else.
On Thu, Aug 1, 2013 at 5:04 PM, Jean-Daniel Cryans wrote:
> I can't think of a way how your missing blocks would be related to
> HBase replication, there's something else going on. Are all the
> datanodes checking back in?
>
> J-D
>
> On Thu, Aug 1,
Is there a way to reload the HBase configs without restarting the whole
system (in other words, without an interruption of service)?
I'm on:
CDH4.1.2
HBase 0.92.1
Hadoop 2.0.0
Thanks,
Patrick
I'm running:
CDH4.1.2
HBase 0.92.1
Hadoop 2.0.0
Is there an issue with restarting a standby cluster with replication
running? I am doing the following on the standby cluster:
- stop hmaster
- stop name_node
- start name_node
- start hmaster
When the name node comes back up, it's reliably missing
tables in two different clusters.
> WARNING: It doesn't work for incrementColumnValues'd cells since the
> timestamp is changed after being appended to the log.
>
>
> The problem is that increments' timestamps are different in the WAL
> and in the final KV that's sto
, Jul 11, 2013 at 12:53 PM, Jean-Daniel Cryans wrote:
> Are those incremented cells?
>
> J-D
>
> On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless
> wrote:
> > I have had replication running for about a week now, and have had a lot
> of
> > data flowing to o
I have had replication running for about a week now, and have had a lot of
data flowing to our slave cluster over that time. Now, I'm running the
verifyrep MR job over a 1-hour period a couple days ago (which should be
fully replicated), and I'm seeing a small number of "BADROWS".
Spot-checking a f
n section 2.3.1, you would see that its value is 1.
>
> Cheers
>
> On Thu, Jul 11, 2013 at 9:28 AM, Patrick Schless
> wrote:
>
> > In 0.94 I noticed (in the "Job File") my job VerifyRep job was running
> with
> > hbase.client.scanner.caching set to 1, even
In 0.94 I noticed (in the "Job File") my job VerifyRep job was running with
hbase.client.scanner.caching set to 1, even though the hbase docs [1] say
it defaults to 100. I didn't have that property being set in any of my
configs. I added the properties to hbase-site.xml (set to 100), and now
that j
11, 2013 at 8:46 AM, Patrick Schless
wrote:
> Yes [1], I set that in hbase-site.xml when I turned on replication. This
> box is solely my job-tracker, so maybe it doesn't pick up the
> hbase-site.xml? Trying this job from the HMaster didn't work, because it
> doesn't h
configuration value is false.
> Below is the relevant code:
> if (!conf.getBoolean(HConstants.REPLICATION_ENABLE_KEY, false)) {
> throw new IOException("Replication needs to be enabled to verify
> it.");
> }
>
> Jieshan
> -Original Message-
&g
On 0.92.1, I have (recently) enabled replication, and I'm trying to verify
that it's working correctly. I am getting an error saying that replication
needs to be enabled, but replication *is* enabled, so I assume I'm doing
something wrong. Looking at the age of the last shipped op (on the master
cl
es.apache.org/jira/browse/HBASE-7122
>
> Thanks,
> Himanshu
>
>
> On Tue, Jul 2, 2013 at 3:09 PM, Patrick Schless
> wrote:
>
> > I've just enabled replication (to 1 peer), and I'm seeing a bunch of
> > errors, along the lines of [1]. Replication doe
I've just enabled replication (to 1 peer), and I'm seeing a bunch of
errors, along the lines of [1]. Replication does seem to work, though (data
is showing up in the standby cluster).
The file exists (I can see it in the HDFS web GUI), but it seems be empty.
Is this an error I need to worry about
sure thing: https://issues.apache.org/jira/browse/HBASE-8844
On Mon, Jul 1, 2013 at 3:59 PM, Jean-Daniel Cryans wrote:
> Yeah that package documentation ought to be changed. Mind opening a jira?
>
> Thx,
>
> J-D
>
> On Mon, Jul 1, 2013 at 1:51 PM, Patrick Schless
>
The first two tutorials for enabling replication that google gives me [1],
[2] take very different tones with regard to stop_replication. The HBase
docs [1] make it sound fine to start and stop replication as desired. The
Cloudera docs [2] say it may cause data loss.
Which is true? If data loss is
in this case you've to disable your table first.
>
>
> Matteo
>
>
>
> On Wed, Jun 19, 2013 at 6:19 PM, Patrick Schless
> wrote:
>
> > Unfortunately, I'm on 0.92.1, and the snapshot approach you linked isn't
> > available until 0.94. Bummer, looked
sn't seem to be a good way to rename a table
>>
>> Have you looked at http://hbase.apache.org/book.html#table.rename ?
>>
>> Cheers
>>
>> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
>> patrick.schl...@gmail.com
>> > wrote:
>>
On Wed, Jun 19, 2013 at 12:41 AM, Stack wrote:
> On Mon, Jun 17, 2013 at 12:06 PM, Patrick Schless <
> patrick.schl...@gmail.com
> > wrote:
>
> > Working on setting up HBase replication across a VPN tunnel, and
> following
> > the docs here: [1] (and here: [2]).
hbase.apache.org/book.html#table.rename ?
>
> Cheers
>
> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
> patrick.schl...@gmail.com
> > wrote:
>
> > Context:
> > I'm working on getting replication set up, and a prerequisite for me is
> to
> >
Context:
I'm working on getting replication set up, and a prerequisite for me is to
rename the table (since you have to replicate to the same name as the
source). For this, I'm testing a CopyTable strategy, since there doesn't
seem to be a good way to rename a table (please correct me if I'm wrong)
Working on setting up HBase replication across a VPN tunnel, and following
the docs here: [1] (and here: [2]).
Two questions, regarding firewall allowances required:
1) The docs say that the zookeeper clusters must be able to reach each
other. I don't see any docs on why this is (the high-level di
I like having access to the web admin pages that HBase, HDFS, etc
provide. I can't find a way to put them behind SSL, though. For the
HMaster it's easy enough (nginx+SSL as a reverse proxy), but the
HMaster generates links like data01.company.com:60030. Is there a way
to change the scheme and port
I am trying to find out the number of data points (cells) in a table
with "hbase org.apache.hadoop.hbase.mapreduce.CellCounter
". on a very small table (3 cells), it works fine. On a table
with a couple thousand cells, I get this error (4 times):
org.apache.hadoop.mapred.Counters$CountersExceeded
the old nodes), I was able to remove the /etc/hosts entries and
bounce hbase without any problem
I still have no idea where the new hbase is getting the references to the
old nodes..
Filed a bug report: https://issues.apache.org/jira/browse/HBASE-6343
On Thu, Jul 5, 2012 at 6:44 PM, Patrick
I have an existing hbase cluster (old.domain.com) and I am trying to
migrate the data to a new set of boxes (new.domain.com). Both are running
hbase 0.90.x.
I would like to minimize downtime, so I'm looking at the Backup tool from
mozilla (
http://blog.mozilla.org/data/2011/02/04/migrating-hbase-i
43 matches
Mail list logo