Re: Column families

2017-06-22 Thread Brian Jeltema
One use-case that applies to my tables is that I have a table with a set of columns that have data that is always processed with MR jobs, but other rather large columns that are generally only accessed through a UI. By separating those into two column families, MR jobs that do a full table scan

Re: `hbase classpath` command causes “File name too long” error

2016-06-17 Thread Brian Jeltema
It’s the first ‘$’ that’s killing you. it should be` export HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` > On Jun 17, 2016, at 6:41 AM, Mahesha999 wrote: > > I am trying out HBase bulk loading. The command looks like this: > >

constantly adding snapshot info

2015-12-27 Thread Brian Jeltema
I recently upgraded to Hadoop 2.7.1 and HBase 1.1.2. Since the upgrade, the HBase master logs have been filling and rolling over about every 5 minutes, filled with variations of the following (modified to obscure internal details): 2015-12-27 12:24:52,481 DEBUG

Re: constantly adding snapshot info

2015-12-27 Thread Brian Jeltema
in the previous post. Brian > > Regards > Samir > > On Sun, Dec 27, 2015 at 6:34 PM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> I recently upgraded to Hadoop 2.7.1 and HBase 1.1.2. >> >> Since the upgrade, the HBase master logs have be

Re: regions in transition

2015-12-23 Thread Brian Jeltema
I appear to have resolved the OOM error by greatly increasing the max process limit (to 64K). Using HDP 2.1 a limit of 1024 seemed to be working OK. I’m surprised I had to make a change of this magnitude. Brian > On Dec 23, 2015, at 7:20 AM, Brian Jeltema <bdjelt...@gmail.com&

Re: regions in transition

2015-12-23 Thread Brian Jeltema
be causing an OOM error? Thanks Brian > On Dec 22, 2015, at 12:46 PM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> >> You should really find out where you hmaster ui lives (there is a master UI >> for every node provided by the apache project) because it gives y

Re: regions in transition

2015-12-22 Thread Brian Jeltema
safe to stop HBase and delete the ZK node? > > Thanks > > On Mon, Dec 21, 2015 at 3:54 PM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> I am doing a cluster upgrade to the HDP 2.2 stack. For some reason, after >> the upgrade HBase >> cannot find any regions for

Re: regions in transition

2015-12-22 Thread Brian Jeltema
(onlineRegions != null && onlineRegions.size() > 0) %> >> ... >> <%else> >>Not serving regions >> >> >> The message means that there was no region online on the underlying server. >> >> FYI >> >> On Tue, Dec

Re: regions in transition

2015-12-22 Thread Brian Jeltema
doesn’t have any regions to server. > On Dec 22, 2015, at 6:19 AM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> >> Can you pick a few regions stuck in transition and check related region >> server logs to see why they couldn't be assigned ? > > I don’t see

Re: regions in transition

2015-12-22 Thread Brian Jeltema
ns > > > The message means that there was no region online on the underlying server. > > FYI > > On Tue, Dec 22, 2015 at 7:18 AM, Brian Jeltema <bdjelt...@gmail.com> wrote: > >> Following up, if I look at the MBase Master UI in the Ambari console I see >&

regions in transition

2015-12-21 Thread Brian Jeltema
I am doing a cluster upgrade to the HDP 2.2 stack. For some reason, after the upgrade HBase cannot find any regions for existing tables. I believe the HDFS file system is OK. But looking at the ZooKeeper nodes, I noticed that many (maybe all) of the regions were listed in the ZooKeeper

unexpected replication on export

2015-03-09 Thread Brian Jeltema
I used ExportSnapshot to copy a snapshot from a cluster with a default replication factor of 3 to a smaller development cluster with a default replication factor of 1. The resulting table appears to have been created with a replication of 3, ignoring the default setting. Is this expected? Is

periodicFlusher get stuck

2015-02-24 Thread Brian Jeltema
I’m seeing occasional HBase log output similar to the output shown below. It appears there is a request to flush a region, repeated every 10 seconds, that apparently is never being performed. It’s causing MR jobs to timeout because they cannot write to this region. Is this a known problem?

Re: periodicFlusher get stuck

2015-02-24 Thread Brian Jeltema
/configuration On Feb 24, 2015, at 11:28 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Interesting... Can you share you hbase-site.xml? Have you setup hbase.regionserver.optionalcacheflushinterval? Can you hadoop fs -ls -R this region folder? 2015-02-24 11:15 GMT-05:00 Brian Jeltema

Re: periodicFlusher get stuck

2015-02-24 Thread Brian Jeltema
, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I’m seeing occasional HBase log output similar to the output shown below. It appears there is a request to flush a region, repeated every 10 seconds, that apparently is never being performed. It’s causing MR jobs to timeout because

Re: http://stackoverflow.com/questions/28350940/cannot-start-standalone-instance-of-hbase

2015-02-06 Thread Brian Jeltema
You’re running on Windows. Did you follow this: http://hbase.apache.org/cygwin.html On Feb 6, 2015, at 2:39 AM, sibtain sibtain_ab...@ymail.com wrote: Please follow this link http://stackoverflow.com/questions/28350940/cannot-start-standalone-instance-of-hbase for my question. I'm

Re: 0.94 going forward

2014-12-16 Thread Brian Jeltema
I have been able to export snapshots from 0.94 to 0.98. I’ve pasted the instructions that I developed and published on our internal wiki. I also had to significantly increase retry count parameters due to a high number of timeout failures during the export. Cross-cluster transfers To export

Re: what can cause RegionTooBusyException?

2014-11-11 Thread Brian Jeltema
Request Count. You can monitor the value for the underlying region to see if it receives above-normal writes. Cheers On Mon, Nov 10, 2014 at 4:06 PM, Brian Jeltema bdjelt...@gmail.com wrote: Was the region containing this row hot around the time of failure ? How do I measure

what can cause RegionTooBusyException?

2014-11-10 Thread Brian Jeltema
I’m running a map/reduce job against a table that is performing a large number of writes (probably updating every row). The job is failing with the exception below. This is a solid failure; it dies at the same point in the application, and at the same row in the table. So I doubt it’s a conflict

Re: what can cause RegionTooBusyException?

2014-11-10 Thread Brian Jeltema
using ? Cheers On Mon, Nov 10, 2014 at 11:10 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I’m running a map/reduce job against a table that is performing a large number of writes (probably updating every row). The job is failing with the exception below. This is a solid

Re: what can cause RegionTooBusyException?

2014-11-10 Thread Brian Jeltema
with monitoring tool) what memstore pressure was ? Thanks On Nov 10, 2014, at 11:34 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: How many tasks may write to this row concurrently ? only 1 mapper should be writing to this row. Is there a way to check which locks are being held

snapshot timeouts

2014-10-08 Thread Brian Jeltema
I’m trying to snapshot a moderately large table (3 billion rows, but not a huge amount of data per row). Those snapshots have been timing out, so I set the following parameters to relatively large values: hbase.snapshot.master.timeoutMillis hbase.snapshot.region.timeout

Re: snapshot timeouts

2014-10-08 Thread Brian Jeltema
more information : the release of hbase you're using value for hbase.rpc.timeout (looks like you leave it @ default) more of the error (please include stack trace if possible) Cheers On Wed, Oct 8, 2014 at 12:09 PM, Brian Jeltema brian.jelt...@foo.net wrote: I’m trying to snapshot

Re: snapshot timeouts

2014-10-08 Thread Brian Jeltema
is screwed up. So I’ll clean up the mess and try again tomorrow. Regrets for the possible false alarm Brian On Oct 8, 2014, at 3:25 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: Sorry, I usually include that info. HBase version is 0.98. hbase.rpc.timeout is the default. When

ExportSnapshot webhdfs problems

2014-09-30 Thread Brian Jeltema
I’m trying to use ExportSnapshot to copy a snapshot from a Hadoop 1 to a Hadoop 2 cluster using the webhdfs protocol. I’ve done this successfully before, though there are always mapper failures and retries in the job log. However, I’m not having success with a rather large table due to an

problem restoring snapshot

2014-09-25 Thread Brian Jeltema
I exported a snapshot to another cluster, same version of all software. A restore_snapshot on the target system hung and eventually timed out, I think due to file ownership issues. I restored hbase ownership to everything in /apps/hbase and tried the restore_snapshot again. It’s still hanging,

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
: Preconditions.checkArgument(!metaChanges.hasRegionsToRestore(), A clone should not have regions to restore); Was there region split prior to snapshot restore action ? Cheers On Thu, Sep 25, 2014 at 9:19 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I exported a snapshot

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: The table did not exist on the target cluster when I tried the first restore_clone. Is there some way I can delete all traces of the table and start over? On Sep 25, 2014, at 12:25 PM, Ted Yu yuzhih...@gmail.com wrote: It is from

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
Does hbck report any inconsistency ? Not for the table in question. There are inconsistencies in an unrelated table. I do see related content in: /apps/hbase/data/.tmp/data/default/Foo is it safe to delete that stuff? Cheers On Thu, Sep 25, 2014 at 9:52 AM, Brian Jeltema

Re: problem restoring snapshot

2014-09-25 Thread Brian Jeltema
Deleting the contents of /apps/hbase/data/.tmp fixed the problem On Sep 25, 2014, at 1:48 PM, Ted Yu yuzhih...@gmail.com wrote: bq. is it safe to delete that stuff? Yes. You have the exported snapshot as source of truth. On Thu, Sep 25, 2014 at 10:43 AM, Brian Jeltema brian.jelt

Re: Copying data from 94 to 98 ..

2014-09-16 Thread Brian Jeltema
I’ve been successfully moving snapshots from 94 to 98 using webhdfs. On the 94 cluster: hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot snappy -copy-to webhdfs://host-on-98-cluster/apps/hbase/data -mappers 12 and then manually fixing the file system layout. On Sep 16,

Re: need help understand log output

2014-09-12 Thread Brian Jeltema
first began to appear, but nothing jumped out at me; a write-heavy MR job was running at the time, so there might be something buried in the noise (a lot of noise). 在 2014-9-11,19:08,Brian Jeltema brian.jelt...@digitalenvoy.net 写道: the RS log is huge. What do you want to see other than what

Re: need help understand log output

2014-09-11 Thread Brian Jeltema
, but revisiting the code, the flush every 10s in the RS log actually comes from HRegion#shouldFlush, so there is sth triggered.. could you pastebin the RS log? On Wed, Sep 10, 2014 at 6:59 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: out of curiosity, did you see below messages

Re: need help understand log output

2014-09-10 Thread Brian Jeltema
out of curiosity, did you see below messages in RS log? LOG.warn(Snapshot called again without clearing previous. + Doing nothing. Another ongoing flush or did we fail last attempt?”); Nope thanks. On Tue, Sep 9, 2014 at 2:15 AM, Brian Jeltema brian.jelt

Re: need help understand log output

2014-09-08 Thread Brian Jeltema
-hadoop2 The MR job is reading from an HBase snapshot, if that’s relevant. Cheers On Sun, Sep 7, 2014 at 8:50 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I have a map/reduce job that is consistently failing with timeouts. The failing mapper log files contain a series

Re: directory usage question

2014-09-08 Thread Brian Jeltema
of initTableSnapshotMapperJob that I didn’t expect, and I’m just trying to understand what it’s doing. BTW in tip of 0.98, with HBASE-11742, related code looks a bit different. Cheers On Sun, Sep 7, 2014 at 8:27 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: Eclipse doesn't show

Re: need help understand log output

2014-09-08 Thread Brian Jeltema
@,1400624237999.5bb6bd41597ddd8dd7ca03e78f3a3e65. after a delay of 12420 a log entry being generated every 10 seconds starting about 4 days ago. I presume these problems are related. On Sep 8, 2014, at 7:10 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: When number of attempts is greater than the value

Re: need help understand log output

2014-09-08 Thread Brian Jeltema
I’ve resolved these problems by restarting the region server that owned the region in question. I don’t know what the underlying issue was, but at this point it’s not worth pursuing. Thanks for responding. Brian On Sep 8, 2014, at 11:06 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote

Re: directory usage question

2014-09-07 Thread Brian Jeltema
probably just something I'm doing wrong. Brian Cheers On Sat, Sep 6, 2014 at 6:09 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I'm trying to track down a problem I'm having running map/reduce jobs against snapshots. Can someone explain the difference between files stored

Re: directory usage question

2014-09-07 Thread Brian Jeltema
: The files under archive directory are referenced by snapshots. Please don't delete them manually. You can delete unused snapshots. Cheers On Sep 7, 2014, at 4:08 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: On Sep 6, 2014, at 9:32 AM, Ted Yu yuzhih...@gmail.com wrote

Re: directory usage question

2014-09-07 Thread Brian Jeltema
Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Cheers On Sun, Sep 7, 2014 at 5:48 AM, Brian Jeltema

need help understand log output

2014-09-07 Thread Brian Jeltema
I have a map/reduce job that is consistently failing with timeouts. The failing mapper log files contain a series of records similar to those below. When I look at the hbase and hdfs logs (on foo.net in this case) I don’t see anything obvious at these timestamps. The mapper task times out

directory usage question

2014-09-06 Thread Brian Jeltema
I'm trying to track down a problem I'm having running map/reduce jobs against snapshots. Can someone explain the difference between files stored in: /apps/hbase/data/archive/data/default and files stored in /apps/hbase/data/data/default (Hadoop 2.4, HBase 0.98) Thanks

Re: snapshot timeout problem

2014-07-22 Thread Brian Jeltema
I ran the balancer from hbase shell, but don’t see any change. Is there a way to balance a specific table? bq. One RegionServer has 69 regions Can you run load balancer so that your regions are better balanced ? Cheers On Mon, Jul 21, 2014 at 6:56 AM, Brian Jeltema brian.jelt

Re: snapshot timeout problem

2014-07-22 Thread Brian Jeltema
, 2014 at 6:56 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: There are 174 regions, not well balanced. One RegionServer has 69 regions. That RegionServer generates a series of log entries (modified and shown below), one for each region, at roughly 1 to 2 second intervals. The timeout

Re: snapshot timeout problem

2014-07-21 Thread Brian Jeltema
server which is slow in completing its part of the snapshot procedure. Have you looked at region server logs ? Feel free to pastebin relevant portion. Thanks On Jul 21, 2014, at 4:03 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I’m running HBase 0.98. I’m trying to snapshot

determining which region a mapper processed

2014-07-20 Thread Brian Jeltema
When a MapReduce job is run against HBase, a mapper is created for each region (using TableInputFormat). Is there a way look at the history and determine which region a given mapper ran against? Or can I rely on the mapper number being the same as the region number? TIA Brian

Re: Introducing Project Trafodion

2014-06-11 Thread Brian Jeltema
I'm sure you saw this, but just in case, here's another SQL interface to HBase. Brian On Jun 11, 2014, at 5:57 PM, Birdsall, Dave dave.birds...@hp.com wrote: Hi, The cat is already out of the bag on Trafodion on this dlist (and we are very happy about it actually). But, here's an official

how to get source table from MultiTableInputFormat

2014-04-23 Thread Brian Jeltema
If I’m using MultiTableInputFormat to process process input from several tables in a map/reduce job, is there any way in the mapper to determine which table a given Result is coming from? Brian

Re: Lease exception when I execute large scan with filters.

2014-04-12 Thread Brian Jeltema
I don't want to be argumentative here, but by definition is's not an internal feature because it's part of the public API. We use versioning in a way that makes me somewhat uncomfortable, but it's been quite useful. I'd like to see a clear explanation of why it exists and what use cases it was

ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
Is it possible to use webhdfs to export a snapshot to another cluster? If so, what would the command look like? TIA Brian

Re: ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
? Regards, Shahab On Fri, Mar 21, 2014 at 8:14 AM, Matteo Bertozzi theo.berto...@gmail.comwrote: ExportSnapshot uses the FileSystem API so you'll probably be able to say: -copy-to: webhdfs://host/path Matteo On Fri, Mar 21, 2014 at 12:09 PM, Brian Jeltema brian.jelt

Re: ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
, 2014 at 4:27 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: Exporting across versions was why I tried webhdfs. I have a cluster running HBase 0.94 and wanted to export a table to a different cluster running HBase 0.96. I got the export to work, but attempting to do

Re: ExportSnapshot using webhdfs

2014-03-21 Thread Brian Jeltema
and the full stack trace Matteo On Fri, Mar 21, 2014 at 5:22 PM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: My paths were different, and it was .tabledesc rather than .tableinfo, but I got past the problem. Now the restore_snapshot seems to be hung, and I'm seeing many warnings

Re: ClusterId read in ZooKeeper is null

2013-07-11 Thread Brian Jeltema
This issue has been resolved. It was caused by version skew between the client library and the running service. On Jul 10, 2013, at 11:47 AM, Brian Jeltema wrote: As far as I can tell the HMaster process is running correctly. There are no obvious problems in the logs. As suggested, I

Re: ClusterId read in ZooKeeper is null

2013-07-10 Thread Brian Jeltema
value for zookeeper.znode.parent in your cluster configuration, but not set this in your client code? On Tue, Jul 9, 2013 at 10:05 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote: I'm new to HBase, and need a little guidance. I've set up a 6-node cluster, with 3 nodes running

ClusterId read in ZooKeeper is null

2013-07-09 Thread Brian Jeltema
I'm new to HBase, and need a little guidance. I've set up a 6-node cluster, with 3 nodes running the ZooKeeper server. The database seems to be working from the hbase shell; I can create tables, insert, scan, etc. But when I try to perform operations in a Java app, I hang at: 13/07/09 12:40:34