[jira] [Commented] (HBASE-5282) Possible file handle leak with truncated HLog file.

2012-01-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193917#comment-13193917
 ] 

Jonathan Hsieh commented on HBASE-5282:
---

Ah, got it.  Good catch.  

> Possible file handle leak with truncated HLog file.
> ---
>
> Key: HBASE-5282
> URL: https://issues.apache.org/jira/browse/HBASE-5282
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-5282.patch
>
>
> When debugging hbck, found that the code responsible for this exception can 
> leak open file handles.
> {code}
> 12/01/15 05:58:11 INFO regionserver.HRegion: Replaying edits from 
> hdfs://haus01.
> sf.cloudera.com:56020/hbase-jon/test5/98a1e7255731aae44b3836641840113e/recovered
> .edits/3211315; minSequenceid=3214658
> 12/01/15 05:58:11 ERROR handler.OpenRegionHandler: Failed open of 
> region=test5,8
> \x90\x00\x00\x00\x00\x00\x00/05_0,1326597390073.98a1e7255731aae44b3836641840
> 113e.
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1437)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1424)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1419)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:57)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:158)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:572)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1940)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1896)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:312)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5282) Possible file handle leak with truncated HLog file.

2012-01-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193902#comment-13193902
 ] 

Jonathan Hsieh commented on HBASE-5282:
---

True, but #replayRecoveredEdits is only used in one place, wrapped with {{try 
catch}} that checks for IOE and seems like reasonable behavior:

#replayRecoverededitsIfAny(...)
{code}
  try {
seqid = replayRecoveredEdits(edits, seqid, reporter);
  } catch (IOException e) {
boolean skipErrors = conf.getBoolean("hbase.skip.errors", false);
if (skipErrors) {
  Path p = HLog.moveAsideBadEditsFile(fs, edits);
  LOG.error("hbase.skip.errors=true so continuing. Renamed " + edits +
" as " + p, e);
} else {
  throw e;
}
  }
{code}

What do you mean by protect status.cleanup()? Check for {{status == null}}? (it 
cannot be).

> Possible file handle leak with truncated HLog file.
> ---
>
> Key: HBASE-5282
> URL: https://issues.apache.org/jira/browse/HBASE-5282
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-5282.patch
>
>
> When debugging hbck, found that the code responsible for this exception can 
> leak open file handles.
> {code}
> 12/01/15 05:58:11 INFO regionserver.HRegion: Replaying edits from 
> hdfs://haus01.
> sf.cloudera.com:56020/hbase-jon/test5/98a1e7255731aae44b3836641840113e/recovered
> .edits/3211315; minSequenceid=3214658
> 12/01/15 05:58:11 ERROR handler.OpenRegionHandler: Failed open of 
> region=test5,8
> \x90\x00\x00\x00\x00\x00\x00/05_0,1326597390073.98a1e7255731aae44b3836641840
> 113e.
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1437)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1424)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1419)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:57)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:158)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:572)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1940)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1896)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:312)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5282) Possible file handle leak with truncated HLog file.

2012-01-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193839#comment-13193839
 ] 

Jonathan Hsieh commented on HBASE-5282:
---


When debugging, open region file was attempting to open either a truncated or 0 
size hlogfile (which is throws IOException at out from getReader), and leaking 
a handle on every open attempt.

Patch applies on 0.92 and trunk.

> Possible file handle leak with truncated HLog file.
> ---
>
> Key: HBASE-5282
> URL: https://issues.apache.org/jira/browse/HBASE-5282
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-5282.patch
>
>
> When debugging hbck, found that the code responsible for this exception can 
> leak open file handles.
> {code}
> 12/01/15 05:58:11 INFO regionserver.HRegion: Replaying edits from 
> hdfs://haus01.
> sf.cloudera.com:56020/hbase-jon/test5/98a1e7255731aae44b3836641840113e/recovered
> .edits/3211315; minSequenceid=3214658
> 12/01/15 05:58:11 ERROR handler.OpenRegionHandler: Failed open of 
> region=test5,8
> \x90\x00\x00\x00\x00\x00\x00/05_0,1326597390073.98a1e7255731aae44b3836641840
> 113e.
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1437)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1424)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1419)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:57)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:158)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:572)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1940)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1896)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:312)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5276) PerformanceEvaluation does not set the correct classpath for MR because it lives in the test jar

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193536#comment-13193536
 ] 

Jonathan Hsieh commented on HBASE-5276:
---

Hi Tim, 

>From the HBASE-4688 issue, it looks this isn't in Apache HBase until 0.92.0.  
>If you would like this in a future CDH3 release please file an issue here:

https://issues.cloudera.org/browse/DISTRO

Since CDH4 is based on Apache HBase 0.92, it will be in the CDH4 HBase.  

Thanks,
Jon.

> PerformanceEvaluation does not set the correct classpath for MR because it 
> lives in the test jar
> 
>
> Key: HBASE-5276
> URL: https://issues.apache.org/jira/browse/HBASE-5276
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.90.4
>Reporter: Tim Robertson
>Priority: Minor
>
> Note: This was discovered running the CDH version hbase-0.90.4-cdh3u2
> Running the PerformanceEvaluation as follows:
>   $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation scan 5
> fails because the MR tasks do not get the HBase jar on the CP, and thus hit 
> ClassNotFoundExceptions.
> The job gets the following only:
>   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2-tests.jar
>   
> file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
>   
> file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
> The RowCounter etc all work because they live in the HBase jar, not the test 
> jar, and they get the following 
>   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/guava-r06.jar
>   
> file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
>   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2.jar
>   
> file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
> Presumably this relates to 
>   job.setJarByClass(PerformanceEvaluation.class);
>   ...
>   TableMapReduceUtil.addDependencyJars(job);
> A (cowboy) workaround to run PE is to unpack the jars, and copy the 
> PerformanceEvaluation* classes building a patched jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193499#comment-13193499
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

It was also suggested that I need to worry about compactions due to a HRegion 
flush when I close regions during overlap merging.  At least in  0.90, this is 
not actually necessary -- the closeRegion HMaster side actually flushes but 
ignores the return value of internalFlushcache return flag that specifies if a 
region needs to be compacted.


> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5278) HBase shell script refers to removed "migrate" functionality

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193260#comment-13193260
 ] 

Jonathan Hsieh commented on HBASE-5278:
---

Wow, you are fast Stack.  I was trying to commit. :)

> HBase shell script refers to removed "migrate" functionality
> 
>
> Key: HBASE-5278
> URL: https://issues.apache.org/jira/browse/HBASE-5278
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Trivial
> Fix For: 0.94.0, 0.92.1
>
> Attachments: hbase-5278.patch
>
>
> $ hbase migrate
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Migrate
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Migrate
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
> will exit.
> The 'hbase' shell script has docs referring to a 'migrate' command which no 
> longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5278) HBase shell script refers to removed "migrate" functionality

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193248#comment-13193248
 ] 

Jonathan Hsieh commented on HBASE-5278:
---

+1. lgtm. 

> HBase shell script refers to removed "migrate" functionality
> 
>
> Key: HBASE-5278
> URL: https://issues.apache.org/jira/browse/HBASE-5278
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Trivial
> Attachments: hbase-5278.patch
>
>
> $ hbase migrate
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Migrate
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Migrate
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
> will exit.
> The 'hbase' shell script has docs referring to a 'migrate' command which no 
> longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup

2012-01-24 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192804#comment-13192804
 ] 

Jonathan Hsieh commented on HBASE-5209:
---

As a suggestion, instead of adding new methods to the HMasterInterface or 
HConnection, how about adding and serializing data into the 
o.a.h.hbase.ClusterStatus object? 

If we want to get lists of the standby masters, we'd probably want to add some 
info into ZK in a znode such as {{/hbase/backup-masters/}}.

> HConnection/HMasterInterface should allow for way to get hostname of 
> currently active master in multi-master HBase setup
> 
>
> Key: HBASE-5209
> URL: https://issues.apache.org/jira/browse/HBASE-5209
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Aditya Acharya
>Assignee: David S. Wang
>
> I have a multi-master HBase set up, and I'm trying to programmatically 
> determine which of the masters is currently active. But the API does not 
> allow me to do this. There is a getMaster() method in the HConnection class, 
> but it returns an HMasterInterface, whose methods do not allow me to find out 
> which master won the last race. The API should have a 
> getActiveMasterHostname() or something to that effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191339#comment-13191339
 ] 

Jonathan Hsieh commented on HBASE-4920:
---

I feel the "cyber" look and the hard edges of the wordmark doesn't quite fit 
with the roundness of the image but like the general idea (maybe a "shaper" 
style for the same idea).

@Stack http://en.wikipedia.org/wiki/Vancouver_Canucks

@Lars Actual pacific northwest native american totem poles with octopus/squid
http://users.imag.net/~sry.jkramer/nativetotems/common.htm
http://www.flickr.com/photos/lostviking/3419653151/

Giant squids are pretty close to the International Orange (Engineering) color.
http://blogs.sfweekly.com/thesnitch/2008/01/breaking_giant_cartoon_squid_a.php

> We need a mascot, a totem
> -
>
> Key: HBASE-4920
> URL: https://issues.apache.org/jira/browse/HBASE-4920
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
> 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
> (2).JPG
>
>
> We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
> Clyesdale.  We need something else.
> We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
> and we could order boxes of them from some off-shore sweatshop that 
> subcontracts to a contractor who employs child labor only.
> Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
> Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
> here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
> translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187970#comment-13187970
 ] 

Jonathan Hsieh commented on HBASE-5218:
---

@Doug

I updated review board -- sorry for my minor dyslexia.


> [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
> ---
>
> Key: HBASE-5218
> URL: https://issues.apache.org/jira/browse/HBASE-5218
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Trivial
> Attachments: book_hbase_5218.xml.patch
>
>
> Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
> v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-16 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187134#comment-13187134
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

I've been testing using failed splits generated by cycling the hbase master 
while doing a heavy write load with a high split frequency prior to HBASE-5196 
patch.  A subset of problems has been fixed automatically but it seems to be a 
class of  problems with splitting regions that isn't being handled properly.  
This actually is probably the case we are most likely to encounter.

> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-16 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187127#comment-13187127
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

@Ted sounds good.

> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-16 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187128#comment-13187128
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

@Ted sounds good.

> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186251#comment-13186251
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

@Ram

I think I need a few days (test/polish) to get this completely ready -- if you 
are willing to wait/review to get this through I'm willing to hack on it 
today/tommorrow to get it through.

> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-13 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185891#comment-13185891
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

I'm working on it.  Was working on some of the TODOs and got caught with 
another snag.  It will come soon.

> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-12 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185373#comment-13185373
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

I still need to do some more actual clusters testing, but I'm going to post 
another version that solved problem #1 and #2 later tonight.

#1 -- added offline(byte[] regionname) method to master ipc interface.  
#2 -- added code to wait for region to exit RIT status before moving on.  Test 
doesn't seem flakey anymore. (all these tests seem to pass about 25 times in 
row now).

I really would like to have this in the 0.90.6 release if possible -- any 
complaints if I added some compatibility checks to see if it can use the new 
API is present and blare some some mean sounding warnings if you attempt to use 
the overlap fixing feature against a version that does not support it? (it will 
mostly work but likely require a hmaster restart to be "clean" again).



> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4934) Display Master server and Regionserver start time on respective info servers.

2012-01-12 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185130#comment-13185130
 ] 

Jonathan Hsieh commented on HBASE-4934:
---

@Ming

I haven't looked into this but makes sense.  Since this is committed already, 
maybe file another issue to fix this?

> Display Master server and Regionserver start time on respective info servers.
> -
>
> Key: HBASE-4934
> URL: https://issues.apache.org/jira/browse/HBASE-4934
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: hbase-4934.patch, hmaster.png, hregion.png
>
>
> With operations like rolling restart or master failovers, it is difficult to 
> tell if a server is the "old" instance or the "new" restarted instance.  
> Adding a start date stamp on the info web pages would be helpful for 
> determining this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-09 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182948#comment-13182948
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

The code in HBASE-1621 code does something similar to my problem cases so it 
might be the solution as well -- apparently meta regioninfos has an offline 
flag. (not sure if this is just trunk though). 

> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-09 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182898#comment-13182898
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

@Ted,

For #1.  I'd ideally like the tool to be backwards compatible with existing 
0.90's.  I think this version will work for older versions in cases where the 
problem is table region holes.  This problem only affects when attempting to 
repair overlapping regions.   If I need to modify servers to update the 
unassign/close api, I'll probably put warnings on the code so that the user is 
aware of potential issues if using hbck to fix older versions (or possibly ask 
the user to failover to another master). 

For #2, makes sense -- I'll spend more time digging into what is "in-motion" 
causing the flaky tests.



> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-09 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182417#comment-13182417
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

I'm posting a preliminary version that I'm currently testing on real clusters.  
The tests are flakey on the 0.90 branch (so there is something async that I 
didn't synchronize properly), and there are a few more TODO's I want to knock 
out before this is ready for full review to be considered for committing.  It's 
got some problems I need some advice figuring out.  

Problem 1:

In the unit tests, I have a few cases where I fabricate new regions and try to 
force the overlapping regions to be closed. For some of these, I cannot delete 
a table after it is repaired without causing subsequent tests to fail.  I think 
this is due to a few things:

1) The disable table handler uses in-memory assignment manager state while 
delete uses in META assignment information.
2) Currently I'm using the sneaky closeRegion that purposely doesn't go through 
the master and in turn doesn't modify in-memory state -- disable uses out of 
date in-memory region assignments.  If I use the unassign method sends RIT 
transitions to the master, but which ends up attempting to assign it again, 
causing timing/transient states.  

What is a good way to clear the HMaster's assignment manager's assignment data 
for particular regions or to force it to re-read from META? (without modifying 
the 0.90 HBase's it is meant to repair).  

Problem 2:

Sometimes test fail reporting HOLE_IN_REGION_CHAIN and 
SERVER_DOES_NOT_MATCH_META.  This means the old and new regions are confiused 
with each other and basically something is still happening asynchronously.  I 
think this is the new region is being assigned and is still transitioning.  
Sound about right?  To make the unit test deterministic, should hbck wait for 
these to settle or should just the unit test wait? 

> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be ca

[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-04 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180138#comment-13180138
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

I've been working on a new version of hbck that solves a whole bunch of 
potential problems in HBase tables.   Currently it is implemented for a variant 
of 0.90 in mind -- there will likely be some minor work to port to stock 
0.90.5, and significant work required to port it to trunk / 0.92.


> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detect most of the invariant 
> violations (orphans is new).  However with '-fix' it can only automatically 
> handle deployment problems with region consistency cases.  This updated 
> version should be able to handle all cases.  When complete will likely 
> deprecate the OfflineMetaRepair tool and subsume several META hole related 
> problems.
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5103) Fix improper master znode deserialization

2011-12-29 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177396#comment-13177396
 ] 

Jonathan Hsieh commented on HBASE-5103:
---

should this get marked as 0.92.1 instead of 0.92.0?


> Fix improper master znode deserialization
> -
>
> Key: HBASE-5103
> URL: https://issues.apache.org/jira/browse/HBASE-5103
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: hbase-5103.patch
>
>
> In ActiveMasterManager#blockUntilBecomingActiveMaster the master znode is 
> created as a versioned serialized version of ServerName
> {code}
>  if (ZKUtil.createEphemeralNodeAndWatch(this.watcher,
>   this.watcher.masterAddressZNode, sn.getVersionedBytes())) {
> {code}
> There are a few user visible places where it is used but not deserialized 
> properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-29 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177379#comment-13177379
 ] 

Jonathan Hsieh commented on HBASE-5101:
---

@Todd

Looks like it, closing as a dupe.

> Add a max number of regions per regionserver limit
> --
>
> Key: HBASE-5101
> URL: https://issues.apache.org/jira/browse/HBASE-5101
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> In a testing environment, a cluster got to a state with more than 1500 
> regions per region server, and essentially became stuck and unavailable.  We 
> could add a limit to the number of regions that a region server can serve to 
> prevent this from happening.  This looks like it could be implemented in the 
> core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-29 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177206#comment-13177206
 ] 

Jonathan Hsieh commented on HBASE-5101:
---

@Ted
First cut -- simply some configured number of max regions.  Maybe it would stop 
splitting at a point where it can handle if some number of rs's going down and 
its regions are reassigned.

There could possibly be a limit on tables as well -- and we could put cap on 
the # of region servers/table and # of regions/region server.  

> Add a max number of regions per regionserver limit
> --
>
> Key: HBASE-5101
> URL: https://issues.apache.org/jira/browse/HBASE-5101
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>
> In a testing environment, a cluster got to a state with more than 1500 
> regions per region server, and essentially became stuck and unavailable.  We 
> could add a limit to the number of regions that a region server can serve to 
> prevent this from happening.  This looks like it could be implemented in the 
> core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176959#comment-13176959
 ] 

Jonathan Hsieh commented on HBASE-5101:
---

@Andrew 
I think that would be the idea. The hope is to avoid getting region servers 
into trouble and to give an admin some warning when they are approaching 
trouble (maybe reached some percentage of region limit).

@Ted
I've been purposely testing using a stress configuration with heavy write load 
that purposely requires flushes (4 MB), splits (64MB) and compactions all the 
time.  Along the way region servers crash (which is fine -- fault injection is 
part of this workload). 

I've encountered some situations where folks don't know the distribution of 
their row keys (or don't have uniform row key distributions).  This could be a 
useful go-between in situations where region pre-splitting with dynamic 
splitting off may not be effective.  

> Add a max number of regions per regionserver limit
> --
>
> Key: HBASE-5101
> URL: https://issues.apache.org/jira/browse/HBASE-5101
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>
> In a testing environment, a cluster got to a state with more than 1500 
> regions per region server, and essentially became stuck and unavailable.  We 
> could add a limit to the number of regions that a region server can serve to 
> prevent this from happening.  This looks like it could be implemented in the 
> core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master

2011-12-22 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175061#comment-13175061
 ] 

Jonathan Hsieh commented on HBASE-5083:
---

Agreed and can probably be handled in the same patch.  (updated description to 
add the missing other option).

> Backup HMaster should have http infoport open with link to the active master
> 
>
> Key: HBASE-5083
> URL: https://issues.apache.org/jira/browse/HBASE-5083
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>
> Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is 
> up.  It seems like it would be good for a backup hmaster to have a basic web 
> page up on the info port so that users could see that it is up.  Also it 
> should probably either provide a link to the active master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172928#comment-13172928
 ] 

Jonathan Hsieh commented on HBASE-5063:
---

I don't think failure are due this patch.  The MR ones have been failing 
recently so I buy that.

I'd love to know the maven voodoo to make the hanging tests print/save their 
output...

> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch, hbase-5063.v2.0.92.patch, 
> hbase-5063.v2.trunk.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172917#comment-13172917
 ] 

Jonathan Hsieh commented on HBASE-5063:
---

@Stack

Here's what I got from a local run:

{code}
Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 506.988 sec
Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 501.953 sec <<< 
FAILURE!
Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 76.135 sec
{code}

mapreduce.TestTableMapReduce seems to be a hang.  (grr... how do I get maven 
just to spit out all test output instead of waiting for the test to "finish")
mapred.TestTableMapReduce seems to be a failed MR job.


> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch, hbase-5063.v2.0.92.patch, 
> hbase-5063.v2.trunk.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172832#comment-13172832
 ] 

Jonathan Hsieh commented on HBASE-5063:
---

@Lars

I got tied up with something this morning and just started looking at this 
again.  Its will significant amount of work to make this testable so I'm going 
punt on making a test if this is ok.  (There is a specific interleaving which I 
got once but can't seem to easily duplicate).



> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch, hbase-5063.v2.0.92.patch, 
> hbase-5063.v2.trunk.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172599#comment-13172599
 ] 

Jonathan Hsieh commented on HBASE-5063:
---

I think it is valid and will address it.  I'd like to write a unit test that 
captures this issue as well (it is odd that TestMasterFailover does not).



> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-17 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171765#comment-13171765
 ] 

Jonathan Hsieh commented on HBASE-5063:
---

Here's the exception -- unfortunately it doesn't say which master it is unable 
to connect to.

{code}
11/12/17 18:50:24 WARN regionserver.HRegionServer: Unable to connect to master. 
Retrying. Error was:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
at 
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1024)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:876)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
at $Proxy8.getProtocolVersion(Unknown Source)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1616)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:787)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:674)
at java.lang.Thread.run(Thread.java:619)
{code}

> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4934) Display Master server and Regionserver start time on respective info servers.

2011-12-16 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171126#comment-13171126
 ] 

Jonathan Hsieh commented on HBASE-4934:
---

I don't think the failed tests are related to this patch at all, there seem to 
be instances of similar failures in the 4-5 previous builds.

> Display Master server and Regionserver start time on respective info servers.
> -
>
> Key: HBASE-4934
> URL: https://issues.apache.org/jira/browse/HBASE-4934
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Minor
> Attachments: hbase-4934.patch, hmaster.png, hregion.png
>
>
> With operations like rolling restart or master failovers, it is difficult to 
> tell if a server is the "old" instance or the "new" restarted instance.  
> Adding a start date stamp on the info web pages would be helpful for 
> determining this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4934) Display Master server and Regionserver start time on respective info servers.

2011-12-16 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170984#comment-13170984
 ] 

Jonathan Hsieh commented on HBASE-4934:
---

I find this useful for quickly determining which regionservers have been 
restarted/resurrected.

> Display Master server and Regionserver start time on respective info servers.
> -
>
> Key: HBASE-4934
> URL: https://issues.apache.org/jira/browse/HBASE-4934
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Minor
> Attachments: hbase-4934.patch, hmaster.png, hregion.png
>
>
> With operations like rolling restart or master failovers, it is difficult to 
> tell if a server is the "old" instance or the "new" restarted instance.  
> Adding a start date stamp on the info web pages would be helpful for 
> determining this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5049) TestHLogSplit.testLogRollAfterSplitStart not working due to HBASE-5006

2011-12-15 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170634#comment-13170634
 ] 

Jonathan Hsieh commented on HBASE-5049:
---

So the old code had an error in semantics (see 
http://docs.oracle.com/javase/6/docs/api/java/lang/Throwable.html#initCause(java.lang.Throwable)
 ).

Specifically the initCause can only be used once.  When an exception is passed 
in at the constructor, the subsequent initCause call will *always* fail.  

We encountered this exception when backporting parts of HBASE-5006 for a 
HBASE-2312/HADOOP-6840 bug fix series a 0.90 based-branch.   This hadoop branch 
has some fixes related to HADOOP-6840 currently only on the 1.1.0 branch.



> TestHLogSplit.testLogRollAfterSplitStart not working due to HBASE-5006
> --
>
> Key: HBASE-5049
> URL: https://issues.apache.org/jira/browse/HBASE-5049
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Trivial
> Attachments: 
> 0001-HBASE-5049-TestHLogSplit.testLogRollAfterSplitStart-.patch
>
>
> java.lang.IllegalStateException: Can't overwrite cause
>   at java.lang.Throwable.initCause(Throwable.java:320)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:624)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:570)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:504)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit.testLogRollAfterSplitStart(TestHLogSplit.java:880)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5034) Make distributed log splitting the default once we gain confidence in it.

2011-12-15 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170091#comment-13170091
 ] 

Jonathan Hsieh commented on HBASE-5034:
---

My bad -- I didn't check that.  I think we should still deprecate and remove 
the older path so we only have one path/configuration to exercise in the future.

> Make distributed log splitting the default once we gain confidence in it.
> -
>
> Key: HBASE-5034
> URL: https://issues.apache.org/jira/browse/HBASE-5034
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.94.0
>Reporter: Jonathan Hsieh
>
> As a suggestion:
> To reduce the number of paths necessary for testing, we should make 
> distributed log splitting the default setting for recovery once we gain 
> confidence with it.  After a release where it is the default (0.94 
> hopefully?), the release after could remove the original non-distributed 
> version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164844#comment-13164844
 ] 

Jonathan Hsieh commented on HBASE-4610:
---

I think if the tests are no worse than before, 0.92.0 sounds reasonable to me.

> Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
> (definitely bring in config params, decide if we need to do more to fix the 
> bug)
> -
>
> Key: HBASE-4610
> URL: https://issues.apache.org/jira/browse/HBASE-4610
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: Jonathan Gray
> Fix For: 0.92.0, 0.94.0
>
> Attachments: 4610.txt
>
>
> Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
> added some more config parameters to better control the master startup loop 
> where it waits for RS to heartbeat in.  We had thought at the time that 92 
> would have a different solution but it is still relying on heartbeats to 
> learn about RSs.
> For now, we should definitely bring these config params into 92/trunk.  
> Otherwise this is an incompatible regression and adding these will also make 
> things like what was just reported over in HBASE-4603 trivial to fix in an 
> optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164835#comment-13164835
 ] 

Jonathan Hsieh commented on HBASE-4610:
---

I had started doing this also -- are you sure you want to keep the 'if (count 
== oldcount && count > 0) break' line?  It was removed on the 0.90 version.

{code}
+long slept = 0;
 for (int oldcount = countOfRegionServers(); !this.master.isStopped();) {
   Thread.sleep(interval);
+  slept += interval;
   count = countOfRegionServers();
   if (count == oldcount && count > 0) break;
 
   String msg;
+  if (count == oldcount && count >= minToStart && slept >= timeout) {
+LOG.info("Finished waiting for regionserver count to settle; " +
+"count=" + count + ", sleptFor=" + slept);
+break;
{code}

Before and after test, TestMasterFailover seemed flaky for me on the 0.92 
branch.  

Is the plan for this 0.92.0 or 0.92.1?

> Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
> (definitely bring in config params, decide if we need to do more to fix the 
> bug)
> -
>
> Key: HBASE-4610
> URL: https://issues.apache.org/jira/browse/HBASE-4610
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: Jonathan Gray
> Fix For: 0.92.1
>
> Attachments: 4610.txt
>
>
> Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
> added some more config parameters to better control the master startup loop 
> where it waits for RS to heartbeat in.  We had thought at the time that 92 
> would have a different solution but it is still relying on heartbeats to 
> learn about RSs.
> For now, we should definitely bring these config params into 92/trunk.  
> Otherwise this is an incompatible regression and adding these will also make 
> things like what was just reported over in HBASE-4603 trivial to fix in an 
> optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164830#comment-13164830
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

Ted has filed HBASE-4977 and closed HBASE-3848.   I will resolving this issue 
as "Not a bug"

> Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
> --
>
> Key: HBASE-4972
> URL: https://issues.apache.org/jira/browse/HBASE-4972
> Project: HBase
>  Issue Type: Task
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.92.0
>
>
> There are several issues that have been committed in the 0.90 branch but were 
> not in trunk/0.92 branch.   These regressions should be "forward" ported.
> HBASE-3320  ! 
> HBASE-3380  ! -> HBASE-4610 is a jira to backports this, but it is not done.
> HBASE-3410  ! 
> HBASE-3501  !
> HBASE-3714  ! 
> HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
> branch.
> HBASE-3848  !
> HBASE-3892  ! * Comments say trunk does not need.
> HBASE-3906  !
> HBASE-3989  !
> HBASE-4109  !
> HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
> 0.90 or 0.92
> HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164827#comment-13164827
 ] 

Jonathan Hsieh commented on HBASE-4974:
---

The test failures are related to a problem in HBASE-4927.  An addendum was 
added and those 3 tests should pass now.

> Remove some resources leaks on the tests
> 
>
> Key: HBASE-4974
> URL: https://issues.apache.org/jira/browse/HBASE-4974
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 4974_all.patch
>
>
> Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164686#comment-13164686
 ] 

Jonathan Hsieh commented on HBASE-4927:
---

Verison initially committed with this patch made the HRegionInfo's comparator 
declare  region ['','') smaller  ['', 'A').  Previously it was the other way 
around.  

In the TestOfflineMeta* tests, disableTable call eventually calls 
AssignmentManager#getRegionsOfTable(table).  This returns 3 regions instead of 
4.  This is because this uses a "boundary" region with has [startkey='', 
endkey='').  The change likely left either the begin or end region out with 
this call.

The core problem is because the definintion of greater than or less than 
regions is inconsistent wrt to '' start and end keys.   


> CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
> last region when the endkey is empty
> ---
>
> Key: HBASE-4927
> URL: https://issues.apache.org/jira/browse/HBASE-4927
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 
> 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
> 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
> 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
> 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
> hbase-4927-fix-ws.txt
>
>
> When reviewing HBASE-4238 backporting, Jon found this issue.
> What happens if the split points are  (empty end key is the last key, empty 
> start key is the first key)
> Parent [A,)
> L daughter [A,B), 
> R daughter [B,)
> When sorted, we gets to end key comparision which results in this incorrector 
> order:
> [A,B), [A,), [B,) 
> we wanted:
> [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164512#comment-13164512
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

So two issue remain, 
* HBASE-4610 which is explicitly a forward porting issue. 
* HBASE-3848 which is open -- currently with a commit on 0.90 branch but not 
trunk/0.92.  Maybe this should be closed on 0.90 and a new forward porting 
issue should be created?

The other issues are basically non-issues code-wise: 
* subsequent patches picked up the fix.
* patch is not relevant to 0.92/trunk branches. (would be nice to have this in 
title).
* typos in commit messages.  

> Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
> --
>
> Key: HBASE-4972
> URL: https://issues.apache.org/jira/browse/HBASE-4972
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.92.0
>
>
> There are several issues that have been committed in the 0.90 branch but were 
> not in trunk/0.92 branch.   These regressions should be "forward" ported.
> HBASE-3320  ! 
> HBASE-3380  ! -> HBASE-4610 is a jira to backports this, but it is not done.
> HBASE-3410  ! 
> HBASE-3501  !
> HBASE-3714  ! 
> HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
> branch.
> HBASE-3848  !
> HBASE-3892  ! * Comments say trunk does not need.
> HBASE-3906  !
> HBASE-3989  !
> HBASE-4109  !
> HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
> 0.90 or 0.92
> HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164501#comment-13164501
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

* HBASE-3848 This work has been idle since Jun/11
* HBASE-3892 Comments say trunk doesn't need, but no test case so can't verify 
without effort. Seems to have significant differences between 0.90 and 0.92.  
* HBASE-3906 Comments say doesn't make sense on trunk.
* HBASE-3989 Comments say not needed on trunk
* HBASE-4109 Comments say not needed on trunk
* HBASE-4160 Patch and commit present but does not contain name HBASE-4160.
* HBASE-4423 Contained in 0.92's HBASE-4238 


> Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
> --
>
> Key: HBASE-4972
> URL: https://issues.apache.org/jira/browse/HBASE-4972
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.92.0
>
>
> There are several issues that have been committed in the 0.90 branch but were 
> not in trunk/0.92 branch.   These regressions should be "forward" ported.
> HBASE-3320  ! 
> HBASE-3380  ! -> HBASE-4610 is a jira to backports this, but it is not done.
> HBASE-3410  ! 
> HBASE-3501  !
> HBASE-3714  ! 
> HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
> branch.
> HBASE-3848  !
> HBASE-3892  ! * Comments say trunk does not need.
> HBASE-3906  !
> HBASE-3989  !
> HBASE-4109  !
> HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
> 0.90 or 0.92
> HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164466#comment-13164466
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

Here's the first half, 

* HBASE-3320 - Contained in HBASE-3290's trunk/0.92 commit.
* HBASE-3380/HBASE-4610 - conflicts with HBASE-4749 in both cases 
TestMasterFailover is still flaky.
* HBASE-3410 - Contained in HBASE-3374's trunk/0.92 commit.
* HBASE-3501 - Bad commit message - was named HBASE-3502 in trunk/0.92
* HBASE-3714 - Contained in HBASE-4552's trunk/0.92 commit
* HBASE-3729 - Bad commit message - was named HBASE-3749 in trunk/0.92

> Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
> --
>
> Key: HBASE-4972
> URL: https://issues.apache.org/jira/browse/HBASE-4972
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.92.0
>
>
> There are several issues that have been committed in the 0.90 branch but were 
> not in trunk/0.92 branch.   These regressions should be "forward" ported.
> HBASE-3320  ! 
> HBASE-3380  ! -> HBASE-4610 is a jira to backports this, but it is not done.
> HBASE-3410  ! 
> HBASE-3501  !
> HBASE-3714  ! 
> HBASE-3729  !! Maked in 0.92 but not committed there, committed in 0.90 
> branch.
> HBASE-3848  !
> HBASE-3892  ! * Comments say trunk does not need.
> HBASE-3906  !
> HBASE-3989  !
> HBASE-4109  !
> HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
> 0.90 or 0.92
> HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164419#comment-13164419
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

Good news is that most of these patches are small.

> Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
> --
>
> Key: HBASE-4972
> URL: https://issues.apache.org/jira/browse/HBASE-4972
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.92.0
>
>
> There are several issues that have been committed in the 0.90 branch but were 
> not in trunk/0.92 branch.   These regressions should be "forward" ported.
> HBASE-3320  ! 
> HBASE-3380  ! -> HBASE-4610 is a jira to backports this, but it is not done.
> HBASE-3410  ! 
> HBASE-3501  !
> HBASE-3714  ! 
> HBASE-3729  !! Maked in 0.92 but not committed there, committed in 0.90 
> branch.
> HBASE-3848  !
> HBASE-3892  ! * Comments say trunk does not need.
> HBASE-3906  !
> HBASE-3989  !
> HBASE-4109  !
> HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
> 0.90 or 0.92
> HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3755) Catch zk's ConnectionLossException and augment error message with more help

2011-12-05 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163160#comment-13163160
 ] 

Jonathan Hsieh commented on HBASE-3755:
---

Note: This was also committed in the 0.92/trunk branch but doesn't show up 
because its commit log has HBASE=3755.

> Catch zk's ConnectionLossException and augment error message with more help
> ---
>
> Key: HBASE-3755
> URL: https://issues.apache.org/jira/browse/HBASE-3755
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.90.3
>
> Attachments: HBASE-3755-v2.patch, HBASE-3755.patch
>
>
> 0.90 has a different behavior regarding ZK connections, it tends to create 
> too many of them and it's not obvious to users what they should do to fix. I 
> think I've helped at least 5 different users this week with this error.
> By catching ConnectionLossException and augmenting its message, we could say 
> something like "it's possible that the ZooKeeper server has too many 
> connections from this IP, see doc at blah" since the ZK server isn't nice 
> enough to let us know what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4934) Display Master server and Regionserver start time on respective info servers.

2011-12-02 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161761#comment-13161761
 ] 

Jonathan Hsieh commented on HBASE-4934:
---

Sure it is, but as a user I think in human readable dates (did the triggered 
restart happen 2 minutes ago?) instead of milliseconds since 1970... :)

> Display Master server and Regionserver start time on respective info servers.
> -
>
> Key: HBASE-4934
> URL: https://issues.apache.org/jira/browse/HBASE-4934
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Priority: Minor
>
> With operations like rolling restart or master failovers, it is difficult to 
> tell if a server is the "old" instance or the "new" restarted instance.  
> Adding a start date stamp on the info web pages would be helpful for 
> determining this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4931) CopyTable instructions are of date in book and usage in source.

2011-12-01 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161355#comment-13161355
 ] 

Jonathan Hsieh commented on HBASE-4931:
---

* Docs should give warning about zoo.cfg settings vs hbase-site.xml settings.
* Docs should say requires same/compatible versions of hbase.
* Run time warning if there are unexpected arguments (or typos)

> CopyTable instructions are of date in book and usage in source.
> ---
>
> Key: HBASE-4931
> URL: https://issues.apache.org/jira/browse/HBASE-4931
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.4, 0.92.0
>Reporter: Jonathan Hsieh
>
> The book and the usage instructions refer to ReplicationRegionInterface and 
> ReplicationRegionServer which are no longer present in the 0.90+ versions.
> {code}
> $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> --rs.class=org.apache.hadoop.hbase.ipc.ReplicationRegionInterface
> --rs.impl=org.apache.hadoop.hbase.regionserver.replication.ReplicationRegionServer
> --starttime=1265875194289 --endtime=1265878794289
> --peer.adr=server1,server2,server3:2181:/hbase TestTable
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4614) Can't CopyTable between clusters if zoo.cfg is on the classpath

2011-12-01 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161353#comment-13161353
 ] 

Jonathan Hsieh commented on HBASE-4614:
---

Good thread from Lars G on zoo.cfg: 
http://comments.gmane.org/gmane.comp.java.hadoop.hbase.devel/22462

> Can't CopyTable between clusters if zoo.cfg is on the classpath
> ---
>
> Key: HBASE-4614
> URL: https://issues.apache.org/jira/browse/HBASE-4614
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: Jean-Daniel Cryans
> Fix For: 0.94.0
>
>
> Kinger on IRC found out that it's currently impossible to CopyTable between 
> clusters if there's a zoo.cfg on the classpath as it will take precedence 
> over the --peer.adr or whatever else the user passes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4931) CopyTable instructions are of date in book and usage in source.

2011-12-01 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161336#comment-13161336
 ] 

Jonathan Hsieh commented on HBASE-4931:
---

This should give a better actionable error message.

{code}
11/12/01 17:05:19 WARN client.HConnectionManager$HConnectionImplementation: 
Encountered problems when prefetch META table: 
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
table: splittable, row=splittable,,99
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:136)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:649)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:703)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:594)
{code}

> CopyTable instructions are of date in book and usage in source.
> ---
>
> Key: HBASE-4931
> URL: https://issues.apache.org/jira/browse/HBASE-4931
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.4, 0.92.0
>Reporter: Jonathan Hsieh
>
> The book and the usage instructions refer to ReplicationRegionInterface and 
> ReplicationRegionServer which are no longer present in the 0.90+ versions.
> {code}
> $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> --rs.class=org.apache.hadoop.hbase.ipc.ReplicationRegionInterface
> --rs.impl=org.apache.hadoop.hbase.regionserver.replication.ReplicationRegionServer
> --starttime=1265875194289 --endtime=1265878794289
> --peer.adr=server1,server2,server3:2181:/hbase TestTable
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4931) CopyTable instructions are of date in book and usage in source.

2011-12-01 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161335#comment-13161335
 ] 

Jonathan Hsieh commented on HBASE-4931:
---

Other things:

* Docs should say this is a push operation.
* Run time warning if bad zk quorum peer is specified
* Run time warning if bad zk node is specified

> CopyTable instructions are of date in book and usage in source.
> ---
>
> Key: HBASE-4931
> URL: https://issues.apache.org/jira/browse/HBASE-4931
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.4, 0.92.0
>Reporter: Jonathan Hsieh
>
> The book and the usage instructions refer to ReplicationRegionInterface and 
> ReplicationRegionServer which are no longer present in the 0.90+ versions.
> {code}
> $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> --rs.class=org.apache.hadoop.hbase.ipc.ReplicationRegionInterface
> --rs.impl=org.apache.hadoop.hbase.regionserver.replication.ReplicationRegionServer
> --starttime=1265875194289 --endtime=1265878794289
> --peer.adr=server1,server2,server3:2181:/hbase TestTable
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4912) HDFS API Changes

2011-12-01 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160713#comment-13160713
 ] 

Jonathan Hsieh commented on HBASE-4912:
---

I think this this might be one that is related to this bucket HADOOP-7873.

> HDFS API Changes
> 
>
> Key: HBASE-4912
> URL: https://issues.apache.org/jira/browse/HBASE-4912
> Project: HBase
>  Issue Type: Sub-task
>  Components: client, regionserver
>Reporter: Nicolas Spiegelberg
>Assignee: Pritam Damania
> Fix For: 0.94.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157671#comment-13157671
 ] 

Jonathan Hsieh commented on HBASE-4862:
---

@chenhui

I have a question and a few nits. 

What happens if the .temp gets left behind without being renamed?

You might want to mention that hlogs files in progress (.temp file suffixed) 
are excluded here.
{code}
+// After creating writer, simulate partial region's
+// replayRecoveredEditsIfAny() which gets SplitEditFiles of this
+// region,and delete them.
{code}

Also, probably want to update javadoc of getSplitEditFilesSorted.

Comment should probably be "most likely" instead of "mostly"
{code}
+try{
+  logSplitter.splitLog();
+} catch (IOException e) {
+  LOG.info(e);
+  Assert.fail("Throws IOException when spliting "
+  + "log, it is mostly because writing file does not "
+  + "exist which is caused by concurrent replayRecoveredEditsIfAny()");
+}
+if (fs.exists(corruptDir)) {
+  if (fs.listStatus(corruptDir).length > 0) {
+Assert.fail("There are some corrupt logs, "
++ "it is mostly caused by concurrent replayRecoveredEditsIfAny()");
+  }
+}
+  }
{code}


> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, 
> hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, 
> hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157653#comment-13157653
 ] 

Jonathan Hsieh commented on HBASE-4838:
---

@Lars:

+1 lgtm.

I didn't do a deep code review but I applied v3 and tested the 
TestAcidGuarantees ran it 20 times, and also ran the failures enumerated in 
HBASE-2856 they all pass.  (Wow, the diff between v1 and v3 is pretty subtle.)





> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v3.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157636#comment-13157636
 ] 

Jonathan Hsieh commented on HBASE-4862:
---

How feasible is it to add testing to this patch?  Maybe simulate the failure 
situation by aborting RS's and then starting them like in the 
TestSplitTransactionOnCluster tests?

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, 
> hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4868) TestOfflineMetaRebuildBase#testMetaRebuild occasionally fails

2011-11-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157550#comment-13157550
 ] 

Jonathan Hsieh commented on HBASE-4868:
---

I mis-spoke -- the timeouts are already there.  Sorry about that.  

The check should be added to similar spots in the tests in 
TestOfflineMetaRebuildHole and TestOfflineMetaRebuildOverlap -- they would 
likely be vulnerable to the same kind of race.

> TestOfflineMetaRebuildBase#testMetaRebuild occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch, HBASE-4868_trunkv2.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157346#comment-13157346
 ] 

Jonathan Hsieh commented on HBASE-4868:
---

@Jinchao

Also, if this is call that can indefinitely block, I'd add timeout values for 
the test.

So instead of just 

{code}
@Test
{code}

change it to 

{code}
@Test(timeout=180)  // fail test after 180s
{code}

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4866) Fix possible NPE in AssignmentManager#regionOnline

2011-11-24 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156779#comment-13156779
 ] 

Jonathan Hsieh commented on HBASE-4866:
---


Looks like it corresponds to this line which is AssignmentManager:724 on the 
0.90 branch

{code}
  HServerInfo hsiWithoutLoad = new HServerInfo(
serverInfo.getServerAddress(), serverInfo.getStartCode(),
serverInfo.getInfoPort(), serverInfo.getHostname());
{code}   

> Fix possible NPE in AssignmentManager#regionOnline
> --
>
> Key: HBASE-4866
> URL: https://issues.apache.org/jira/browse/HBASE-4866
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.4
>Reporter: Jonathan Hsieh
>
> NPE encountered in users's HMaster logs:
> {code}
> 11/11/22 23:45:37 FATAL master.HMaster: Unhandled exception. Starting 
> shutdown.
> java.lang.NullPointerException
>at 
> org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:731)
>at 
> org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:215)
>at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:422)
>at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:295)
> {code}
> From user list: 
> http://mail-archives.apache.org/mod_mbox/hbase-user/20.mbox/%3C4ECC9AFC.6030307%40qualtrics.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-22 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155544#comment-13155544
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

Also, I don't think dist log splitting has anything do to with this failure.

> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4, 0.92.0, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 4842-v3.txt, hbase-4842-breaker.patch, hbase-4842.patch
>
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-22 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155543#comment-13155543
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

I'll file a new issue.

The main issue isn't what is returned, but when.  With the first 'hbck -fix', 
the master makes a call to the regionserver to issue a request open the region 
(which adds data to meta).  This returns right away.  The next hbck call will 
cause the master query meta again which is used to check consistency.  
Sometimes the new meta entries are fixed before the second hbck call is done 
(failing the test), sometimes it is not (not failing).  

The slight delay allows the open request to finish and the meta entry to be 
updated before the subsequent 'hbck' call.

> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4, 0.92.0, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 4842-v3.txt, hbase-4842-breaker.patch, hbase-4842.patch
>
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154835#comment-13154835
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

Hm.. minicluster failed to start properly in that one.  Seems likely due to 
problem in there somewhere.

> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4, 0.92.0, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-4842-breaker.patch, hbase-4842.patch
>
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154834#comment-13154834
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

Looks like it failed on iteration 39.. 

> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4, 0.92.0, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-4842-breaker.patch, hbase-4842.patch
>
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154802#comment-13154802
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

Stack.

For now the adding a sleep.  Longer term adding some synchronization options 
for the open region call or add update the regions state to returning something 
like OPENING state and then OPEN state after meta and zk have been updated.

> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4, 0.92.0, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-4842-breaker.patch, hbase-4842.patch
>
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154791#comment-13154791
 ] 

Jonathan Hsieh commented on HBASE-4842:
---


I've attached a patch that inserts a sleep into the RegionServer code right 
before writing to meta which causes the test to fail consistently.  There are 
some hanging threads if you run this using mvn.  I ran the change in eclipse as 
a unit test where it fails the test (but the unit test remains hung).


> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Reporter: Jonathan Hsieh
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154788#comment-13154788
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

The story behind this problem.

HBCK repairs a bad assignment using the admin interface to reassign a 
particular region.   offlining the region in ZK.  This calls master.assign -- 
eventually the master uses its serverManager and issues an 
HRegionServer.openRegion().

Looks like the HRegionServer.openRegion being essentially asynchronous and 
causes the failure.  The call submits an OpenRegionHandler (ORH) callback to 
the RS's ExecutorService and then immediately returns the RegionState to OPENED.

The ORH thread calls ORH.process -> updateMeta, which creates a 
PostOpenDeployTaskThread and starts another thread that calls  
HRegionServer.postOpenDeployTasks -> MetaEditor.updateRegionLocation which 
updates the meta table.  

The problem is that the RegionState OPENED is reported to the master even 
though it may not have written all its new assignment to META yet.




> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Reporter: Jonathan Hsieh
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154729#comment-13154729
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

Hm.. this looks like a race or due to the lack of a rendezvous of some sort.  
Up to HBASE-4378, there was a 15000ms (yikes 15 sec!) sleep between the 'hbck 
-fix' call and the subsequent 'hbck' call that is supposed to be clean.  
HBASE-4703 removed this.  

My hunch is that maybe the update to META the 'hbck -fix' does isn't seen on 
the second 'hbck' run.

https://github.com/apache/hbase/commit/6ca0e79a6ac92190238d5cda56f787ab9702d7fc#L61L138
TestHBaseFsck.java:138 


> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Reporter: Jonathan Hsieh
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4842) [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154686#comment-13154686
 ] 

Jonathan Hsieh commented on HBASE-4842:
---

Output Examples:

Note that the ZK assignment and the META assignment did not change.
{code}
// hbck -fix call
ERROR: Region 
tableBadMetaAssign,,1321733234211.35120fc878802e3b6829e6d7b597b44c. listed in 
META on region server ubuntu64-build01.sf.cloudera.com,51134,1321733229687 but 
found on region server ubuntu64-build01.sf.cloudera.com,38112,1321733229583
Trying to fix assignment error...
...
// hbck after fix
ERROR: Region 
tableBadMetaAssign,,1321733234211.35120fc878802e3b6829e6d7b597b44c. listed in 
META on region server ubuntu64-build01.sf.cloudera.com,51134,1321733229687 but 
found on region server ubuntu64-build01.sf.cloudera.com,38112,1321733229583
{code}

Note that the ZK assignment changed but meta had not yet changed.
{code}
// hbck -fix
ERROR: Region 
tableBadMetaAssign,,1321719700727.af24fbbe3e1df676b8e31e3ff5765fb6. listed in 
META on region server p0123.sf.cloudera.com,36067,1321719696277 but found on 
region server p0123.sf.cloudera.com,54221,1321719696237
Trying to fix assignment error...
...
// hbck after fix
ERROR: Region 
tableBadMetaAssign,,1321719700727.af24fbbe3e1df676b8e31e3ff5765fb6. listed in 
META on region server p0123.sf.cloudera.com,36067,1321719696277 but found on 
region server p0123.sf.cloudera.com,59522,1321719696305
{code}

> [hbck] Fix intermittent failures on TestHBaseFsck.testHBaseFsck
> ---
>
> Key: HBASE-4842
> URL: https://issues.apache.org/jira/browse/HBASE-4842
> Project: HBase
>  Issue Type: Bug
>Reporter: Jonathan Hsieh
>
> Its seems that on the 0.92 branch in particular, TestHBaseFsck.testHBaseFsck 
> is intermittently failing.
> In the test, a region's assignment is purposely changed in META but not in 
> ZK.  After the equivalent of 'hbck -fix', a subsequent check that should be 
> clean comes up with a new ZK assignment but with META still being 
> inconsistent with ZK.  The RS in ZK sometimes this points to the same RS, but 
> sometimes it "moves" to another ZK. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154453#comment-13154453
 ] 

Jonathan Hsieh commented on HBASE-2856:
---

On the bulkload operation, the error has something to do with the split point 
-- in the test I force a split and the resulting error has something to do with 
the point where the start of the second daughter.

@Lars -- since the original issue is resolved, and since this seems non-trival, 
maybe this should get move into a new issue?

> TestAcidGuarantee broken on trunk 
> --
>
> Key: HBASE-2856
> URL: https://issues.apache.org/jira/browse/HBASE-2856
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89.20100621
>Reporter: ryan rawson
>Assignee: Amitanand Aiyer
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 2856-0.92.txt, 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 
> 2856-v5.txt, 2856-v6.txt, 2856-v7.txt, 2856-v8.txt, 
> 2856-v9-all-inclusive.txt, acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns 
> from a row, and every so often the first column of N is different, when it 
> should be the same.  This is a bug deep inside the scanner whereby the first 
> peek() of a row is done at time T then the rest of the read is done at T+1 
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent 
> value) to the HFile thus allowing us to preserve read consistency past 
> flushes.  Another solution involves fixing the scanners so that peek() is not 
> destructive (and thus might return different things at different times alas).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4820) Distributed log splitting coding enhancement to make it easier to understand, no semantics change

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154443#comment-13154443
 ] 

Jonathan Hsieh commented on HBASE-4820:
---

@Kannan, I'm looking at this from the point of view of someone who recently 
spent a many hours reviewing the dist log splitting patches in aggregate and 
may be responsible for fixing issues if it has problems.  I had a harder time 
than I'd prefer, and will likely have the same problem again if there are 
problems in the future.  Doing a little bit of semantics preserving changes 
such as making var/method/class names more descriptive and encapsulating pieces 
would go a long way to make the code more easily and quickly understandable by 
more people.

Are you suggesting splitting these changes into smaller pieces such as:

* add better exception error messages.
* consolidate calls only used once. Ex: async callbacks submethods; inline 
finishInitailize into SLM's constructor
* rename vague methods. ex: installTask(String taskName) might be better as 
enqueueSplitLog(String logPath);  handleDeadWorker might be better as 
blacklistDeadWorker;  'exec(String name, Progressable)' might be better as  
'split(String logfilename, Progressable)'
* rename vague classes. ex: Task to SplitTask, TaskBatch to 
SplitTaskState/SplitTaskContext
* correct comments to be consistent with code (comments in SplitLogWorker talks 
about SUCCESS state which acutally is DONE state).








> Distributed log splitting coding enhancement to make it easier to understand, 
> no semantics change
> -
>
> Key: HBASE-4820
> URL: https://issues.apache.org/jira/browse/HBASE-4820
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
>  Labels: newbie
> Fix For: 0.94.0
>
> Attachments: 
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch
>
>
> In reviewing distributed log splitting feature, we found some cosmetic 
> issues.  They make the code hard to understand.
> It will be great to fix them.  For this issue, there should be no semantic 
> change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk

2011-11-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154249#comment-13154249
 ] 

Jonathan Hsieh commented on HBASE-2856:
---

@lars the 0.92 version or TestAcidGuarantees ran for about 12 hours without 
problems. 


> TestAcidGuarantee broken on trunk 
> --
>
> Key: HBASE-2856
> URL: https://issues.apache.org/jira/browse/HBASE-2856
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89.20100621
>Reporter: ryan rawson
>Assignee: Amitanand Aiyer
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 2856-0.92.txt, 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 
> 2856-v5.txt, 2856-v6.txt, 2856-v7.txt, 2856-v8.txt, 
> 2856-v9-all-inclusive.txt, acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns 
> from a row, and every so often the first column of N is different, when it 
> should be the same.  This is a bug deep inside the scanner whereby the first 
> peek() of a row is done at time T then the rest of the read is done at T+1 
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent 
> value) to the HFile thus allowing us to preserve read consistency past 
> flushes.  Another solution involves fixing the scanners so that peek() is not 
> destructive (and thus might return different things at different times alas).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk

2011-11-20 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153925#comment-13153925
 ] 

Jonathan Hsieh commented on HBASE-2856:
---

@larsh I posted it for you here.  https://reviews.apache.org/r/2893/

I applied the patch, committed it and generated a git-patch via 'git 
format-patch HEAD^' which has enough info to find the right branch.

> TestAcidGuarantee broken on trunk 
> --
>
> Key: HBASE-2856
> URL: https://issues.apache.org/jira/browse/HBASE-2856
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89.20100621
>Reporter: ryan rawson
>Assignee: Amitanand Aiyer
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 2856-0.92.txt, 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 
> 2856-v5.txt, 2856-v6.txt, 2856-v7.txt, 2856-v8.txt, 
> 2856-v9-all-inclusive.txt, acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns 
> from a row, and every so often the first column of N is different, when it 
> should be the same.  This is a bug deep inside the scanner whereby the first 
> peek() of a row is done at time T then the rest of the read is done at T+1 
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent 
> value) to the HFile thus allowing us to preserve read consistency past 
> flushes.  Another solution involves fixing the scanners so that peek() is not 
> destructive (and thus might return different things at different times alas).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk

2011-11-20 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153774#comment-13153774
 ] 

Jonathan Hsieh commented on HBASE-2856:
---

On trunk, TestAcidGuarantees ran for a solid day and a half (33+ hours) without 
failing.  

larsh@ I'll loop the 0.92 version and let it run through today and report how 
it fared around midday monday.

> TestAcidGuarantee broken on trunk 
> --
>
> Key: HBASE-2856
> URL: https://issues.apache.org/jira/browse/HBASE-2856
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89.20100621
>Reporter: ryan rawson
>Assignee: Amitanand Aiyer
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 2856-0.92.txt, 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 
> 2856-v5.txt, 2856-v6.txt, 2856-v7.txt, 2856-v8.txt, 
> 2856-v9-all-inclusive.txt, acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns 
> from a row, and every so often the first column of N is different, when it 
> should be the same.  This is a bug deep inside the scanner whereby the first 
> peek() of a row is done at time T then the rest of the read is done at T+1 
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent 
> value) to the HFile thus allowing us to preserve read consistency past 
> flushes.  Another solution involves fixing the scanners so that peek() is not 
> destructive (and thus might return different things at different times alas).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk

2011-11-20 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153775#comment-13153775
 ] 

Jonathan Hsieh commented on HBASE-2856:
---

On trunk, TestAcidGuarantees ran for a solid day and a half (33+ hours) without 
failing.  

larsh@ I'll loop the 0.92 version and let it run through today and report how 
it fared around midday monday.

> TestAcidGuarantee broken on trunk 
> --
>
> Key: HBASE-2856
> URL: https://issues.apache.org/jira/browse/HBASE-2856
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89.20100621
>Reporter: ryan rawson
>Assignee: Amitanand Aiyer
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 2856-0.92.txt, 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 
> 2856-v5.txt, 2856-v6.txt, 2856-v7.txt, 2856-v8.txt, 
> 2856-v9-all-inclusive.txt, acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns 
> from a row, and every so often the first column of N is different, when it 
> should be the same.  This is a bug deep inside the scanner whereby the first 
> peek() of a row is done at time T then the rest of the read is done at T+1 
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent 
> value) to the HFile thus allowing us to preserve read consistency past 
> flushes.  Another solution involves fixing the scanners so that peek() is not 
> destructive (and thus might return different things at different times alas).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk

2011-11-18 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153282#comment-13153282
 ] 

Jonathan Hsieh commented on HBASE-2856:
---

I've been looping TestAcidGuarantee's fro about 6 hours now and it is still 
chugging along and has not  failed.  I'm going to let it go overnight.  (I 
believe it used to fail within an hour)  

What are thoughts on backporting this onto the 0.92 branch?   (as a separate 
issue..)

> TestAcidGuarantee broken on trunk 
> --
>
> Key: HBASE-2856
> URL: https://issues.apache.org/jira/browse/HBASE-2856
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89.20100621
>Reporter: ryan rawson
>Assignee: Amitanand Aiyer
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 2856-v5.txt, 
> 2856-v6.txt, 2856-v7.txt, 2856-v8.txt, 2856-v9-all-inclusive.txt, acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns 
> from a row, and every so often the first column of N is different, when it 
> should be the same.  This is a bug deep inside the scanner whereby the first 
> peek() of a row is done at time T then the rest of the read is done at T+1 
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent 
> value) to the HFile thus allowing us to preserve read consistency past 
> flushes.  Another solution involves fixing the scanners so that peek() is not 
> destructive (and thus might return different things at different times alas).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4623) Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92

2011-11-18 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153271#comment-13153271
 ] 

Jonathan Hsieh commented on HBASE-4623:
---

@stack.  hbase-4623-0.92.patch doesn't apply on trunk.  The robot tried the 
0.92 version on trunk.

I did the diff for trunk backwards.  Fixing.

> Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92
> ---
>
> Key: HBASE-4623
> URL: https://issues.apache.org/jira/browse/HBASE-4623
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.92.0
>
> Attachments: hbase-4623-0.92.patch, hbase-4623.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4623) Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92

2011-11-18 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153208#comment-13153208
 ] 

Jonathan Hsieh commented on HBASE-4623:
---

Found the info -- it is in 
hbase/target/surefire-reports/TEST-org.apache.hadoop.hbase.client.TestShell.xml 

> Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92
> ---
>
> Key: HBASE-4623
> URL: https://issues.apache.org/jira/browse/HBASE-4623
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4623) Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92

2011-11-18 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153195#comment-13153195
 ] 

Jonathan Hsieh commented on HBASE-4623:
---

@stack Is that fix in reference to HBASE-4973 or specific to this patch?  (This 
one I'm mentioning is specific to a patch on this one).

Is there a place to find this without jenkins?


> Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92
> ---
>
> Key: HBASE-4623
> URL: https://issues.apache.org/jira/browse/HBASE-4623
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4623) Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92

2011-11-17 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152040#comment-13152040
 ] 

Jonathan Hsieh commented on HBASE-4623:
---

TestShell is failing and I need some hints on where to find test output from 
TestShell.

I'm getting a error in TestShell, likely because some methods have been removed 
from Scan.  It is telling me:

{code}
...
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: org.jruby.exceptions.RaiseException: (RuntimeError) Shell unit tests 
failed. Check output file for details.
{code}

Currently I'm trying to run via 'mvn test -Dtest=TestShell' and don't know 
where to get this output logging.  (looking in target/surefire-reports doesn't 
provide useful log data).



> Remove @deprecated Scan methods in 0.90 from TRUNK and 0.92
> ---
>
> Key: HBASE-4623
> URL: https://issues.apache.org/jira/browse/HBASE-4623
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-11-16 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151743#comment-13151743
 ] 

Jonathan Hsieh commented on HBASE-4377:
---

Todd too a quick look and mentioned that "fs.defaultFS" is a Hadoop 0.21+'ism.  
On a 0.20.x release nothing really happens.  Any concerns about this on the 
0.90 backport?

{code}
+  public static void main(String[] args) throws Exception {
+
+// create a fsck object
+Configuration conf = HBaseConfiguration.create();
+conf.set("fs.defaultFS", conf.get(HConstants.HBASE_DIR));
+HBaseFsck fsck = new HBaseFsck(conf);
+
+
{code{

> [hbck] Offline rebuild .META. from fs data only.
> 
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
> EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
> hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, 
> hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch
>
>
> In a worst case situation, it may be helpful to have an offline .META. 
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
> from scratch.  Users could move bad regions out until there is a clean 
> rebuild.  
> It would likely fill in region split holes.  Follow on work could given 
> options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4804) Minor Dyslexia in CHANGES.txt

2011-11-16 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151630#comment-13151630
 ] 

Jonathan Hsieh commented on HBASE-4804:
---

Haha..  I have a spelling problem and a tendency to omit words which may be 
incurable. :)

> Minor Dyslexia in CHANGES.txt
> -
>
> Key: HBASE-4804
> URL: https://issues.apache.org/jira/browse/HBASE-4804
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.92.0
>
> Attachments: hbase-4804.patch
>
>
> I was going through the 0.92 CHANGES and found are a few entries in 
> CHANGES.txt where jira numbers don't match up descriptions.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4506) [hbck] Allow HBaseFsck to be instantiated without connecting

2011-11-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150217#comment-13150217
 ] 

Jonathan Hsieh commented on HBASE-4506:
---

@Nicolas I don't see the revert on this particular patch -- the executor 
instantiation is just moved to a separate method and uses the same numThreads 
value which is the hard coded value or the one set in the hbase-site.xml file.

Which lines are we talking about?  

> [hbck] Allow HBaseFsck to be instantiated without connecting
> 
>
> Key: HBASE-4506
> URL: https://issues.apache.org/jira/browse/HBASE-4506
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Affects Versions: 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4506-hbck-Allow-HBaseFsck-to-be-instantiated-w.patch, 
> hbase-4506-0.90.patch
>
>
> This is a semantics preserving patch that allows for offline meta rebuild 
> (HBASE-4377) to reuse code in the existing hbck code when hbase is down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4718) Backport HBASE-4552 to 0.90 branch.

2011-11-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150061#comment-13150061
 ] 

Jonathan Hsieh commented on HBASE-4718:
---

Patch filed to apply since this was targeted to the 0.90 branch instead of the 
trunk/0.92 branch.  Attached output of selected unit tests run.


{code}
---
 T E S T S
---
Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 146.315 sec
Running org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.407 sec

Results :

Tests run: 10, Failures: 0, Errors: 0, Skipped: 0
{code}

> Backport HBASE-4552 to 0.90 branch.
> ---
>
> Key: HBASE-4718
> URL: https://issues.apache.org/jira/browse/HBASE-4718
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 4718-v2.90, hbase-4718.0.90.patch, 
> hbase-4718.v3.includes-hbase-3316.patch, hbase-4718.v4.patch
>
>
> In discussion of HBASE-4552 / HBASE-4677 there has been some discussion about 
> whether and how to backport HBASE-4552 to the 0.90 branch.  This is a 
> potentially compatibility breaking so several approaches hav ebeen suggested.
> 1) provide patch but do not integrate
> 2) integrate patch that extends and deprecates old api without removing old 
> api.  It has been argued that  clients are supposed to use 
> LoadIncrementalHFiles api and not at the internal HRegionServer RPC api.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4718) Backport HBASE-4552 to 0.90 branch.

2011-11-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150062#comment-13150062
 ] 

Jonathan Hsieh commented on HBASE-4718:
---

Patch *failed* to apply..

> Backport HBASE-4552 to 0.90 branch.
> ---
>
> Key: HBASE-4718
> URL: https://issues.apache.org/jira/browse/HBASE-4718
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 4718-v2.90, hbase-4718.0.90.patch, 
> hbase-4718.v3.includes-hbase-3316.patch, hbase-4718.v4.patch
>
>
> In discussion of HBASE-4552 / HBASE-4677 there has been some discussion about 
> whether and how to backport HBASE-4552 to the 0.90 branch.  This is a 
> potentially compatibility breaking so several approaches hav ebeen suggested.
> 1) provide patch but do not integrate
> 2) integrate patch that extends and deprecates old api without removing old 
> api.  It has been argued that  clients are supposed to use 
> LoadIncrementalHFiles api and not at the internal HRegionServer RPC api.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-11-06 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145197#comment-13145197
 ] 

Jonathan Hsieh commented on HBASE-4377:
---

@mingjian 

If there was a split that didn't complete cleanly, a parent region with 
daughters should look like an overlap.  The tool will tell you where these 
overlaps are.

One way to fix the problem is to keep the parent region and then move or remove 
the daughter regions from hdfs.  Since it is in the middle of a split, the 
parent should have all the data.  Alternately, you could copy the store files 
from the daughters into the dir of the parent and then run the offline 
rebuilder.

I plan on writing a blog post and hopefully adding to the book on how to fix 
these problems.

> [hbck] Offline rebuild .META. from fs data only.
> 
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
> EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
> hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, 
> hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch
>
>
> In a worst case situation, it may be helpful to have an offline .META. 
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
> from scratch.  Users could move bad regions out until there is a clean 
> rebuild.  
> It would likely fill in region split holes.  Follow on work could given 
> options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.

2011-11-05 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144842#comment-13144842
 ] 

Jonathan Hsieh commented on HBASE-4740:
---

@Stack I've updated the patch -- if this is insufficient, I'm probably going to 
be spotty for a week or so.

> [bulk load]  the HBASE-4552 API can't tell if errors on region server is 
> recoverable or unrecoverable error.
> 
>
> Key: HBASE-4740
> URL: https://issues.apache.org/jira/browse/HBASE-4740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch, 
> hbase-4740.v2.patch
>
>
> Running TestHFileOutputFormat more frequently seems to show that it has 
> become flaky.   It is difficult to tell if this is because of a unrecoverable 
> failure or a recoverable failure.   To make this visiable from test and for 
> users, we need to make a change to bulkload call's interface on 
> HRegionServer.  The change should make successful rpcs return true, 
> recoverable failures return false, and unrecoverable failure throw an 
> IOException.  This is an RPC change, so it would be really good to get this 
> api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.

2011-11-04 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144485#comment-13144485
 ] 

Jonathan Hsieh commented on HBASE-4740:
---

@Stack

Yeah, 0 is actually the original behavior in the pre-HBASE-4552 version it I 
think would just eat exceptions and bail out without completing.  It is more 
complicated because of bulk atomicity.

Will update boolean if it works --  there is some template checking in another 
place so assumed it needed boxed type.

The difference is that the version uses a different LoadIncrementalHandlers 
instance.  I'll refactor to exclude that portion and require it in the test.

I tried the previous version with a small data set on psuedo-dist cluster and 
live cluster.  For this particular patch I tried this one by looping the 
relevant unit tests 100 times and seeing that they passed all the time.  I 
haven't tested this exact version on real cluster. 



> [bulk load]  the HBASE-4552 API can't tell if errors on region server is 
> recoverable or unrecoverable error.
> 
>
> Key: HBASE-4740
> URL: https://issues.apache.org/jira/browse/HBASE-4740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch
>
>
> Running TestHFileOutputFormat more frequently seems to show that it has 
> become flaky.   It is difficult to tell if this is because of a unrecoverable 
> failure or a recoverable failure.   To make this visiable from test and for 
> users, we need to make a change to bulkload call's interface on 
> HRegionServer.  The change should make successful rpcs return true, 
> recoverable failures return false, and unrecoverable failure throw an 
> IOException.  This is an RPC change, so it would be really good to get this 
> api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.

2011-11-04 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144364#comment-13144364
 ] 

Jonathan Hsieh commented on HBASE-4740:
---

Review here: https://reviews.apache.org/r/2730/

> [bulk load]  the HBASE-4552 API can't tell if errors on region server is 
> recoverable or unrecoverable error.
> 
>
> Key: HBASE-4740
> URL: https://issues.apache.org/jira/browse/HBASE-4740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch
>
>
> Running TestHFileOutputFormat more frequently seems to show that it has 
> become flaky.   It is difficult to tell if this is because of a unrecoverable 
> failure or a recoverable failure.   To make this visiable from test and for 
> users, we need to make a change to bulkload call's interface on 
> HRegionServer.  The change should make successful rpcs return true, 
> recoverable failures return false, and unrecoverable failure throw an 
> IOException.  This is an RPC change, so it would be really good to get this 
> api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.

2011-11-04 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144362#comment-13144362
 ] 

Jonathan Hsieh commented on HBASE-4740:
---

Ends up that I was splitting in the wrong place and splitting an empty region 
returns scary error messages when it should say return an innocuous one.

> [bulk load]  the HBASE-4552 API can't tell if errors on region server is 
> recoverable or unrecoverable error.
> 
>
> Key: HBASE-4740
> URL: https://issues.apache.org/jira/browse/HBASE-4740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch
>
>
> Running TestHFileOutputFormat more frequently seems to show that it has 
> become flaky.   It is difficult to tell if this is because of a unrecoverable 
> failure or a recoverable failure.   To make this visiable from test and for 
> users, we need to make a change to bulkload call's interface on 
> HRegionServer.  The change should make successful rpcs return true, 
> recoverable failures return false, and unrecoverable failure throw an 
> IOException.  This is an RPC change, so it would be really good to get this 
> api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.

2011-11-03 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143642#comment-13143642
 ] 

Jonathan Hsieh commented on HBASE-4740:
---

While reworking the tests for recoverable and simulated unrecoverable failures 
with the updated api, I noticed that there are some problems in the test cases 
I previously wrote.  There will be some significant changes with the tests in 
this patch as well.

Oddly I have a case where splitting was not happening in a particular test case 
but is in another.  

> [bulk load]  the HBASE-4552 API can't tell if errors on region server is 
> recoverable or unrecoverable error.
> 
>
> Key: HBASE-4740
> URL: https://issues.apache.org/jira/browse/HBASE-4740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.92.0
>
>
> Running TestHFileOutputFormat more frequently seems to show that it has 
> become flaky.   It is difficult to tell if this is because of a unrecoverable 
> failure or a recoverable failure.   To make this visiable from test and for 
> users, we need to make a change to bulkload call's interface on 
> HRegionServer.  The change should make successful rpcs return true, 
> recoverable failures return false, and unrecoverable failure throw an 
> IOException.  This is an RPC change, so it would be really good to get this 
> api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4718) Backport HBASE-4552 to 0.90 branch.

2011-11-02 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142691#comment-13142691
 ] 

Jonathan Hsieh commented on HBASE-4718:
---

Here are the results of the relevent unit tests running tests on 0.90.x.  I'm 
fairly confident that only known flakies could fail on the full run, will post 
any anomalies.

{code}

~/proj/hbase-0.90$ mvn test 
-Dtest=TestLoadIncrementalHFilesSplitRecovery,TestHRegionServerBulkLoad



---
 T E S T S
---
Running org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 47.143 sec
Running org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 72.985 sec

Results :

Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
{code}

> Backport HBASE-4552 to 0.90 branch.
> ---
>
> Key: HBASE-4718
> URL: https://issues.apache.org/jira/browse/HBASE-4718
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-4718.0.90.patch
>
>
> In discussion of HBASE-4552 / HBASE-4677 there has been some discussion about 
> whether and how to backport HBASE-4552 to the 0.90 branch.  This is a 
> potentially compatibility breaking so several approaches hav ebeen suggested.
> 1) provide patch but do not integrate
> 2) integrate patch that extends and deprecates old api without removing old 
> api.  It has been argued that  clients are supposed to use 
> LoadIncrementalHFiles api and not at the internal HRegionServer RPC api.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4718) Backport HBASE-4552 to 0.90 branch.

2011-11-02 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142658#comment-13142658
 ] 

Jonathan Hsieh commented on HBASE-4718:
---

I have verified that stock 0.90.4's non-atomic bulk import still works against 
a standalone cluster running 0.90.5-snapshot including the HBASE-4718 patch 
(combined backport of HBASE-4552/HBASE-4716) and the HBASE-3316 patch (separate 
but trivial backport)

> Backport HBASE-4552 to 0.90 branch.
> ---
>
> Key: HBASE-4718
> URL: https://issues.apache.org/jira/browse/HBASE-4718
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-4718.0.90.patch
>
>
> In discussion of HBASE-4552 / HBASE-4677 there has been some discussion about 
> whether and how to backport HBASE-4552 to the 0.90 branch.  This is a 
> potentially compatibility breaking so several approaches hav ebeen suggested.
> 1) provide patch but do not integrate
> 2) integrate patch that extends and deprecates old api without removing old 
> api.  It has been argued that  clients are supposed to use 
> LoadIncrementalHFiles api and not at the internal HRegionServer RPC api.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4718) Backport HBASE-4552 to 0.90 branch.

2011-11-01 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141449#comment-13141449
 ] 

Jonathan Hsieh commented on HBASE-4718:
---

An initial backport was done by apurtell.  I've taken it and made it work 
against 0.90.  It requires a backport of HBASE-3316.  Before I submit, I would 
like test cross version RPC to verify compatibility or reasonable warning 
messages.

If it is decided not to integrate, I will post the patch after testing.

> Backport HBASE-4552 to 0.90 branch.
> ---
>
> Key: HBASE-4718
> URL: https://issues.apache.org/jira/browse/HBASE-4718
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: Jonathan Hsieh
>
> In discussion of HBASE-4552 / HBASE-4677 there has been some discussion about 
> whether and how to backport HBASE-4552 to the 0.90 branch.  This is a 
> potentially compatibility breaking so several approaches hav ebeen suggested.
> 1) provide patch but do not integrate
> 2) integrate patch that extends and deprecates old api without removing old 
> api.  It has been argued that  clients are supposed to use 
> LoadIncrementalHFiles api and not at the internal HRegionServer RPC api.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4677) Remove old single bulkLoadHFile method

2011-10-31 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140742#comment-13140742
 ] 

Jonathan Hsieh commented on HBASE-4677:
---

I think in this case the api was flawed and the only real way to
fix is to extend. If this gets backported to 0.90.5 we'll keep the old
API call to maintain compatibility.

0.90 and 0.92 are different major versions so the apis can change.

> Remove old single bulkLoadHFile method
> --
>
> Key: HBASE-4677
> URL: https://issues.apache.org/jira/browse/HBASE-4677
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.92.0
>
> Attachments: hbase-4677.patch
>
>
> In review for HBASE-4649, there is some debate as whether to remove, 
> deprecate, or leave the HRegionServer.bulkLoadHFile method. 
> https://reviews.apache.org/r/2545/ .   This jira will take care of that for 
> the 0.92 and trunk releases, and allow the same patch to remain for an 
> optional 0.90.x patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-31 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140533#comment-13140533
 ] 

Jonathan Hsieh commented on HBASE-4377:
---

Addressed most of stack's comments:

* Removed try-catch from deleteTable.
* Updated comment related issues.
* Renamed splits in populateTable to values (splits is for region splits, the 
latter is for creating values.)
* Have separate patch for filling in holes.
* Removed setTableName and added internal check code to getTableName().
* Refactored the sidelining function to check rename returns.

I'm going to punt on these two.

* HRegion creation was done manually because the version that existed attempted 
to open stores and I didn't want or need that.
* MetaReader was not used because at the time I was trying to figure out the 
different table existence semantics in 0.90 vs trunk.   


> [hbck] Offline rebuild .META. from fs data only.
> 
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
> EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
> hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, 
> hbase-4377.trunk.v5.txt
>
>
> In a worst case situation, it may be helpful to have an offline .META. 
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
> from scratch.  Users could move bad regions out until there is a clean 
> rebuild.  
> It would likely fill in region split holes.  Follow on work could given 
> options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-31 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140317#comment-13140317
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

@Ram, on trunk or 0.92 branches, HTableDescriptor(conf,tablename) doesn't seem 
to be in the api.  In patch v4, it seems like all the HTable constructors have 
been updated to explicitly take a the configuration reference.

I'm assuming you meant HTable? 

> multi-CF bulk load is not atomic across column families
> ---
>
> Key: HBASE-4552
> URL: https://issues.apache.org/jira/browse/HBASE-4552
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.92.0
>
> Attachments: hbase-4552.consolidated.patch, 
> hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
> hbase-4552.consolidated.v4.patch
>
>
> Currently the bulk load API simply imports one HFile at a time. With 
> multi-column-family support, this is inappropriate, since different CFs show 
> up separately. Instead, the IPC endpoint should take a of CF -> HFiles, so we 
> can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4677) Remove old single bulkLoadHFile method

2011-10-31 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140217#comment-13140217
 ] 

Jonathan Hsieh commented on HBASE-4677:
---

@Stack

I lean slightly towards removing instead of deprecating.  From those reviews, I 
was initially leaning towards deprecating until it became clear we'd need to 
bump the rpc version numbers in both cases.

The patch is broken out so it is easy to pick one path or the other.


> Remove old single bulkLoadHFile method
> --
>
> Key: HBASE-4677
> URL: https://issues.apache.org/jira/browse/HBASE-4677
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.92.0
>
> Attachments: hbase-4677.patch
>
>
> In review for HBASE-4649, there is some debate as whether to remove, 
> deprecate, or leave the HRegionServer.bulkLoadHFile method. 
> https://reviews.apache.org/r/2545/ .   This jira will take care of that for 
> the 0.92 and trunk releases, and allow the same patch to remain for an 
> optional 0.90.x patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-29 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13139500#comment-13139500
 ] 

Jonathan Hsieh commented on HBASE-4377:
---

i'll do an update tomorrow or monday to
address the nits and get the 0.90 version caught up again.

> [hbck] Offline rebuild .META. from fs data only.
> 
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
> EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
> hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, 
> hbase-4377.trunk.v5.txt
>
>
> In a worst case situation, it may be helpful to have an offline .META. 
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
> from scratch.  Users could move bad regions out until there is a clean 
> rebuild.  
> It would likely fill in region split holes.  Follow on work could given 
> options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-29 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13139386#comment-13139386
 ] 

Jonathan Hsieh commented on HBASE-4377:
---

@Stack

I have a patch written that optionally handles filling in holes, but haven't 
polished it for review yet.  I'll add it after this patch gets through.  IIRC 
it adds this functionality to hbck and to the offline meta rebuilder.

> [hbck] Offline rebuild .META. from fs data only.
> 
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
> EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
> hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, 
> hbase-4377.trunk.v5.txt
>
>
> In a worst case situation, it may be helpful to have an offline .META. 
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
> from scratch.  Users could move bad regions out until there is a clean 
> rebuild.  
> It would likely fill in region split holes.  Follow on work could given 
> options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-28 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138977#comment-13138977
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

This was due to HBASE-4634 which got committed two days ago.  The old 
getTestDir was a public method and apparently was just removed.  This will 
probably break on trunk as well.

https://github.com/apache/hbase/commit/ed21cd6c4c266f610352d76d3d4b6f5cff492a97#src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java

I think this should be replaced with getDataTestDir calls (thats what the old 
bulk load test calls to getTestDir were changed to).

> multi-CF bulk load is not atomic across column families
> ---
>
> Key: HBASE-4552
> URL: https://issues.apache.org/jira/browse/HBASE-4552
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.92.0
>
> Attachments: hbase-4552.consolidated.patch
>
>
> Currently the bulk load API simply imports one HFile at a time. With 
> multi-column-family support, this is inappropriate, since different CFs show 
> up separately. Instead, the IPC endpoint should take a of CF -> HFiles, so we 
> can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4677) Remove old single bulkLoadHFile method

2011-10-28 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138876#comment-13138876
 ] 

Jonathan Hsieh commented on HBASE-4677:
---

Updated patch that bumps RPC version.

> Remove old single bulkLoadHFile method
> --
>
> Key: HBASE-4677
> URL: https://issues.apache.org/jira/browse/HBASE-4677
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.92.0
>
> Attachments: hbase-4677.patch
>
>
> In review for HBASE-4649, there is some debate as whether to remove, 
> deprecate, or leave the HRegionServer.bulkLoadHFile method. 
> https://reviews.apache.org/r/2545/ .   This jira will take care of that for 
> the 0.92 and trunk releases, and allow the same patch to remain for an 
> optional 0.90.x patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4649) Add atomic bulk load function to region server

2011-10-28 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138813#comment-13138813
 ] 

Jonathan Hsieh commented on HBASE-4649:
---

There was a extra System.out.println introduced by HBASE-4532 that was causing 
the new unit tests to fail when run from mvn (worked fine from eclipse or via 
junit's test runner ' bin/hbase org.junit.runner.JUnitCore 
org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad')

I've attached patch there.


> Add atomic bulk load function to region server
> --
>
> Key: HBASE-4649
> URL: https://issues.apache.org/jira/browse/HBASE-4649
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.90.4, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4649-Add-atomic-bulk-load-function-to-region-s.patch, 
> hbase-4649.v2.patch
>
>
> Add a method that atomically bulk load multiple hfiles.  Row atomicity 
> guarantees for multi-column family rows require this functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




<    1   2   3   4   >