Hi Michael,

I'll create a Jira task and fix the patch spacing. I can't really talk too much about the HRegionServer/HClient extension i'm developing but I do think that there could be a general purpose need. For example, HBase let's you filter a scan by rowkey and column key. But what about actual data values? An extension could be an HRegionServer with a scanner that can filters rows by column values given some WHERE criteria. Or maybe that's a bad example cause that should be built directly into HBase? Another would be implementing distributed joins between tables...

I havn't had a chance to re-profile yet. I'd modified the HBase code so I could extend and so part of the motivation of this patch was so that I could revert, update, add the patch you suggested, and then re-apply the extension patch.

I've done that and will hopefully get back to profiling this afternoon.

Thanks,
James

Michael Stack wrote:
The patch looks like an improvement to me. Whats the rationale for needing to extend client/server? Do you think it of general applicability? I'd suggest making an issue and attaching a patch (file against hbase component and it looks like your tabs are not the hadoop two spaces convention going by the below). We can continue discussion therein. I offer to vote for it after review and trying it local.

St.Ack
P.S. Did HADOOP-1498, applied yesterday, change the profiling characteristics you wrote about a few days ago?


James Kennedy wrote:
For what i'm doing I found it necessary to extend HRegionServer/HRegion/HClient for some custom functionality.

Following good Java practice I see that the HBase code as been programmed defensively, keeping stuff private as much as possible.

However, for extensibility it would be nice if the servers/client were easy to extend.

Attached is a patch that makes several methods protected instead of private, adds getters to fields of inner classes, and some other modifications i found were useful for some simple extension code.

I didn't make this a Jira task because I wasn't sure if you guys approved of opening up the code like this but hopefully someone will find this useful.

- James K
------------------------------------------------------------------------

Index: /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java
===================================================================
--- /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java (revision 549130) +++ /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java (working copy)
@@ -62,7 +62,7 @@
   /*
* Data structure that holds current location for a region and its info.
    */
-  static class RegionLocation {
+  protected static class RegionLocation {
     HRegionInfo regionInfo;
     HServerAddress serverAddress;
@@ -76,6 +76,22 @@ return "address: " + this.serverAddress.toString() + ", regioninfo: " +
         this.regionInfo;
     }
+
+    public HRegionInfo getRegionInfo() {
+        return regionInfo;
+    }
+
+    public void setRegionInfo(HRegionInfo regionInfo) {
+        this.regionInfo = regionInfo;
+    }
+
+    public HServerAddress getServerAddress() {
+        return serverAddress;
+    }
+
+    public void setServerAddress(HServerAddress serverAddress) {
+        this.serverAddress = serverAddress;
+    }
   }
      // Map tableName -> (Map startRow -> (HRegionInfo, HServerAddress)
@@ -116,7 +132,7 @@
     this.rand = new Random();
   }
- private void handleRemoteException(RemoteException e) throws IOException { + protected void handleRemoteException(RemoteException e) throws IOException {
     String msg = e.getMessage();
if(e.getClassName().equals("org.apache.hadoop.hbase.InvalidColumnNameException")) {
       throw new InvalidColumnNameException(msg);
@@ -143,7 +159,7 @@
      /* Find the address of the master and connect to it
    */
-  private void checkMaster() throws MasterNotRunningException {
+  protected void checkMaster() throws MasterNotRunningException {
     if (this.master != null) {
       return;
     }
@@ -531,7 +547,7 @@
    * @param tableName - the table name to be checked
    * @throws IllegalArgumentException - if the table name is reserved
    */
-  private void checkReservedTableName(Text tableName) {
+  protected void checkReservedTableName(Text tableName) {
     if(tableName.equals(ROOT_TABLE_NAME)
         || tableName.equals(META_TABLE_NAME)) {
       @@ -547,7 +563,7 @@
//////////////////////////////////////////////////////////////////////////////
   // Client API
//////////////////////////////////////////////////////////////////////////////
-
+     /**
    * Loads information so that a table can be manipulated.
    * @@ -558,8 +574,21 @@
     if(tableName == null || tableName.getLength() == 0) {
throw new IllegalArgumentException("table name cannot be null or zero length");
     }
-    this.tableServers = tablesToServers.get(tableName);
-    if (this.tableServers == null ) {
+    this.tableServers = getTableServers(tableName);
+  }
+  +  /**
+   * Gets the servers of the given table.
+   * +   * @param tableName - the table to be located
+ * @throws IOException - if the table can not be located after retrying
+   */
+ protected synchronized SortedMap<Text, RegionLocation> getTableServers(Text tableName) throws IOException {
+    if(tableName == null || tableName.getLength() == 0) {
+ throw new IllegalArgumentException("table name cannot be null or zero length");
+    }
+ SortedMap<Text, RegionLocation> serverResult = tablesToServers.get(tableName);
+    if (serverResult == null ) {
       if (LOG.isDebugEnabled()) {
         LOG.debug("No servers for " + tableName + ". Doing a find...");
       }
@@ -565,8 +594,9 @@
       }
       // We don't know where the table is.
       // Load the information from meta.
-      this.tableServers = findServersForTable(tableName);
+      serverResult = findServersForTable(tableName);
     }
+    return serverResult;
   }
/*
@@ -836,7 +866,7 @@
    * @param regionServer - the server to connect to
    * @throws IOException
    */
- synchronized HRegionInterface getHRegionConnection(HServerAddress regionServer) + protected synchronized HRegionInterface getHRegionConnection(HServerAddress regionServer)
       throws IOException {
// See if we already have a connection
@@ -916,7 +946,7 @@
    * @param row Row to find.
    * @return Location of row.
    */
-  synchronized RegionLocation getRegionLocation(Text row) {
+  protected synchronized RegionLocation getRegionLocation(Text row) {
     if(row == null || row.getLength() == 0) {
throw new IllegalArgumentException("row key cannot be null or zero length");
     }
@@ -1554,6 +1584,20 @@
     }
          return errCode;
+  }  +
+  /**
+   * @return the map of opened servers
+   */
+  protected TreeMap<String, HRegionInterface> getOpenServers(){
+    return servers;
+  }
+
+  /**
+   * @return the configuration for this server
+   */
+  public Configuration getConf(){
+    return conf;
   }
      /**
@@ -1565,4 +1609,5 @@
     int errCode = (new HClient(c)).doCommandLine(args);
     System.exit(errCode);
   }
+
 }
Index: /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
===================================================================
--- /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java (revision 549130) +++ /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java (working copy)
@@ -55,7 +55,7 @@
* regionName is a unique identifier for this HRegion. (startKey, endKey]
  * defines the keyspace for this HRegion.
  */
-class HRegion implements HConstants {
+public class HRegion implements HConstants {
   static String SPLITDIR = "splits";
   static String MERGEDIR = "merges";
   static String TMPREGION_PREFIX = "tmpregion_";
@@ -298,7 +298,7 @@
    *     * @throws IOException
    */
- HRegion(Path rootDir, HLog log, FileSystem fs, Configuration conf, + public HRegion(Path rootDir, HLog log, FileSystem fs, Configuration conf, HRegionInfo regionInfo, Path initialFiles)
   throws IOException {
     @@ -386,7 +386,7 @@
* This method could take some time to execute, so don't call it from a * time-sensitive thread.
    */
-  Vector<HStoreFile> close() throws IOException {
+  public Vector<HStoreFile> close() throws IOException {
     lock.obtainWriteLock();
     try {
       boolean shouldClose = false;
@@ -548,43 +548,43 @@
   // HRegion accessors
////////////////////////////////////////////////////////////////////////////// - Text getStartKey() {
+  public Text getStartKey() {
     return regionInfo.startKey;
   }
   -  Text getEndKey() {
+  public Text getEndKey() {
     return regionInfo.endKey;
   }
   -  long getRegionId() {
+  public long getRegionId() {
     return regionInfo.regionId;
   }
- Text getRegionName() {
+  public Text getRegionName() {
     return regionInfo.regionName;
   }
   -  Path getRootDir() {
+  public Path getRootDir() {
     return rootDir;
   }
  -  HTableDescriptor getTableDesc() {
+  public HTableDescriptor getTableDesc() {
     return regionInfo.tableDesc;
   }
   -  HLog getLog() {
+  public HLog getLog() {
     return log;
   }
   -  Configuration getConf() {
+  public Configuration getConf() {
     return conf;
   }
   -  Path getRegionDir() {
+  public Path getRegionDir() {
     return regiondir;
   }
   -  FileSystem getFilesystem() {
+  public FileSystem getFilesystem() {
     return fs;
   }
@@ -973,7 +973,7 @@ * Return an iterator that scans over the HRegion, returning the indicated * columns. This Iterator must be closed by the caller.
    */
-  HInternalScannerInterface getScanner(Text[] cols, Text firstRow)
+ public HInternalScannerInterface getScanner(Text[] cols, Text firstRow)
   throws IOException {
     lock.obtainReadLock();
     try {
@@ -1011,7 +1011,7 @@
    * @return lockid
    * @see #put(long, Text, BytesWritable)
    */
-  long startUpdate(Text row) throws IOException {
+  public long startUpdate(Text row) throws IOException {
// We obtain a per-row lock, so other clients will block while one client // performs an update. The read lock is released by the client calling // #commit or #abort or if the HRegionServer lease on the lock expires.
@@ -1029,7 +1029,7 @@
* This method really just tests the input, then calls an internal localput() * method.
    */
- void put(long lockid, Text targetCol, byte [] val) throws IOException { + public void put(long lockid, Text targetCol, byte [] val) throws IOException {
     if (DELETE_BYTES.compareTo(val) == 0) {
       throw new IOException("Cannot insert value: " + val);
     }
@@ -1039,7 +1039,7 @@
   /**
* Delete a value or write a value. This is a just a convenience method for put().
    */
-  void delete(long lockid, Text targetCol) throws IOException {
+  public void delete(long lockid, Text targetCol) throws IOException {
     localput(lockid, targetCol, DELETE_BYTES.get());
   }
@@ -1055,7 +1055,7 @@
    * @param val Value to enter into cell
    * @throws IOException
    */
-  void localput(final long lockid, final Text targetCol,
+  public void localput(final long lockid, final Text targetCol,
     final byte [] val)
   throws IOException {
     checkColumn(targetCol);
@@ -1090,7 +1090,7 @@
* writes associated with the given row-lock. These values have not yet
    * been placed in memcache or written to the log.
    */
-  void abort(long lockid) throws IOException {
+  public void abort(long lockid) throws IOException {
     Text row = getRowFromLock(lockid);
     if(row == null) {
       throw new LockException("No write lock for lockid " + lockid);
@@ -1124,7 +1124,7 @@
    * @param lockid Lock for row we're to commit.
    * @throws IOException
    */
-  void commit(final long lockid) throws IOException {
+  public void commit(final long lockid) throws IOException {
// Remove the row from the pendingWrites list so // that repeated executions won't screw this up.
     Text row = getRowFromLock(lockid);
Index: /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java
===================================================================
--- /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java (revision 549130) +++ /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java (working copy)
@@ -139,6 +139,76 @@
     this.regionName.readFields(in);
     this.offLine = in.readBoolean();
   }
+  +  /**
+   * @return the endKey
+   */
+  public Text getEndKey(){
+    return endKey;
+  }
+
+  /**
+   * @param endKey the endKey to set
+   */
+  public void setEndKey(Text endKey){
+    this.endKey = endKey;
+  }
+
+  /**
+   * @return the regionId
+   */
+  public long getRegionId(){
+    return regionId;
+  }
+
+  /**
+   * @param regionId the regionId to set
+   */
+  public void setRegionId(long regionId){
+    this.regionId = regionId;
+  }
+
+  /**
+   * @return the regionName
+   */
+  public Text getRegionName(){
+    return regionName;
+  }
+
+  /**
+   * @param regionName the regionName to set
+   */
+  public void setRegionName(Text regionName){
+    this.regionName = regionName;
+  }
+
+  /**
+   * @return the startKey
+   */
+  public Text getStartKey(){
+    return startKey;
+  }
+
+  /**
+   * @param startKey the startKey to set
+   */
+  public void setStartKey(Text startKey){
+    this.startKey = startKey;
+  }
+
+  /**
+   * @return the tableDesc
+   */
+  public HTableDescriptor getTableDesc(){
+    return tableDesc;
+  }
+
+  /**
+   * @param tableDesc the tableDesc to set
+   */
+  public void setTableDesc(HTableDescriptor tableDesc){
+    this.tableDesc = tableDesc;
+  }
//////////////////////////////////////////////////////////////////////////////
   // Comparable
@@ -162,4 +232,6 @@
     // Compare end keys.
     return this.endKey.compareTo(other.endKey);
   }
+
+  }
Index: /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
===================================================================
--- /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java (revision 549130) +++ /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java (working copy)
@@ -468,7 +468,7 @@
* Sets a flag that will cause all the HRegionServer threads to shut down
    * in an orderly fashion.
    */
-  synchronized void stop() {
+  public synchronized void stop() {
     stopRequested = true;
notifyAll(); // Wakes run() if it is sleeping
   }
@@ -1079,7 +1079,7 @@
   }
/** - * Private utility method for safely obtaining an HRegion handle.
+   * Protected utility method for safely obtaining an HRegion handle.
    * @param regionName Name of online [EMAIL PROTECTED] HRegion} to return
    * @return [EMAIL PROTECTED] HRegion} for <code>regionName</code>
    * @throws NotServingRegionException
@@ -1084,7 +1084,7 @@
    * @return [EMAIL PROTECTED] HRegion} for <code>regionName</code>
    * @throws NotServingRegionException
    */
-  private HRegion getRegion(final Text regionName)
+  protected HRegion getRegion(final Text regionName)
   throws NotServingRegionException {
     return getRegion(regionName, false);
   }
@@ -1090,7 +1090,7 @@
   }
/** - * Private utility method for safely obtaining an HRegion handle.
+   * Protected utility method for safely obtaining an HRegion handle.
    * @param regionName Name of online [EMAIL PROTECTED] HRegion} to return
* @param checkRetiringRegions Set true if we're to check retiring regions
    * as well as online regions.
@@ -1097,7 +1097,7 @@
    * @return [EMAIL PROTECTED] HRegion} for <code>regionName</code>
    * @throws NotServingRegionException
    */
-  private HRegion getRegion(final Text regionName,
+  protected HRegion getRegion(final Text regionName,
       final boolean checkRetiringRegions)
   throws NotServingRegionException {
     HRegion region = null;


Reply via email to