[jira] [Commented] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2015-10-30 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983821#comment-14983821
 ] 

Qianxi Zhang commented on HBASE-11386:
--

Thanks [~ashish singhi]
I will be appreciate for your work!

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Critical
> Attachments: HBASE_11386_trunk_v1.patch, HBASE_11386_trunk_v2.patch
>
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11625) Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum

2016-01-03 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080650#comment-15080650
 ] 

Qianxi Zhang commented on HBASE-11625:
--

I think it is a big problem. [~tedyu] [~apurtell]
If we use the hbase checksum, when the data is corrupt, this code " b = new 
HFileBlock(headerBuf, fileContext.isUseHBaseChecksum());" will throw exception 
before the checksum code " if (verifyChecksum && !validateBlockChecksum(b, 
onDiskBlock, hdrSize))".

{code}
   b = new HFileBlock(headerBuf, fileContext.isUseHBaseChecksum());
onDiskBlock = new byte[b.getOnDiskSizeWithHeader() + hdrSize];
// headerBuf is HBB
System.arraycopy(headerBuf.array(), headerBuf.arrayOffset(), 
onDiskBlock, 0, hdrSize);
nextBlockOnDiskSize =
  readAtOffset(is, onDiskBlock, hdrSize, b.getOnDiskSizeWithHeader()
  - hdrSize, true, offset + hdrSize, pread);
onDiskSizeWithHeader = b.onDiskSizeWithoutHeader + hdrSize;
  }

  if (!fileContext.isCompressedOrEncrypted()) {
b.assumeUncompressed();
  }

  if (verifyChecksum && !validateBlockChecksum(b, onDiskBlock, hdrSize)) {
return null; // checksum mismatch
  }

{code}

> Reading datablock throws "Invalid HFile block magic" and can not switch to 
> hdfs checksum 
> -
>
> Key: HBASE-11625
> URL: https://issues.apache.org/jira/browse/HBASE-11625
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.94.21, 0.98.4, 0.98.5
>Reporter: qian wang
>Assignee: Pankaj Kumar
> Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz
>
>
> when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it 
> could happen file corruption but it only can switch to hdfs checksum 
> inputstream till validateBlockChecksum(). If the datablock's header corrupted 
> when b = new HFileBlock(),it throws the exception "Invalid HFile block magic" 
> and the rpc call fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11625) Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum

2016-01-03 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080643#comment-15080643
 ] 

Qianxi Zhang commented on HBASE-11625:
--

I encounter the same problem in HBase 1.0 + Hadoop 2.6.0.
Caused by: java.io.IOException: Invalid HFile block magic: 
\x00\x00\x00\x00\x00\x00\x00\x00
at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:154)
at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:167)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.(HFileBlock.java:252)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1644)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1467)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:430)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:865)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:254)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:156)

> Reading datablock throws "Invalid HFile block magic" and can not switch to 
> hdfs checksum 
> -
>
> Key: HBASE-11625
> URL: https://issues.apache.org/jira/browse/HBASE-11625
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.94.21, 0.98.4, 0.98.5
>Reporter: qian wang
>Assignee: Pankaj Kumar
> Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz
>
>
> when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it 
> could happen file corruption but it only can switch to hdfs checksum 
> inputstream till validateBlockChecksum(). If the datablock's header corrupted 
> when b = new HFileBlock(),it throws the exception "Invalid HFile block magic" 
> and the rpc call fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11625) Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum

2016-01-04 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082186#comment-15082186
 ] 

Qianxi Zhang commented on HBASE-11625:
--

I agree. So now I could set “hbase.regionserver.checksum.verify” to false, 
maybe it will lead to some poor performance but it is stable.

> Reading datablock throws "Invalid HFile block magic" and can not switch to 
> hdfs checksum 
> -
>
> Key: HBASE-11625
> URL: https://issues.apache.org/jira/browse/HBASE-11625
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.94.21, 0.98.4, 0.98.5, 1.0.1.1, 1.0.3
>Reporter: qian wang
>Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz, HBASE-11625.patch
>
>
> when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it 
> could happen file corruption but it only can switch to hdfs checksum 
> inputstream till validateBlockChecksum(). If the datablock's header corrupted 
> when b = new HFileBlock(),it throws the exception "Invalid HFile block magic" 
> and the rpc call fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-14267:


 Summary: In Mapreduce on HBase scenario, restart in 
TableInputFormat will result in getting wrong data.
 Key: HBASE-14267
 URL: https://issues.apache.org/jira/browse/HBASE-14267
 Project: HBase
  Issue Type: Bug
  Components: Client, mapreduce
Affects Versions: 1.1.0.1
Reporter: Qianxi Zhang
Assignee: Qianxi Zhang


When I run a mapreduce job on HBase, I will modify the row got from 
Result.getRow(), for example, reverse the row. Since my program is very 
complicated to handle data, it takes long time, and the lease int Region server 
expired. 
Result#195
{code}
  public byte [] getRow() {
if (this.row == null) {
  this.row = (this.cells == null || this.cells.length == 0) ?
  null :
  CellUtil.cloneRow(this.cells[0]);
}
return this.row;
  }
{code}

TableInputFormat will restart the scan from last row, but the row has been 
modified, so it will read wrong data.
TableRecordReaderImpl#218
{code}
  } catch (IOException e) {
// do not retry if the exception tells us not to do so
if (e instanceof DoNotRetryIOException) {
  throw e;
}
// try to handle all other IOExceptions by restarting
// the scanner, if the second call fails, it will be rethrown
LOG.info("recovered from " + StringUtils.stringifyException(e));
if (lastSuccessfulRow == null) {
  LOG.warn("We are restarting the first next() invocation," +
  " if your mapper has restarted a few other times like this" +
  " then you should consider killing this job and investigate" +
  " why it's taking so long.");
}
if (lastSuccessfulRow == null) {
  restart(scan.getStartRow());
} else {
  restart(lastSuccessfulRow);
  scanner.next();// skip presumed already mapped row
}
value = scanner.next();
if (value != null && value.isStale()) numStale++;
numRestarts++;
  }
  if (value != null && value.size() > 0) {
key.set(value.getRow());
lastSuccessfulRow = key.get();
return true;
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704494#comment-14704494
 ] 

Qianxi Zhang commented on HBASE-14267:
--

Hi, [~te...@apache.org] and [~stack]
How do you think about it?
I think we should the Result should be not modified. May be we can return the 
cloneValue(row).

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Affects Versions: 1.1.0.1
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14267:
-
Affects Version/s: (was: 1.1.0.1)
   Status: Patch Available  (was: Open)

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14267:
-
Status: Open  (was: Patch Available)

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14267_trunk_v1.patch
>
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14267:
-
Attachment: HBASE_14267_trunk_v1.patch

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14267_trunk_v1.patch
>
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704531#comment-14704531
 ] 

Qianxi Zhang commented on HBASE-14267:
--

But I think we should offer this mechanism rather than user, or offer the API 
doc for that.

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14267_trunk_v1.patch
>
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704532#comment-14704532
 ] 

Qianxi Zhang commented on HBASE-14267:
--

But I think we should offer this mechanism rather than user, or offer the API 
doc for that.

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14267_trunk_v1.patch
>
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14267:
-
Status: Patch Available  (was: Open)

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14267_trunk_v1.patch
>
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-08-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704572#comment-14704572
 ] 

Qianxi Zhang commented on HBASE-14267:
--

IMO, this problem is serious and covert. User should know the data can not be 
modified.
In our use case, this bug is covert. When lease expired, it will occur, 
otherwise it will not.
 
There is three method to solve this problem:
1. not allowed to modify the row in Result.getRow();
2. return the clone data;
3. API doc for that to let user know the data should be modified.


> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14267_trunk_v1.patch
>
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11490) In HBase Shell set_peer_tableCFs, if the tableCFS is null, it will be wrong

2015-08-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706272#comment-14706272
 ] 

Qianxi Zhang commented on HBASE-11490:
--

Thanks [~ashish singhi] and [~anoop.hbase].
I will modify the patch and add a test case.

> In HBase Shell set_peer_tableCFs, if the tableCFS is null, it will be wrong
> ---
>
> Key: HBASE-11490
> URL: https://issues.apache.org/jira/browse/HBASE-11490
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11490_trunk_v1.patch
>
>
> In HBase Shell set_peer_tableCFs, If the tableCFS is null, it will throw NPE
>  # set all tables to be replicable for a peer
> hbase> set_peer_tableCFs '1', ""
> hbase> set_peer_tableCFs '1'
> ReplicationAdmin#199
> {code}
>   public void setPeerTableCFs(String id, String tableCFs) throws 
> ReplicationException {
> this.replicationPeers.setPeerTableCFsConfig(id, tableCFs);
>   }
> {code}
> ReplicationPeersZKImpl#177
> {code}
>   public void setPeerTableCFsConfig(String id, String tableCFsStr) throws 
> ReplicationException {
> try {
>   if (!peerExists(id)) {
> throw new IllegalArgumentException("Cannot set peer tableCFs because 
> id=" + id
> + " does not exist.");
>   }
>   String tableCFsZKNode = getTableCFsNode(id);
>   byte[] tableCFs = Bytes.toBytes(tableCFsStr);
>   if (ZKUtil.checkExists(this.zookeeper, tableCFsZKNode) != -1) {
> ZKUtil.setData(this.zookeeper, tableCFsZKNode, tableCFs);
>   } else {
> ZKUtil.createAndWatch(this.zookeeper, tableCFsZKNode, tableCFs);
>   }
>   LOG.info("Peer tableCFs with id= " + id + " is now " + tableCFsStr);
> } catch (KeeperException e) {
>   throw new ReplicationException("Unable to change tableCFs of the peer 
> with id=" + id, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14267) In Mapreduce on HBase scenario, restart in TableInputFormat will result in getting wrong data.

2015-09-05 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732150#comment-14732150
 ] 

Qianxi Zhang commented on HBASE-14267:
--

Thanks stack, you are right.
When the mapreduce restarts, tableinputformat will scan at last row which has 
been modified in Result.
TableRecordReaderImpl
{code}
ry {
value = this.scanner.next();
if (logScannerActivity) {
rowcount++;
if (rowcount >= logPerRowCount) {
long now = System.currentTimeMillis();
LOG.info("Mapper took " + (now - timestamp)
+ "ms to process " + rowcount + " rows");
timestamp = now;
rowcount = 0;
}
}
} catch (IOException e) {
// try to handle all IOExceptions by restarting
// the scanner, if the second call fails, it will be rethrown
LOG.info("recovered from " + StringUtils.stringifyException(e));
if (lastSuccessfulRow == null) {
LOG.warn("We are restarting the first next() invocation," +
" if your mapper has restarted a few other times 
like this" +
" then you should consider killing this job and 
investigate" +
" why it's taking so long.");
}
if (lastSuccessfulRow == null) {
restart(scan.getStartRow());
} else {
restart(lastSuccessfulRow);
scanner.next();// skip presumed already mapped row
}
{code}

{code}
if (value != null && value.size() > 0) {
key.set(value.getRow());
lastSuccessfulRow = key.get();
lastKey = value.getRow();
return true;
}
{code}

lastSuccessfulRow is the key in result.

> In Mapreduce on HBase scenario, restart in TableInputFormat will result in 
> getting wrong data.
> --
>
> Key: HBASE-14267
> URL: https://issues.apache.org/jira/browse/HBASE-14267
> Project: HBase
>  Issue Type: Bug
>  Components: Client, mapreduce
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14267_trunk_v1.patch
>
>
> When I run a mapreduce job on HBase, I will modify the row got from 
> Result.getRow(), for example, reverse the row. Since my program is very 
> complicated to handle data, it takes long time, and the lease int Region 
> server expired. 
> Result#195
> {code}
>   public byte [] getRow() {
> if (this.row == null) {
>   this.row = (this.cells == null || this.cells.length == 0) ?
>   null :
>   CellUtil.cloneRow(this.cells[0]);
> }
> return this.row;
>   }
> {code}
> TableInputFormat will restart the scan from last row, but the row has been 
> modified, so it will read wrong data.
> TableRecordReaderImpl#218
> {code}
>   } catch (IOException e) {
> // do not retry if the exception tells us not to do so
> if (e instanceof DoNotRetryIOException) {
>   throw e;
> }
> // try to handle all other IOExceptions by restarting
> // the scanner, if the second call fails, it will be rethrown
> LOG.info("recovered from " + StringUtils.stringifyException(e));
> if (lastSuccessfulRow == null) {
>   LOG.warn("We are restarting the first next() invocation," +
>   " if your mapper has restarted a few other times like this" +
>   " then you should consider killing this job and investigate" +
>   " why it's taking so long.");
> }
> if (lastSuccessfulRow == null) {
>   restart(scan.getStartRow());
> } else {
>   restart(lastSuccessfulRow);
>   scanner.next();// skip presumed already mapped row
> }
> value = scanner.next();
> if (value != null && value.isStale()) numStale++;
> numRestarts++;
>   }
>   if (value != null && value.size() > 0) {
> key.set(value.getRow());
> lastSuccessfulRow = key.get();
> return true;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-14391:


 Summary: Empty regionserver WAL will never be deleted although the 
coresponding regionserver has been stale
 Key: HBASE-14391
 URL: https://issues.apache.org/jira/browse/HBASE-14391
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 1.0.2, 2.0.0
Reporter: Qianxi Zhang
Assignee: Qianxi Zhang


When I restarted the hbase cluster in which there was few data, I found there 
are two directories for one host with different timestamp which indicates that 
the old regionserver wal directory is not deleted.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14391:
-
Affects Version/s: (was: 2.0.0)

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14391:
-
Description: 
When I restarted the hbase cluster in which there was few data, I found there 
are two directories for one host with different timestamp which indicates that 
the old regionserver wal directory is not deleted.
FHLog#989
{code}
 @Override
  public void close() throws IOException {
shutdown();
final FileStatus[] files = getFiles();
if (null != files && 0 != files.length) {
  for (FileStatus file : files) {
Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
// Tell our listeners that a log is going to be archived.
if (!this.listeners.isEmpty()) {
  for (WALActionsListener i : this.listeners) {
i.preLogArchive(file.getPath(), p);
  }
}

if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
  throw new IOException("Unable to rename " + file.getPath() + " to " + 
p);
}
// Tell our listeners that a log was archived.
if (!this.listeners.isEmpty()) {
  for (WALActionsListener i : this.listeners) {
i.postLogArchive(file.getPath(), p);
  }
}
  }
  LOG.debug("Moved " + files.length + " WAL file(s) to " +
FSUtils.getPath(this.fullPathArchiveDir));
}
LOG.info("Closed WAL: " + toString());
  }
{code}
When regionserver is stopped, the hlog will be archived, so wal/regionserver is 
empty in hdfs.

MasterFileSystem#252
{code}
if (curLogFiles == null || curLogFiles.length == 0) {
// Empty log folder. No recovery needed
continue;
  }
{code}
The regionserver directory will be not splitted, it makes sense. But it will be 
not deleted.

  was:
When I restarted the hbase cluster in which there was few data, I found there 
are two directories for one host with different timestamp which indicates that 
the old regionserver wal directory is not deleted.




> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-14391 started by Qianxi Zhang.

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14737894#comment-14737894
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Yeah, I think so. I intend to fix this bug this weekend

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738067#comment-14738067
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Hey, Jerry

I think there are two methods.
1. When the regionserver nomal stops(not abort) and closes the WAL, delete the 
regionserver WAL directory ( HBase 0.98 uses this method)
2. When the cluster starts, master checks all the regionserver directory, If 
the directory in empty then delete it.

I incline to the first method.

How do you think about it?



> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738068#comment-14738068
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Hey, Jerry

I think there are two methods.
1. When the regionserver nomal stops(not abort) and closes the WAL, delete the 
regionserver WAL directory ( HBase 0.98 uses this method)
2. When the cluster starts, master checks all the regionserver directory, If 
the directory in empty then delete it.

I incline to the first method.

How do you think about it?



> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738066#comment-14738066
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Hey, Jerry

I think there are two methods.
1. When the regionserver nomal stops(not abort) and closes the WAL, delete the 
regionserver WAL directory ( HBase 0.98 uses this method)
2. When the cluster starts, master checks all the regionserver directory, If 
the directory in empty then delete it.

I incline to the first method.

How do you think about it?



> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738064#comment-14738064
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Hey, Jerry

I think there are two methods.
1. When the regionserver nomal stops(not abort) and closes the WAL, delete the 
regionserver WAL directory ( HBase 0.98 uses this method)
2. When the cluster starts, master checks all the regionserver directory, If 
the directory in empty then delete it.

I incline to the first method.

How do you think about it?



> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14391:
-
Attachment: HBASE_14391_trunk_v1.patch

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14391_trunk_v1.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-09 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738249#comment-14738249
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Sorry Elliott, maybe I did not catch your idea. There are no fail moves, in 
fact we did not delete it before this patch.

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14391_trunk_v1.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-10 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739960#comment-14739960
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Thanks, [~jerryhe] and [~eclark]
I will update the patch.

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14391_trunk_v1.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-10 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14391:
-
Attachment: HBASE_14391_trunk_v2.patch

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14391_trunk_v1.patch, HBASE_14391_trunk_v2.patch, 
> WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-13 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14391:
-
Status: Patch Available  (was: In Progress)

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_14391_trunk_v1.patch, HBASE_14391_trunk_v2.patch, 
> WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900238#comment-14900238
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Of course. Thanks for the good advice.

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE-14391-master-v3.patch, HBASE_14391_trunk_v1.patch, 
> HBASE_14391_trunk_v2.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900237#comment-14900237
 ] 

Qianxi Zhang commented on HBASE-14391:
--

Of course. Thanks for the good advice.

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE-14391-master-v3.patch, HBASE_14391_trunk_v1.patch, 
> HBASE_14391_trunk_v2.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-21 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-14391:
-
Attachment: HBASE_14391_master_v4.patch

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE-14391-master-v3.patch, 
> HBASE_14391_master_v4.patch, HBASE_14391_trunk_v1.patch, 
> HBASE_14391_trunk_v2.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-11302) ReplicationSourceManager is not thread safe

2014-06-06 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-11302:


 Summary: ReplicationSourceManager is not thread safe
 Key: HBASE-11302
 URL: https://issues.apache.org/jira/browse/HBASE-11302
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.99.0
Reporter: Qianxi Zhang


In ReplicationSourceManager, sources is used to record the peers. It could be 
removed in removePeer method, and read in preLogRoll method. it is not thread 
safe.
ReplicationSourceManager#296
{code}
  void preLogRoll(Path newLog) throws IOException {

synchronized (this.hlogsById) {
  String name = newLog.getName();
  for (ReplicationSourceInterface source : this.sources) {
try {
  this.replicationQueues.addLog(source.getPeerClusterZnode(), name);
} catch (ReplicationException e) {
  throw new IOException("Cannot add log to replication queue with id="
  + source.getPeerClusterZnode() + ", filename=" + name, e);
}
  }
  for (SortedSet hlogs : this.hlogsById.values()) {
if (this.sources.isEmpty()) {
  // If there's no slaves, don't need to keep the old hlogs since
  // we only consider the last one when a new slave comes in
  hlogs.clear();
}
hlogs.add(name);
  }
}

this.latestPath = newLog;
  }
{code}

ReplicationSourceManager#392
{code}
  public void removePeer(String id) {
LOG.info("Closing the following queue " + id + ", currently have "
+ sources.size() + " and another "
+ oldsources.size() + " that were recovered");
String terminateMessage = "Replication stream was removed by a user";
ReplicationSourceInterface srcToRemove = null;
List oldSourcesToDelete =
new ArrayList();
// First close all the recovered sources for this peer
for (ReplicationSourceInterface src : oldsources) {
  if (id.equals(src.getPeerClusterId())) {
oldSourcesToDelete.add(src);
  }
}
for (ReplicationSourceInterface src : oldSourcesToDelete) {
  src.terminate(terminateMessage);
  closeRecoveredQueue((src));
}
LOG.info("Number of deleted recovered sources for " + id + ": "
+ oldSourcesToDelete.size());
// Now look for the one on this cluster
for (ReplicationSourceInterface src : this.sources) {
  if (id.equals(src.getPeerClusterId())) {
srcToRemove = src;
break;
  }
}
if (srcToRemove == null) {
  LOG.error("The queue we wanted to close is missing " + id);
  return;
}
srcToRemove.terminate(terminateMessage);
this.sources.remove(srcToRemove);
deleteSource(id, true);
  }
{code}




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11302) ReplicationSourceManager is not thread safe

2014-06-06 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11302:
-

Attachment: HBase-11302-0.99.diff

CopyOnWriteArrayList may be used to solve this problem.

> ReplicationSourceManager is not thread safe
> ---
>
> Key: HBASE-11302
> URL: https://issues.apache.org/jira/browse/HBASE-11302
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0
>Reporter: Qianxi Zhang
> Attachments: HBase-11302-0.99.diff
>
>
> In ReplicationSourceManager, sources is used to record the peers. It could be 
> removed in removePeer method, and read in preLogRoll method. it is not thread 
> safe.
> ReplicationSourceManager#296
> {code}
>   void preLogRoll(Path newLog) throws IOException {
> synchronized (this.hlogsById) {
>   String name = newLog.getName();
>   for (ReplicationSourceInterface source : this.sources) {
> try {
>   this.replicationQueues.addLog(source.getPeerClusterZnode(), name);
> } catch (ReplicationException e) {
>   throw new IOException("Cannot add log to replication queue with id="
>   + source.getPeerClusterZnode() + ", filename=" + name, e);
> }
>   }
>   for (SortedSet hlogs : this.hlogsById.values()) {
> if (this.sources.isEmpty()) {
>   // If there's no slaves, don't need to keep the old hlogs since
>   // we only consider the last one when a new slave comes in
>   hlogs.clear();
> }
> hlogs.add(name);
>   }
> }
> this.latestPath = newLog;
>   }
> {code}
> ReplicationSourceManager#392
> {code}
>   public void removePeer(String id) {
> LOG.info("Closing the following queue " + id + ", currently have "
> + sources.size() + " and another "
> + oldsources.size() + " that were recovered");
> String terminateMessage = "Replication stream was removed by a user";
> ReplicationSourceInterface srcToRemove = null;
> List oldSourcesToDelete =
> new ArrayList();
> // First close all the recovered sources for this peer
> for (ReplicationSourceInterface src : oldsources) {
>   if (id.equals(src.getPeerClusterId())) {
> oldSourcesToDelete.add(src);
>   }
> }
> for (ReplicationSourceInterface src : oldSourcesToDelete) {
>   src.terminate(terminateMessage);
>   closeRecoveredQueue((src));
> }
> LOG.info("Number of deleted recovered sources for " + id + ": "
> + oldSourcesToDelete.size());
> // Now look for the one on this cluster
> for (ReplicationSourceInterface src : this.sources) {
>   if (id.equals(src.getPeerClusterId())) {
> srcToRemove = src;
> break;
>   }
> }
> if (srcToRemove == null) {
>   LOG.error("The queue we wanted to close is missing " + id);
>   return;
> }
> srcToRemove.terminate(terminateMessage);
> this.sources.remove(srcToRemove);
> deleteSource(id, true);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11302) ReplicationSourceManager is not thread safe

2014-06-06 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11302:
-

Affects Version/s: 0.98.2

> ReplicationSourceManager is not thread safe
> ---
>
> Key: HBASE-11302
> URL: https://issues.apache.org/jira/browse/HBASE-11302
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.2
>Reporter: Qianxi Zhang
> Attachments: HBase-11302-0.99.diff
>
>
> In ReplicationSourceManager, sources is used to record the peers. It could be 
> removed in removePeer method, and read in preLogRoll method. it is not thread 
> safe.
> ReplicationSourceManager#296
> {code}
>   void preLogRoll(Path newLog) throws IOException {
> synchronized (this.hlogsById) {
>   String name = newLog.getName();
>   for (ReplicationSourceInterface source : this.sources) {
> try {
>   this.replicationQueues.addLog(source.getPeerClusterZnode(), name);
> } catch (ReplicationException e) {
>   throw new IOException("Cannot add log to replication queue with id="
>   + source.getPeerClusterZnode() + ", filename=" + name, e);
> }
>   }
>   for (SortedSet hlogs : this.hlogsById.values()) {
> if (this.sources.isEmpty()) {
>   // If there's no slaves, don't need to keep the old hlogs since
>   // we only consider the last one when a new slave comes in
>   hlogs.clear();
> }
> hlogs.add(name);
>   }
> }
> this.latestPath = newLog;
>   }
> {code}
> ReplicationSourceManager#392
> {code}
>   public void removePeer(String id) {
> LOG.info("Closing the following queue " + id + ", currently have "
> + sources.size() + " and another "
> + oldsources.size() + " that were recovered");
> String terminateMessage = "Replication stream was removed by a user";
> ReplicationSourceInterface srcToRemove = null;
> List oldSourcesToDelete =
> new ArrayList();
> // First close all the recovered sources for this peer
> for (ReplicationSourceInterface src : oldsources) {
>   if (id.equals(src.getPeerClusterId())) {
> oldSourcesToDelete.add(src);
>   }
> }
> for (ReplicationSourceInterface src : oldSourcesToDelete) {
>   src.terminate(terminateMessage);
>   closeRecoveredQueue((src));
> }
> LOG.info("Number of deleted recovered sources for " + id + ": "
> + oldSourcesToDelete.size());
> // Now look for the one on this cluster
> for (ReplicationSourceInterface src : this.sources) {
>   if (id.equals(src.getPeerClusterId())) {
> srcToRemove = src;
> break;
>   }
> }
> if (srcToRemove == null) {
>   LOG.error("The queue we wanted to close is missing " + id);
>   return;
> }
> srcToRemove.terminate(terminateMessage);
> this.sources.remove(srcToRemove);
> deleteSource(id, true);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11342) The method isChildReadLock in class ZKInterProcessLockBase is wrong

2014-06-13 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-11342:


 Summary: The method isChildReadLock in class 
ZKInterProcessLockBase is wrong
 Key: HBASE-11342
 URL: https://issues.apache.org/jira/browse/HBASE-11342
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.98.3, 0.99.0
Reporter: Qianxi Zhang
Assignee: Qianxi Zhang
Priority: Minor


The method isChildReadLock in class ZKInterProcessLockBase may be wrong, which 
determines whether the lock is readLock or not. So we should compare the node 
name with READ_LOCK_CHILD_NODE_PREFIX rather than WRITE_LOCK_CHILD_NODE_PREFIX. 
Since there is no other method to invoke the method "isChildReadLock" now, we 
have not encountered an error.
{code}
  protected static boolean isChildReadLock(String child) {
int idx = child.lastIndexOf(ZKUtil.ZNODE_PATH_SEPARATOR);
String suffix = child.substring(idx + 1);
return suffix.startsWith(WRITE_LOCK_CHILD_NODE_PREFIX);
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11342) The method isChildReadLock in class ZKInterProcessLockBase is wrong

2014-06-13 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11342:
-

Attachment: HBASE_11342.patch

> The method isChildReadLock in class ZKInterProcessLockBase is wrong
> ---
>
> Key: HBASE-11342
> URL: https://issues.apache.org/jira/browse/HBASE-11342
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11342.patch
>
>
> The method isChildReadLock in class ZKInterProcessLockBase may be wrong, 
> which determines whether the lock is readLock or not. So we should compare 
> the node name with READ_LOCK_CHILD_NODE_PREFIX rather than 
> WRITE_LOCK_CHILD_NODE_PREFIX. Since there is no other method to invoke the 
> method "isChildReadLock" now, we have not encountered an error.
> {code}
>   protected static boolean isChildReadLock(String child) {
> int idx = child.lastIndexOf(ZKUtil.ZNODE_PATH_SEPARATOR);
> String suffix = child.substring(idx + 1);
> return suffix.startsWith(WRITE_LOCK_CHILD_NODE_PREFIX);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11354) HConnectionImplementation#DelayedClosing does not start

2014-06-15 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-11354:


 Summary: HConnectionImplementation#DelayedClosing does not start
 Key: HBASE-11354
 URL: https://issues.apache.org/jira/browse/HBASE-11354
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.98.3, 0.99.0
Reporter: Qianxi Zhang
Assignee: Qianxi Zhang


The method "createAndStart" in class DelayedClosing only creates a instance, 
but forgets to start it. So thread delayedClosing is not running all the time.
ConnectionManager#1623
{code}
  static DelayedClosing createAndStart(HConnectionImplementation hci){
Stoppable stoppable = new Stoppable() {
  private volatile boolean isStopped = false;
  @Override public void stop(String why) { isStopped = true;}
  @Override public boolean isStopped() {return isStopped;}
};

return new DelayedClosing(hci, stoppable);
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11354) HConnectionImplementation#DelayedClosing does not start

2014-06-15 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031810#comment-14031810
 ] 

Qianxi Zhang commented on HBASE-11354:
--

I want to know why we need DelayedClosing. I can not find the issue about it. I 
modify the method "createAndStart" to create the instance and start the thread, 
but I find that the client exit will be more slowly. I think the reason is that 
the thread sleeps for a while. Could we interrupt the thread when the 
connection is closed ? 

> HConnectionImplementation#DelayedClosing does not start
> ---
>
> Key: HBASE-11354
> URL: https://issues.apache.org/jira/browse/HBASE-11354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>
> The method "createAndStart" in class DelayedClosing only creates a instance, 
> but forgets to start it. So thread delayedClosing is not running all the time.
> ConnectionManager#1623
> {code}
>   static DelayedClosing createAndStart(HConnectionImplementation hci){
> Stoppable stoppable = new Stoppable() {
>   private volatile boolean isStopped = false;
>   @Override public void stop(String why) { isStopped = true;}
>   @Override public boolean isStopped() {return isStopped;}
> };
> return new DelayedClosing(hci, stoppable);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11354) HConnectionImplementation#DelayedClosing does not start

2014-06-15 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11354:
-

Attachment: HBASE_11354.patch

> HConnectionImplementation#DelayedClosing does not start
> ---
>
> Key: HBASE-11354
> URL: https://issues.apache.org/jira/browse/HBASE-11354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE_11354.patch
>
>
> The method "createAndStart" in class DelayedClosing only creates a instance, 
> but forgets to start it. So thread delayedClosing is not running all the time.
> ConnectionManager#1623
> {code}
>   static DelayedClosing createAndStart(HConnectionImplementation hci){
> Stoppable stoppable = new Stoppable() {
>   private volatile boolean isStopped = false;
>   @Override public void stop(String why) { isStopped = true;}
>   @Override public boolean isStopped() {return isStopped;}
> };
> return new DelayedClosing(hci, stoppable);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11354) HConnectionImplementation#DelayedClosing does not start

2014-06-15 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11354:
-

Priority: Minor  (was: Major)

> HConnectionImplementation#DelayedClosing does not start
> ---
>
> Key: HBASE-11354
> URL: https://issues.apache.org/jira/browse/HBASE-11354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11354.patch
>
>
> The method "createAndStart" in class DelayedClosing only creates a instance, 
> but forgets to start it. So thread delayedClosing is not running all the time.
> ConnectionManager#1623
> {code}
>   static DelayedClosing createAndStart(HConnectionImplementation hci){
> Stoppable stoppable = new Stoppable() {
>   private volatile boolean isStopped = false;
>   @Override public void stop(String why) { isStopped = true;}
>   @Override public boolean isStopped() {return isStopped;}
> };
> return new DelayedClosing(hci, stoppable);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11354) HConnectionImplementation#DelayedClosing does not start

2014-06-16 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033268#comment-14033268
 ] 

Qianxi Zhang commented on HBASE-11354:
--

[~stack] I think so, and I will check it again.

> HConnectionImplementation#DelayedClosing does not start
> ---
>
> Key: HBASE-11354
> URL: https://issues.apache.org/jira/browse/HBASE-11354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11354.patch
>
>
> The method "createAndStart" in class DelayedClosing only creates a instance, 
> but forgets to start it. So thread delayedClosing is not running all the time.
> ConnectionManager#1623
> {code}
>   static DelayedClosing createAndStart(HConnectionImplementation hci){
> Stoppable stoppable = new Stoppable() {
>   private volatile boolean isStopped = false;
>   @Override public void stop(String why) { isStopped = true;}
>   @Override public boolean isStopped() {return isStopped;}
> };
> return new DelayedClosing(hci, stoppable);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11354) HConnectionImplementation#DelayedClosing does not start

2014-06-18 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035783#comment-14035783
 ] 

Qianxi Zhang commented on HBASE-11354:
--

I think we can delete DelaydClosing. Now the connection could be managed by 
ConnectionManager or invoker. When connection need be closed, the method 
internalClose could close the connection to zk.

> HConnectionImplementation#DelayedClosing does not start
> ---
>
> Key: HBASE-11354
> URL: https://issues.apache.org/jira/browse/HBASE-11354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11354.patch
>
>
> The method "createAndStart" in class DelayedClosing only creates a instance, 
> but forgets to start it. So thread delayedClosing is not running all the time.
> ConnectionManager#1623
> {code}
>   static DelayedClosing createAndStart(HConnectionImplementation hci){
> Stoppable stoppable = new Stoppable() {
>   private volatile boolean isStopped = false;
>   @Override public void stop(String why) { isStopped = true;}
>   @Override public boolean isStopped() {return isStopped;}
> };
> return new DelayedClosing(hci, stoppable);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-06-20 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-11386:


 Summary: Replication#table,CF config will be wrong if the table 
name includes namespace
 Key: HBASE-11386
 URL: https://issues.apache.org/jira/browse/HBASE-11386
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Qianxi Zhang
Assignee: Qianxi Zhang
Priority: Minor


Now we can config the table and CF in Replication, but I think the parse will 
be wrong if the table name includes namespace

ReplicationPeer#parseTableCFsFromConfig(line 125)
{code}
Map> tableCFsMap = null;

// parse out (table, cf-list) pairs from tableCFsConfig
// format: "table1:cf1,cf2;table2:cfA,cfB"
String[] tables = tableCFsConfig.split(";");
for (String tab : tables) {
  // 1 ignore empty table config
  tab = tab.trim();
  if (tab.length() == 0) {
continue;
  }
  // 2 split to "table" and "cf1,cf2"
  //   for each table: "table:cf1,cf2" or "table"
  String[] pair = tab.split(":");
  String tabName = pair[0].trim();
  if (pair.length > 2 || tabName.length() == 0) {
LOG.error("ignore invalid tableCFs setting: " + tab);
continue;
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-06-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038584#comment-14038584
 ] 

Qianxi Zhang commented on HBASE-11386:
--

[~fenghh] Pls correct me if am wrong. If the config is "ns:table1:cf1,cf2" , 
the parser will be wrong. But I think if the config is flexible, the parser 
will be complex, such as "ns:table:cf1,cf2" or "table:cf1" or "table" or 
"ns:table".

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-20 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-11388:


 Summary: The order parameter is wrong when invoking the 
constructor of the ReplicationPeer In the method "getPeer" of the class 
ReplicationPeersZKImpl
 Key: HBASE-11388
 URL: https://issues.apache.org/jira/browse/HBASE-11388
 Project: HBase
  Issue Type: Bug
Reporter: Qianxi Zhang
Assignee: Qianxi Zhang
Priority: Minor


The parameters is "Configurationi", "ClusterKey" and "id" in the constructor of 
the class ReplicationPeer. But he order parameter is "Configurationi", "id" and 
"ClusterKey" when invoking the constructor of the ReplicationPeer In the method 
"getPeer" of the class ReplicationPeersZKImpl
ReplicationPeer#76
{code}
  public ReplicationPeer(Configuration conf, String key, String id) throws 
ReplicationException {
this.conf = conf;
this.clusterKey = key;
this.id = id;
try {
  this.reloadZkWatcher();
} catch (IOException e) {
  throw new ReplicationException("Error connecting to peer cluster with 
peerId=" + id, e);
}
  }
{code}

ReplicationPeersZKImpl#498
{code}
ReplicationPeer peer =
new ReplicationPeer(peerConf, peerId, 
ZKUtil.getZooKeeperClusterKey(peerConf));
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11388:
-

Attachment: HBASE_11388.patch

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11388:
-

Affects Version/s: 0.99.0
   0.98.3

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11388:
-

Component/s: Replication

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-20 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11388:
-

Fix Version/s: 0.99.0

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039651#comment-14039651
 ] 

Qianxi Zhang commented on HBASE-11388:
--

[~tedyu] Sorry, I do not understand what you mean. Is that about this issue?

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-20 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039656#comment-14039656
 ] 

Qianxi Zhang commented on HBASE-11388:
--

ok thanks [~tedyu]

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11354) HConnectionImplementation#DelayedClosing does not start

2014-06-23 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040740#comment-14040740
 ] 

Qianxi Zhang commented on HBASE-11354:
--

thanks [~nkeywal], so how do you think what we should do about this issue? In 
fact, it is a bug, am i right?

> HConnectionImplementation#DelayedClosing does not start
> ---
>
> Key: HBASE-11354
> URL: https://issues.apache.org/jira/browse/HBASE-11354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11354.patch
>
>
> The method "createAndStart" in class DelayedClosing only creates a instance, 
> but forgets to start it. So thread delayedClosing is not running all the time.
> ConnectionManager#1623
> {code}
>   static DelayedClosing createAndStart(HConnectionImplementation hci){
> Stoppable stoppable = new Stoppable() {
>   private volatile boolean isStopped = false;
>   @Override public void stop(String why) { isStopped = true;}
>   @Override public boolean isStopped() {return isStopped;}
> };
> return new DelayedClosing(hci, stoppable);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-23 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041597#comment-14041597
 ] 

Qianxi Zhang commented on HBASE-11388:
--

thanks [~jdcryans]. That is a good idea, and I will do it.

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-06-24 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042089#comment-14042089
 ] 

Qianxi Zhang commented on HBASE-11388:
--

[~jdcryans] I think about your idea, I think we should not delete clusterKey 
which is an attribute of the ReplicationPeer. The others could invoke the 
"getClusterKey" method to get it though we can infer it using 
ZKUtil#getZooKeeperClusterKey(), but I think this is more reasonable. Pls 
correct me if am wrong.

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE_11388.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-07-03 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11388:
-

Attachment: HBASE_11388_trunk_V1.patch

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0, 0.98.5
>
> Attachments: HBASE_11388.patch, HBASE_11388_trunk_V1.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-07-03 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051435#comment-14051435
 ] 

Qianxi Zhang commented on HBASE-11388:
--

Thanks [~jdcryans] . I resubmitted the patch.

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0, 0.98.5
>
> Attachments: HBASE_11388.patch, HBASE_11388_trunk_V1.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-03 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051437#comment-14051437
 ] 

Qianxi Zhang commented on HBASE-11386:
--

 If the config is flexible, the parser will be complex. Maybe we need more than 
one delimiter. Could someone give some advice?

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-03 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052122#comment-14052122
 ] 

Qianxi Zhang commented on HBASE-11386:
--

thanks [~stack]. I think we can add the delimiter ' '(blank) between table and 
cf. So the form is "ns:table cf1,cf2,cf3". Do you agree?

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-03 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052207#comment-14052207
 ] 

Qianxi Zhang commented on HBASE-11386:
--

ok, thanks [~stack], I will do this patch.

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-04 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11386:
-

Attachment: HBASE_11386_trunk_v1.patch

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11386_trunk_v1.patch
>
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-04 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11386:
-

Attachment: HBASE_11386_trunk_v2.patch

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11386_trunk_v1.patch, HBASE_11386_trunk_v2.patch
>
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-04 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11386:
-

Attachment: (was: HBASE_11386_trunk_v2.patch)

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11386_trunk_v1.patch
>
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-04 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11386:
-

Attachment: HBASE_11386_trunk_v2.patch

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11386_trunk_v1.patch, HBASE_11386_trunk_v2.patch
>
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace

2014-07-04 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052764#comment-14052764
 ] 

Qianxi Zhang commented on HBASE-11386:
--

[~stack]
 We'll break anyone currently using this feature? Their configuration string 
will break when this code goes in? 
Yes, since the input format is changed, people need input according the new 
format again.

Does it have to be this way? Could we support old format and new? 
I want to find a way to maintain backward compatibility, but now I have no 
idea. Maybe we can offer a new interface, but I think it is not a good way. If 
anyone have ideas, please offer them to me.

> Replication#table,CF config will be wrong if the table name includes namespace
> --
>
> Key: HBASE-11386
> URL: https://issues.apache.org/jira/browse/HBASE-11386
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11386_trunk_v1.patch, HBASE_11386_trunk_v2.patch
>
>
> Now we can config the table and CF in Replication, but I think the parse will 
> be wrong if the table name includes namespace
> ReplicationPeer#parseTableCFsFromConfig(line 125)
> {code}
> Map> tableCFsMap = null;
> // parse out (table, cf-list) pairs from tableCFsConfig
> // format: "table1:cf1,cf2;table2:cfA,cfB"
> String[] tables = tableCFsConfig.split(";");
> for (String tab : tables) {
>   // 1 ignore empty table config
>   tab = tab.trim();
>   if (tab.length() == 0) {
> continue;
>   }
>   // 2 split to "table" and "cf1,cf2"
>   //   for each table: "table:cf1,cf2" or "table"
>   String[] pair = tab.split(":");
>   String tabName = pair[0].trim();
>   if (pair.length > 2 || tabName.length() == 0) {
> LOG.error("ignore invalid tableCFs setting: " + tab);
> continue;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11490) In HBase Shell set_peer_tableCFs, if the tableCFS is null, it will be wrong

2014-07-10 Thread Qianxi Zhang (JIRA)
Qianxi Zhang created HBASE-11490:


 Summary: In HBase Shell set_peer_tableCFs, if the tableCFS is 
null, it will be wrong
 Key: HBASE-11490
 URL: https://issues.apache.org/jira/browse/HBASE-11490
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.99.0
Reporter: Qianxi Zhang
Assignee: Qianxi Zhang
Priority: Minor


In HBase Shell set_peer_tableCFs, If the tableCFS is null, it will throw NPE

 # set all tables to be replicable for a peer
hbase> set_peer_tableCFs '1', ""
hbase> set_peer_tableCFs '1'

ReplicationAdmin#199
{code}
  public void setPeerTableCFs(String id, String tableCFs) throws 
ReplicationException {
this.replicationPeers.setPeerTableCFsConfig(id, tableCFs);
  }
{code}

ReplicationPeersZKImpl#177
{code}
  public void setPeerTableCFsConfig(String id, String tableCFsStr) throws 
ReplicationException {
try {
  if (!peerExists(id)) {
throw new IllegalArgumentException("Cannot set peer tableCFs because 
id=" + id
+ " does not exist.");
  }
  String tableCFsZKNode = getTableCFsNode(id);
  byte[] tableCFs = Bytes.toBytes(tableCFsStr);
  if (ZKUtil.checkExists(this.zookeeper, tableCFsZKNode) != -1) {
ZKUtil.setData(this.zookeeper, tableCFsZKNode, tableCFs);
  } else {
ZKUtil.createAndWatch(this.zookeeper, tableCFsZKNode, tableCFs);
  }
  LOG.info("Peer tableCFs with id= " + id + " is now " + tableCFsStr);
} catch (KeeperException e) {
  throw new ReplicationException("Unable to change tableCFs of the peer 
with id=" + id, e);
}
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11490) In HBase Shell set_peer_tableCFs, if the tableCFS is null, it will be wrong

2014-07-10 Thread Qianxi Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qianxi Zhang updated HBASE-11490:
-

Attachment: HBASE_11490_trunk_v1.patch

> In HBase Shell set_peer_tableCFs, if the tableCFS is null, it will be wrong
> ---
>
> Key: HBASE-11490
> URL: https://issues.apache.org/jira/browse/HBASE-11490
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Attachments: HBASE_11490_trunk_v1.patch
>
>
> In HBase Shell set_peer_tableCFs, If the tableCFS is null, it will throw NPE
>  # set all tables to be replicable for a peer
> hbase> set_peer_tableCFs '1', ""
> hbase> set_peer_tableCFs '1'
> ReplicationAdmin#199
> {code}
>   public void setPeerTableCFs(String id, String tableCFs) throws 
> ReplicationException {
> this.replicationPeers.setPeerTableCFsConfig(id, tableCFs);
>   }
> {code}
> ReplicationPeersZKImpl#177
> {code}
>   public void setPeerTableCFsConfig(String id, String tableCFsStr) throws 
> ReplicationException {
> try {
>   if (!peerExists(id)) {
> throw new IllegalArgumentException("Cannot set peer tableCFs because 
> id=" + id
> + " does not exist.");
>   }
>   String tableCFsZKNode = getTableCFsNode(id);
>   byte[] tableCFs = Bytes.toBytes(tableCFsStr);
>   if (ZKUtil.checkExists(this.zookeeper, tableCFsZKNode) != -1) {
> ZKUtil.setData(this.zookeeper, tableCFsZKNode, tableCFs);
>   } else {
> ZKUtil.createAndWatch(this.zookeeper, tableCFsZKNode, tableCFs);
>   }
>   LOG.info("Peer tableCFs with id= " + id + " is now " + tableCFsStr);
> } catch (KeeperException e) {
>   throw new ReplicationException("Unable to change tableCFs of the peer 
> with id=" + id, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-07-24 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073937#comment-14073937
 ] 

Qianxi Zhang commented on HBASE-11388:
--

[~jdcryans] ok, I will do it.

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0, 0.98.5
>
> Attachments: HBASE_11388.patch, HBASE_11388_trunk_V1.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11388) The order parameter is wrong when invoking the constructor of the ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl

2014-07-24 Thread Qianxi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073962#comment-14073962
 ] 

Qianxi Zhang commented on HBASE-11388:
--

[~jdcryans] I think the new trunk has fixed this bug, so we could close this 
issue.

> The order parameter is wrong when invoking the constructor of the 
> ReplicationPeer In the method "getPeer" of the class ReplicationPeersZKImpl
> -
>
> Key: HBASE-11388
> URL: https://issues.apache.org/jira/browse/HBASE-11388
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
>Priority: Minor
> Fix For: 0.99.0, 0.98.5
>
> Attachments: HBASE_11388.patch, HBASE_11388_trunk_V1.patch
>
>
> The parameters is "Configurationi", "ClusterKey" and "id" in the constructor 
> of the class ReplicationPeer. But he order parameter is "Configurationi", 
> "id" and "ClusterKey" when invoking the constructor of the ReplicationPeer In 
> the method "getPeer" of the class ReplicationPeersZKImpl
> ReplicationPeer#76
> {code}
>   public ReplicationPeer(Configuration conf, String key, String id) throws 
> ReplicationException {
> this.conf = conf;
> this.clusterKey = key;
> this.id = id;
> try {
>   this.reloadZkWatcher();
> } catch (IOException e) {
>   throw new ReplicationException("Error connecting to peer cluster with 
> peerId=" + id, e);
> }
>   }
> {code}
> ReplicationPeersZKImpl#498
> {code}
> ReplicationPeer peer =
> new ReplicationPeer(peerConf, peerId, 
> ZKUtil.getZooKeeperClusterKey(peerConf));
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)