from:"Jianwei Cui \(JIRA\)"

[jira] [Commented] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-22 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15771727#comment-15771727
 ] 

Jianwei Cui commented on HBASE-17330:
-

Thanks for pointing out the mod time problem, [~stack]. I tried the patch 
locally as:
1. start a client to take snapshot periodically;
2. make {{SnapshotFileCache#refreshCache}} log the loading hfile names each 
time it scheduled.
The log shows {{SnapshotFileCache}} could load the hfiles referenced by 
snapshots taken before {{refreshCache}} starting. However, as you mentioned, 
relying on the mod time is risky, the accuracy of mod time depends on the 
implementation of underlying file system, and the mod time could also be 
updated(such as by {{FSNamesystem#setTimes}}). To be more safe, we can make 
{{SnapshotFileCache#getUnreferencedFiles}} load hfile names through on-disk 
snapshots if the passed file is not in memory cache? as:
{code}
  public synchronized Iterable 
getUnreferencedFiles(Iterable files,
  final SnapshotManager snapshotManager)
  throws IOException {
...
for (FileStatus file : files) {
  String fileName = file.getPath().getName();
  if (!refreshed && !cache.contains(fileName)) {
refreshCache(); // ==> Always load hfile names through on-disk 
snapshots(not consider the mod time).
refreshed = true;
  }
  if (cache.contains(fileName)) {
continue;
  }
{code}

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17330-v1.patch, HBASE-17330-v2.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-22 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769907#comment-15769907
 ] 

Jianwei Cui commented on HBASE-17330:
-

Thanks for the review, Ted.

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17330-v1.patch, HBASE-17330-v2.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17347) ExportSnapshot may write snapshot info file to wrong directory when specifying target name

2016-12-20 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765912#comment-15765912
 ] 

Jianwei Cui commented on HBASE-17347:
-

Thanks for the review [~tedyu]

> ExportSnapshot may write snapshot info file to wrong directory when 
> specifying target name
> --
>
> Key: HBASE-17347
> URL: https://issues.apache.org/jira/browse/HBASE-17347
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17347-v1.patch
>
>
> Exportsnapshot will write a new snapshot info file when specifying the target 
> name:
> {code}
> if (!targetName.equals(snapshotName)) {
>   SnapshotDescription snapshotDesc =
> SnapshotDescriptionUtils.readSnapshotInfo(inputFs, snapshotDir)
>   .toBuilder()
>   .setName(targetName)
>   .build();
>   SnapshotDescriptionUtils.writeSnapshotInfo(snapshotDesc, 
> snapshotTmpDir, outputFs);
> }
> {code}
> The snapshot info file will be written to the snapshot tmp directory, 
> however, it should be directly written to the snapshot directory if 
> {{snapshot.export.skip.tmp}} is true. In addition, owner and permission 
> should be set for the new snapshot info file when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17347) ExportSnapshot may write snapshot info file to wrong directory when specifying target name

2016-12-20 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-17347:

Attachment: HBASE-17347-v1.patch

> ExportSnapshot may write snapshot info file to wrong directory when 
> specifying target name
> --
>
> Key: HBASE-17347
> URL: https://issues.apache.org/jira/browse/HBASE-17347
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-17347-v1.patch
>
>
> Exportsnapshot will write a new snapshot info file when specifying the target 
> name:
> {code}
> if (!targetName.equals(snapshotName)) {
>   SnapshotDescription snapshotDesc =
> SnapshotDescriptionUtils.readSnapshotInfo(inputFs, snapshotDir)
>   .toBuilder()
>   .setName(targetName)
>   .build();
>   SnapshotDescriptionUtils.writeSnapshotInfo(snapshotDesc, 
> snapshotTmpDir, outputFs);
> }
> {code}
> The snapshot info file will be written to the snapshot tmp directory, 
> however, it should be directly written to the snapshot directory if 
> {{snapshot.export.skip.tmp}} is true. In addition, owner and permission 
> should be set for the new snapshot info file when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-17347) ExportSnapshot may write snapshot info file to wrong directory when specifying target name

2016-12-20 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-17347:
---

 Summary: ExportSnapshot may write snapshot info file to wrong 
directory when specifying target name
 Key: HBASE-17347
 URL: https://issues.apache.org/jira/browse/HBASE-17347
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.0.0
Reporter: Jianwei Cui
Priority: Minor


Exportsnapshot will write a new snapshot info file when specifying the target 
name:
{code}
if (!targetName.equals(snapshotName)) {
  SnapshotDescription snapshotDesc =
SnapshotDescriptionUtils.readSnapshotInfo(inputFs, snapshotDir)
  .toBuilder()
  .setName(targetName)
  .build();
  SnapshotDescriptionUtils.writeSnapshotInfo(snapshotDesc, snapshotTmpDir, 
outputFs);
}
{code}
The snapshot info file will be written to the snapshot tmp directory, however, 
it should be directly written to the snapshot directory if 
{{snapshot.export.skip.tmp}} is true. In addition, owner and permission should 
be set for the new snapshot info file when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-19 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15763115#comment-15763115
 ] 

Jianwei Cui commented on HBASE-17330:
-

The failed test passed locally, could you please take a look at patch v2? 
[~tedyu]. Thanks. 

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-17330-v1.patch, HBASE-17330-v2.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-17 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui reassigned HBASE-17330:
---

Assignee: Jianwei Cui

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-17330-v1.patch, HBASE-17330-v2.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-17 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-17330:

Attachment: HBASE-17330-v2.patch

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-17330-v1.patch, HBASE-17330-v2.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-17 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15757119#comment-15757119
 ] 

Jianwei Cui commented on HBASE-17330:
-

Thanks for the review [~tedyu]. As mentioned above, it seems not need to 
consider modify time of tmp directory in {{SnapshotFileCache#refreshCache}}? 
Then, the {{hasChange}} could be easily judged by 
{{fs.getFileStatus(snapshotDir).getModificationTime() > lastModifiedTime}}. 

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-17330-v1.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-17 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756766#comment-15756766
 ] 

Jianwei Cui commented on HBASE-17330:
-

In {{SnapshotFileCache#refreshCache}}, the modify time of snapshot tmp 
directory will also be considered as:
{code}
// get the status of the snapshots temporary directory and check if it has 
changes
// The top-level directory timestamp is not updated, so we have to check 
the inner-level.
try {
  Path snapshotTmpDir = new Path(snapshotDir, 
SnapshotDescriptionUtils.SNAPSHOT_TMP_DIR_NAME);
  FileStatus tempDirStatus = fs.getFileStatus(snapshotTmpDir);
  lastTimestamp = Math.min(lastTimestamp, 
tempDirStatus.getModificationTime());
  hasChanges |= (lastTimestamp >= lastModifiedTime);
  ...
} catch (FileNotFoundException e) {
  // Nothing todo, if the tmp dir is empty
}
{code}
It seems the in-progress snapshots under tmp directory won't be loaded in 
{{SnapshotFileCache#refreshCache}} after 
[HBASE-12627|https://issues.apache.org/jira/browse/HBASE-12627], so do not need 
to consider modify time of tmp directory in {{SnapshotFileCache#refreshCache}}?

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-17330-v1.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-17 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-17330:

Attachment: HBASE-17330-v1.patch

> SnapshotFileCache will always refresh the file cache
> 
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0, 1.3.1, 0.98.23
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-17330-v1.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
>   FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>   lastTimestamp = dirStatus.getModificationTime();
>   hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-17330) SnapshotFileCache will always refresh the file cache

2016-12-17 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-17330:
---

 Summary: SnapshotFileCache will always refresh the file cache
 Key: HBASE-17330
 URL: https://issues.apache.org/jira/browse/HBASE-17330
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.98.23, 2.0.0, 1.3.1
Reporter: Jianwei Cui
Priority: Minor


In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
{code}
try {
  FileStatus dirStatus = fs.getFileStatus(snapshotDir);
  lastTimestamp = dirStatus.getModificationTime();
  hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
hasChanges always be true
{code}
The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always be 
true because {{lastTimestamp}} will be updated as:
{code}
this.lastModifiedTime = lastTimestamp;
{code}
So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-11-25 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15695572#comment-15695572
 ] 

Jianwei Cui commented on HBASE-15616:
-

Yes, empty string can work well. I think all operations should have consistent 
responses when passing null qualifier. So we could allow null qualifier and 
convert null to empty string internally for all operations, or throw Exception 
if users pass null? It seems null qualifier is allowed for Put/Get/Scan/Append, 
users may have used null qualifier in these operations, so also need to allow 
null qualifier for checkAndMutate and increment?

> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Attachments: HBASE-15616-v1.patch, HBASE-15616-v2.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
>   at 
> org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
>   ... 7 more
> {code}
> The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte 
> array is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the 
> same column, because null qualifier is allowed for {{Put}},  users may be 
> confused if

[jira] [Commented] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-11-22 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15686690#comment-15686690
 ] 

Jianwei Cui commented on HBASE-15616:
-

Sorry to reply late [~anoop.hbase], and thanks for your questions.

bq. So we can do this way of not setting qualifier on PB when qualifier is 
null? Do we need pass empty string to be set?
The mentioned code is defined in AccessControlUtil.java? The 
{{permissionBuilder}} is the builder of proto message {{TablePermission}}. The 
qualifier is optional in pb definition of TablePermission. When the qualifier 
is not setting, it has clear meaning as the permission is granted on any column 
of the family? On the other hand, checkAndMutate must check a specific column, 
so that qualifier field is required in pb definition of Condition: 
{code}
message Condition {
  required bytes row = 1;
  required bytes family = 2;
  required bytes qualifier = 3;
  ...
{code}
When the qualifier is null, it's also a legal column, I think pass empty string 
seems more clear in this situation?

bq. There are some other places in RequestConverter, we are doing this 
setQualifier(ByteStringer.wrap(qualifier))  See buildIncrementRequest() eg.
HTable provides two ways to do increment:
{code}
  public long incrementColumnValue(final byte [] row, final byte [] family, 
final byte [] qualifier, final long amount)
  public Result increment(final Increment increment)
{code}
The first method will check qualifier and throw NPE if qualifier is null before 
issuing request to server, so that we can't use the first method to do 
increment on null qualifier column. In the second method, {{Increment}} 
provides two ways to add a column:
{code}
public Increment addColumn(byte [] family, byte [] qualifier, long amount)
public Increment add(Cell cell)
{code}
{{addColumn}} will also check qualifier is not null, however {{add(Cell cell)}} 
won't do such check, so we can do increment on null qualifier column as:
{code}
  Increment incr = new Increment(Bytes.toBytes("row"));
  KeyValue kv = new KeyValue(Bytes.toBytes("row"), Bytes.toBytes("C"), 
null, HConstants.LATEST_TIMESTAMP, KeyValue.Type.Put, Bytes.toBytes(1l));
  incr.add(kv);
  table.increment(incr);
{code}
Therefore, the increment methods of HTable have different behaviors when 
qualifier is null, which looks confused. I think the null qualifier is legal in 
HBase, so that should be allowed in different increment methods, and we can 
also pass empty string as null qualifier in buildIncrementRequest()? What do 
you think [~anoop.hbase]? Thanks!

> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Attachments: HBASE-15616-v1.patch, HBASE-15616-v2.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
>

[jira] [Commented] (HBASE-17026) VerifyReplication log should distinguish whether good row key is result of revalidation

2016-11-06 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15642957#comment-15642957
 ] 

Jianwei Cui commented on HBASE-17026:
-

The patch looks good to me [~tedyu]. BTW, because the rowkey may be an 
unreadable binary array, do we need to use 'Bytes.toStringBinary(...)' to print 
the rowkey?

> VerifyReplication log should distinguish whether good row key is result of 
> revalidation
> ---
>
> Key: HBASE-17026
> URL: https://issues.apache.org/jira/browse/HBASE-17026
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: 17026.v1.txt
>
>
> Inspecting app log from VerifyReplication, I saw lines in the following form:
> {code}
> 2016-11-03 15:28:44,877 INFO [main] 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: Good row 
> key: X000  X
> {code}
> where 'X' is the delimiter.
> Without line number, it is difficult to tell whether the good row has gone 
> through revalidation.
> This issue is to distinguish the two logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16771) VerifyReplication should increase GOODROWS counter if re-comparison passes

2016-10-09 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561132#comment-15561132
 ] 

Jianwei Cui commented on HBASE-16771:
-

[~tedyu], patch v3 looks good to me, thanks for the fix.

> VerifyReplication should increase GOODROWS counter if re-comparison passes
> --
>
> Key: HBASE-16771
> URL: https://issues.apache.org/jira/browse/HBASE-16771
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16771.v1.txt, 16771.v2.txt, 16771.v3.txt
>
>
> HBASE-16423 added re-comparison feature to reduce false positive rate.
> However, before logFailRowAndIncreaseCounter() is called, GOODROWS counter is 
> not incremented. Neither is GOODROWS incremented when re-comparison passes.
> This may produce inconsistent results across multiple runs of the same 
> verifyrep command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16771) VerifyReplication should increase GOODROWS counter if re-comparison passes

2016-10-09 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561049#comment-15561049
 ] 

Jianwei Cui commented on HBASE-16771:
-

Patch v2 looks good to me. Btw, in Verifier#map:
{code}
if (rowCmpRet == 0) {
  // rowkey is same, need to compare the content of the row
  try {
Result.compareResults(value, currentCompareRowInPeerTable);
context.getCounter(Counters.GOODROWS).increment(1);
if (verbose) {
  LOG.info("Good row key: " + delimiter + 
Bytes.toString(value.getRow()) + delimiter);
}
  } catch (Exception e) {
logFailRowAndIncreaseCounter(context, 
Counters.CONTENT_DIFFERENT_ROWS, value);
LOG.error("Exception while comparing row : " + e);  // > 
unnecessary to log an exception
  }
{code}
There will be an exception message when the values are different for the same 
rowkey. It may be a good row when doing re-check, and if not, the 
{{logFailRowAndIncreaseCounter}} will also log an error message for this row, 
so it is unnecessary to log an exception here?

> VerifyReplication should increase GOODROWS counter if re-comparison passes
> --
>
> Key: HBASE-16771
> URL: https://issues.apache.org/jira/browse/HBASE-16771
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 16771.v1.txt, 16771.v2.txt
>
>
> HBASE-16423 added re-comparison feature to reduce false positive rate.
> However, before logFailRowAndIncreaseCounter() is called, GOODROWS counter is 
> not incremented. Neither is GOODROWS incremented when re-comparison passes.
> This may produce inconsistent results across multiple runs of the same 
> verifyrep command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16762) NullPointerException is thrown when constructing sourceTable in verifyrep

2016-10-09 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561033#comment-15561033
 ] 

Jianwei Cui commented on HBASE-16762:
-

[~tedyu], patch looks good to me, thanks for the fix.

> NullPointerException is thrown when constructing sourceTable in verifyrep
> -
>
> Key: HBASE-16762
> URL: https://issues.apache.org/jira/browse/HBASE-16762
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 16762.branch-1.txt
>
>
> Branch-1 patch for HBASE-16423 incorrectly constructed sourceTable, leading 
> to the following exception:
> {code}
> 16/10/04 17:00:30 INFO mapreduce.Job: Task Id : 
> attempt_1473183665588_0082_m_16_1, Status : FAILED 
> Error: java.lang.NullPointerException 
> at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:436) 
> at org.apache.hadoop.hbase.client.HTable.(HTable.java:150) 
> at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.map(VerifyReplication.java:128)
>  
> at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.map(VerifyReplication.java:86)
>  
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) 
> {code}
> I checked master patch where there is no such bug



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-09-23 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515650#comment-15515650
 ] 

Jianwei Cui commented on HBASE-16423:
-

Thanks for the review [~tedyu], could you please take a look at the patch for 
branch-1?

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> 
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16423-branch-1-v1.patch, HBASE-16423-v1.patch, 
> HBASE-16423-v2.patch, HBASE-16423-v3.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-09-23 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-16423:

Attachment: HBASE-16423-branch-1-v1.patch

patch for branch-1.

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> 
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16423-branch-1-v1.patch, HBASE-16423-v1.patch, 
> HBASE-16423-v2.patch, HBASE-16423-v3.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-09-23 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-16423:

Attachment: HBASE-16423-v3.patch

trunk patch v3 to fix whitespace.

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> 
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16423-v1.patch, HBASE-16423-v2.patch, 
> HBASE-16423-v3.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-09-22 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513317#comment-15513317
 ] 

Jianwei Cui commented on HBASE-16423:
-

Thanks for the review [~tedyu], upload patch v2.

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> 
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-16423-v1.patch, HBASE-16423-v2.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-09-22 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-16423:

Attachment: HBASE-16423-v2.patch

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> 
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-16423-v1.patch, HBASE-16423-v2.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-09-21 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-16423:

Attachment: HBASE-16423-v1.patch

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> 
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-16423-v1.patch
>
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-08-16 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423733#comment-15423733
 ] 

Jianwei Cui commented on HBASE-16423:
-

[~churromorales], there are cases will change the versions between [startTime, 
endTime) during VerifyReplication scanning. For example, if there are new 
versions being written to source cluster, the total versions may exceed the 
family max version so that compaction will cause the versions between 
[startTime, endTime) deleted, the compaction may happen at different time in 
source and peer clusters, making VerifyReplication may report inconsistent 
rows. 

> Add re-compare option to VerifyReplication to avoid occasional inconsistent 
> rows
> 
>
> Key: HBASE-16423
> URL: https://issues.apache.org/jira/browse/HBASE-16423
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
>
> Because replication keeps eventually consistency, VerifyReplication may 
> report inconsistent rows if there are data being written to source or peer 
> clusters during scanning. These occasionally inconsistent rows will have the 
> same data if we do the comparison again after a short period. It is not easy 
> to find the really inconsistent rows if VerifyReplication report a large 
> number of such occasionally inconsistency. To avoid this case, we can add an 
> option to make VerifyReplication read out the inconsistent rows again after 
> sleeping a few seconds and re-compare the rows during scanning. This behavior 
> follows the eventually consistency of hbase's replication. Suggestions and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16423) Add re-compare option to VerifyReplication to avoid occasional inconsistent rows

2016-08-16 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-16423:
---

 Summary: Add re-compare option to VerifyReplication to avoid 
occasional inconsistent rows
 Key: HBASE-16423
 URL: https://issues.apache.org/jira/browse/HBASE-16423
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Affects Versions: 2.0.0
Reporter: Jianwei Cui
Priority: Minor


Because replication keeps eventually consistency, VerifyReplication may report 
inconsistent rows if there are data being written to source or peer clusters 
during scanning. These occasionally inconsistent rows will have the same data 
if we do the comparison again after a short period. It is not easy to find the 
really inconsistent rows if VerifyReplication report a large number of such 
occasionally inconsistency. To avoid this case, we can add an option to make 
VerifyReplication read out the inconsistent rows again after sleeping a few 
seconds and re-compare the rows during scanning. This behavior follows the 
eventually consistency of hbase's replication. Suggestions and discussions are 
welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15626) RetriesExhaustedWithDetailsException#getDesc won't return the full message

2016-04-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259345#comment-15259345
 ] 

Jianwei Cui commented on HBASE-15626:
-

Same problem as [HBASE-15710|https://issues.apache.org/jira/browse/HBASE-15710].

> RetriesExhaustedWithDetailsException#getDesc won't return the full message
> --
>
> Key: HBASE-15626
> URL: https://issues.apache.org/jira/browse/HBASE-15626
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-15626-v1.patch
>
>
> The RetriesExhaustedWithDetailsException#getDesc will include server 
> addresses as:
> {code}
>   public static String getDesc(List exceptions,
>List actions,
>List hostnamePort) {
> String s = getDesc(classifyExs(exceptions));
> StringBuilder addrs = new StringBuilder(s);
> addrs.append("servers with issues: ");
> Set uniqAddr = new HashSet();
> uniqAddr.addAll(hostnamePort);
> for(String addr : uniqAddr) {
>   addrs.append(addr).append(", ");
> }
> return s;  // ==> should be addrs.toString()
>   }
> {code}
> However, the returned value is {{s}}, only includes the exceptions. To 
> include the server addresses, the returned value should be 
> {{addrs.toString()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15626) RetriesExhaustedWithDetailsException#getDesc won't return the full message

2016-04-10 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15626:

Attachment: HBASE-15626-v1.patch

> RetriesExhaustedWithDetailsException#getDesc won't return the full message
> --
>
> Key: HBASE-15626
> URL: https://issues.apache.org/jira/browse/HBASE-15626
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-15626-v1.patch
>
>
> The RetriesExhaustedWithDetailsException#getDesc will include server 
> addresses as:
> {code}
>   public static String getDesc(List exceptions,
>List actions,
>List hostnamePort) {
> String s = getDesc(classifyExs(exceptions));
> StringBuilder addrs = new StringBuilder(s);
> addrs.append("servers with issues: ");
> Set uniqAddr = new HashSet();
> uniqAddr.addAll(hostnamePort);
> for(String addr : uniqAddr) {
>   addrs.append(addr).append(", ");
> }
> return s;  // ==> should be addrs.toString()
>   }
> {code}
> However, the returned value is {{s}}, only includes the exceptions. To 
> include the server addresses, the returned value should be 
> {{addrs.toString()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15626) RetriesExhaustedWithDetailsException#getDesc won't return the full message

2016-04-10 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15626:
---

 Summary: RetriesExhaustedWithDetailsException#getDesc won't return 
the full message
 Key: HBASE-15626
 URL: https://issues.apache.org/jira/browse/HBASE-15626
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 2.0.0
Reporter: Jianwei Cui
Priority: Minor


The RetriesExhaustedWithDetailsException#getDesc will include server addresses 
as:
{code}
  public static String getDesc(List exceptions,
   List actions,
   List hostnamePort) {
String s = getDesc(classifyExs(exceptions));
StringBuilder addrs = new StringBuilder(s);
addrs.append("servers with issues: ");
Set uniqAddr = new HashSet();
uniqAddr.addAll(hostnamePort);

for(String addr : uniqAddr) {
  addrs.append(addr).append(", ");
}
return s;  // ==> should be addrs.toString()
  }
{code}
However, the returned value is {{s}}, only includes the exceptions. To include 
the server addresses, the returned value should be {{addrs.toString()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-04-10 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15616:

Attachment: HBASE-15616-v2.patch

Add unit test for null qualifier

> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Attachments: HBASE-15616-v1.patch, HBASE-15616-v2.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
>   at 
> org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
>   ... 7 more
> {code}
> The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte 
> array is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the 
> same column, because null qualifier is allowed for {{Put}},  users may be 
> confused if null qualifier is not allowed for {{checkAndMutate}}. We can also 
> convert null qualifier to empty byte array for {{checkAndMutate}} in client 
> side. Discussions and suggestions are welcomed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15497) Incorrect javadoc for atomicity guarantee of Increment and Append

2016-04-10 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233922#comment-15233922
 ] 

Jianwei Cui commented on HBASE-15497:
-

Can anyone help to review the patch? Thanks:)

> Incorrect javadoc for atomicity guarantee of Increment and Append
> -
>
> Key: HBASE-15497
> URL: https://issues.apache.org/jira/browse/HBASE-15497
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-15497-v1.patch
>
>
> At the front of {{Increment.java}} file, there is comment about read 
> atomicity:
> {code}
>  * This operation does not appear atomic to readers.  Increments are done
>  * under a single row lock, so write operations to a row are synchronized, but
>  * readers do not take row locks so get and scan operations can see this
>  * operation partially completed.
> {code}
> It seems this comment is not true after MVCC integrated 
> [HBASE-4583|https://issues.apache.org/jira/browse/HBASE-4583]. Currently, the 
> readers can be guaranteed to read the whole result of Increment if I am not 
> wrong. Similar comments also exist in {{Append.java}}, {{Table#append(...)}} 
> and {{Table#increment(...)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15497) Incorrect javadoc for atomicity guarantee of Increment and Append

2016-04-10 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15497:

Attachment: HBASE-15497-v1.patch

> Incorrect javadoc for atomicity guarantee of Increment and Append
> -
>
> Key: HBASE-15497
> URL: https://issues.apache.org/jira/browse/HBASE-15497
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-15497-v1.patch
>
>
> At the front of {{Increment.java}} file, there is comment about read 
> atomicity:
> {code}
>  * This operation does not appear atomic to readers.  Increments are done
>  * under a single row lock, so write operations to a row are synchronized, but
>  * readers do not take row locks so get and scan operations can see this
>  * operation partially completed.
> {code}
> It seems this comment is not true after MVCC integrated 
> [HBASE-4583|https://issues.apache.org/jira/browse/HBASE-4583]. Currently, the 
> readers can be guaranteed to read the whole result of Increment if I am not 
> wrong. Similar comments also exist in {{Append.java}}, {{Table#append(...)}} 
> and {{Table#increment(...)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-04-09 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233882#comment-15233882
 ] 

Jianwei Cui commented on HBASE-15616:
-

Thanks for the review [~stack]. The patch could also be applied to branch-1. 

> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Attachments: HBASE-15616-v1.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
>   at 
> org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
>   ... 7 more
> {code}
> The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte 
> array is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the 
> same column, because null qualifier is allowed for {{Put}},  users may be 
> confused if null qualifier is not allowed for {{checkAndMutate}}. We can also 
> convert null qualifier to empty byte array for {{checkAndMutate}} in client 
> side. Discussions and suggestions are welcomed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-04-09 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15616:

Status: Patch Available  (was: In Progress)

> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Attachments: HBASE-15616-v1.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
>   at 
> org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
>   ... 7 more
> {code}
> The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte 
> array is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the 
> same column, because null qualifier is allowed for {{Put}},  users may be 
> confused if null qualifier is not allowed for {{checkAndMutate}}. We can also 
> convert null qualifier to empty byte array for {{checkAndMutate}} in client 
> side. Discussions and suggestions are welcomed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-04-09 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-15616 started by Jianwei Cui.
---
> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Attachments: HBASE-15616-v1.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
>   at 
> org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
>   ... 7 more
> {code}
> The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte 
> array is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the 
> same column, because null qualifier is allowed for {{Put}},  users may be 
> confused if null qualifier is not allowed for {{checkAndMutate}}. We can also 
> convert null qualifier to empty byte array for {{checkAndMutate}} in client 
> side. Discussions and suggestions are welcomed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-04-09 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui reassigned HBASE-15616:
---

Assignee: Jianwei Cui

> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Attachments: HBASE-15616-v1.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
>   at 
> org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
>   ... 7 more
> {code}
> The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte 
> array is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the 
> same column, because null qualifier is allowed for {{Put}},  users may be 
> confused if null qualifier is not allowed for {{checkAndMutate}}. We can also 
> convert null qualifier to empty byte array for {{checkAndMutate}} in client 
> side. Discussions and suggestions are welcomed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-04-08 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15616:

Attachment: HBASE-15616-v1.patch

> CheckAndMutate will encouter NPE if qualifier to check is null
> --
>
> Key: HBASE-15616
> URL: https://issues.apache.org/jira/browse/HBASE-15616
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15616-v1.patch
>
>
> If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
> will encounter NPE.
> The test code:
> {code}
> table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
> Put(row).addColumn(family, null, Bytes.toBytes(1)));
> {code}
> The exception:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=3, exceptions:
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:31 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
> Fri Apr 08 15:51:32 CST 2016, 
> RpcRetryingCaller{globalStartTime=1460101891615, pause=100, maxAttempts=3}, 
> java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
>   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
>   at ...
> Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
>   ... 2 more
> Caused by: com.google.protobuf.ServiceException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
>   at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
>   ... 4 more
> Caused by: java.lang.NullPointerException
>   at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
>   at 
> com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>   at 
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
>   at 
> org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
>   at 
> org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
>   ... 7 more
> {code}
> The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte 
> array is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the 
> same column, because null qualifier is allowed for {{Put}},  users may be 
> confused if null qualifier is not allowed for {{checkAndMutate}}. We can also 
> convert null qualifier to empty byte array for {{checkAndMutate}} in client 
> side. Discussions and suggestions are welcomed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15616) CheckAndMutate will encouter NPE if qualifier to check is null

2016-04-08 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15616:
---

 Summary: CheckAndMutate will encouter NPE if qualifier to check is 
null
 Key: HBASE-15616
 URL: https://issues.apache.org/jira/browse/HBASE-15616
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 2.0.0
Reporter: Jianwei Cui


If qualifier to check is null, the checkAndMutate/checkAndPut/checkAndDelete 
will encounter NPE.
The test code:
{code}
table.checkAndPut(row, family, null, Bytes.toBytes(0), new 
Put(row).addColumn(family, null, Bytes.toBytes(1)));
{code}
The exception:
{code}
Exception in thread "main" 
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=3, exceptions:
Fri Apr 08 15:51:31 CST 2016, RpcRetryingCaller{globalStartTime=1460101891615, 
pause=100, maxAttempts=3}, java.io.IOException: 
com.google.protobuf.ServiceException: java.lang.NullPointerException
Fri Apr 08 15:51:31 CST 2016, RpcRetryingCaller{globalStartTime=1460101891615, 
pause=100, maxAttempts=3}, java.io.IOException: 
com.google.protobuf.ServiceException: java.lang.NullPointerException
Fri Apr 08 15:51:32 CST 2016, RpcRetryingCaller{globalStartTime=1460101891615, 
pause=100, maxAttempts=3}, java.io.IOException: 
com.google.protobuf.ServiceException: java.lang.NullPointerException

at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:120)
at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:772)
at ...
Caused by: java.io.IOException: com.google.protobuf.ServiceException: 
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:341)
at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:768)
at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:755)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:99)
... 2 more
Caused by: com.google.protobuf.ServiceException: java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:239)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:35252)
at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:765)
... 4 more
Caused by: java.lang.NullPointerException
at com.google.protobuf.LiteralByteString.size(LiteralByteString.java:76)
at 
com.google.protobuf.CodedOutputStream.computeBytesSizeNoTag(CodedOutputStream.java:767)
at 
com.google.protobuf.CodedOutputStream.computeBytesSize(CodedOutputStream.java:539)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Condition.getSerializedSize(ClientProtos.java:7483)
at 
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at 
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest.getSerializedSize(ClientProtos.java:12431)
at 
org.apache.hadoop.hbase.ipc.IPCUtil.getTotalSizeWhenWrittenDelimited(IPCUtil.java:311)
at 
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:409)
at 
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:333)
at 
org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:245)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
... 7 more

{code}
The reason is {{LiteralByteString.size()}} will throw NPE if wrapped byte array 
is null. It is possible to invoke {{put}} and {{checkAndMutate}} on the same 
column, because null qualifier is allowed for {{Put}},  users may be confused 
if null qualifier is not allowed for {{checkAndMutate}}. We can also convert 
null qualifier to empty byte array for {{checkAndMutate}} in client side. 
Discussions and suggestions are welcomed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15588) Use nonce for checkAndMutate operation

2016-04-01 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15588:
---

 Summary: Use nonce for checkAndMutate operation
 Key: HBASE-15588
 URL: https://issues.apache.org/jira/browse/HBASE-15588
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 2.0.0
Reporter: Jianwei Cui


Like {{increment}}/{{append}}, the {{checkAndPut}}/{{checkAndDelete}} operation 
is non-idempotent, so that the client may get incorrect result if there are 
retries, and such incorrect result may lead the application enter an error 
state. A possible solution is using nonce for checkAndMutate operations, 
discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15327) Canary will always invoke admin.balancer() in each sniffing period when writeSniffing is enabled

2016-03-30 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15327:

Attachment: HBASE-15327-branch-1-v1.patch
HBASE-15327-v1.patch

Make patches for trunk and branch-1. Thanks for the review [~stack] and 
[~yuzhih...@gmail.com] :)

> Canary will always invoke admin.balancer() in each sniffing period when 
> writeSniffing is enabled
> 
>
> Key: HBASE-15327
> URL: https://issues.apache.org/jira/browse/HBASE-15327
> Project: HBase
>  Issue Type: Bug
>  Components: canary
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-15327-branch-1-v1.patch, HBASE-15327-trunk.patch, 
> HBASE-15327-trunk.patch, HBASE-15327-v1.patch
>
>
> When Canary#writeSniffing is enabled, Canary#checkWriteTableDistribution will 
> make sure the regions of write table distributed on all region servers as:
> {code}
>   int numberOfServers = admin.getClusterStatus().getServers().size();
>   ..
>   int numberOfCoveredServers = serverSet.size();
>   if (numberOfCoveredServers < numberOfServers) {
> admin.balancer();
>   }
> {code}
> The master will also work as a regionserver, so that ClusterStatus#getServers 
> will contain the master. On the other hand, write table of Canary will not be 
> assigned to master, making numberOfCoveredServers always smaller than 
> numberOfServers and admin.balancer always be invoked in each sniffing period. 
> This may cause frequent region moves. A simple fix is excluding master from 
> numberOfServers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-21 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205607#comment-15205607
 ] 

Jianwei Cui commented on HBASE-15433:
-

Thanks for the review, Ted and Ashish.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2
>
> Attachments: HBASE-15433-branch-1-v1.patch, 
> HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch, HBASE-15433-v4.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15497) Incorrect javadoc for atomicity guarantee of Increment and Append

2016-03-21 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15497:
---

 Summary: Incorrect javadoc for atomicity guarantee of Increment 
and Append
 Key: HBASE-15497
 URL: https://issues.apache.org/jira/browse/HBASE-15497
 Project: HBase
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.0.0
Reporter: Jianwei Cui
Priority: Minor


At the front of {{Increment.java}} file, there is comment about read atomicity:
{code}
 * This operation does not appear atomic to readers.  Increments are done
 * under a single row lock, so write operations to a row are synchronized, but
 * readers do not take row locks so get and scan operations can see this
 * operation partially completed.
{code}
It seems this comment is not true after MVCC integrated 
[HBASE-4583|https://issues.apache.org/jira/browse/HBASE-4583]. Currently, the 
readers can be guaranteed to read the whole result of Increment if I am not 
wrong. Similar comments also exist in {{Append.java}}, {{Table#append(...)}} 
and {{Table#increment(...)}}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15469) Take snapshot by family

2016-03-21 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203878#comment-15203878
 ] 

Jianwei Cui commented on HBASE-15469:
-

For our case, the goal is to copy existed data for given families and clone the 
snapshot, so that creating a new table with only the subset families is a 
better choice. For the restore case, the goal is to rollback the table to some 
history state? the snapshot with only a subset of families may not represent 
any history state of the table, so that should not be used for the restore 
purpose.
{quote}
we may block the restore of snapshots with only a subset of families. and that 
will solve the strange situation of restore. 
and when we clone we just create a new table with only the subset. In theory 
this is more clear for the end user. 
{quote}
Agreed with your analysis [~mbertozzi], and also expect other opinions and 
cases. Thanks!

> Take snapshot by family
> ---
>
> Key: HBASE-15469
> URL: https://issues.apache.org/jira/browse/HBASE-15469
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15469-v1.patch, HBASE-15469-v2.patch
>
>
> In our production environment, there are some 'wide' tables in offline 
> cluster. The 'wide' table has a number of families, different applications 
> will access different families of the table through MapReduce. When some 
> application starting to provide online service, we need to copy needed 
> families from offline cluster to online cluster. For future write, the 
> inter-cluster replication supports setting families for table, we can use it 
> to copy future edits for needed families. For existed data, we can take 
> snapshot of the table on offline cluster, then exploit {{ExportSnapshot}} to 
> copy snapshot to online cluster and clone the snapshot. However, we can only 
> take snapshot for the whole table in which many families are not needed for 
> the application, this will lead unnecessary data copy. I think it is useful 
> to support taking snapshot by family, so that we can only copy needed data.
> Possible solution to support such function:
> 1. Add family names field to the protobuf definition of 
> {{SnapshotDescription}}
> 2. Allow to set families when taking snapshot in hbase shell, such as:
> {code}
>snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => 
> true}
> {code}
> 3. Add family names to {{SnapshotDescription}} in client side
> 4. Read family names from {{SnapshotDescription}} in Master/Regionserver, 
> keep only requested families when taking snapshot for region.
> Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-21 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203848#comment-15203848
 ] 

Jianwei Cui commented on HBASE-15433:
-

HBASE-15433-branch-1-v1.patch could also be applied to branch-1.1/1.2/1.3.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2
>
> Attachments: HBASE-15433-branch-1-v1.patch, 
> HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch, HBASE-15433-v4.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-21 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203833#comment-15203833
 ] 

Jianwei Cui commented on HBASE-15433:
-

Sorry to reply late and thanks for your review [~yuzhih...@gmail.com], have 
attached patch for branch-1 and will wait for the tests result. Thanks.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2
>
> Attachments: HBASE-15433-branch-1-v1.patch, 
> HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch, HBASE-15433-v4.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-21 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15433:

Attachment: HBASE-15433-branch-1-v1.patch

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2
>
> Attachments: HBASE-15433-branch-1-v1.patch, 
> HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch, HBASE-15433-v4.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15469) Take snapshot by family

2016-03-20 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15469:

Attachment: HBASE-15469-v1.patch

> Take snapshot by family
> ---
>
> Key: HBASE-15469
> URL: https://issues.apache.org/jira/browse/HBASE-15469
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15469-v1.patch
>
>
> In our production environment, there are some 'wide' tables in offline 
> cluster. The 'wide' table has a number of families, different applications 
> will access different families of the table through MapReduce. When some 
> application starting to provide online service, we need to copy needed 
> families from offline cluster to online cluster. For future write, the 
> inter-cluster replication supports setting families for table, we can use it 
> to copy future edits for needed families. For existed data, we can take 
> snapshot of the table on offline cluster, then exploit {{ExportSnapshot}} to 
> copy snapshot to online cluster and clone the snapshot. However, we can only 
> take snapshot for the whole table in which many families are not needed for 
> the application, this will lead unnecessary data copy. I think it is useful 
> to support taking snapshot by family, so that we can only copy needed data.
> Possible solution to support such function:
> 1. Add family names field to the protobuf definition of 
> {{SnapshotDescription}}
> 2. Allow to set families when taking snapshot in hbase shell, such as:
> {code}
>snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => 
> true}
> {code}
> 3. Add family names to {{SnapshotDescription}} in client side
> 4. Read family names from {{SnapshotDescription}} in Master/Regionserver, 
> keep only requested families when taking snapshot for region.
> Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15469) Take snapshot by family

2016-03-19 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200847#comment-15200847
 ] 

Jianwei Cui commented on HBASE-15469:
-

Good question! Yes, the current path will create all families when cloning or 
restoring. This could be optional for user. For most cases, it is more 
reasonable to only retain the requested families when taking snapshot? Users 
can add other needed families after cloning or restoring. What do you think? 
[~mbertozzi]. Thanks.

> Take snapshot by family
> ---
>
> Key: HBASE-15469
> URL: https://issues.apache.org/jira/browse/HBASE-15469
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15469-v1.patch
>
>
> In our production environment, there are some 'wide' tables in offline 
> cluster. The 'wide' table has a number of families, different applications 
> will access different families of the table through MapReduce. When some 
> application starting to provide online service, we need to copy needed 
> families from offline cluster to online cluster. For future write, the 
> inter-cluster replication supports setting families for table, we can use it 
> to copy future edits for needed families. For existed data, we can take 
> snapshot of the table on offline cluster, then exploit {{ExportSnapshot}} to 
> copy snapshot to online cluster and clone the snapshot. However, we can only 
> take snapshot for the whole table in which many families are not needed for 
> the application, this will lead unnecessary data copy. I think it is useful 
> to support taking snapshot by family, so that we can only copy needed data.
> Possible solution to support such function:
> 1. Add family names field to the protobuf definition of 
> {{SnapshotDescription}}
> 2. Allow to set families when taking snapshot in hbase shell, such as:
> {code}
>snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => 
> true}
> {code}
> 3. Add family names to {{SnapshotDescription}} in client side
> 4. Read family names from {{SnapshotDescription}} in Master/Regionserver, 
> keep only requested families when taking snapshot for region.
> Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15469) Take snapshot by family

2016-03-19 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199512#comment-15199512
 ] 

Jianwei Cui commented on HBASE-15469:
-

Upload the patch. In hbase shell, we scan specify families when taking snapshot 
as:
{code}
hbase(main):004:0> snapshot 'test_table', 'test-snapshot', 'f1'
0 row(s) in 0.3830 seconds
{code}
And {{list_snapshots}} will show the table and families of the snapshot:
{code}
hbase(main):001:0> list_snapshots
SNAPSHOT  TABLE/CFs + CREATION TIME 

  
 test-snapshottest_table/f1 (Thu Mar 17 
20:54:22 +0800 2016)
  
1 row(s) in 0.2890 seconds
{code}
This snapshot could be operated by other operations, such as 
{{clone_snapshot}}, {{restore_snapshot}}, etc.

> Take snapshot by family
> ---
>
> Key: HBASE-15469
> URL: https://issues.apache.org/jira/browse/HBASE-15469
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15469-v1.patch
>
>
> In our production environment, there are some 'wide' tables in offline 
> cluster. The 'wide' table has a number of families, different applications 
> will access different families of the table through MapReduce. When some 
> application starting to provide online service, we need to copy needed 
> families from offline cluster to online cluster. For future write, the 
> inter-cluster replication supports setting families for table, we can use it 
> to copy future edits for needed families. For existed data, we can take 
> snapshot of the table on offline cluster, then exploit {{ExportSnapshot}} to 
> copy snapshot to online cluster and clone the snapshot. However, we can only 
> take snapshot for the whole table in which many families are not needed for 
> the application, this will lead unnecessary data copy. I think it is useful 
> to support taking snapshot by family, so that we can only copy needed data.
> Possible solution to support such function:
> 1. Add family names field to the protobuf definition of 
> {{SnapshotDescription}}
> 2. Allow to set families when taking snapshot in hbase shell, such as:
> {code}
>snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => 
> true}
> {code}
> 3. Add family names to {{SnapshotDescription}} in client side
> 4. Read family names from {{SnapshotDescription}} in Master/Regionserver, 
> keep only requested families when taking snapshot for region.
> Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15469) Take snapshot by family

2016-03-19 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201278#comment-15201278
 ] 

Jianwei Cui commented on HBASE-15469:
-

Upload v2 to remove unrelated changes in hbase-site.xml and create RB.

> Take snapshot by family
> ---
>
> Key: HBASE-15469
> URL: https://issues.apache.org/jira/browse/HBASE-15469
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15469-v1.patch, HBASE-15469-v2.patch
>
>
> In our production environment, there are some 'wide' tables in offline 
> cluster. The 'wide' table has a number of families, different applications 
> will access different families of the table through MapReduce. When some 
> application starting to provide online service, we need to copy needed 
> families from offline cluster to online cluster. For future write, the 
> inter-cluster replication supports setting families for table, we can use it 
> to copy future edits for needed families. For existed data, we can take 
> snapshot of the table on offline cluster, then exploit {{ExportSnapshot}} to 
> copy snapshot to online cluster and clone the snapshot. However, we can only 
> take snapshot for the whole table in which many families are not needed for 
> the application, this will lead unnecessary data copy. I think it is useful 
> to support taking snapshot by family, so that we can only copy needed data.
> Possible solution to support such function:
> 1. Add family names field to the protobuf definition of 
> {{SnapshotDescription}}
> 2. Allow to set families when taking snapshot in hbase shell, such as:
> {code}
>snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => 
> true}
> {code}
> 3. Add family names to {{SnapshotDescription}} in client side
> 4. Read family names from {{SnapshotDescription}} in Master/Regionserver, 
> keep only requested families when taking snapshot for region.
> Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15469) Take snapshot by family

2016-03-19 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15469:

Attachment: HBASE-15469-v2.patch

> Take snapshot by family
> ---
>
> Key: HBASE-15469
> URL: https://issues.apache.org/jira/browse/HBASE-15469
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15469-v1.patch, HBASE-15469-v2.patch
>
>
> In our production environment, there are some 'wide' tables in offline 
> cluster. The 'wide' table has a number of families, different applications 
> will access different families of the table through MapReduce. When some 
> application starting to provide online service, we need to copy needed 
> families from offline cluster to online cluster. For future write, the 
> inter-cluster replication supports setting families for table, we can use it 
> to copy future edits for needed families. For existed data, we can take 
> snapshot of the table on offline cluster, then exploit {{ExportSnapshot}} to 
> copy snapshot to online cluster and clone the snapshot. However, we can only 
> take snapshot for the whole table in which many families are not needed for 
> the application, this will lead unnecessary data copy. I think it is useful 
> to support taking snapshot by family, so that we can only copy needed data.
> Possible solution to support such function:
> 1. Add family names field to the protobuf definition of 
> {{SnapshotDescription}}
> 2. Allow to set families when taking snapshot in hbase shell, such as:
> {code}
>snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => 
> true}
> {code}
> 3. Add family names to {{SnapshotDescription}} in client side
> 4. Read family names from {{SnapshotDescription}} in Master/Regionserver, 
> keep only requested families when taking snapshot for region.
> Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15469) Take snapshot by family

2016-03-19 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15469:
---

 Summary: Take snapshot by family
 Key: HBASE-15469
 URL: https://issues.apache.org/jira/browse/HBASE-15469
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 2.0.0
Reporter: Jianwei Cui


In our production environment, there are some 'wide' tables in offline cluster. 
The 'wide' table has a number of families, different applications will access 
different families of the table through MapReduce. When some application 
starting to provide online service, we need to copy needed families from 
offline cluster to online cluster. For future write, the inter-cluster 
replication supports setting families for table, we can use it to copy future 
edits for needed families. For existed data, we can take snapshot of the table 
on offline cluster, then exploit {{ExportSnapshot}} to copy snapshot to online 
cluster and clone the snapshot. However, we can only take snapshot for the 
whole table in which many families are not needed for the application, this 
will lead unnecessary data copy. I think it is useful to support taking 
snapshot by family, so that we can only copy needed data.
Possible solution to support such function:
1. Add family names field to the protobuf definition of {{SnapshotDescription}}
2. Allow to set families when taking snapshot in hbase shell, such as:
{code}
   snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => 
true}
{code}
3. Add family names to {{SnapshotDescription}} in client side
4. Read family names from {{SnapshotDescription}} in Master/Regionserver, keep 
only requested families when taking snapshot for region.
Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-19 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201155#comment-15201155
 ] 

Jianwei Cui commented on HBASE-15433:
-

Run the failed tests locally and passed, seems the test fail is unrelated to 
this patch. Could you please take a look at the patch v4? 
[~yuzhih...@gmail.com] [~mbertozzi]. Thanks.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0, 1.1.5
>
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch, HBASE-15433-v4.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-15 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15433:

Attachment: HBASE-15433-v4.patch

fix checkstyle and whitespace

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0, 1.1.5
>
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch, HBASE-15433-v4.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-15 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui reassigned HBASE-15433:
---

Assignee: Jianwei Cui

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0, 1.1.5
>
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch, HBASE-15433-v4.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-15 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195033#comment-15195033
 ] 

Jianwei Cui commented on HBASE-15433:
-

{quote}
Instead of getting table region count from quota cache we can get it from 
RegionLocator which will solve the corner case you described.
{quote}
This may make other corner cases fail if I am not wrong. For example, the table 
has 5 regions, clientA is trying to restore the table to snapshot with 8 
regions, while clientB is trying to restore the snapshot with 10 regions, then:
1. clientA firstly invokes {{checkAndUpdateNamespaceRegionQuota}} before 
{{restoreSnapshot}}, the {{tableRegionCount}} is 5 for clientA and it updates 
the region count of the table to 8
2. Before clientA invokes {{restoreSnapshot}}, clientB invokes 
{{checkAndUpdateNamespaceRegionQuota}} before {{restoreSnapshot}}, the 
{{tableRegionCount}} is also 5(when using RegionLocator) for clientB and it 
updates the region count of the table to 10
3. clientA successfully restored its snapshot, so that the actual region count 
is 8
4. clientB encountered IOE in {{restoreSnapshot}} and will reset the region 
count to 5 in IOE catch clause. However, the region count should be 8 because 
clientA succeeded.
I think it is not easy to resolve the concurrent issues in {{SnapshotManager}} 
without lock, we may wait for RestoreSnapshotHandler rewritten by procedure v2 
and move quota updating in RestoreSnapshotHandler?

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0, 1.1.5
>
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-15 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194966#comment-15194966
 ] 

Jianwei Cui commented on HBASE-15433:
-

Upload patch v3 according to comments. There has been a unit test named as 
TestNamespaceAuditor#testRestoreSnapshotQuotaExceed, the new patch checks the 
exception cause type and region count in this unit test. 
{{TestNamespaceAuditor}} and {{TestRestoreFlushSnapshotFromClient}} passed 
locally. [~ashish singhi], could you please have a look at v3? Thanks.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-15 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15433:

Attachment: HBASE-15433-v3.patch

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch, HBASE-15433-v3.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-15 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194845#comment-15194845
 ] 

Jianwei Cui commented on HBASE-15433:
-

Reasonable for this case IMO. It seems there are other issues when concurrently 
restoring snapshots for the same table. We need to keep the 
{{checkAndUpdateNamespaceRegionQuota}} the same order before and after 
{{restoreSnapshot}} among concurrent restore requests. For example, beofre 
{{restoreSnapshot}}, if clientA invoked {{checkAndUpdateNamespaceRegionQuota}} 
ahead of clientB, then after {{restoreSnapshot}}, we need to make sure clientA 
also invoked {{checkAndUpdateNamespaceRegionQuota}} ahead of clientB?
In the document of 
[HBASE-12439|https://issues.apache.org/jira/browse/HBASE-12439], it seems the 
CloneSnapshotHandler/RestoreSnapshotHandler will be rewritten by procedure v2? 
After that, we can keep the quota updating sync with 
CloneSnapshot/RestoreSnapshot steps and rollbacks. Currently, without steps and 
rollbacks, RestoreSnapshotHandler may not update the quota information 
correctly. Therefore, I think we can still keep the quota updating in 
{{SnapshotManager}} before procedure v2 rewritten? For concurrent request 
issues, we can add some comments in the code to explain the problem?

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-14 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194691#comment-15194691
 ] 

Jianwei Cui commented on HBASE-15433:
-

Thanks for your comment.
{quote}
Not required I think, because we are having enough quota for this table in the 
cache before restoring the snapshot and after restoring snapshot we are only 
decrementing it, so it will work.
{quote}
There may be corner case if I am not wrong. For example, if the table has 5 
regions, clientA is trying to restore the table to snapshot with 8 regions, 
while clientB is trying to restore the snapshot with 10 regions, then:
1. clientA firstly invokes {{checkAndUpdateNamespaceRegionQuota}} before 
{{restoreSnapshot}}, the {{tableRegionCount}} is 5 for clientA and it updates 
the region count of the table to 8
2. Before clientA invokes {{restoreSnapshot}}, clientB invokes 
{{checkAndUpdateNamespaceRegionQuota}} before {{restoreSnapshot}}, the 
{{tableRegionCount}} is 8 for clientB and it updates the region count of the 
table to 10
3. Both clientA and clientB encountered IOE in {{restoreSnapshot}}, and the two 
clients are trying to reset the region count in IOE catch clause
4. clientA firstly reset the region count to 5, and then clientB reset the 
region count to 8, so the final region count for the table is 8 in such case, 
but it should be 5 because both operations failed.
It seems not easy to update the quota information correctly without lock if 
there are concurrent restoreSnapshot requests IMO. Maybe, it is more easy to do 
such work in {{RestoreSnapshotHandler}} with table lock held(like 
{{CreateTableProcedure}})?
1. In {{RestoreSnapshotHandler}}, overwrite {{prepareWithTableLock}} method 
with {{checkAndUpdateNamespaceRegionQuota}} if {{snapshotRegionCount}} is 
larger than {{tableRegionCount}}. If {{checkAndUpdateNamespaceRegionQuota}} 
fails here, we do not need to reset the region count and {{SnapshotManager}} 
will throw exception directly.
2. In {{RestoreSnapshotHandler#completed}}, if exception received and 
{{tableRegionCount < SnapshotRegionCount}}, we reset region count to 
{{tableRegionCount}}, if no exception received and {{tableRegionCount > 
snapshotRegionCount}}, we set region count to {{snapshotRegionCount}}. 
What's your opinion about this issue? [~ashish singhi]

{quote}
We can also include quota has exceeded in the error message
{quote}
Yes, will polish the message and update the patch.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-14 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193148#comment-15193148
 ] 

Jianwei Cui commented on HBASE-15433:
-

The table must be disabled during restoreSnapshot, so that the 
{{tableRegionCount}} won't change. Assume there won't be concurrent 
restoreSnapshot requests for the same table, the 
{{checkAndUpdateNamespaceRegionQuota}} after {{restoreSnapshot}} will be 
executed only when {{tableRegionCount > snapshotRegionCount}} satisfied, this 
means we have preserved enough region count for the 
{{checkAndUpdateNamespaceRegionQuota}} from the namespace quota. Therefore, 
other thread operations won't make the {{checkAndUpdateNamespaceRegionQuota}} 
fail if they operating on different tables? However, if there are concurrent 
restoreSnapshot requests for the same table, it will cause problem, and we may 
need lock to make sure the quota information is updated correctly, or we can 
move the quota check and update logic in the {{RestoreSnapshotHandler}} after 
table lock is held?

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-14 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193038#comment-15193038
 ] 

Jianwei Cui commented on HBASE-15433:
-

{quote}
When QEE is thrown we will still end up in updating the region quota which is 
not really required, may be we can avoid that.
{quote}
Yes, we should catch QEE firstly and not update the quota information in such 
situation as you suggested above.
{quote}
Also suggest to rename currentRegionCount to tableRegionCount and 
updatedRegionCount to snapshotRegionCount for better understanding. Please add 
more comments like why are we doing this way.
{quote}
Good suggestions, will update the patch.

{quote}
If this throws exception then there will be another issue, because now the 
snapshot has been successfully restored but in the catch clause we are updating 
the table region count in namespace quota.
{quote}
Good find. Here, the {{checkAndUpdateNamespaceRegionQuota}} should succeed 
because it will reduce the region count for the table? However, if the 
{{checkAndUpdateNamespaceRegionQuota}} throws exception, there must be some 
unexpected reasons, and call {{checkAndUpdateNamespaceRegionQuota}} in catch 
clause may also fail. We can log an error message in QEE catch clause and throw 
it directly? And the code here can be updated as:
{code}
  int tableRegionCount = -1;
  try {
// Table already exist. Check and update the region quota for this 
table namespace
// Table is disabled, table region count won't change during 
restoreSnapshot
tableRegionCount = getRegionCountOfTable(tableName);
int snapshotRegionCount = manifest.getRegionManifestsMap().size();

// Update region count before restoreSnapshot if snapshotRegionCount is 
larger. If we
// updated the region count to a smaller value before retoreSnapshot 
and the retoreSnapshot
// fails, we may fail to reset the region count to its original value 
if the namespace
// region count quota is consumed by other tables during the 
restoreSnapshot, such as
// region split or table create under the same namespace.
if (tableRegionCount > 0 && tableRegionCount < snapshotRegionCount) {
  checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
}

restoreSnapshot(snapshot, snapshotTableDesc);

// Update the region count after restoreSnapshot succeeded if 
snapshotRegionCount is
// smaller. This step should not fail because it will reduce the region 
count for table
if (tableRegionCount > 0 && tableRegionCount > snapshotRegionCount) {
  checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
}
  } catch (QuotaExceededException e) {
LOG.error("Exception occurred while restoring the snapshot " + 
snapshot.getName()
  + " as table " + tableName.getNameAsString(), e);
// If QEE is thrown before restoreSnapshot, quota information is not 
updated, and we
// should throw the exception directly. If QEE is thrown after 
restoreSnapshot, there
// must be unexpected reasons, we also throw the exception directly
throw e;
  } catch (IOException e) {
if (tableRegionCount > 0) {
  // reset region count for table
  checkAndUpdateNamespaceRegionQuota(tableRegionCount, tableName);
}
LOG.error("Exception occurred while restoring the snapshot " + 
snapshot.getName()
+ " as table " + tableName.getNameAsString(), e);
throw e;
  }
{code}
What's your opinion about this issue? [~ashish singhi]

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-13 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192690#comment-15192690
 ] 

Jianwei Cui commented on HBASE-15433:
-

Thanks for your comment and sorry to reply late [~ashish singhi].
{quote}
So in catch clause first let us catch QEE and then IOE. If QEE is caught then 
we will not update the quota information.
{quote}
Yes, we don't need to update the quota information if QEE is caught. However, 
if IOE is caught, this means {{checkAndUpdateNamespaceRegionQuota}} succeeded 
while the following {{restoreSnapshot(SnapshotDescription, HTableDescriptor)}} 
failed, and the quota information has been updated by the region count in the 
snapshot. For example, the original region count is 10 for the table, and there 
are 5 regions in the snapshot, the region count will be updated to 5 in such 
case? and we need to reset the region count to 10 for the table in {{catch}}?

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-11 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190901#comment-15190901
 ] 

Jianwei Cui commented on HBASE-15433:
-

I get your point, Yes, it will be more concise to remove the 'else' keyword:)

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-11 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190895#comment-15190895
 ] 

Jianwei Cui commented on HBASE-15433:
-

Thanks for your comment, will make a new unit test and update the patch.
{quote}
I was thinking of a simple fix like, just catch the QuoteExceededException and 
don't remove the table from namespace quota.
{quote}
If the snapshot contains less regions than the current table's, 
'checkAndUpdateNamespaceRegionQuota' will update the region count for the 
table, we need to reset the region count in 'catch' block if 'restoreSnapshot' 
throw exception?


> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-10 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15433:

Attachment: HBASE-15433-trunk-v2.patch

add unit test for this case

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-10 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188919#comment-15188919
 ] 

Jianwei Cui commented on HBASE-15433:
-

Thanks for your comments [~yuzhih...@gmail.com]. The 'else' block will throw 
exception when the NamespaceStateManager is not initialized, this will make 
sure the NamespaceAuditor in right state when the method is invoked? Will add 
unit test for this case.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-09 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15433:

Attachment: HBASE-15433-trunk-v1.patch

This patch could be applied to 2.0.0, 1.4.0, 1.3.0, 1.2.0 and 1.1.4.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-09 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187210#comment-15187210
 ] 

Jianwei Cui commented on HBASE-15433:
-

If the snapshot contains less regions than the current table's, 
'checkAndUpdateNamespaceRegionQuota' will reduce the region count of the table. 
The remaining region quota may be consumed by others(such as table creation, 
region split, etc) during the restore procedure, therefore, we can not reset 
the region count for the table in 'catch' block if the restore procedure fails. 
Will update another patch to fix this case.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-09 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187064#comment-15187064
 ] 

Jianwei Cui commented on HBASE-15433:
-

I tried the patch on trunk and will try it on other branches.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-09 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15433:

Attachment: HBASE-15433-trunk.patch

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-09 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187024#comment-15187024
 ] 

Jianwei Cui commented on HBASE-15433:
-

Thanks for your comment [~ashish singhi], I made a patch to fix this case and 
will upload after test passed:)

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-09 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15433:
---

 Summary: SnapshotManager#restoreSnapshot not update table and 
region count quota correctly when encountering exception
 Key: HBASE-15433
 URL: https://issues.apache.org/jira/browse/HBASE-15433
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.0.0
Reporter: Jianwei Cui


In SnapshotManager#restoreSnapshot, the table and region quota will be checked 
and updated as:
{code}
  try {
// Table already exist. Check and update the region quota for this 
table namespace
checkAndUpdateNamespaceRegionQuota(manifest, tableName);
restoreSnapshot(snapshot, snapshotTableDesc);
  } catch (IOException e) {

this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
LOG.error("Exception occurred while restoring the snapshot " + 
snapshot.getName()
+ " as table " + tableName.getNameAsString(), e);
throw e;
  }
{code}
The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
make the region count quota exceeded, then, the table will be removed in the 
'catch' block. This will make the current table count and region count 
decrease, following table creation or region split will succeed even if the 
actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-03-05 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181663#comment-15181663
 ] 

Jianwei Cui commented on HBASE-15340:
-

{quote}
The solution of having a client aware readPnt will solve even that(?)
{quote}
It seems [HBASE-13099|https://issues.apache.org/jira/browse/HBASE-13099] has 
proposed such solution: 
https://issues.apache.org/jira/browse/HBASE-13099?focusedCommentId=14337017=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14337017.
 However, there are cases the solution can't cover(if I am not wrong). For 
example:
1. the client holds the readPoint when the scanner is created on serverA and 
the client has read partial row data from serverA
2. move the region to another serverB before the whole row returned
3. before the client created a new scanner for the row with the readPoint on 
serverB: new mutations applied to the region, including deletes for the row, 
and a major compaction happens and completed.
The major compaction could delete the cells of the row because the new server 
can't get a proper smallestReadPoint for the compaction before all ongoing scan 
requests arrived. Then, the client can not read the remaining cells of the row 
after the compaction, and will break per-row atomicity for scan. 

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15355) region.jsp can not be found on info server of master

2016-03-02 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175490#comment-15175490
 ] 

Jianwei Cui commented on HBASE-15355:
-

[~stack] Thanks for your comment:). If we decide to undo master hosting meta in 
near future, this issue is not a problem IMO, otherwise, we can move jps files 
to fix this issue. BTW, do we have any design or plan to split meta for 
scaling? 

> region.jsp can not be found on info server of master
> 
>
> Key: HBASE-15355
> URL: https://issues.apache.org/jira/browse/HBASE-15355
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
>
> After [HBASE-10569|https://issues.apache.org/jira/browse/HBASE-10569], master 
> is also a regionserver and it will serve regions of system tables. The meta 
> region info could be viewed on master at the address such as : 
> http://localhost:16010/region.jsp?name=1588230740. The real path of 
> region.jsp for the request will be hbase-webapps/master/region.jsp on master, 
> however, the region.jsp is under the directory hbase-webapps/regionserver, so 
> that can not be found on master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15355) region.jsp can not be found on info server of master

2016-02-27 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170835#comment-15170835
 ] 

Jianwei Cui commented on HBASE-15355:
-

Master is also a regionserver, do we need to put the jsp file in the same 
folder?

> region.jsp can not be found on info server of master
> 
>
> Key: HBASE-15355
> URL: https://issues.apache.org/jira/browse/HBASE-15355
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
>
> After [HBASE-10569|https://issues.apache.org/jira/browse/HBASE-10569], master 
> is also a regionserver and it will serve regions of system tables. The meta 
> region info could be viewed on master at the address such as : 
> http://localhost:16010/region.jsp?name=1588230740. The real path of 
> region.jsp for the request will be hbase-webapps/master/region.jsp on master, 
> however, the region.jsp is under the directory hbase-webapps/regionserver, so 
> that can not be found on master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15355) region.jsp can not be found on info server of master

2016-02-26 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15355:
---

 Summary: region.jsp can not be found on info server of master
 Key: HBASE-15355
 URL: https://issues.apache.org/jira/browse/HBASE-15355
 Project: HBase
  Issue Type: Bug
  Components: UI
Affects Versions: 2.0.0
Reporter: Jianwei Cui
Priority: Minor


After [HBASE-10569|https://issues.apache.org/jira/browse/HBASE-10569], master 
is also a regionserver and it will serve regions of system tables. The meta 
region info could be viewed on master at the address such as : 
http://localhost:16010/region.jsp?name=1588230740. The real path of region.jsp 
for the request will be hbase-webapps/master/region.jsp on master, however, the 
region.jsp is under the directory hbase-webapps/regionserver, so that can not 
be found on master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-02-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168832#comment-15168832
 ] 

Jianwei Cui commented on HBASE-15340:
-

{quote}
The solution of having a client aware readPnt will solve even that(?)
{quota}
It seems work IMO, I will try to find whether there is any discussion about 
this issue.

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-02-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168829#comment-15168829
 ] 

Jianwei Cui commented on HBASE-15340:
-

After [HBASE-11544|https://issues.apache.org/jira/browse/HBASE-11544], the 
maxScannerResultSize of ClientScanner will be 2MB default, this will make 
server return partial result more easily when size limit reached, and this 
issue will happen even when the user not set batch for scan.  

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-02-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168771#comment-15168771
 ] 

Jianwei Cui commented on HBASE-15340:
-

[~anoop.hbase], thanks for your comment, I get your point:). Yes, the case you 
mentioned will happen. The page https://hbase.apache.org/acid-semantics.html 
explains the consistency guarantee for scan:
{code}
A scan is not a consistent view of a table. Scans do not exhibit snapshot 
isolation.

Rather, scans have the following properties:

1. Any row returned by the scan will be a consistent view (i.e. that version of 
the complete row existed at some point in time) [1]
2. A scan will always reflect a view of the data at least as new as the 
beginning of the scan. This satisfies the visibility guarantees enumerated 
below.
1. For example, if client A writes data X and then communicates via a side 
channel to client B, any scans started by client B will contain data at least 
as new as X.
2. A scan _must_ reflect all mutations committed prior to the construction 
of the scanner, and _may_ reflect some mutations committed subsequent to the 
construction of the scanner.
3. Scans must include all data written prior to the scan (except in the 
case where data is subsequently mutated, in which case it _may_ reflect the 
mutation)
{code}
It seems the consistent for scan only guarantee to read out data at least as 
new as the beginning of the scan, but no guarantee to whether read out data 
concurrently written or written after the beginning of the scan. 

At the end of the page:
{code}
[1] A consistent view is not guaranteed intra-row scanning -- i.e. fetching a 
portion of a row in one RPC then going back to fetch another portion of the row 
in a subsequent RPC. Intra-row scanning happens when you set a limit on how 
many values to return per Scan#next (See Scan#setBatch(int)).
{code}
It mentioned the problem of this jira that row-level consistent view is not 
guaranteed for intra-row scanning, so this is a known problem?

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-02-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168736#comment-15168736
 ] 

Jianwei Cui commented on HBASE-15340:
-

[~anoop.hbase], the intra-row scanning seems come from 
[HBASE-1537|https://issues.apache.org/jira/browse/HBASE-1537], so that versions 
after 0.90.0 will have this issue. I will make a patch following the idea and 
check the result:)

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-02-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168678#comment-15168678
 ] 

Jianwei Cui commented on HBASE-15340:
-

[~ram_krish], this is a different problem caused by region move when scanning 
IMO. When [HBASE-15325|https://issues.apache.org/jira/browse/HBASE-15325] is 
resolved, there is no data miss, however, the returned data may combined from 
different row-level transactions which is unexpected for application. I think 
we should also keep the READ_COMMITTED isolation level in this situation?

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests

2016-02-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168664#comment-15168664
 ] 

Jianwei Cui commented on HBASE-15325:
-

When user set batch for scan, the client may also return partial row result to 
application and suffer this problem if region moves. The reason is that the 
server will judge whether the result is partial as:
{code}
  boolean partialResultFormed() {
return scannerState == NextState.SIZE_LIMIT_REACHED_MID_ROW
|| scannerState == NextState.TIME_LIMIT_REACHED_MID_ROW;
  }
{code}
The NextState.BATCH_LIMIT_REACHED is not considered as partial result, so that 
the ClientScanner won't get a partial result from server and will go to the 
next row when retrying:
  if (!this.lastResult.isPartial()) {
if (scan.isReversed()) {
  scan.setStartRow(createClosestRowBefore(lastResult.getRow()));
} else {
  scan.setStartRow(Bytes.add(lastResult.getRow(), new byte[1]));  
// <=== partial result from batch limit reached case will go to the next row 
and missing rest data
}
  } else {
// we need rescan this row because we only load partial row before
scan.setStartRow(lastResult.getRow());
  }
{code}
I think if user sets batch for scan, it means the user allows partial result? 
We can set scan.allowPartialResults to true in this situation, and the server 
should also take NextState.BATCH_LIMIT_REACHED as a partial result, then the 
ClientScanner will receive a partial result and retry the same row if region 
moved after applied the patch.  

> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> ---
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15325-test.txt, HBASE-15325-v1.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-02-26 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168630#comment-15168630
 ] 

Jianwei Cui commented on HBASE-15340:
-

A direct solution is that we can make ClientScanner record the readPoint when 
the scanner for the region is firstly opened, the following scanners for the 
same region use the same readPoint if RegionMovedException happens. Any 
suggestion? 

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-02-25 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15340:
---

 Summary: Partial row result of scan may return data violates the 
row-level transaction 
 Key: HBASE-15340
 URL: https://issues.apache.org/jira/browse/HBASE-15340
 Project: HBase
  Issue Type: Bug
  Components: Scanners, Transactions/MVCC
Affects Versions: 2.0.0
Reporter: Jianwei Cui


There are cases the region sever will return partial row result, such as the 
client set batch for scan or configured size limit reached. In these 
situations, the client may return data that violates the row-level transaction 
to the application. The following steps show the problem:
{code}
// assume there is a test table 'test_table' with one family 'F' and one region 
'region'. 
// meanwhile there are two region servers 'rsA' and 'rsB'.
1. Let 'region' firstly located in 'rsA' and put one row with two columns 'c1' 
and 'c2' as:
> put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'

2. Start a client to scan 'test_table', with scan.setBatch(1) and 
scan.setCaching(1). The client will get one column as : {column='F:c1' and 
value='value1'} in the first rpc call after scanner created, and the result 
will be returned to application.

3. Before the client issues the next request, the 'region' was moved to 'rsB' 
and accepted another mutations for the two columns 'c1' and 'c2' as:
> put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'

4. Then, the client  will receive a RegionMovedException when issuing next 
request and will retry to open scanner on 'rsB'. The newly opened scanner will 
higher mvcc than old data so that could read out column as : { column='F:c2' 
with value='value2'} and return the result to application.
   Therefore, the application will get data as:

'row'column='F:c1'   value='value1'
'row'column='F:c2',  value='value2'

   The returned data is combined from two different mutations and violates the 
row-level transaction.
{code}
The reason is that the newly opened scanner after region moved will get a 
different mvcc. I am not sure whether this result is by design for scan if 
partial row result is allowed. However, such row result combined from different 
transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15327) Canary will always invoke admin.balancer() in each sniffing period when writeSniffing is enabled

2016-02-25 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15327:

Attachment: HBASE-15327-trunk.patch

> Canary will always invoke admin.balancer() in each sniffing period when 
> writeSniffing is enabled
> 
>
> Key: HBASE-15327
> URL: https://issues.apache.org/jira/browse/HBASE-15327
> Project: HBase
>  Issue Type: Bug
>  Components: canary
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-15327-trunk.patch
>
>
> When Canary#writeSniffing is enabled, Canary#checkWriteTableDistribution will 
> make sure the regions of write table distributed on all region servers as:
> {code}
>   int numberOfServers = admin.getClusterStatus().getServers().size();
>   ..
>   int numberOfCoveredServers = serverSet.size();
>   if (numberOfCoveredServers < numberOfServers) {
> admin.balancer();
>   }
> {code}
> The master will also work as a regionserver, so that ClusterStatus#getServers 
> will contain the master. On the other hand, write table of Canary will not be 
> assigned to master, making numberOfCoveredServers always smaller than 
> numberOfServers and admin.balancer always be invoked in each sniffing period. 
> This may cause frequent region moves. A simple fix is excluding master from 
> numberOfServers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15327) Canary will always invoke admin.balancer() in each sniffing period when writeSniffing is enabled

2016-02-25 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15327:
---

 Summary: Canary will always invoke admin.balancer() in each 
sniffing period when writeSniffing is enabled
 Key: HBASE-15327
 URL: https://issues.apache.org/jira/browse/HBASE-15327
 Project: HBase
  Issue Type: Bug
  Components: canary
Affects Versions: 2.0.0
Reporter: Jianwei Cui
Priority: Minor


When Canary#writeSniffing is enabled, Canary#checkWriteTableDistribution will 
make sure the regions of write table distributed on all region servers as:
{code}
  int numberOfServers = admin.getClusterStatus().getServers().size();
  ..
  int numberOfCoveredServers = serverSet.size();
  if (numberOfCoveredServers < numberOfServers) {
admin.balancer();
  }
{code}
The master will also work as a regionserver, so that ClusterStatus#getServers 
will contain the master. On the other hand, write table of Canary will not be 
assigned to master, making numberOfCoveredServers always smaller than 
numberOfServers and admin.balancer always be invoked in each sniffing period. 
This may cause frequent region moves. A simple fix is excluding master from 
numberOfServers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will reset to the start of the row if the region is moved between two rpc requests

2016-02-24 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166847#comment-15166847
 ] 

Jianwei Cui commented on HBASE-15325:
-

In ScannerCallable#call(), the NotServingRegionException will be wrapped as 
DoNotRetryIOException:
{code}
  if (ioe instanceof NotServingRegionException) {
// Throw a DNRE so that we break out of cycle of calling NSRE
// when what we need is to open scanner against new location.
// Attach NSRE to signal client that it needs to re-setup scanner.
if (this.scanMetrics != null) {
  this.scanMetrics.countOfNSRE.incrementAndGet();
}
throw new DoNotRetryIOException("Resetting the scanner -- see 
exception cause", ioe);
{code}

> ResultScanner allowing partial result will reset to the start of the row if 
> the region is moved between two rpc requests
> 
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15325-test.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the start of 
> this row. So we will see the cells which have been seen before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15304) SecureBulkLoadEndpoint#bulkLoadHFiles not consider assignSeqNum flag(In 0.94 branch)

2016-02-22 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-15304:

Attachment: HBASE-15304-0.94-v1.patch

> SecureBulkLoadEndpoint#bulkLoadHFiles not consider assignSeqNum flag(In 0.94 
> branch)
> 
>
> Key: HBASE-15304
> URL: https://issues.apache.org/jira/browse/HBASE-15304
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors
>Affects Versions: 0.94.27
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-15304-0.94-v1.patch
>
>
> In 0.94, it seems SecureBulkLoadEndpoint#bulkLoadHFiles never use the 
> assignSeqNum flag, so that the server won't assign a sequence number for bulk 
> load hfiles even when the assignSeqNum is set to true by client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15304) SecureBulkLoadEndpoint#bulkLoadHFiles not consider assignSeqNum flag(In 0.94 branch)

2016-02-22 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15304:
---

 Summary: SecureBulkLoadEndpoint#bulkLoadHFiles not consider 
assignSeqNum flag(In 0.94 branch)
 Key: HBASE-15304
 URL: https://issues.apache.org/jira/browse/HBASE-15304
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.27
Reporter: Jianwei Cui
Priority: Minor


In 0.94, it seems SecureBulkLoadEndpoint#bulkLoadHFiles never use the 
assignSeqNum flag, so that the server won't assign a sequence number for bulk 
load hfiles even when the assignSeqNum is set to true by client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-15303) LoadIncrementalHFiles will encounter NoSuchMethodException when using secure

2016-02-22 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui resolved HBASE-15303.
-
Resolution: Invalid

> LoadIncrementalHFiles will encounter NoSuchMethodException when using secure
> 
>
> Key: HBASE-15303
> URL: https://issues.apache.org/jira/browse/HBASE-15303
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors
>Affects Versions: 0.94.27
>Reporter: Jianwei Cui
>
> After [HBASE-8521|https://issues.apache.org/jira/browse/HBASE-8521], the 
> LoadIncrementalHFiles could ask server to assign sequence id for bulk load 
> hfiles by invoking SecureBulkLoadClient#bulkLoadHFiles:
> {code}
>   public boolean bulkLoadHFiles(List> familyPaths, 
> Token userToken,
>   String bulkToken, boolean assignSeqNum) throws IOException {
> try {
>   return (Boolean) Methods.call(protocolClazz, proxy, "bulkLoadHFiles", 
> new Class[] {
>   List.class, Token.class, String.class, Boolean.class },
> new Object[] { familyPaths, userToken, bulkToken, assignSeqNum });
> } catch (Exception e) {
>   throw new IOException("Failed to bulkLoadHFiles", e);
> }
>   }
> {code}
> However, SecureBulkLoadProtocol does not define such interface(with 
> assignSeqNum as last parameter), so that the client will encounter 
> NoSuchMethodException when using secure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-15303) LoadIncrementalHFiles will encounter NoSuchMethodException when using secure

2016-02-22 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-15303:
---

 Summary: LoadIncrementalHFiles will encounter 
NoSuchMethodException when using secure
 Key: HBASE-15303
 URL: https://issues.apache.org/jira/browse/HBASE-15303
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.27
Reporter: Jianwei Cui


After [HBASE-8521|https://issues.apache.org/jira/browse/HBASE-8521], the 
LoadIncrementalHFiles could ask server to assign sequence id for bulk load 
hfiles by invoking SecureBulkLoadClient#bulkLoadHFiles:
{code}
  public boolean bulkLoadHFiles(List> familyPaths, 
Token userToken,
  String bulkToken, boolean assignSeqNum) throws IOException {
try {
  return (Boolean) Methods.call(protocolClazz, proxy, "bulkLoadHFiles", new 
Class[] {
  List.class, Token.class, String.class, Boolean.class },
new Object[] { familyPaths, userToken, bulkToken, assignSeqNum });
} catch (Exception e) {
  throw new IOException("Failed to bulkLoadHFiles", e);
}
  }
{code}
However, SecureBulkLoadProtocol does not define such interface(with 
assignSeqNum as last parameter), so that the client will encounter 
NoSuchMethodException when using secure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14259) Backport Namespace quota support to 98 branch

2016-01-27 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118848#comment-15118848
 ] 

Jianwei Cui commented on HBASE-14259:
-

Thanks for the patch, [~avandana]. Seems one typo in patch v3?
{code}
+  public void updateQuotaForRegionMerge(HRegionInfo hri) throws IOException {
+if (isInitialized()) { // ===> if (!isInitialized()) {
+  throw new IOException(
+  "Merge operation is being performed even before namespace auditor is 
initialized.");
+}
{code}

In TestZKLessNamespaceAuditor, setBoolean("hbase.assignment.usezk", false) will 
be applied to  TestZKLessNamespaceAuditor#UTIL, not TestNamespaceAuditor#UTIL, 
making TestZKLessNamespaceAuditor will also use zookeeper to assign region. 
Need to make TestNamespaceAuditor#UTIL protected?
{code}
+@Category(MediumTests.class)
+public class TestZKLessNamespaceAuditor extends TestNamespaceAuditor {
+  private static final HBaseTestingUtility UTIL = new HBaseTestingUtility();
+
+  @BeforeClass
+  public static void before() throws Exception {
+UTIL.getConfiguration().setBoolean("hbase.assignment.usezk", false);
+setupOnce();
+  }
{code}

In AssignmentManager:
{code}
 case MERGED:
 case READY_TO_MERGE:
 case MERGE_PONR:
 case MERGED:
+  try {
+regionStateListener.onRegionMerged(hri);
+  } catch (IOException exp) {
+errorMsg = StringUtils.stringifyException(exp);
+  }
{code}
should invoke regionStateListener.onRegionMerged only in MERGED case?
{code}
 case MERGE_PONR:
 case MERGED:
+  if (code == TransitionCode.MERGED) {
+  try {
+regionStateListener.onRegionMerged(hri);
+  } catch (IOException exp) {
+errorMsg = StringUtils.stringifyException(exp);
+  }
+  }
{code}

> Backport Namespace quota support to 98 branch 
> --
>
> Key: HBASE-14259
> URL: https://issues.apache.org/jira/browse/HBASE-14259
> Project: HBase
>  Issue Type: Task
>Reporter: Vandana Ayyalasomayajula
>Assignee: Andrew Purtell
> Fix For: 0.98.18
>
> Attachments: HBASE-14259_v1_0.98.patch, HBASE-14259_v2_0.98.patch, 
> HBASE-14259_v3_0.98.patch
>
>
> Namespace quota support (HBASE-8410) has been backported to branch-1 
> (HBASE-13438). This jira would backport the same to 98 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14992) Add cache stats of past n periods in region server status page

2015-12-16 Thread Jianwei Cui (JIRA)

Jianwei Cui created HBASE-14992:
---

 Summary: Add cache stats of past n periods in region server status 
page
 Key: HBASE-14992
 URL: https://issues.apache.org/jira/browse/HBASE-14992
 Project: HBase
  Issue Type: Improvement
  Components: BlockCache, metrics
Affects Versions: 2.0.0
Reporter: Jianwei Cui
Priority: Minor


The cache stats of past n periods, such as SumHitCountsPastNPeriods, 
SumHitCachingCountsPastNPeriods, etc, 
are useful to indicate the real-time read load of region server, especially for 
temporary read peak. It is helpful to add such metrics to BlockCache#Stats tab 
of region server status page. Discussion and suggestion are welcomed.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14936) CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()

2015-12-14 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055856#comment-15055856
 ] 

Jianwei Cui commented on HBASE-14936:
-

Thanks for your review [~chenheng]

> CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()
> --
>
> Key: HBASE-14936
> URL: https://issues.apache.org/jira/browse/HBASE-14936
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 1.1.2
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.2, 1.1, 1.3, 1.0
>
> Attachments: HBASE-14936-branch-1.0-1.1.patch, 
> HBASE-14936-branch-1.0-addendum.patch, HBASE-14936-trunk-v1.patch, 
> HBASE-14936-trunk-v2.patch, HBASE-14936-trunk.patch
>
>
> It seems CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod() as
> {code}
> public void rollMetricsPeriod() {
>   lruCacheStats.rollMetricsPeriod();
>   bucketCacheStats.rollMetricsPeriod();
> }
> {code}
> otherwise, CombinedBlockCache.getHitRatioPastNPeriods() and 
> CombinedBlockCache.getHitCachingRatioPastNPeriods() will always return 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14936) CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()

2015-12-14 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055857#comment-15055857
 ] 

Jianwei Cui commented on HBASE-14936:
-

Thanks for your review [~chenheng]

> CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()
> --
>
> Key: HBASE-14936
> URL: https://issues.apache.org/jira/browse/HBASE-14936
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 1.1.2
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.2, 1.1, 1.3, 1.0
>
> Attachments: HBASE-14936-branch-1.0-1.1.patch, 
> HBASE-14936-branch-1.0-addendum.patch, HBASE-14936-trunk-v1.patch, 
> HBASE-14936-trunk-v2.patch, HBASE-14936-trunk.patch
>
>
> It seems CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod() as
> {code}
> public void rollMetricsPeriod() {
>   lruCacheStats.rollMetricsPeriod();
>   bucketCacheStats.rollMetricsPeriod();
> }
> {code}
> otherwise, CombinedBlockCache.getHitRatioPastNPeriods() and 
> CombinedBlockCache.getHitCachingRatioPastNPeriods() will always return 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14936) CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()

2015-12-14 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055786#comment-15055786
 ] 

Jianwei Cui commented on HBASE-14936:
-

Sure, it seems HBASE-14936-trunk-v2.patch could be applied to branch-1.2, I add 
a patch for branch-1.0 and branch-1.1

> CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()
> --
>
> Key: HBASE-14936
> URL: https://issues.apache.org/jira/browse/HBASE-14936
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 1.1.2
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.2, 1.1, 1.3
>
> Attachments: HBASE-14936-branch-1.0-1.1.patch, 
> HBASE-14936-trunk-v1.patch, HBASE-14936-trunk-v2.patch, 
> HBASE-14936-trunk.patch
>
>
> It seems CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod() as
> {code}
> public void rollMetricsPeriod() {
>   lruCacheStats.rollMetricsPeriod();
>   bucketCacheStats.rollMetricsPeriod();
> }
> {code}
> otherwise, CombinedBlockCache.getHitRatioPastNPeriods() and 
> CombinedBlockCache.getHitCachingRatioPastNPeriods() will always return 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14936) CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()

2015-12-14 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-14936:

Attachment: HBASE-14936-branch-1.0-1.1.patch

patch for branch-1.0 and branch-1.1

> CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()
> --
>
> Key: HBASE-14936
> URL: https://issues.apache.org/jira/browse/HBASE-14936
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 1.1.2
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.2, 1.1, 1.3
>
> Attachments: HBASE-14936-branch-1.0-1.1.patch, 
> HBASE-14936-trunk-v1.patch, HBASE-14936-trunk-v2.patch, 
> HBASE-14936-trunk.patch
>
>
> It seems CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod() as
> {code}
> public void rollMetricsPeriod() {
>   lruCacheStats.rollMetricsPeriod();
>   bucketCacheStats.rollMetricsPeriod();
> }
> {code}
> otherwise, CombinedBlockCache.getHitRatioPastNPeriods() and 
> CombinedBlockCache.getHitCachingRatioPastNPeriods() will always return 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14936) CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()

2015-12-14 Thread Jianwei Cui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-14936:

Attachment: HBASE-14936-trunk-v2.patch

add license for TestCombinedBlockCache.java

> CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()
> --
>
> Key: HBASE-14936
> URL: https://issues.apache.org/jira/browse/HBASE-14936
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 1.1.2
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
> Fix For: 2.0.0, 1.2, 1.1, 1.3
>
> Attachments: HBASE-14936-trunk-v1.patch, HBASE-14936-trunk-v2.patch, 
> HBASE-14936-trunk.patch
>
>
> It seems CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod() as
> {code}
> public void rollMetricsPeriod() {
>   lruCacheStats.rollMetricsPeriod();
>   bucketCacheStats.rollMetricsPeriod();
> }
> {code}
> otherwise, CombinedBlockCache.getHitRatioPastNPeriods() and 
> CombinedBlockCache.getHitCachingRatioPastNPeriods() will always return 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14936) CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()

2015-12-10 Thread Jianwei Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15050452#comment-15050452
 ] 

Jianwei Cui commented on HBASE-14936:
-

Do not need to overwrite getHitRatio() because getHitCount() and 
getRequestCount() has been overwritten.

> CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod()
> --
>
> Key: HBASE-14936
> URL: https://issues.apache.org/jira/browse/HBASE-14936
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 1.1.2
>Reporter: Jianwei Cui
> Attachments: HBASE-14936-trunk-v1.patch, HBASE-14936-trunk.patch
>
>
> It seems CombinedBlockCache should overwrite CacheStats#rollMetricsPeriod() as
> {code}
> public void rollMetricsPeriod() {
>   lruCacheStats.rollMetricsPeriod();
>   bucketCacheStats.rollMetricsPeriod();
> }
> {code}
> otherwise, CombinedBlockCache.getHitRatioPastNPeriods() and 
> CombinedBlockCache.getHitCachingRatioPastNPeriods() will always return 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 125 matches

Mail list logo