[jira] [Comment Edited] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled

2018-08-10 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577029#comment-16577029
 ] 

Tak Lon (Stephen) Wu edited comment on HBASE-21011 at 8/11/18 5:08 AM:
---

Thanks [~reidchan] for your suggestion about using {{hdfs -rm}} but I don't 
think that would be a good suggestion to operator due to any error-prone 
entering (pointing to directory other than oldWALs or HFiles archive directory) 
while using {{-rm}}. I hope not but it could lead to data loss disaster (that's 
why I don't suggest operator to use hdfs command as workaround if possible). 
But I do agree with you about the point of `endless requirement`, once I get 
back from our use cases, I will close this item. 

Also, I think part of the HBase on Cloud should be related HBASE-20952 about 
WAL interface decouple from HDFS (although I didn't put any comment), but IMHO 
backward compatibility should not only related this {{run_cleaner_chore}} admin 
CLI but all related components that handles WAL. Anyway, I will keep this in 
mind for future improvements/changes related to WAL.




was (Author: taklwu):
Thanks [~reidchan] for your suggestion about using {{hdfs -rm}} but I don't 
think that would be a good suggestion to operator due to any error-prone 
entering while using {{-rm}} could lead to data loss disaster. But I do agree 
with you about the point of `endless requirement`, once I get back from our use 
cases, I will close this item. 

Also, I think part of the HBase on Cloud should be related HBASE-20952 about 
WAL interface decouple from HDFS (although I didn't put any comment), but IMHO 
backward compatibility should not only related this {{run_cleaner_chore}} admin 
CLI but all related components that handles WAL. Anyway, I will keep this in 
mind for future improvements/changes related to WAL.



> Provide CLI option to run oldwals and hfiles cleaner separately when cleaner 
> chore is disabled
> --
>
> Key: HBASE-21011
> URL: https://issues.apache.org/jira/browse/HBASE-21011
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, Client
>Affects Versions: 3.0.0, 1.4.6, 2.1.1
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21011.master.001.patch, 
> HBASE-21011.master.002.patch, HBASE-21011.master.003.patch, 
> HBASE-21011.master.004.patch
>
>
> There is a corner case when cleaner chore for HFiles and oldwals is disabled, 
> admin/user needs to manually execute admin command {{cleaner_chore_run}} to 
> clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is 
> to [firstly trigger the HFiles cleaner and then oldwals 
> cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420],
>  and only return succeed if both completes. 
> but when running this {{cleaner_chore_run}} command, there is a potential use 
> case that admin would like trigger the cleaner for only oldwals or hfiles but 
> still keep the automatic cleaner chore disabled. So, this change aims to 
> provide support for this corner case, and provide flexibility for those user 
> with cleaner chore disabled by default to execute admin CLI to run oldwals 
> and HFiles cleaning procedure individually.
> NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch 
> added options 'hfiles' and 'oldwals' to it. Also fix default behavior of 
> {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, 
> e.g. the proposed admin CLI options are
> {noformat}
> hbase> cleaner_chore_run   # this was introduced in HBASE-17280, 
> but changed the behavior to only ran when cleaner chore is set to disabled
> hbase> cleaner_chore_run 'hfiles'  # added, ran when cleaner chore is set 
> to disabled
> hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set 
> to disabled
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled

2018-08-10 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577042#comment-16577042
 ] 

Reid Chan edited comment on HBASE-21011 at 8/11/18 4:02 AM:


I stand my point, and vote won't change.

Please ensure it has at least two +1s from other committers before it goes in 
any branch, since there's a -1. 


was (Author: reidchan):
I stand my point, and vote won't change.

Please ensure it has at least two {{+1}}s from other committers before it goes 
in any branch, since there's a {{-1}}. :)

> Provide CLI option to run oldwals and hfiles cleaner separately when cleaner 
> chore is disabled
> --
>
> Key: HBASE-21011
> URL: https://issues.apache.org/jira/browse/HBASE-21011
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, Client
>Affects Versions: 3.0.0, 1.4.6, 2.1.1
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21011.master.001.patch, 
> HBASE-21011.master.002.patch, HBASE-21011.master.003.patch, 
> HBASE-21011.master.004.patch
>
>
> There is a corner case when cleaner chore for HFiles and oldwals is disabled, 
> admin/user needs to manually execute admin command {{cleaner_chore_run}} to 
> clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is 
> to [firstly trigger the HFiles cleaner and then oldwals 
> cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420],
>  and only return succeed if both completes. 
> but when running this {{cleaner_chore_run}} command, there is a potential use 
> case that admin would like trigger the cleaner for only oldwals or hfiles but 
> still keep the automatic cleaner chore disabled. So, this change aims to 
> provide support for this corner case, and provide flexibility for those user 
> with cleaner chore disabled by default to execute admin CLI to run oldwals 
> and HFiles cleaning procedure individually.
> NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch 
> added options 'hfiles' and 'oldwals' to it. Also fix default behavior of 
> {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, 
> e.g. the proposed admin CLI options are
> {noformat}
> hbase> cleaner_chore_run   # this was introduced in HBASE-17280, 
> but changed the behavior to only ran when cleaner chore is set to disabled
> hbase> cleaner_chore_run 'hfiles'  # added, ran when cleaner chore is set 
> to disabled
> hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set 
> to disabled
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled

2018-08-10 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577042#comment-16577042
 ] 

Reid Chan commented on HBASE-21011:
---

I stand my point, and vote won't change.

Please ensure it has at least two {{+1}}s from other committers before it goes 
in any branch, since there's a {{-1}}. :)

> Provide CLI option to run oldwals and hfiles cleaner separately when cleaner 
> chore is disabled
> --
>
> Key: HBASE-21011
> URL: https://issues.apache.org/jira/browse/HBASE-21011
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, Client
>Affects Versions: 3.0.0, 1.4.6, 2.1.1
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21011.master.001.patch, 
> HBASE-21011.master.002.patch, HBASE-21011.master.003.patch, 
> HBASE-21011.master.004.patch
>
>
> There is a corner case when cleaner chore for HFiles and oldwals is disabled, 
> admin/user needs to manually execute admin command {{cleaner_chore_run}} to 
> clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is 
> to [firstly trigger the HFiles cleaner and then oldwals 
> cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420],
>  and only return succeed if both completes. 
> but when running this {{cleaner_chore_run}} command, there is a potential use 
> case that admin would like trigger the cleaner for only oldwals or hfiles but 
> still keep the automatic cleaner chore disabled. So, this change aims to 
> provide support for this corner case, and provide flexibility for those user 
> with cleaner chore disabled by default to execute admin CLI to run oldwals 
> and HFiles cleaning procedure individually.
> NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch 
> added options 'hfiles' and 'oldwals' to it. Also fix default behavior of 
> {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, 
> e.g. the proposed admin CLI options are
> {noformat}
> hbase> cleaner_chore_run   # this was introduced in HBASE-17280, 
> but changed the behavior to only ran when cleaner chore is set to disabled
> hbase> cleaner_chore_run 'hfiles'  # added, ran when cleaner chore is set 
> to disabled
> hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set 
> to disabled
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled

2018-08-10 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577037#comment-16577037
 ] 

Reid Chan commented on HBASE-21011:
---

{{rm}} loss nothing, then you still don't understand the mechanism of cleaner 
and its target. (I didn't ask you to {{rm}} data dir)

{quote}
Exposing too much server side details to client is not always a good practice, 
not to mention it only covers 1% of users's need, hbase can't cook everything.
{quote}
You can cook yourself in your company release.

> Provide CLI option to run oldwals and hfiles cleaner separately when cleaner 
> chore is disabled
> --
>
> Key: HBASE-21011
> URL: https://issues.apache.org/jira/browse/HBASE-21011
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, Client
>Affects Versions: 3.0.0, 1.4.6, 2.1.1
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21011.master.001.patch, 
> HBASE-21011.master.002.patch, HBASE-21011.master.003.patch, 
> HBASE-21011.master.004.patch
>
>
> There is a corner case when cleaner chore for HFiles and oldwals is disabled, 
> admin/user needs to manually execute admin command {{cleaner_chore_run}} to 
> clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is 
> to [firstly trigger the HFiles cleaner and then oldwals 
> cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420],
>  and only return succeed if both completes. 
> but when running this {{cleaner_chore_run}} command, there is a potential use 
> case that admin would like trigger the cleaner for only oldwals or hfiles but 
> still keep the automatic cleaner chore disabled. So, this change aims to 
> provide support for this corner case, and provide flexibility for those user 
> with cleaner chore disabled by default to execute admin CLI to run oldwals 
> and HFiles cleaning procedure individually.
> NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch 
> added options 'hfiles' and 'oldwals' to it. Also fix default behavior of 
> {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, 
> e.g. the proposed admin CLI options are
> {noformat}
> hbase> cleaner_chore_run   # this was introduced in HBASE-17280, 
> but changed the behavior to only ran when cleaner chore is set to disabled
> hbase> cleaner_chore_run 'hfiles'  # added, ran when cleaner chore is set 
> to disabled
> hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set 
> to disabled
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled

2018-08-10 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577029#comment-16577029
 ] 

Tak Lon (Stephen) Wu commented on HBASE-21011:
--

Thanks [~reidchan] for your suggestion about using {{hdfs -rm}} but I don't 
think that would be a good suggestion to operator due to any error-prone 
entering while using {{-rm}} could lead to data loss disaster. But I do agree 
with you about the point of `endless requirement`, once I get back from our use 
cases, I will close this item. 

Also, I think part of the HBase on Cloud should be related HBASE-20952 about 
WAL interface decouple from HDFS (although I didn't put any comment), but IMHO 
backward compatibility should not only related this {{run_cleaner_chore}} admin 
CLI but all related components that handles WAL. Anyway, I will keep this in 
mind for future improvements/changes related to WAL.



> Provide CLI option to run oldwals and hfiles cleaner separately when cleaner 
> chore is disabled
> --
>
> Key: HBASE-21011
> URL: https://issues.apache.org/jira/browse/HBASE-21011
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, Client
>Affects Versions: 3.0.0, 1.4.6, 2.1.1
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21011.master.001.patch, 
> HBASE-21011.master.002.patch, HBASE-21011.master.003.patch, 
> HBASE-21011.master.004.patch
>
>
> There is a corner case when cleaner chore for HFiles and oldwals is disabled, 
> admin/user needs to manually execute admin command {{cleaner_chore_run}} to 
> clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is 
> to [firstly trigger the HFiles cleaner and then oldwals 
> cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420],
>  and only return succeed if both completes. 
> but when running this {{cleaner_chore_run}} command, there is a potential use 
> case that admin would like trigger the cleaner for only oldwals or hfiles but 
> still keep the automatic cleaner chore disabled. So, this change aims to 
> provide support for this corner case, and provide flexibility for those user 
> with cleaner chore disabled by default to execute admin CLI to run oldwals 
> and HFiles cleaning procedure individually.
> NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch 
> added options 'hfiles' and 'oldwals' to it. Also fix default behavior of 
> {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, 
> e.g. the proposed admin CLI options are
> {noformat}
> hbase> cleaner_chore_run   # this was introduced in HBASE-17280, 
> but changed the behavior to only ran when cleaner chore is set to disabled
> hbase> cleaner_chore_run 'hfiles'  # added, ran when cleaner chore is set 
> to disabled
> hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set 
> to disabled
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21018) RS crashed because AsyncFS was unable to update HDFS data encryption key

2018-08-10 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21018:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.1.1
   2.0.2
   3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to branch-2.0+ Thanks for the patch [~jojochuang].

> RS crashed because AsyncFS was unable to update HDFS data encryption key
> 
>
> Key: HBASE-21018
> URL: https://issues.apache.org/jira/browse/HBASE-21018
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 2.0.0
> Environment: Hadoop 3.0.0, HBase 2.0.0, 
> HDFS configuration dfs.encrypt.data.transfer = true
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-21018.master.001.patch
>
>
> We (+[~uagashe]) found HBase RegionServer doesn't update HDFS data encryption 
> key correctly, and in some cases after retry 10 times, it aborts.
> {noformat}
> 2018-08-03 17:37:03,233 WARN 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create 
> fan-out dfs output 
> /hbase/WALs/rs1.example.com,22101,1533318719239/rs1.example.com%2C22101%2C1533318719239.rs1.example.com%2C22101%2C1533318719239.regiongroup-0.1533343022981
>  failed, retry = 1
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1685436998) doesn't exist. Current key: 1085959374
> at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper$SaslNegotiateHandler.check(FanOutOneBlockAsyncDFSOutputSaslHelper.java:399)
> at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper$SaslNegotiateHandler.channelRead(FanOutOneBlockAsyncDFSOutputSaslHelper.java:470)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
> at 
> org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> at 
> org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> at 
> 

[jira] [Commented] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled

2018-08-10 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577012#comment-16577012
 ] 

Reid Chan commented on HBASE-21011:
---

You can always delete the specific directory using {{hdfs -rm}} command, since 
you know the details.

And with your feature applied, your client may ask "what if i only want to 
delete some directories, not all", it is endless requirement. And even this 
requirement, you could clean them as i said above.

Team is working on `HBase on Cloud`, wals may no longer be existed, with your 
feature applied, backward compatibility should be taken into consideration on 
such a corner case which i don't think it worth and can be avoided now.

Exposing too much server side details to client is not always a good practice, 
not to mention it only covers 1% of users's need, hbase can't cook everything.


> Provide CLI option to run oldwals and hfiles cleaner separately when cleaner 
> chore is disabled
> --
>
> Key: HBASE-21011
> URL: https://issues.apache.org/jira/browse/HBASE-21011
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, Client
>Affects Versions: 3.0.0, 1.4.6, 2.1.1
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21011.master.001.patch, 
> HBASE-21011.master.002.patch, HBASE-21011.master.003.patch, 
> HBASE-21011.master.004.patch
>
>
> There is a corner case when cleaner chore for HFiles and oldwals is disabled, 
> admin/user needs to manually execute admin command {{cleaner_chore_run}} to 
> clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is 
> to [firstly trigger the HFiles cleaner and then oldwals 
> cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420],
>  and only return succeed if both completes. 
> but when running this {{cleaner_chore_run}} command, there is a potential use 
> case that admin would like trigger the cleaner for only oldwals or hfiles but 
> still keep the automatic cleaner chore disabled. So, this change aims to 
> provide support for this corner case, and provide flexibility for those user 
> with cleaner chore disabled by default to execute admin CLI to run oldwals 
> and HFiles cleaning procedure individually.
> NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch 
> added options 'hfiles' and 'oldwals' to it. Also fix default behavior of 
> {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, 
> e.g. the proposed admin CLI options are
> {noformat}
> hbase> cleaner_chore_run   # this was introduced in HBASE-17280, 
> but changed the behavior to only ran when cleaner chore is set to disabled
> hbase> cleaner_chore_run 'hfiles'  # added, ran when cleaner chore is set 
> to disabled
> hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set 
> to disabled
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21005) Maven site configuration causes downstream projects to get a directory named ${project.basedir}

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576993#comment-16576993
 ] 

Sean Busbey commented on HBASE-21005:
-

+1

> Maven site configuration causes downstream projects to get a directory named 
> ${project.basedir}
> ---
>
> Key: HBASE-21005
> URL: https://issues.apache.org/jira/browse/HBASE-21005
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.0.0
>Reporter: Matt Burgess
>Assignee: Josh Elser
>Priority: Minor
> Attachments: HBASE-21005.001.patch, HBASE-21005.002.patch
>
>
> Matt told me about this interesting issue they see down in Apache Nifi's build
> NiFi depends on HBase for some code that they provide to their users. As a 
> part of the build process of NiFi, they are seeing a directory named 
> {{$\{project.basedir}}} get created the first time they build with an empty 
> Maven repo. Matt reports that after a javax.el artifact is cached, Maven will 
> stop creating the directory; however, if you wipe that artifact from the 
> Maven repo, the next build will end up re-creating it.
> I believe I've seen this with Phoenix, too, but never investigated why it was 
> actually happening.
> My hunch is that it's related to the local maven repo that we create to 
> "patch" in our custom maven-fluido-skin jar (HBASE-14785). I'm not sure if we 
> can "work" around this by pushing the custom local repo into a profile and 
> only activating that for the mvn-site. Another solution would be to publish 
> the maven-fluido-jar to central with custom coordinates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20512) document change to running tests on secure clusters

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576991#comment-16576991
 ] 

Sean Busbey commented on HBASE-20512:
-

In general I'd suggest including the long description in the earlier version's 
upgrade consideration section and then referencing it from newer ones (like we 
did for some 1.4+ stuff that's also in mentioned in 2.0+), but I don't feel 
strongly about it.

+1

> document change to running tests on secure clusters
> ---
>
> Key: HBASE-20512
> URL: https://issues.apache.org/jira/browse/HBASE-20512
> Project: HBase
>  Issue Type: Task
>  Components: documentation, integration tests, Usability
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Sean Busbey
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.0.2
>
> Attachments: HBASE-20512.master.001.patch
>
>
> We should document the change to authentication handling in HBASE-16231 in 
> the upgrade section of the reference guide.
> It's surprising to folks that have existing automated testing that's been 
> working on our prior stable release lines. We should give a warning to those 
> updating. The release note is probably suitable for a first pass.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21011) Provide CLI option to run oldwals and hfiles cleaner separately when cleaner chore is disabled

2018-08-10 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576977#comment-16576977
 ] 

Tak Lon (Stephen) Wu commented on HBASE-21011:
--

Thanks [~yuzhih...@gmail.com], in fact I did get what [~reidchan] is trying to 
help with this case and yeah just switching on the cleaner chore would save my 
ass (because HBASE-18309 and HBASE-18083 have been backported to branch-1). I'm 
communicating with one of our users as well to see if they agree to just turn 
cleaner chore on by default, and will come back to update here when I can close 
it.

> Provide CLI option to run oldwals and hfiles cleaner separately when cleaner 
> chore is disabled
> --
>
> Key: HBASE-21011
> URL: https://issues.apache.org/jira/browse/HBASE-21011
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, Client
>Affects Versions: 3.0.0, 1.4.6, 2.1.1
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21011.master.001.patch, 
> HBASE-21011.master.002.patch, HBASE-21011.master.003.patch, 
> HBASE-21011.master.004.patch
>
>
> There is a corner case when cleaner chore for HFiles and oldwals is disabled, 
> admin/user needs to manually execute admin command {{cleaner_chore_run}} to 
> clean the old HFiles and oldwals. Existing logic of {{cleaner_chore_run}} is 
> to [firstly trigger the HFiles cleaner and then oldwals 
> cleaner|https://github.com/taklwu/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L1414-L1420],
>  and only return succeed if both completes. 
> but when running this {{cleaner_chore_run}} command, there is a potential use 
> case that admin would like trigger the cleaner for only oldwals or hfiles but 
> still keep the automatic cleaner chore disabled. So, this change aims to 
> provide support for this corner case, and provide flexibility for those user 
> with cleaner chore disabled by default to execute admin CLI to run oldwals 
> and HFiles cleaning procedure individually.
> NOTE that {{cleaner_chore_run}} was introduced in HBASE-17280, this patch 
> added options 'hfiles' and 'oldwals' to it. Also fix default behavior of 
> {{cleaner_chore_run}} will be only ran when cleaner chore is set to disabled, 
> e.g. the proposed admin CLI options are
> {noformat}
> hbase> cleaner_chore_run   # this was introduced in HBASE-17280, 
> but changed the behavior to only ran when cleaner chore is set to disabled
> hbase> cleaner_chore_run 'hfiles'  # added, ran when cleaner chore is set 
> to disabled
> hbase> cleaner_chore_run 'oldwals' # added, ran when cleaner chore is set 
> to disabled
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20941) Create and implement HbckService in master

2018-08-10 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576973#comment-16576973
 ] 

Umesh Agashe commented on HBASE-20941:
--

Thanks for the review, [~busbey]. Working on changes per review comments.

> Create and implement HbckService in master
> --
>
> Key: HBASE-20941
> URL: https://issues.apache.org/jira/browse/HBASE-20941
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-20941.master.001.patch
>
>
> Create HbckService in master and implement following methods:
>  # setTableState(): If table state are inconsistent with action/ procedures 
> working on them, sometimes manipulating their states in meta fix things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-10 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576970#comment-16576970
 ] 

Tak Lon (Stephen) Wu commented on HBASE-21028:
--

[non-binding] +1, and thanks for fixing the checkstyle.

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21005) Maven site configuration causes downstream projects to get a directory named ${project.basedir}

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576967#comment-16576967
 ] 

Hadoop QA commented on HBASE-21005:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 16m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
30s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
5s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m  1s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}193m  
1s{color} | {color:green} root in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}269m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21005 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935177/HBASE-21005.002.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  mvnsite  |
| uname | Linux 7f006fdc696c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 86821dee22 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14004/testReport/ |
| Max. process+thread count | 4591 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14004/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Maven site configuration causes downstream projects to get a directory named 
> ${project.basedir}
> ---
>
> Key: HBASE-21005
>  

[jira] [Commented] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576953#comment-16576953
 ] 

Hadoop QA commented on HBASE-21028:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1.3 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
28s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
6m  7s{color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.5 2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 81m 
11s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}103m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:53dba69 |
| JIRA Issue | HBASE-21028 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935199/HBASE-21028-branch-1.3.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 57c76dce6843 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | 

[jira] [Created] (HBASE-21037) Hbck needs to call Master.offlineRegion() to clean in-memory state after issuing a RS.closeRegion()

2018-08-10 Thread huaxiang sun (JIRA)
huaxiang sun created HBASE-21037:


 Summary: Hbck needs to call Master.offlineRegion() to clean 
in-memory state after issuing a RS.closeRegion()
 Key: HBASE-21037
 URL: https://issues.apache.org/jira/browse/HBASE-21037
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 1.2.6
Reporter: huaxiang sun
Assignee: huaxiang sun


In certain cases, hbck issues a RS.closeRegion() to close a region from RS. It 
does not clean up in-memory state from master for the offlined region and 
balancer will bring back the closed region, causing region inconsistency. 
certain codes needs to be reexamined to see a Master.offlineRegion() is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20387) flaky infrastructure should work for all branches

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576896#comment-16576896
 ] 

Hudson commented on HBASE-20387:


Results for branch HBASE-20387
[build #4 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20387/4/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20387/4//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20387/4//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20387/4//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> flaky infrastructure should work for all branches
> -
>
> Key: HBASE-20387
> URL: https://issues.apache.org/jira/browse/HBASE-20387
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
>
> We need a flaky list per-branch, since what does/does not work reliably on 
> master isn't really relevant to our older maintenance release lines.
> We should just make the invocation a step in the current per-branch nightly 
> jobs, prior to when we need the list in the stages that run unit tests. We 
> can publish it in the nightly job as well so that precommit can still get it. 
> (and can fetch it per-branch if needed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-10 Thread Daniel Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Wong updated HBASE-21028:

Attachment: HBASE-21028-branch-1.3.patch

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-10 Thread Daniel Wong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576884#comment-16576884
 ] 

Daniel Wong commented on HBASE-21028:
-

Wasn't sure if the community preferred having the original commit exactly,  
I'll update the patch with checkstyle fixes.  [~taklwu] I don't seem to have 
Jira contributor status to self assign.

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576868#comment-16576868
 ] 

Hudson commented on HBASE-21025:


Results for branch branch-2.1
[build #168 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/168/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/168//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/168//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/168//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576810#comment-16576810
 ] 

Hudson commented on HBASE-21025:


Results for branch branch-2.0
[build #656 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/656/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/656//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/656//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/656//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576808#comment-16576808
 ] 

Hudson commented on HBASE-21025:


Results for branch branch-2
[build #1090 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1090/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1090//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1090//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1090//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576766#comment-16576766
 ] 

Hudson commented on HBASE-18477:


Results for branch HBASE-18477
[build #291 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/291/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/291//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/291//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/291//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/291//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Umbrella JIRA for HBase Read Replica clusters
> -
>
> Key: HBASE-18477
> URL: https://issues.apache.org/jira/browse/HBASE-18477
> Project: HBase
>  Issue Type: New Feature
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase 
> Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope 
> doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a 
> root directory external to the cluster (such as in Amazon S3). This means 
> that the data is stored outside of the cluster and can be accessible after 
> the cluster has been terminated. One use case that is often asked about is 
> pointing multiple clusters to one root directory (sharing the data) to have 
> read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a 
> read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata 
> operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets 
> confused with multiple clusters trying to update the meta table with their ip 
> addresses)
> Adding refresh functionality for the meta table to ensure new metadata is 
> picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data 
> is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external 
> filesystem.
>  
> Please note that this feature is still quite manual (with the potential for 
> automation later).
>  
> More information on this particular feature can be found here: 
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20387) flaky infrastructure should work for all branches

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576765#comment-16576765
 ] 

Sean Busbey commented on HBASE-20387:
-

messing with this on a branch has convinced me that it can't be a part of the 
current nightly job.  jobs that consume the include/exclude list we generate 
should be robust to failures (like if docker changes behavior). That means they 
ought to get artifacts off of the "last success build" URL. incorporating the 
report into the nightly job means that if any part of our nightly test suite 
fails, all the various test jobs will be stuck consuming an old flaky report 
thus missing information about a test that caused nightly to fail.

doing this as multibranch pipeline job(s) still makes sense, they'll just need 
to be new ones.

> flaky infrastructure should work for all branches
> -
>
> Key: HBASE-20387
> URL: https://issues.apache.org/jira/browse/HBASE-20387
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
>
> We need a flaky list per-branch, since what does/does not work reliably on 
> master isn't really relevant to our older maintenance release lines.
> We should just make the invocation a step in the current per-branch nightly 
> jobs, prior to when we need the list in the stages that run unit tests. We 
> can publish it in the nightly job as well so that precommit can still get it. 
> (and can fetch it per-branch if needed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20387) flaky infrastructure should work for all branches

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20387:

Summary: flaky infrastructure should work for all branches  (was: Fold 
flaky test finding into nightly job)

> flaky infrastructure should work for all branches
> -
>
> Key: HBASE-20387
> URL: https://issues.apache.org/jira/browse/HBASE-20387
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
>
> We need a flaky list per-branch, since what does/does not work reliably on 
> master isn't really relevant to our older maintenance release lines.
> We should just make the invocation a step in the current per-branch nightly 
> jobs, prior to when we need the list in the stages that run unit tests. We 
> can publish it in the nightly job as well so that precommit can still get it. 
> (and can fetch it per-branch if needed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20705) Having RPC Quota on a table prevents Space quota to be recreated/removed

2018-08-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576763#comment-16576763
 ] 

Josh Elser commented on HBASE-20705:


[~jatsakthi], _you_ should be making the effort to understand what you changed 
and if you happened to have broke something else :)

That said, you only made changes to TestMasterQuotasObserver, so I can't 
imagine how you would have broken that other test.

Looking at 
[https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html,]
 that test which failed has been reported as being flaky.

> Having RPC Quota on a table prevents Space quota to be recreated/removed
> 
>
> Key: HBASE-20705
> URL: https://issues.apache.org/jira/browse/HBASE-20705
> Project: HBase
>  Issue Type: Bug
>Reporter: Biju Nair
>Assignee: Sakthi
>Priority: Major
> Attachments: hbase-20705.master.001.patch
>
>
> * Property {{hbase.quota.remove.on.table.delete}} is set to {{true}} by 
> default
>  * Create a table and set RPC and Space quota
> {noformat}
> hbase(main):022:0> create 't2','cf1'
> Created table t2
> Took 0.7420 seconds
> => Hbase::Table - t2
> hbase(main):023:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0105 seconds
> hbase(main):024:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.0186 seconds
> hbase(main):025:0> list_quotas
> TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 
> 10M/sec, SCOPE => MACHINE
> TABLE => t2 TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, VIOLATION_POLICY 
> => NO_WRITES{noformat}
>  * Drop the table and the Space quota is set to {{REMOVE => true}}
> {noformat}
> hbase(main):026:0> disable 't2'
> Took 0.4363 seconds
> hbase(main):027:0> drop 't2'
> Took 0.2344 seconds
> hbase(main):028:0> list_quotas
> TABLE => t2 TYPE => SPACE, TABLE => t2, REMOVE => true
> USER => u1 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, 
> SCOPE => MACHINE{noformat}
>  * Recreate the table and set Space quota back. The Space quota on the table 
> is still set to {{REMOVE => true}}
> {noformat}
> hbase(main):029:0> create 't2','cf1'
> Created table t2
> Took 0.7348 seconds
> => Hbase::Table - t2
> hbase(main):031:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0088 seconds
> hbase(main):032:0> list_quotas
> OWNER QUOTAS
> TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 
> 10M/sec, SCOPE => MACHINE
> TABLE => t2 TYPE => SPACE, TABLE => t2, REMOVE => true{noformat}
>  * Remove RPC quota and drop the table, the Space Quota is not removed
> {noformat}
> hbase(main):033:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => NONE
> Took 0.0193 seconds
> hbase(main):036:0> disable 't2'
> Took 0.4305 seconds
> hbase(main):037:0> drop 't2'
> Took 0.2353 seconds
> hbase(main):038:0> list_quotas
> OWNER QUOTAS
> TABLE => t2                               TYPE => SPACE, TABLE => t2, REMOVE 
> => true{noformat}
>  * Deleting the quota entry from {{hbase:quota}} seems to be the option to 
> reset it. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21005) Maven site configuration causes downstream projects to get a directory named ${project.basedir}

2018-08-10 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-21005:
---
Attachment: HBASE-21005.002.patch

> Maven site configuration causes downstream projects to get a directory named 
> ${project.basedir}
> ---
>
> Key: HBASE-21005
> URL: https://issues.apache.org/jira/browse/HBASE-21005
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.0.0
>Reporter: Matt Burgess
>Assignee: Josh Elser
>Priority: Minor
> Attachments: HBASE-21005.001.patch, HBASE-21005.002.patch
>
>
> Matt told me about this interesting issue they see down in Apache Nifi's build
> NiFi depends on HBase for some code that they provide to their users. As a 
> part of the build process of NiFi, they are seeing a directory named 
> {{$\{project.basedir}}} get created the first time they build with an empty 
> Maven repo. Matt reports that after a javax.el artifact is cached, Maven will 
> stop creating the directory; however, if you wipe that artifact from the 
> Maven repo, the next build will end up re-creating it.
> I believe I've seen this with Phoenix, too, but never investigated why it was 
> actually happening.
> My hunch is that it's related to the local maven repo that we create to 
> "patch" in our custom maven-fluido-skin jar (HBASE-14785). I'm not sure if we 
> can "work" around this by pushing the custom local repo into a profile and 
> only activating that for the mvn-site. Another solution would be to publish 
> the maven-fluido-jar to central with custom coordinates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21005) Maven site configuration causes downstream projects to get a directory named ${project.basedir}

2018-08-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576741#comment-16576741
 ] 

Josh Elser commented on HBASE-21005:


{quote}Can you add to the comment explaining why we are pulling from your 
personal maven GAV space with a link to the github fork? Is it 
[https://github.com/joshelser/maven-fluido-skin/tree/1.4-HBase-patched] ?
{quote}
Missed this action item. Will throw up a new patch with an updated comment.

Any other objections/requests?

> Maven site configuration causes downstream projects to get a directory named 
> ${project.basedir}
> ---
>
> Key: HBASE-21005
> URL: https://issues.apache.org/jira/browse/HBASE-21005
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.0.0
>Reporter: Matt Burgess
>Assignee: Josh Elser
>Priority: Minor
> Attachments: HBASE-21005.001.patch
>
>
> Matt told me about this interesting issue they see down in Apache Nifi's build
> NiFi depends on HBase for some code that they provide to their users. As a 
> part of the build process of NiFi, they are seeing a directory named 
> {{$\{project.basedir}}} get created the first time they build with an empty 
> Maven repo. Matt reports that after a javax.el artifact is cached, Maven will 
> stop creating the directory; however, if you wipe that artifact from the 
> Maven repo, the next build will end up re-creating it.
> I believe I've seen this with Phoenix, too, but never investigated why it was 
> actually happening.
> My hunch is that it's related to the local maven repo that we create to 
> "patch" in our custom maven-fluido-skin jar (HBASE-14785). I'm not sure if we 
> can "work" around this by pushing the custom local repo into a profile and 
> only activating that for the mvn-site. Another solution would be to publish 
> the maven-fluido-jar to central with custom coordinates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-10 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576731#comment-16576731
 ] 

Tak Lon (Stephen) Wu edited comment on HBASE-21028 at 8/10/18 7:00 PM:
---

[non-binding]

Verified the patch is the same as HBASE-18633 , but (a nit/trivial comment) 
would you fix the 
[checkstyle|https://builds.apache.org/job/PreCommit-HBASE-Build/13996/artifact/patchprocess/diff-checkstyle-hbase-server.txt]
 error for {{TestMultiLogThreshold.java}} (where all those lines are longer 
than 100) ?

Also we will need a committer or RM for branch-1 or branch-1.3 to see if 1.3.3 
is the right fix version. should you also assign this item to yourself ?


was (Author: taklwu):
[non-binding]

Verified the patch is the same as HBASE-18633 , but (a nit/trivial comment) 
would you fix the 
[checkstyle|https://builds.apache.org/job/PreCommit-HBASE-Build/13996/artifact/patchprocess/diff-checkstyle-hbase-server.txt]
 error for {{TestMultiLogThreshold.java}} (where all those lines are longer 
than 100) ?

Also we will need a committer or RM for branch-1 or branch-1.3 to see if 1.3.3 
is the right fix version.

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-10 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576731#comment-16576731
 ] 

Tak Lon (Stephen) Wu commented on HBASE-21028:
--

[non-binding]

Verified the patch is the same as HBASE-18633 , but (a nit/trivial comment) 
would you fix the 
[checkstyle|https://builds.apache.org/job/PreCommit-HBASE-Build/13996/artifact/patchprocess/diff-checkstyle-hbase-server.txt]
 error for {{TestMultiLogThreshold.java}} (where all those lines are longer 
than 100) ?

Also we will need a committer or RM for branch-1 or branch-1.3 to see if 1.3.3 
is the right fix version.

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576704#comment-16576704
 ] 

stack commented on HBASE-21025:
---

+1 for branch 2.0.  the jmx clashing ports is an old issue.

> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20512) document change to running tests on secure clusters

2018-08-10 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576677#comment-16576677
 ] 

Tak Lon (Stephen) Wu commented on HBASE-20512:
--

+1 [non-binding] 

> document change to running tests on secure clusters
> ---
>
> Key: HBASE-20512
> URL: https://issues.apache.org/jira/browse/HBASE-20512
> Project: HBase
>  Issue Type: Task
>  Components: documentation, integration tests, Usability
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Sean Busbey
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.0.2
>
> Attachments: HBASE-20512.master.001.patch
>
>
> We should document the change to authentication handling in HBASE-16231 in 
> the upgrade section of the reference guide.
> It's surprising to folks that have existing automated testing that's been 
> working on our prior stable release lines. We should give a warning to those 
> updating. The release note is probably suitable for a first pass.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20753) reference guide should direct security related issues to priv...@hbase.apache.org

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576671#comment-16576671
 ] 

Hudson commented on HBASE-20753:


Results for branch master
[build #425 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/425/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/425//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/425//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/425//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> reference guide should direct security related issues to 
> priv...@hbase.apache.org
> -
>
> Key: HBASE-20753
> URL: https://issues.apache.org/jira/browse/HBASE-20753
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, security
>Reporter: Sean Busbey
>Assignee: Sahil Aggarwal
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-20753.master.001.patch
>
>
> the reference guide currently directs folks to send security issues to 
> priv...@apache.org:
> {quote}
> To protect existing HBase installations from new vulnerabilities, please do 
> not use JIRA to report security-related bugs. Instead, send your report to 
> the mailing list priv...@apache.org, which allows anyone to send messages, 
> but restricts who can read them. Someone on that list will contact you to 
> follow up on your report.
> {quote}
> This address does not exist. It should tell folks to send the email to 
> priv...@hbase.apache.org.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576670#comment-16576670
 ] 

Hudson commented on HBASE-21025:


Results for branch master
[build #425 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/425/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/425//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/425//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/425//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20753) reference guide should direct security related issues to priv...@hbase.apache.org

2018-08-10 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576644#comment-16576644
 ] 

Mingliang Liu commented on HBASE-20753:
---

+1 (non-binding)

> reference guide should direct security related issues to 
> priv...@hbase.apache.org
> -
>
> Key: HBASE-20753
> URL: https://issues.apache.org/jira/browse/HBASE-20753
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, security
>Reporter: Sean Busbey
>Assignee: Sahil Aggarwal
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-20753.master.001.patch
>
>
> the reference guide currently directs folks to send security issues to 
> priv...@apache.org:
> {quote}
> To protect existing HBase installations from new vulnerabilities, please do 
> not use JIRA to report security-related bugs. Instead, send your report to 
> the mailing list priv...@apache.org, which allows anyone to send messages, 
> but restricts who can read them. Someone on that list will contact you to 
> follow up on your report.
> {quote}
> This address does not exist. It should tell folks to send the email to 
> priv...@hbase.apache.org.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20917) MetaTableMetrics#stop references uninitialized requestsMap for non-meta region

2018-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576643#comment-16576643
 ] 

Ted Yu commented on HBASE-20917:


[~xucang]:
Can you take a look at the addendum ?

thanks

> MetaTableMetrics#stop references uninitialized requestsMap for non-meta region
> --
>
> Key: HBASE-20917
> URL: https://issues.apache.org/jira/browse/HBASE-20917
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0, 1.4.6, 2.2.0
>
> Attachments: 20917.addendum, 20917.v1.txt, 20917.v2.txt
>
>
> I noticed the following in test output:
> {code}
> 2018-07-21 15:54:43,181 ERROR [RS_CLOSE_REGION-regionserver/172.17.5.4:0-1] 
> executor.EventHandler(186): Caught throwable while processing event 
> M_RS_CLOSE_REGION
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.coprocessor.MetaTableMetrics.stop(MetaTableMetrics.java:329)
>   at 
> org.apache.hadoop.hbase.coprocessor.BaseEnvironment.shutdown(BaseEnvironment.java:91)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionEnvironment.shutdown(RegionCoprocessorHost.java:165)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.shutdown(CoprocessorHost.java:290)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.postEnvCall(RegionCoprocessorHost.java:559)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:622)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postClose(RegionCoprocessorHost.java:551)
>   at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1678)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1484)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> {code}
> {{requestsMap}} is only initialized for the meta region.
> However, check for meta region is absent in the stop method:
> {code}
>   public void stop(CoprocessorEnvironment e) throws IOException {
> // since meta region can move around, clear stale metrics when stop.
> for (String meterName : requestsMap.keySet()) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20917) MetaTableMetrics#stop references uninitialized requestsMap for non-meta region

2018-08-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20917:
---
Attachment: 20917.addendum

> MetaTableMetrics#stop references uninitialized requestsMap for non-meta region
> --
>
> Key: HBASE-20917
> URL: https://issues.apache.org/jira/browse/HBASE-20917
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0, 1.4.6, 2.2.0
>
> Attachments: 20917.addendum, 20917.v1.txt, 20917.v2.txt
>
>
> I noticed the following in test output:
> {code}
> 2018-07-21 15:54:43,181 ERROR [RS_CLOSE_REGION-regionserver/172.17.5.4:0-1] 
> executor.EventHandler(186): Caught throwable while processing event 
> M_RS_CLOSE_REGION
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.coprocessor.MetaTableMetrics.stop(MetaTableMetrics.java:329)
>   at 
> org.apache.hadoop.hbase.coprocessor.BaseEnvironment.shutdown(BaseEnvironment.java:91)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionEnvironment.shutdown(RegionCoprocessorHost.java:165)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.shutdown(CoprocessorHost.java:290)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.postEnvCall(RegionCoprocessorHost.java:559)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:622)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postClose(RegionCoprocessorHost.java:551)
>   at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1678)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1484)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> {code}
> {{requestsMap}} is only initialized for the meta region.
> However, check for meta region is absent in the stop method:
> {code}
>   public void stop(CoprocessorEnvironment e) throws IOException {
> // since meta region can move around, clear stale metrics when stop.
> for (String meterName : requestsMap.keySet()) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15233) Bytes.toBytes() methods should allow arrays to be re-used

2018-08-10 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576599#comment-16576599
 ] 

Mike Drob commented on HBASE-15233:
---

Assuming you meant {{byte[] reuse}}, not {{byte reuse}}, then in the first, you 
allocate a new array and then immediately discard it, while in the second you 
allocate an array and then pass it to the method to be filled in.

> Bytes.toBytes() methods should allow arrays to be re-used 
> --
>
> Key: HBASE-15233
> URL: https://issues.apache.org/jira/browse/HBASE-15233
> Project: HBase
>  Issue Type: Improvement
>  Components: API
>Affects Versions: 1.1.3
>Reporter: Jean-Marc Spaggiari
>Assignee: Michael Ernest
>Priority: Minor
>  Labels: beginner
>
> Today we have this:
> {code}
>   public static byte[] toBytes(long val) {
> byte [] b = new byte[8];
> for (int i = 7; i > 0; i--) {
>   b[i] = (byte) val;
>   val >>>= 8;
> }
> b[0] = (byte) val;
> return b;
>   }
> {code}
> might be nice to also have this:
> {code}
>   public static byte[] toBytes(long val, byte[] reuse) {
> for (int i = 7; i > 0; i--) {
>   reuse[i] = (byte) val;
>   val >>>= 8;
> }
> reuse[0] = (byte) val;
> return reuse;
>   }
> {code}
> Same for all the other Bytes.toBytes() methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21032) ScanResponses contain only one cell each

2018-08-10 Thread Andrey Elenskiy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576525#comment-16576525
 ] 

Andrey Elenskiy commented on HBASE-21032:
-

Yes, it should always try to fit MaxResultSize of Cells into a partial 
ScanResponse. You can try out the difference by running the code I provided on 
hbase 2.0.0 (will return only ~2-3 results) and hbase 2.1.0 returns ~260 
ScanResponses (2X that if you account for heartbeats).

> ScanResponses contain only one cell each
> 
>
> Key: HBASE-21032
> URL: https://issues.apache.org/jira/browse/HBASE-21032
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners
>Affects Versions: 2.1.0
> Environment: HBase 2.1.0
> Hadoop 2.8.4
> Java 8
>Reporter: Andrey Elenskiy
>Priority: Major
> Attachments: App.java
>
>
> I have a long row with a bunch of columns that I'm scanning with 
> setAllowPartialResults(true). In the response I'm getting the first partial 
> ScanResponse being around 2MB with multiple cells while all of the consequent 
> ones being 1 cell per ScanResponse. After digging more, I found that each of 
> those single cell ScanResponse partials are preceded by a heartbeat (zero 
> cells). This results in two requests per cell to a regionserver.
> I've attached code to reproduce it on hbase version 2.1.0 (it works as 
> expected on 2.0.0 and 2.0.1).
> [^App.java]
> I'm fairly certain it's a serverside issue as 
> [gohbase|https://github.com/tsuna/gohbase] client is having the same issue. I 
> have not tried to reproduce this with multi-row scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576494#comment-16576494
 ] 

Ted Yu commented on HBASE-21031:


Can you come up with a test which fails if {{dropMemStoreContents}} only rolls 
back single region (the region which encounters Throwable) ?

Thanks

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576492#comment-16576492
 ] 

Allan Yang commented on HBASE-21031:


{quote}
TestRecoveredEidtsReplayAndAbort passes with the above change.
{quote}
Sorry, [~yuzhih...@gmail.com], I missed your point.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21029) Miscount of memstore's heap/offheap size if same cell was put

2018-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576479#comment-16576479
 ] 

Ted Yu edited comment on HBASE-21029 at 8/10/18 3:59 PM:
-

I applied the test change only in master branch and ran:

-Dtest=TestDefaultMemStore#testPutSameCell

The test passed.
Can you come up with test which fails without change to Segment.java ?

Thanks


was (Author: yuzhih...@gmail.com):
I applied the test change only and ran:

-Dtest=TestDefaultMemStore#testPutSameCell

The test passed.
Can you come up with test which fails without change to Segment.java ?

Thanks

> Miscount of memstore's heap/offheap size if same cell was put
> -
>
> Key: HBASE-21029
> URL: https://issues.apache.org/jira/browse/HBASE-21029
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21029.branch-2.0.001.patch, 
> HBASE-21029.branch-2.0.002.patch
>
>
> We are now using memstore.heapSize() + memstore.offheapSize() to decide 
> whether a flush is needed. But, if a same cell was put again in memstore, 
> only the memstore's dataSize will be increased, the heap/offheap size won't. 
> We encountered the problem that a user was putting the same kv again and 
> again, but the memstore won't flush since the heap size was not counted 
> properly. The RS was killed by system since not enough memory in the end.
> Actually, if MSLAB is used, the heap/offheap will increase no matter the cell 
> is added or not. IIRC, memstore's heap/offheap size should always bigger than 
> data size. We introduced heap/offheap size besides data size in order to 
> reflect memory footprint more precisely. 
> {code}
> // If there's already a same cell in the CellSet and we are using MSLAB, 
> we must count in the
> // MSLAB allocation size as well, or else there will be memory leak 
> (occupied heap size larger
> // than the counted number)
> if (succ || mslabUsed) {
>   cellSize = getCellLength(cellToAdd);
> }
> // heap/offheap size is changed only if the cell is truly added in the 
> cellSet
> long heapSize = heapSizeChange(cellToAdd, succ);
> long offHeapSize = offHeapSizeChange(cellToAdd, succ);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21029) Miscount of memstore's heap/offheap size if same cell was put

2018-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576479#comment-16576479
 ] 

Ted Yu commented on HBASE-21029:


I applied the test change only and ran:

-Dtest=TestDefaultMemStore#testPutSameCell

The test passed.
Can you come up with test which fails without change to Segment.java ?

Thanks

> Miscount of memstore's heap/offheap size if same cell was put
> -
>
> Key: HBASE-21029
> URL: https://issues.apache.org/jira/browse/HBASE-21029
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21029.branch-2.0.001.patch, 
> HBASE-21029.branch-2.0.002.patch
>
>
> We are now using memstore.heapSize() + memstore.offheapSize() to decide 
> whether a flush is needed. But, if a same cell was put again in memstore, 
> only the memstore's dataSize will be increased, the heap/offheap size won't. 
> We encountered the problem that a user was putting the same kv again and 
> again, but the memstore won't flush since the heap size was not counted 
> properly. The RS was killed by system since not enough memory in the end.
> Actually, if MSLAB is used, the heap/offheap will increase no matter the cell 
> is added or not. IIRC, memstore's heap/offheap size should always bigger than 
> data size. We introduced heap/offheap size besides data size in order to 
> reflect memory footprint more precisely. 
> {code}
> // If there's already a same cell in the CellSet and we are using MSLAB, 
> we must count in the
> // MSLAB allocation size as well, or else there will be memory leak 
> (occupied heap size larger
> // than the counted number)
> if (succ || mslabUsed) {
>   cellSize = getCellLength(cellToAdd);
> }
> // heap/offheap size is changed only if the cell is truly added in the 
> cellSet
> long heapSize = heapSizeChange(cellToAdd, succ);
> long offHeapSize = offHeapSizeChange(cellToAdd, succ);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576466#comment-16576466
 ] 

Ted Yu commented on HBASE-21031:


I modified the dropMemStoreContents() method by passing it the region which 
encounters Throwable.
{code}
  public MemStoreSize dropMemStoreContents(HRegion r) throws IOException {
...
  for (HStore s : stores.values()) {
if (!s.getHRegion().equals(r)) continue;
{code}
TestRecoveredEidtsReplayAndAbort passes with the above change.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-15233) Bytes.toBytes() methods should allow arrays to be re-used

2018-08-10 Thread Anni Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576457#comment-16576457
 ] 

Anni Du edited comment on HBASE-15233 at 8/10/18 3:38 PM:
--

Suppose we want to convert b to bytes, and we want to reuse it, what's the 
difference between
{code:java}
byte reuse = new byte[Byte.SIZEOF_LONG]
reuse = Bytes.toBytes(b)
{code}
and
{code:java}
byte reuse = new byte[Byte.SIZEOF_LONG]
Bytes.toBytes(b,reuse)
{code}
 

 


was (Author: adu):
Suppose we want to convert b to bytes, and we want to reuse it, what's the 
difference between

```

byte reuse = new byte[Byte.SIZEOF_LONG]
reuse = Bytes.toBytes(b)

```

and

```

byte reuse = new byte[Byte.SIZEOF_LONG]
Bytes.toBytes(b,reuse)

```

 

> Bytes.toBytes() methods should allow arrays to be re-used 
> --
>
> Key: HBASE-15233
> URL: https://issues.apache.org/jira/browse/HBASE-15233
> Project: HBase
>  Issue Type: Improvement
>  Components: API
>Affects Versions: 1.1.3
>Reporter: Jean-Marc Spaggiari
>Assignee: Michael Ernest
>Priority: Minor
>  Labels: beginner
>
> Today we have this:
> {code}
>   public static byte[] toBytes(long val) {
> byte [] b = new byte[8];
> for (int i = 7; i > 0; i--) {
>   b[i] = (byte) val;
>   val >>>= 8;
> }
> b[0] = (byte) val;
> return b;
>   }
> {code}
> might be nice to also have this:
> {code}
>   public static byte[] toBytes(long val, byte[] reuse) {
> for (int i = 7; i > 0; i--) {
>   reuse[i] = (byte) val;
>   val >>>= 8;
> }
> reuse[0] = (byte) val;
> return reuse;
>   }
> {code}
> Same for all the other Bytes.toBytes() methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HBASE-20387) Fold flaky test finding into nightly job

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-20387 started by Sean Busbey.
---
> Fold flaky test finding into nightly job
> 
>
> Key: HBASE-20387
> URL: https://issues.apache.org/jira/browse/HBASE-20387
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
>
> We need a flaky list per-branch, since what does/does not work reliably on 
> master isn't really relevant to our older maintenance release lines.
> We should just make the invocation a step in the current per-branch nightly 
> jobs, prior to when we need the list in the stages that run unit tests. We 
> can publish it in the nightly job as well so that precommit can still get it. 
> (and can fetch it per-branch if needed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-20387) Fold flaky test finding into nightly job

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reassigned HBASE-20387:
---

Assignee: Sean Busbey

> Fold flaky test finding into nightly job
> 
>
> Key: HBASE-20387
> URL: https://issues.apache.org/jira/browse/HBASE-20387
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
>
> We need a flaky list per-branch, since what does/does not work reliably on 
> master isn't really relevant to our older maintenance release lines.
> We should just make the invocation a step in the current per-branch nightly 
> jobs, prior to when we need the list in the stages that run unit tests. We 
> can publish it in the nightly job as well so that precommit can still get it. 
> (and can fetch it per-branch if needed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-15839) Track our flaky tests and use them to improve our build environment

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey resolved HBASE-15839.
-
Resolution: Fixed

> Track our flaky tests and use them to improve our build environment
> ---
>
> Key: HBASE-15839
> URL: https://issues.apache.org/jira/browse/HBASE-15839
> Project: HBase
>  Issue Type: Improvement
>Reporter: Appy
>Assignee: Appy
>Priority: Major
> Attachments: Screen Shot 2016-05-16 at 4.02.46 PM.png
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15233) Bytes.toBytes() methods should allow arrays to be re-used

2018-08-10 Thread Anni Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576457#comment-16576457
 ] 

Anni Du commented on HBASE-15233:
-

Suppose we want to convert b to bytes, and we want to reuse it, what's the 
difference between

```

byte reuse = new byte[Byte.SIZEOF_LONG]
reuse = Bytes.toBytes(b)

```

and

```

byte reuse = new byte[Byte.SIZEOF_LONG]
Bytes.toBytes(b,reuse)

```

 

> Bytes.toBytes() methods should allow arrays to be re-used 
> --
>
> Key: HBASE-15233
> URL: https://issues.apache.org/jira/browse/HBASE-15233
> Project: HBase
>  Issue Type: Improvement
>  Components: API
>Affects Versions: 1.1.3
>Reporter: Jean-Marc Spaggiari
>Assignee: Michael Ernest
>Priority: Minor
>  Labels: beginner
>
> Today we have this:
> {code}
>   public static byte[] toBytes(long val) {
> byte [] b = new byte[8];
> for (int i = 7; i > 0; i--) {
>   b[i] = (byte) val;
>   val >>>= 8;
> }
> b[0] = (byte) val;
> return b;
>   }
> {code}
> might be nice to also have this:
> {code}
>   public static byte[] toBytes(long val, byte[] reuse) {
> for (int i = 7; i > 0; i--) {
>   reuse[i] = (byte) val;
>   val >>>= 8;
> }
> reuse[0] = (byte) val;
> return reuse;
>   }
> {code}
> Same for all the other Bytes.toBytes() methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20988) TestShell shouldn't be skipped for hbase-shell module test

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576454#comment-16576454
 ] 

Sean Busbey commented on HBASE-20988:
-

I believe this is a duplicate of HBASE-19265

> TestShell shouldn't be skipped for hbase-shell module test
> --
>
> Key: HBASE-20988
> URL: https://issues.apache.org/jira/browse/HBASE-20988
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Major
>
> Here is snippet for QA run 13862 for HBASE-20985 :
> {code}
> 13:42:50 cd /testptch/hbase/hbase-shell
> 13:42:50 /usr/share/maven/bin/mvn 
> -Dmaven.repo.local=/home/jenkins/yetus-m2/hbase-master-patch-1 
> -DHBasePatchProcess -PrunAllTests 
> -Dtest.exclude.pattern=**/master.normalizer.
> TestSimpleRegionNormalizerOnCluster.java,**/replication.regionserver.TestSerialReplicationEndpoint.java,**/master.procedure.TestServerCrashProcedure.java,**/master.procedure.TestCreateTableProcedure.
> 
> java,**/TestClientOperationTimeout.java,**/client.TestSnapshotFromClientWithRegionReplicas.java,**/master.TestAssignmentManagerMetrics.java,**/client.TestShell.java,**/client.
> 
> TestCloneSnapshotFromClientWithRegionReplicas.java,**/master.TestDLSFSHLog.java,**/replication.TestReplicationSmallTestsSync.java,**/master.procedure.TestModifyTableProcedure.java,**/regionserver.
>
> TestCompactionInDeadRegionServer.java,**/client.TestFromClientSide3.java,**/master.procedure.TestRestoreSnapshotProcedure.java,**/client.TestRestoreSnapshotFromClient.java,**/security.access.
> 
> TestCoprocessorWhitelistMasterObserver.java,**/replication.regionserver.TestDrainReplicationQueuesForStandBy.java,**/master.procedure.TestProcedurePriority.java,**/master.locking.TestLockProcedure.
>   
> java,**/master.cleaner.TestSnapshotFromMaster.java,**/master.assignment.TestSplitTableRegionProcedure.java,**/client.TestMobRestoreSnapshotFromClient.java,**/replication.TestReplicationKillSlaveRS.
>   
> java,**/regionserver.TestHRegion.java,**/security.access.TestAccessController.java,**/master.procedure.TestTruncateTableProcedure.java,**/client.TestAsyncReplicationAdminApiWithClusters.java,**/
>  
> coprocessor.TestMetaTableMetrics.java,**/client.TestMobSnapshotCloneIndependence.java,**/namespace.TestNamespaceAuditor.java,**/master.TestMasterAbortAndRSGotKilled.java,**/client.TestAsyncTable.java,**/master.TestMasterOperationsForRegionReplicas.java,**/util.TestFromClientSide3WoUnsafe.java,**/client.TestSnapshotCloneIndependence.java,**/client.TestAsyncDecommissionAdminApi.java,**/client.
> 
> TestRestoreSnapshotFromClientWithRegionReplicas.java,**/master.assignment.TestMasterAbortWhileMergingTable.java,**/client.TestFromClientSide.java,**/client.TestAdmin1.java,**/client.
>  
> TestFromClientSideWithCoprocessor.java,**/replication.TestReplicationKillSlaveRSWithSeparateOldWALs.java,**/master.procedure.TestMasterFailoverWithProcedures.java,**/regionserver.
> TestSplitTransactionOnCluster.java clean test -fae > 
> /testptch/patchprocess/patch-unit-hbase-shell.txt 2>&1
> {code}
> In this case, there was modification to shell script, leading to running 
> shell tests.
> However, TestShell was excluded in the QA run, defeating the purpose.
> Meanwhile QA posted the following onto HBASE-20985 :
> bq. +1unit7m 4s   hbase-shell in the patch passed.
> That is misleading - no related test was actually run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-12436) Fix release artifact distribution

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey resolved HBASE-12436.
-
Resolution: Cannot Reproduce

at some point this all got fixed.

> Fix release artifact distribution
> -
>
> Key: HBASE-12436
> URL: https://issues.apache.org/jira/browse/HBASE-12436
> Project: HBase
>  Issue Type: Task
>Reporter: Sean Busbey
>Priority: Blocker
>
> {quote}
> The old rsync way of release artifact distribution has been turned off. As
> a result, all of our mirrored download links now lead only to hbase-0.94.23
> and hbase-0.96.2. We are going to need to figure out how to use the new svn
> pubsub method and recreate and republish artifacts for earlier releases.
> {quote}
> The artifacts for releases that need to be republished are available on the 
> ASF archive: http://archive.apache.org/dist/hbase/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20753) reference guide should direct security related issues to priv...@hbase.apache.org

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20753:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

thanks for the patch [~awked06]! and thanks for changing the status so QABot 
(and I!) would notice [~liuml07]

> reference guide should direct security related issues to 
> priv...@hbase.apache.org
> -
>
> Key: HBASE-20753
> URL: https://issues.apache.org/jira/browse/HBASE-20753
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, security
>Reporter: Sean Busbey
>Assignee: Sahil Aggarwal
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-20753.master.001.patch
>
>
> the reference guide currently directs folks to send security issues to 
> priv...@apache.org:
> {quote}
> To protect existing HBase installations from new vulnerabilities, please do 
> not use JIRA to report security-related bugs. Instead, send your report to 
> the mailing list priv...@apache.org, which allows anyone to send messages, 
> but restricts who can read them. Someone on that list will contact you to 
> follow up on your report.
> {quote}
> This address does not exist. It should tell folks to send the email to 
> priv...@hbase.apache.org.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-20753) reference guide should direct security related issues to priv...@hbase.apache.org

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reassigned HBASE-20753:
---

Assignee: Sahil Aggarwal

> reference guide should direct security related issues to 
> priv...@hbase.apache.org
> -
>
> Key: HBASE-20753
> URL: https://issues.apache.org/jira/browse/HBASE-20753
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, security
>Reporter: Sean Busbey
>Assignee: Sahil Aggarwal
>Priority: Critical
>  Labels: beginner
> Attachments: HBASE-20753.master.001.patch
>
>
> the reference guide currently directs folks to send security issues to 
> priv...@apache.org:
> {quote}
> To protect existing HBase installations from new vulnerabilities, please do 
> not use JIRA to report security-related bugs. Instead, send your report to 
> the mailing list priv...@apache.org, which allows anyone to send messages, 
> but restricts who can read them. Someone on that list will contact you to 
> follow up on your report.
> {quote}
> This address does not exist. It should tell folks to send the email to 
> priv...@hbase.apache.org.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576432#comment-16576432
 ] 

Hadoop QA commented on HBASE-21031:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
23s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
58s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
15s{color} | {color:red} hbase-server: The patch generated 1 new + 250 
unchanged - 0 fixed = 251 total (was 250) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m  
7s{color} | {color:red} patch has 10 errors when building our shaded downstream 
artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}101m 
41s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
19s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935134/HBASE-21031.branch-2.0.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2ad2799203a4 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 7ee4aa459c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14002/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| shadedjars | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14002/artifact/patchprocess/patch-shadedjars.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14002/testReport/ |
| 

[jira] [Commented] (HBASE-20979) Flaky test reporting should specify what JSON it needs and handle HTTP errors

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576431#comment-16576431
 ] 

Hadoop QA commented on HBASE-20979:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} pylint {color} | {color:green}  0m  
3s{color} | {color:green} There were no new pylint issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  0m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20979 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935144/HBASE-20979.2.patch |
| Optional Tests |  asflicense  pylint  |
| uname | Linux 2c02c04fb48c 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 397388316e |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| pylint | v1.6.5 |
| Max. process+thread count | 42 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14003/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Flaky test reporting should specify what JSON it needs and handle HTTP errors
> -
>
> Key: HBASE-20979
> URL: https://issues.apache.org/jira/browse/HBASE-20979
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Attachments: HBASE-20979.0.txt, HBASE-20979.1.patch, 
> HBASE-20979.2.patch
>
>
> Current flaky test report should be including the {{tree=}} parameter in its 
> Jenkins API calls (see 
> https://support.cloudbees.com/hc/en-us/articles/217911388-Best-Practice-For-Using-Jenkins-REST-API).
> Also should provide some info on failure so that when jobs change or go away 
> we don't get blank failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20979) Flaky test reporting should specify what JSON it needs and handle HTTP errors

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576416#comment-16576416
 ] 

Sean Busbey commented on HBASE-20979:
-

-v2
  - fix the pylint error (sorry, just noticed it)

> Flaky test reporting should specify what JSON it needs and handle HTTP errors
> -
>
> Key: HBASE-20979
> URL: https://issues.apache.org/jira/browse/HBASE-20979
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Attachments: HBASE-20979.0.txt, HBASE-20979.1.patch, 
> HBASE-20979.2.patch
>
>
> Current flaky test report should be including the {{tree=}} parameter in its 
> Jenkins API calls (see 
> https://support.cloudbees.com/hc/en-us/articles/217911388-Best-Practice-For-Using-Jenkins-REST-API).
> Also should provide some info on failure so that when jobs change or go away 
> we don't get blank failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20979) Flaky test reporting should specify what JSON it needs and handle HTTP errors

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20979:

Attachment: HBASE-20979.2.patch

> Flaky test reporting should specify what JSON it needs and handle HTTP errors
> -
>
> Key: HBASE-20979
> URL: https://issues.apache.org/jira/browse/HBASE-20979
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Attachments: HBASE-20979.0.txt, HBASE-20979.1.patch, 
> HBASE-20979.2.patch
>
>
> Current flaky test report should be including the {{tree=}} parameter in its 
> Jenkins API calls (see 
> https://support.cloudbees.com/hc/en-us/articles/217911388-Best-Practice-For-Using-Jenkins-REST-API).
> Also should provide some info on failure so that when jobs change or go away 
> we don't get blank failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20979) Flaky test reporting should specify what JSON it needs and handle HTTP errors

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20979:

Status: Patch Available  (was: In Progress)

-v1
  - update result retrieval to work with yetus builds


Okay I've tested this now with yetus and non-yetus builds and it works as 
expected.

you still +1 [~Apache9]?

> Flaky test reporting should specify what JSON it needs and handle HTTP errors
> -
>
> Key: HBASE-20979
> URL: https://issues.apache.org/jira/browse/HBASE-20979
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Attachments: HBASE-20979.0.txt, HBASE-20979.1.patch
>
>
> Current flaky test report should be including the {{tree=}} parameter in its 
> Jenkins API calls (see 
> https://support.cloudbees.com/hc/en-us/articles/217911388-Best-Practice-For-Using-Jenkins-REST-API).
> Also should provide some info on failure so that when jobs change or go away 
> we don't get blank failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20979) Flaky test reporting should specify what JSON it needs and handle HTTP errors

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20979:

Attachment: HBASE-20979.1.patch

> Flaky test reporting should specify what JSON it needs and handle HTTP errors
> -
>
> Key: HBASE-20979
> URL: https://issues.apache.org/jira/browse/HBASE-20979
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Attachments: HBASE-20979.0.txt, HBASE-20979.1.patch
>
>
> Current flaky test report should be including the {{tree=}} parameter in its 
> Jenkins API calls (see 
> https://support.cloudbees.com/hc/en-us/articles/217911388-Best-Practice-For-Using-Jenkins-REST-API).
> Also should provide some info on failure so that when jobs change or go away 
> we don't get blank failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21030) Correct javadoc for append operation

2018-08-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-21030:
--

Assignee: Subrat Mishra

> Correct javadoc for append operation
> 
>
> Key: HBASE-21030
> URL: https://issues.apache.org/jira/browse/HBASE-21030
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 1.5.0
>Reporter: Nihal Jain
>Assignee: Subrat Mishra
>Priority: Minor
>  Labels: beginner, beginners
>
> The doc for {{append}} operation is incorrect. (see {{@param append}} in the 
> code snippet below or 
> [Table.java#L566|https://github.com/apache/hbase/blob/3f5033f88ee9da2a5a42d058b9aefe57b089b3e1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Table.java#L566])
> {code:java}
>   /**
>* Appends values to one or more columns within a single row.
>* 
>* This operation guaranteed atomicity to readers. Appends are done
>* under a single row lock, so write operations to a row are synchronized, 
> and
>* readers are guaranteed to see this operation fully completed.
>*
>* @param append object that specifies the columns and amounts to be used
>*  for the increment operations
>* @throws IOException e
>* @return values of columns after the append operation (maybe null)
>*/
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.1

2018-08-10 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576382#comment-16576382
 ] 

Wei-Chiu Chuang commented on HBASE-21036:
-

(sorry updated my comments) My team spent a bunch of time on Hadoop 3.0 release 
line so didn't want it to sound like 3.0 is a crap :). But I'm sure the Hadoop 
3.1 release team also don't want it sound like 3.1 is a crap.

> Upgrade our hadoop 3 dependency to 3.1.1
> 
>
> Key: HBASE-21036
> URL: https://issues.apache.org/jira/browse/HBASE-21036
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, hadoop3
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>
> https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E
> 3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due 
> to YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.1

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576379#comment-16576379
 ] 

Sean Busbey commented on HBASE-21036:
-

ships in the night! I also just verified 3.0.3 has YARN-7190 in it. I've 
updated that jira and started figuring out what I need to do to make sure it's 
listed.

* What do folks think about moving 3.0.3 and 3.1.1 to "NT" status?
* making 3.0.3 the pom version in 2.y.z?
* making 3.1.1 the pom version in master?

we make a goal of 2.2.0 getting the 3.0 version into "S"?

> Upgrade our hadoop 3 dependency to 3.1.1
> 
>
> Key: HBASE-21036
> URL: https://issues.apache.org/jira/browse/HBASE-21036
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, hadoop3
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>
> https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E
> 3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due 
> to YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.1

2018-08-10 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576359#comment-16576359
 ] 

Wei-Chiu Chuang edited comment on HBASE-21036 at 8/10/18 2:48 PM:
--

I've checked Hadoop 3.0.3 has YARN-7190 in it. 

Hadoop 3.1 brought in a bunch of new YARN features. If HBase requires them to 
develop new features, it would make sense to move up to 3.1. 

With my Hadoop hat on, I strive to ensure Hadoop 3.0.x line is stable.


was (Author: jojochuang):
I've checked Hadoop 3.0.3 has YARN-7190 in it.

Hadoop 3.1 release line has a bunch of very new YARN features in it. Hadoop 
3.0.x is much better tested IMHO.

> Upgrade our hadoop 3 dependency to 3.1.1
> 
>
> Key: HBASE-21036
> URL: https://issues.apache.org/jira/browse/HBASE-21036
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, hadoop3
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>
> https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E
> 3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due 
> to YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21030) Correct javadoc for append operation

2018-08-10 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576374#comment-16576374
 ] 

Nihal Jain commented on HBASE-21030:


[~yuzhih...@gmail.com], can you add [~subrat.mishra] as contributor?!

> Correct javadoc for append operation
> 
>
> Key: HBASE-21030
> URL: https://issues.apache.org/jira/browse/HBASE-21030
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 1.5.0
>Reporter: Nihal Jain
>Priority: Minor
>  Labels: beginner, beginners
>
> The doc for {{append}} operation is incorrect. (see {{@param append}} in the 
> code snippet below or 
> [Table.java#L566|https://github.com/apache/hbase/blob/3f5033f88ee9da2a5a42d058b9aefe57b089b3e1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Table.java#L566])
> {code:java}
>   /**
>* Appends values to one or more columns within a single row.
>* 
>* This operation guaranteed atomicity to readers. Appends are done
>* under a single row lock, so write operations to a row are synchronized, 
> and
>* readers are guaranteed to see this operation fully completed.
>*
>* @param append object that specifies the columns and amounts to be used
>*  for the increment operations
>* @throws IOException e
>* @return values of columns after the append operation (maybe null)
>*/
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.1

2018-08-10 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576359#comment-16576359
 ] 

Wei-Chiu Chuang commented on HBASE-21036:
-

I've checked Hadoop 3.0.3 has YARN-7190 in it.

Hadoop 3.1 release line has a bunch of very new YARN features in it. Hadoop 
3.0.x is much better tested IMHO.

> Upgrade our hadoop 3 dependency to 3.1.1
> 
>
> Key: HBASE-21036
> URL: https://issues.apache.org/jira/browse/HBASE-21036
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, hadoop3
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>
> https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E
> 3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due 
> to YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20257) hbase-spark should not depend on com.google.code.findbugs.jsr305

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576350#comment-16576350
 ] 

Sean Busbey commented on HBASE-20257:
-

It looks like the {{hbase-spark-it}} module didn't get updated. It still has 
the "only warn about jsr305" plugin setting. with that removed and a clean QA 
run I'm +1.

> hbase-spark should not depend on com.google.code.findbugs.jsr305
> 
>
> Key: HBASE-20257
> URL: https://issues.apache.org/jira/browse/HBASE-20257
> Project: HBase
>  Issue Type: Task
>  Components: build, spark
>Affects Versions: 3.0.0
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20257.v01.patch, HBASE-20257.v02.patch, 
> HBASE-20257.v03.patch, HBASE-20257.v04.patch, HBASE-20257.v05.patch
>
>
> The following can be observed in the build output of master branch:
> {code}
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.BannedDependencies failed 
> with message:
> We don't allow the JSR305 jar from the Findbugs project, see HBASE-16321.
> Found Banned Dependency: com.google.code.findbugs:jsr305:jar:1.3.9
> Use 'mvn dependency:tree' to locate the source of the banned dependencies.
> {code}
> Here is related snippet from hbase-spark/pom.xml:
> {code}
> 
>   com.google.code.findbugs
>   jsr305
> {code}
> Dependency on jsr305 should be dropped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20965) Separate region server report requests to new handlers

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576335#comment-16576335
 ] 

Hudson commented on HBASE-20965:


Results for branch master
[build #424 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/424/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Separate region server report requests to new handlers
> --
>
> Key: HBASE-20965
> URL: https://issues.apache.org/jira/browse/HBASE-20965
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Reporter: Yi Mei
>Assignee: Yi Mei
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20965.branch-2.1.001.patch, 
> HBASE-20965.master.001.patch, HBASE-20965.master.002.patch, 
> HBASE-20965.master.003.patch, HBASE-20965.master.004.patch, 
> HBASE-20965.master.005.patch, HBASE-20965.master.006.patch, 
> HBASE-20965.master.007.patch, HBASE-20965.master.008.patch, 
> HBASE-20965.master.009.patch, HBASE-20965.master.010.patch, 
> HBASE-20965.master.011.patch
>
>
> In master rpc scheduler, all rpc requests are executed in a thread pool. This 
> task separates rs report requests to new handlers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576337#comment-16576337
 ] 

Hudson commented on HBASE-18201:


Results for branch master
[build #424 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/424/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.005.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.006.patch, 
> HBASE-18201.master.006.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21027) Inconsistent synchronization in CacheableDeserializerIdManager

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576336#comment-16576336
 ] 

Hudson commented on HBASE-21027:


Results for branch master
[build #424 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/424/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/424//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Inconsistent synchronization in CacheableDeserializerIdManager 
> ---
>
> Key: HBASE-21027
> URL: https://issues.apache.org/jira/browse/HBASE-21027
> Project: HBase
>  Issue Type: Task
>Affects Versions: 3.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21027.master.001.patch, 
> HBASE-21027.master.002.patch
>
>
> There is some inconsistent synchronization going on in CDIM, we should switch 
> it to using ConcurrentHashMap and simplify our code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21025:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to branch-2.0+. Thanks [~stack] for reviewing.

> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HBASE-20979) Flaky test reporting should specify what JSON it needs and handle HTTP errors

2018-08-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-20979 started by Sean Busbey.
---
> Flaky test reporting should specify what JSON it needs and handle HTTP errors
> -
>
> Key: HBASE-20979
> URL: https://issues.apache.org/jira/browse/HBASE-20979
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Minor
> Attachments: HBASE-20979.0.txt
>
>
> Current flaky test report should be including the {{tree=}} parameter in its 
> Jenkins API calls (see 
> https://support.cloudbees.com/hc/en-us/articles/217911388-Best-Practice-For-Using-Jenkins-REST-API).
> Also should provide some info on failure so that when jobs change or go away 
> we don't get blank failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.1

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576318#comment-16576318
 ] 

Sean Busbey commented on HBASE-21036:
-

would this mean giving up on getting a Hadoop 3.0.z release with YARN-7190?

> Upgrade our hadoop 3 dependency to 3.1.1
> 
>
> Key: HBASE-21036
> URL: https://issues.apache.org/jira/browse/HBASE-21036
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, hadoop3
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>
> https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E
> 3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due 
> to YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.1

2018-08-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576317#comment-16576317
 ] 

Sean Busbey commented on HBASE-21036:
-

This is a good idea for new minor releases. I don't think we've done enough 
diligence on what's changed from 3.0 to do maintenance releases yet.

> Upgrade our hadoop 3 dependency to 3.1.1
> 
>
> Key: HBASE-21036
> URL: https://issues.apache.org/jira/browse/HBASE-21036
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, hadoop3
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>
> https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E
> 3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due 
> to YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-10 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576290#comment-16576290
 ] 

Hari Sekhon edited comment on HBASE-21014 at 8/10/18 1:36 PM:
--

I thought that was really the crux of it:

Write the HDFS location preference hints the same as the FavoredNodeBalancer 
does while applying all the usual Stochastic Balancer balancing heuristics to 
make sure regions and load are evenly spread. Since the HBase Balancer chooses 
where to move regions to it can update the block location preferences metadata 
to match it whenever it migrates regions.

That way when you need to rebalance HDFS blocks, the HDFS Balancer won't move 
the region blocks away from the RegionServers where the regions are being 
served out of and therefore preserve HBase data locality.


was (Author: harisekhon):
I thought that was really the crux of it:

Write the HDFS location preference hints the same as the FavoredNodeBalancer 
does while applying all the usual Stochastic Balancer balancing heuristics to 
make sure regions and load are evenly spread. Since the HBase Balancer chooses 
where to move regions to it can update the block location preferences metadata 
to match it whenever it migrates regions.

That way if I need to rebalance HDFS blocks, the HDFS Balancer won't move the 
region blocks out of their primary active region locations when hdfs block 
pinning is enabled.

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-10 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576290#comment-16576290
 ] 

Hari Sekhon edited comment on HBASE-21014 at 8/10/18 1:33 PM:
--

I thought that was really the crux of it:

Write the HDFS location preference hints the same as the FavoredNodeBalancer 
does while applying all the usual Stochastic Balancer balancing heuristics to 
make sure regions and load are evenly spread. Since the HBase Balancer chooses 
where to move regions to it can update the block location preferences metadata 
to match it whenever it migrates regions.

That way if I need to rebalance HDFS blocks, the HDFS Balancer won't move the 
region blocks out of their primary active region locations when hdfs block 
pinning is enabled.


was (Author: harisekhon):
I thought this was really the crux of it:

Write the HDFS location preference hints the same as the FavoredNodeBalancer 
does while applying all the usual Stochastic Balancer balancing heuristics to 
make sure regions and load are evenly spread.

That way if I need to rebalance HDFS blocks, the HDFS Balancer won't move the 
region blocks out of their primary active region locations when hdfs block 
pinning is enabled.

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-10 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576290#comment-16576290
 ] 

Hari Sekhon commented on HBASE-21014:
-

I thought this was really the crux of it:

Write the HDFS location preference hints the same as the FavoredNodeBalancer 
does while applying all the usual Stochastic Balancer balancing heuristics to 
make sure regions and load are evenly spread.

That way if I need to rebalance HDFS blocks, the HDFS Balancer won't move the 
region blocks out of their primary active region locations when hdfs block 
pinning is enabled.

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576279#comment-16576279
 ] 

Duo Zhang commented on HBASE-21035:
---

The goal for HBASE-20708 is to remove the unnecessary scheduling for SCPs, and 
also remove the usage of RecoverMetaProcedure, if you want to them back then 
HBASE-20708 is useless...

I still stand my point, this is not the normal case as it breaks our 
assumptions, we can provide tools for operators to override these errors, the 
operators will take their own risk, but we should not try to address them in 
the normal code path.

And if you think the procedure wal is not stable which may lead to corruption 
files, please start to make it stable. And also we may introduce something like 
a backup for the procedure wals to prevent manually damages.

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576240#comment-16576240
 ] 

Allan Yang edited comment on HBASE-21035 at 8/10/18 1:11 PM:
-

If scheduling a SCP for servers with '-splitting' is not a good idea, maybe we 
can go around. 
But before HBASE-20708, there is a method called 
processofflineServersWithOnlineRegions which will schedule a SCP for any dead 
server have regions on it(which will cause HBASE-20976...). But after 
HBASE-20708 there isn't( replaced by processOfflineRegions). Can we just bring 
the logic in processofflineServersWithOnlineRegions back? What I want is the 
same behave w/ or wo/ HBASE-20708.


was (Author: allan163):
If scheduling a SCP for servers with '-splitting' is not a good idea, then we 
can go around. Before HBASE-20708, there is a method called 
processofflineServersWithOnlineRegions which will schedule a assign procedure 
for any regions on a dead server. But after HBASE-20708 there isn't( replaced 
by processOfflineRegions). Can we just bring the logic in 
processofflineServersWithOnlineRegions back? What I want is the same behave w/ 
or wo/ HBASE-20708.

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576240#comment-16576240
 ] 

Allan Yang commented on HBASE-21035:


If scheduling a SCP for servers with '-splitting' is not a good idea, then we 
can go around. Before HBASE-20708, there is a method called 
processofflineServersWithOnlineRegions which will schedule a assign procedure 
for any regions on a dead server. But after HBASE-20708 there isn't( replaced 
by processOfflineRegions). Can we just bring the logic in 
processofflineServersWithOnlineRegions back? What I want is the same behave w/ 
or wo/ HBASE-20708.

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.1

2018-08-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21036:
--
Summary: Upgrade our hadoop 3 dependency to 3.1.1  (was: Upgrade our hadoop 
3 dependency to 3.1.0)

> Upgrade our hadoop 3 dependency to 3.1.1
> 
>
> Key: HBASE-21036
> URL: https://issues.apache.org/jira/browse/HBASE-21036
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, hadoop3
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>
> https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E
> 3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due 
> to YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576233#comment-16576233
 ] 

Hadoop QA commented on HBASE-21035:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
36s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
11s{color} | {color:red} hbase-server: The patch generated 5 new + 176 
unchanged - 0 fixed = 181 total (was 176) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  2m 
57s{color} | {color:red} patch has 10 errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 10s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
28s{color} | {color:red} hbase-server generated 1 new + 1 unchanged - 0 fixed = 
2 total (was 1) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}104m 
40s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
18s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21035 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935119/HBASE-21035.branch-2.0.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 0a4799b4bac1 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 7ee4aa459c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14001/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| shadedjars | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14001/artifact/patchprocess/patch-shadedjars.txt
 |
| javadoc | 

[jira] [Created] (HBASE-21036) Upgrade our hadoop 3 dependency to 3.1.0

2018-08-10 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21036:
-

 Summary: Upgrade our hadoop 3 dependency to 3.1.0
 Key: HBASE-21036
 URL: https://issues.apache.org/jira/browse/HBASE-21036
 Project: HBase
  Issue Type: Sub-task
  Components: build, hadoop3
Reporter: Duo Zhang
 Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1


https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E

3.1.1 is a stable release. And since 3.0.x is not compatible with hbase due to 
YARN-7190, we should upgrade our dependency directly to 3.1.x line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576229#comment-16576229
 ] 

Duo Zhang commented on HBASE-21035:
---

What if you lose some edits of meta region and then bring the meta region 
online?

There are some assumption in the code, the renaming of wal directory is done in 
SCP so we can make sure that there is a SCP if the wal directory is already 
ended with '-splitting'. If this is not the case, we do not know what to do. In 
your case it is that you deleted all the procedures, but this is not the only 
possible case right? The damage is made from outside the system, or from an 
expected behavior, i.e, a serious bug in the code, so the decision should also 
be done outside the system.

I think we can add an admin method to allow operators to submit a SCP for a 
special RS, and also add an option to HBCK, which does the same thing in the 
patch here, scan the wal directory, re-submitting SCP for all the RSes which 
wal directories are ended with '-splitting'. But I'm strongly against adding 
this piece of code in normal the master startup path. It is really dangerous.

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21031:
---
Attachment: HBASE-21031.branch-2.0.003.patch

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20976) SCP can be scheduled multiple times for the same RS

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576210#comment-16576210
 ] 

Allan Yang commented on HBASE-20976:


{quote}
I think we'd better do it a bit clean without adding too much checks...

I think here we need to make sure that the deadServers check can work and 
prevent scheduling redundant SCPs. We can do the SCPs check when restarting is 
that, we have not started the PE yet so it is safe, but during the execution, 
this is not a good idea as there is no fencing...
{quote}
Yes, there is no fence here... But the worst case is that there is a race 
condition so we still schedule redundant SCPs, still better than now I think.
Making deadServers working is indeed a better idea, but I can't think a better 
way to do it for now.  IIRC, the deadservers are removed so that the master Web 
UI won't show a dead server foverever there...

> SCP can be scheduled multiple times for the same RS
> ---
>
> Key: HBASE-20976
> URL: https://issues.apache.org/jira/browse/HBASE-20976
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: HBASE-20976.branch-2.0.001.patch, 
> HBASE-20976.branch-2.0.002.patch
>
>
> SCP can be scheduled multiple times for the same RS:
> 1. a RS crashed, a SCP was submitted for it
> 2. before this SCP finish, the Master crashed
> 3. The new master will scan the meta table and find some region is still open 
> on a dead server
> 4. The new master submit a SCP for the dead server again
> The two SCP for the same RS can even execute concurrently if without 
> HBASE-20846…
> Provided a test case to reproduce this issue and a fix solution in the patch.
> Another case that SCP might be scheduled multiple times for the same RS(with 
> HBASE-20708.):
> 1.  a RS crashed, a SCP was submitted for it
> 2. A new RS on the same host started, the old RS's Serveranme was remove from 
> DeadServer.deadServers
> 3. after the SCP passed the Handle_RIT state, a UnassignProcedure need to 
> send a close region operation to the crashed RS
> 4. The UnassignProcedure's dispatch failed since 'NoServerDispatchException'
> 5. Begin to expire the RS, but only find it not online and not in deadServer 
> list, so a SCP was submitted for the same RS again
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576204#comment-16576204
 ] 

Allan Yang commented on HBASE-21035:


Or in short, my opinion is that, we have to try our best to make sure meta 
region can online when master initing, otherwise we can do nothing without meta 
region. But after HBASE-20708(which is indeed a great patch that simplify the 
start up process ), we are obviously not. You can try my test case, without 
HBASE-20708, it can work.

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576189#comment-16576189
 ] 

Allan Yang commented on HBASE-21035:


{quote}
You bring meta online and mess up all the data and cause unrecoverable data 
loss...
{quote}
I can't think of any condition that bringing meta online will cause data loss 
or any other unrecoverable cases…

In HBase-1.x, if something wrong with the AssignmentManager, restarting the 
master can fix the dilemma in most cases. In some catastrophic scenario, we can 
even delete all the RIT Znodes and assign them again. Since if only HDFS/ZK is 
normal and RS can work normally, the meta region can online at least no matter 
what.

But, this is not the case with AMv2,  All the states and procedures are 
persisted, restarting master will result in the same state before restarting 
(we are trying hard to ensure it...). Restarting master won't help like before, 
and also it is hard to interfere with procedures. That means we are not easy to 
recover the system if there is any bugs in AMv2(which is very likely...).

In some cases, we indeed need to delete all procedures making it clean for 
recovering. As it addressed in a doc('Fixing regions stuck in transition in 
HBase 2.0
') in HBASE-19121.
But, if a clean start still causing the system to hang, it is hard to let other 
fix tools like HBCK to kick in.

{quote}
For me, crash or hang is much much better than doing dangerous operations in 
code...
{quote}
>From my point of view, It is very essential that a production ready system 
>that can recover without change any code.  


> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20976) SCP can be scheduled multiple times for the same RS

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576186#comment-16576186
 ] 

Hadoop QA commented on HBASE-20976:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
29s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
41s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
46s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}187m 
11s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}223m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20976 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935097/HBASE-20976.branch-2.0.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 8949f767b952 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 7ee4aa459c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14000/testReport/ |
| Max. process+thread count | 4466 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-21029) Miscount of memstore's heap/offheap size if same cell was put

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576153#comment-16576153
 ] 

Hadoop QA commented on HBASE-21029:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
45s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
56s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 31s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}188m 
33s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}225m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21029 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935095/HBASE-21029.branch-2.0.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux ecdce4734d16 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 7ee4aa459c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13999/testReport/ |
| Max. process+thread count | 4451 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13999/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically 

[jira] [Commented] (HBASE-21030) Correct javadoc for append operation

2018-08-10 Thread Subrat Mishra (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576139#comment-16576139
 ] 

Subrat Mishra commented on HBASE-21030:
---

I would like to attach a patch for this. 

Can someone please add me as a contributor?

 

> Correct javadoc for append operation
> 
>
> Key: HBASE-21030
> URL: https://issues.apache.org/jira/browse/HBASE-21030
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 1.5.0
>Reporter: Nihal Jain
>Priority: Minor
>  Labels: beginner, beginners
>
> The doc for {{append}} operation is incorrect. (see {{@param append}} in the 
> code snippet below or 
> [Table.java#L566|https://github.com/apache/hbase/blob/3f5033f88ee9da2a5a42d058b9aefe57b089b3e1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Table.java#L566])
> {code:java}
>   /**
>* Appends values to one or more columns within a single row.
>* 
>* This operation guaranteed atomicity to readers. Appends are done
>* under a single row lock, so write operations to a row are synchronized, 
> and
>* readers are guaranteed to see this operation fully completed.
>*
>* @param append object that specifies the columns and amounts to be used
>*  for the increment operations
>* @throws IOException e
>* @return values of columns after the append operation (maybe null)
>*/
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576124#comment-16576124
 ] 

Hudson commented on HBASE-18201:


Results for branch branch-2
[build #1088 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1088/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1088//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1088//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1088//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.005.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.006.patch, 
> HBASE-18201.master.006.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-08-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576111#comment-16576111
 ] 

Hudson commented on HBASE-18201:


Results for branch branch-2.1
[build #166 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/166/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/166//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/166//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/166//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.005.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.006.patch, 
> HBASE-18201.master.006.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20540) [umbrella] Hadoop 3 compatibility

2018-08-10 Thread Pavel (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576098#comment-16576098
 ] 

Pavel edited comment on HBASE-20540 at 8/10/18 10:56 AM:
-

It looks like hadoop 3.1 bunch has production ready release now

[Apache Hadoop 3.1.1 
release|https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E]


was (Author: pkirillov):
It looks like hadoop 3.1 bunch has production ready release now

[[ANNOUNCE] Apache Hadoop 3.1.1 
release|https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E]

> [umbrella] Hadoop 3 compatibility
> -
>
> Key: HBASE-20540
> URL: https://issues.apache.org/jira/browse/HBASE-20540
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 2.0.2, 2.1.1
>
>
> There are known issues about the hadoop 3 compatibility for hbase 2. But 
> hadoop 3 is still not production ready. So we will link the issues here and 
> once there is a production ready hadoop 3 release, we will fix these issues 
> soon and upgrade our dependencies on hadoop, and also update the support 
> matrix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20540) [umbrella] Hadoop 3 compatibility

2018-08-10 Thread Pavel (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576098#comment-16576098
 ] 

Pavel commented on HBASE-20540:
---

It looks like hadoop 3.1 bunch has production ready release now

[[ANNOUNCE] Apache Hadoop 3.1.1 
release|https://lists.apache.org/thread.html/895f28e0941b37f006812afa383ff8ff9148fafc4a5be385aebd0fa1@%3Cgeneral.hadoop.apache.org%3E]

> [umbrella] Hadoop 3 compatibility
> -
>
> Key: HBASE-20540
> URL: https://issues.apache.org/jira/browse/HBASE-20540
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 2.0.2, 2.1.1
>
>
> There are known issues about the hadoop 3 compatibility for hbase 2. But 
> hadoop 3 is still not production ready. So we will link the issues here and 
> once there is a production ready hadoop 3 release, we will fix these issues 
> soon and upgrade our dependencies on hadoop, and also update the support 
> matrix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576096#comment-16576096
 ] 

Duo Zhang commented on HBASE-21035:
---

Anyway I'm -1 on doing anything automatically to try to recover if the core 
systems are broken. Here you are removing all the procedures so the solution 
maybe fine, as the error can not be recovered any more. But what if it is just 
a permission problem or something else? You bring meta online and mess up all 
the data and cause unrecoverable data loss...

For me, crash or hang is much much better than doing dangerous operations in 
code...

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20976) SCP can be scheduled multiple times for the same RS

2018-08-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576095#comment-16576095
 ] 

Duo Zhang commented on HBASE-20976:
---

I think we'd better do it a bit clean without adding too much checks...

I think here we need to make sure that the deadServers check can work and 
prevent scheduling redundant SCPs. We can do the SCPs check when restarting is 
that, we have not started the PE yet so it is safe, but during the execution, 
this is not a good idea as there is no fencing...

> SCP can be scheduled multiple times for the same RS
> ---
>
> Key: HBASE-20976
> URL: https://issues.apache.org/jira/browse/HBASE-20976
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: HBASE-20976.branch-2.0.001.patch, 
> HBASE-20976.branch-2.0.002.patch
>
>
> SCP can be scheduled multiple times for the same RS:
> 1. a RS crashed, a SCP was submitted for it
> 2. before this SCP finish, the Master crashed
> 3. The new master will scan the meta table and find some region is still open 
> on a dead server
> 4. The new master submit a SCP for the dead server again
> The two SCP for the same RS can even execute concurrently if without 
> HBASE-20846…
> Provided a test case to reproduce this issue and a fix solution in the patch.
> Another case that SCP might be scheduled multiple times for the same RS(with 
> HBASE-20708.):
> 1.  a RS crashed, a SCP was submitted for it
> 2. A new RS on the same host started, the old RS's Serveranme was remove from 
> DeadServer.deadServers
> 3. after the SCP passed the Handle_RIT state, a UnassignProcedure need to 
> send a close region operation to the crashed RS
> 4. The UnassignProcedure's dispatch failed since 'NoServerDispatchException'
> 5. Begin to expire the RS, but only find it not online and not in deadServer 
> list, so a SCP was submitted for the same RS again
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576089#comment-16576089
 ] 

Allan Yang commented on HBASE-21035:


{quote}
I think this should a tool in HBCK? As in the normal code path, if all the 
procedures are lost, we do not know which is the best way to address this 
problem, force online the meta may introduce more problems...
{quote}
IIRC, tools like HBCK are depending on meta online to do scan...

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20976) SCP can be scheduled multiple times for the same RS

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576082#comment-16576082
 ] 

Allan Yang commented on HBASE-20976:


{code}
I think there may still be races? As if the previous SCP has also been done and 
removed from ProcedureExecutor, and then the UnassignProcedure tries to expire 
the server...
{code}
Maybe I deleted some procedures wals which causing this. But whatever, I think 
a double check won't hurt.

> SCP can be scheduled multiple times for the same RS
> ---
>
> Key: HBASE-20976
> URL: https://issues.apache.org/jira/browse/HBASE-20976
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: HBASE-20976.branch-2.0.001.patch, 
> HBASE-20976.branch-2.0.002.patch
>
>
> SCP can be scheduled multiple times for the same RS:
> 1. a RS crashed, a SCP was submitted for it
> 2. before this SCP finish, the Master crashed
> 3. The new master will scan the meta table and find some region is still open 
> on a dead server
> 4. The new master submit a SCP for the dead server again
> The two SCP for the same RS can even execute concurrently if without 
> HBASE-20846…
> Provided a test case to reproduce this issue and a fix solution in the patch.
> Another case that SCP might be scheduled multiple times for the same RS(with 
> HBASE-20708.):
> 1.  a RS crashed, a SCP was submitted for it
> 2. A new RS on the same host started, the old RS's Serveranme was remove from 
> DeadServer.deadServers
> 3. after the SCP passed the Handle_RIT state, a UnassignProcedure need to 
> send a close region operation to the crashed RS
> 4. The UnassignProcedure's dispatch failed since 'NoServerDispatchException'
> 5. Begin to expire the RS, but only find it not online and not in deadServer 
> list, so a SCP was submitted for the same RS again
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21035) Meta Table should be able to online even if all procedures are lost

2018-08-10 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21035:
---
Attachment: HBASE-21035.branch-2.0.001.patch

> Meta Table should be able to online even if all procedures are lost
> ---
>
> Key: HBASE-21035
> URL: https://issues.apache.org/jira/browse/HBASE-21035
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21035.branch-2.0.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21025:
--
Fix Version/s: 2.0.2

> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21025) Add cache for TableStateManager

2018-08-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576075#comment-16576075
 ] 

Duo Zhang commented on HBASE-21025:
---

{noformat}
2018-08-10 08:45:29,568 ERROR [RS:0;b6feabd0a074:55588] 
coprocessor.CoprocessorHost(398): The coprocessor 
org.apache.hadoop.hbase.JMXListener threw java.rmi.server.ExportException: Port 
already in use: 59872; nested exception is: 
java.net.BindException: Address already in use (Bind failed)
java.rmi.server.ExportException: Port already in use: 59872; nested exception 
is: 
java.net.BindException: Address already in use (Bind failed)
at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:346)
at 
sun.rmi.transport.tcp.TCPTransport.exportObject(TCPTransport.java:254)
at sun.rmi.transport.tcp.TCPEndpoint.exportObject(TCPEndpoint.java:411)
at sun.rmi.transport.LiveRef.exportObject(LiveRef.java:147)
at 
sun.rmi.server.UnicastServerRef.exportObject(UnicastServerRef.java:236)
at sun.rmi.registry.RegistryImpl.setup(RegistryImpl.java:213)
at sun.rmi.registry.RegistryImpl.(RegistryImpl.java:198)
at 
java.rmi.registry.LocateRegistry.createRegistry(LocateRegistry.java:203)
at 
org.apache.hadoop.hbase.JMXListener.startConnectorServer(JMXListener.java:134)
at org.apache.hadoop.hbase.JMXListener.start(JMXListener.java:209)
at 
org.apache.hadoop.hbase.coprocessor.BaseEnvironment.startup(BaseEnvironment.java:72)
at 
org.apache.hadoop.hbase.coprocessor.CoprocessorHost.checkAndLoadInstance(CoprocessorHost.java:263)
at 
org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadSystemCoprocessors(CoprocessorHost.java:157)
at 
org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost.(RegionServerCoprocessorHost.java:70)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:943)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:184)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:130)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:341)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:165)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.BindException: Address already in use (Bind failed)
at java.net.PlainSocketImpl.socketBind(Native Method)
at 
java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
at java.net.ServerSocket.bind(ServerSocket.java:375)
at java.net.ServerSocket.(ServerSocket.java:237)
at java.net.ServerSocket.(ServerSocket.java:128)
at 
sun.rmi.transport.proxy.RMIDirectSocketFactory.createServerSocket(RMIDirectSocketFactory.java:45)
at 
sun.rmi.transport.proxy.RMIMasterSocketFactory.createServerSocket(RMIMasterSocketFactory.java:345)
at 
sun.rmi.transport.tcp.TCPEndpoint.newServerSocket(TCPEndpoint.java:666)
at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:335)
... 23 more
{noformat}

This is the cause for the UT failure, not related to the patch here.

Let me commit.

> Add cache for TableStateManager
> ---
>
> Key: HBASE-21025
> URL: https://issues.apache.org/jira/browse/HBASE-21025
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-21025-v1.patch, HBASE-21025-v2.patch, 
> HBASE-21025.patch
>
>
> After HBASE-20881, we will check whether a table is disabled in SCP, so we 
> need to add cache for it to improve MTTR, and also reduce the request to meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >