[jira] [Commented] (HBASE-21102) ServerCrashProcedure should select target server where no other replicas exist for the current region

2018-09-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619886#comment-16619886
 ] 

Ted Yu commented on HBASE-21102:


>From 
>https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1267/testReport/junit/org.apache.hadoop.hbase.master.balancer/TestRSGroupBasedLoadBalancer/health_checks___yetus_jdk8_hadoop3_checks___testRetainAssignment/
> :
{code}
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancer.testRetainAssignment(TestRSGroupBasedLoadBalancer.java:166)
{code}
It was due to this code:
{code}
  if (this.services != null && this.services.getAssignmentManager() != 
null) { // for tests
if (!hasRegionReplica && 
this.services.getAssignmentManager().getRegionStates()
.isReplicaAvailableForRegion(region)) {
{code}
this.services.getAssignmentManager().getRegionStates() may return null.
Ram:
Can you take a look at 21102.addendum2.txt ?

The above test passes with the additional check.

> ServerCrashProcedure should select target server where no other replicas 
> exist for the current region
> -
>
> Key: HBASE-21102
> URL: https://issues.apache.org/jira/browse/HBASE-21102
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 3.0.0, 2.2.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
> Attachments: 21102.addendum2.txt, HBASE-21102_1.patch, 
> HBASE-21102_2.patch, HBASE-21102_3.patch, HBASE-21102_4.patch, 
> HBASE-21102_addendum.patch, HBASE-21102_addendum.patch, 
> HBASE-21102_addendum.patch, HBASE-21102_branch-2.1.patch, 
> HBASE-21102_branch-2.1.patch, HBASE-21102_initial.patch
>
>
> Currently when a server with region replica crashes, when the target server 
> is created for the replica region assignment there is no guarentee that a 
> server is selected where there is no other replica for the current region 
> getting assigned. It so happens that currently we do an assignment randomly 
> and later the LB comes and identifies these cases and again does MOVE for 
> such regions. It will be better if we can identify target servers at least 
> minimally ensuring that replicas are not colocated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21102) ServerCrashProcedure should select target server where no other replicas exist for the current region

2018-09-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21102:
---
Attachment: 21102.addendum2.txt

> ServerCrashProcedure should select target server where no other replicas 
> exist for the current region
> -
>
> Key: HBASE-21102
> URL: https://issues.apache.org/jira/browse/HBASE-21102
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 3.0.0, 2.2.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
> Attachments: 21102.addendum2.txt, HBASE-21102_1.patch, 
> HBASE-21102_2.patch, HBASE-21102_3.patch, HBASE-21102_4.patch, 
> HBASE-21102_addendum.patch, HBASE-21102_addendum.patch, 
> HBASE-21102_addendum.patch, HBASE-21102_branch-2.1.patch, 
> HBASE-21102_branch-2.1.patch, HBASE-21102_initial.patch
>
>
> Currently when a server with region replica crashes, when the target server 
> is created for the replica region assignment there is no guarentee that a 
> server is selected where there is no other replica for the current region 
> getting assigned. It so happens that currently we do an assignment randomly 
> and later the LB comes and identifies these cases and again does MOVE for 
> such regions. It will be better if we can identify target servers at least 
> minimally ensuring that replicas are not colocated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619836#comment-16619836
 ] 

Ted Yu commented on HBASE-21196:


lgtm

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-21196.master.001.patch, 
> HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, 
> HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
> Advancing internal small scanner to startKey at 
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
> meta region location in ZK, 
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}
> From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see 
> that the string "Removed all cached region locations that map" and "Looking 
> up meta region location in ZK" are present for every put.
> *Analysis:*
>  The problem occurs as we call the {{cleanServerCache}} method always clears 
> the server cache in case tablename is null and exception is null. See 
> [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918]
> {code:java}
> private void cleanServerCache(ServerName server, Throwable regionException) {
> if (tableName == null && 
> ClientExceptionsUtil.isMetaClearingExceptio

[jira] [Updated] (KYLIN-2650) Update to Apache Calcite Avatica 1.12.0

2018-09-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-2650:
--
Summary: Update to Apache Calcite Avatica 1.12.0  (was: Update to Apache 
Calcite Avatica 1.10.0)

> Update to Apache Calcite Avatica 1.12.0
> ---
>
> Key: KYLIN-2650
> URL: https://issues.apache.org/jira/browse/KYLIN-2650
> Project: Kylin
>  Issue Type: Improvement
>    Reporter: Ted Yu
>Priority: Minor
>
> Apache Calcite Avatica 1.10.0 has just been released.
> This issue upgrades Avatica dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3095) Use ArrayDeque instead of LinkedList for queue implementation

2018-09-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-3095:
--
Description: 
Use ArrayDeque instead of LinkedList for queue implementation where thread 
safety is not needed.

>From https://docs.oracle.com/javase/7/docs/api/java/util/ArrayDeque.html
{quote}
Resizable-array implementation of the Deque interface. Array deques have no 
capacity restrictions; they grow as necessary to support usage. They are not 
thread-safe; in the absence of external synchronization, they do not support 
concurrent access by multiple threads. Null elements are prohibited. This class 
is likely to be faster than Stack when used as a stack, and *faster than 
LinkedList when used as a queue.*
{quote}

  was:
Use ArrayDeque instead of LinkedList for queue implementation where thread 
safety is not needed.

>From https://docs.oracle.com/javase/7/docs/api/java/util/ArrayDeque.html

{quote}
Resizable-array implementation of the Deque interface. Array deques have no 
capacity restrictions; they grow as necessary to support usage. They are not 
thread-safe; in the absence of external synchronization, they do not support 
concurrent access by multiple threads. Null elements are prohibited. This class 
is likely to be faster than Stack when used as a stack, and *faster than 
LinkedList when used as a queue.*
{quote}


> Use ArrayDeque instead of LinkedList for queue implementation
> -
>
> Key: KYLIN-3095
> URL: https://issues.apache.org/jira/browse/KYLIN-3095
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee:  Kaige Liu
>Priority: Minor
>  Labels: parallel
> Fix For: Backlog
>
>
> Use ArrayDeque instead of LinkedList for queue implementation where thread 
> safety is not needed.
> From https://docs.oracle.com/javase/7/docs/api/java/util/ArrayDeque.html
> {quote}
> Resizable-array implementation of the Deque interface. Array deques have no 
> capacity restrictions; they grow as necessary to support usage. They are not 
> thread-safe; in the absence of external synchronization, they do not support 
> concurrent access by multiple threads. Null elements are prohibited. This 
> class is likely to be faster than Stack when used as a stack, and *faster 
> than LinkedList when used as a queue.*
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3272) Upgrade Spark dependency to 2.3.1

2018-09-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-3272:
--
Description: 
Currently Spark 2.1.2 is used.

Spark 2.3.1 was released.
We should upgrade the dependency to 2.3.1

  was:
Currently Spark 2.1.2 is used.

Spark 2.3.0 was just released.
We should upgrade the dependency to 2.3.0


> Upgrade Spark dependency to 2.3.1
> -
>
> Key: KYLIN-3272
> URL: https://issues.apache.org/jira/browse/KYLIN-3272
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>    Reporter: Ted Yu
>Priority: Minor
>
> Currently Spark 2.1.2 is used.
> Spark 2.3.1 was released.
> We should upgrade the dependency to 2.3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KYLIN-3334) Prepare for Java 10

2018-09-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473200#comment-16473200
 ] 

Ted Yu edited comment on KYLIN-3334 at 9/18/18 4:17 PM:


Compiling against jdk 11 gives similar error.


was (Author: yuzhih...@gmail.com):
Compiling against jdk 11 gives similar error .

> Prepare for Java 10
> ---
>
> Key: KYLIN-3334
> URL: https://issues.apache.org/jira/browse/KYLIN-3334
> Project: Kylin
>  Issue Type: Task
>    Reporter: Ted Yu
>Priority: Major
>
> When compiling with Java 10 , MapReduce Engine module fails with
> {code}
> [ERROR] Failed to execute goal on project kylin-engine-mr: Could not resolve 
> dependencies for project org.apache.kylin:kylin-engine-mr:jar:2.4.0-SNAPSHOT: 
> Could not find artifact jdk.tools:jdk.tools:jar:1.7 at specified path 
> /a/jdk-10/../lib/tools.jar -> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21206) Scan with batch size may return incomplete cells

2018-09-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619268#comment-16619268
 ] 

Ted Yu commented on HBASE-21206:


Can you attach patch for branch-1 ?

Thanks

> Scan with batch size may return incomplete cells
> 
>
> Key: HBASE-21206
> URL: https://issues.apache.org/jira/browse/HBASE-21206
> Project: HBase
>  Issue Type: Bug
>  Components: scan
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21206.v1.patch, HBASE-21206.v1.patch, ut.patch
>
>
> See the attached UT.  the table has 5 columns and each column has at least 
> one cell in it, but when we scan the table with batchSize=3,  we only got 3 
> cells returned , the other 2 cells got lost ...
> It's a critial bug and should be fixed..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21208) Bytes#toShort doesn't work without unsafe

2018-09-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619230#comment-16619230
 ] 

Ted Yu commented on HBASE-21208:


+1

> Bytes#toShort doesn't work without unsafe
> -
>
> Key: HBASE-21208
> URL: https://issues.apache.org/jira/browse/HBASE-21208
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21208.v0.patch
>
>
> seems we put the brackets in the wrong place.
> {code}
>   short n = 0;
>   n = (short) ((n ^ bytes[offset]) & 0xFF);
>   n = (short) (n << 8);
>   n = (short) ((n ^ bytes[offset+1]) & 0xFF);   // this one
>   return n;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21204) NPE when scan raw DELETE_FAMILY_VERSION and codec is not set

2018-09-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21204:
---
Status: Patch Available  (was: Open)

> NPE when scan raw DELETE_FAMILY_VERSION and codec is not set
> 
>
> Key: HBASE-21204
> URL: https://issues.apache.org/jira/browse/HBASE-21204
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 2.1.0, 2.2.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 2.2.0, 2.0.0, 2.1.0
>
> Attachments: HBASE-21204.master.001.patch, 
> HBASE-21204.master.002.patch
>
>
> There are 7 types of our Cell,
> Minimum((byte)0),
> Put((byte)4),
> Delete((byte)8),
> DeleteFamilyVersion((byte)10),
> DeleteColumn((byte)12),
> DeleteFamily((byte)14),
> Maximum((byte)255);
> But there are only 6 types of our CellType protobuf definition:
> enum CellType {
> MINIMUM = 0;
> PUT = 4;
> DELETE = 8;
> DELETE_FAMILY_VERSION = 10;
> DELETE_COLUMN = 12;
> DELETE_FAMILY = 14;
> MAXIMUM = 255;
> }
> Thus if we scan raw data which is DELETE_FAMILY_VERSION,it will throw NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2018-09-17 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618259#comment-16618259
 ] 

Ted Yu commented on HDFS-6092:
--

Test failure was not related.

> DistributedFileSystem#getCanonicalServiceName() and 
> DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
> --
>
> Key: HDFS-6092
> URL: https://issues.apache.org/jira/browse/HDFS-6092
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6092-v4.patch, HDFS-6092-v5.patch, 
> haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, 
> hdfs-6092-v2.txt, hdfs-6092-v3.txt
>
>
> I discovered this when working on HBASE-10717
> Here is sample code to reproduce the problem:
> {code}
> Path desPath = new Path("hdfs://127.0.0.1/");
> FileSystem desFs = desPath.getFileSystem(conf);
> 
> String s = desFs.getCanonicalServiceName();
> URI uri = desFs.getUri();
> {code}
> Canonical name string contains the default port - 8020
> But uri doesn't contain port.
> This would result in the following exception:
> {code}
> testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
> 0.001 sec  <<< ERROR!
> java.lang.IllegalArgumentException: port out of range:-1
> at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
> at java.net.InetSocketAddress.(InetSocketAddress.java:224)
> at 
> org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
> {code}
> Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13515) NetUtils#connect should log remote address for NoRouteToHostException

2018-09-17 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471170#comment-16471170
 ] 

Ted Yu edited comment on HDFS-13515 at 9/17/18 11:01 PM:
-

Can you log the remote address in case of exception ?

Thanks


was (Author: yuzhih...@gmail.com):
Can you log the remote address in case of exception?

Thanks

> NetUtils#connect should log remote address for NoRouteToHostException
> -
>
> Key: HDFS-13515
> URL: https://issues.apache.org/jira/browse/HDFS-13515
> Project: Hadoop HDFS
>  Issue Type: Improvement
>    Reporter: Ted Yu
>Priority: Minor
>
> {code}
> hdfs.BlockReaderFactory: I/O error constructing remote block reader.
> java.net.NoRouteToHostException: No route to host
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
> at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2884)
> {code}
> In the above stack trace, the remote host was not logged.
> This makes troubleshooting a bit hard.
> NetUtils#connect should log remote address for NoRouteToHostException .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



Re: Stripe Compactions Stability

2018-09-17 Thread Ted Yu
Hi,
To my knowledge, stripe compaction has not seen patches for a few years.

Have you looked at :
http://hbase.apache.org/book.html#ops.date.tiered

If the above doesn't suit your needs, can you tell us more about your use
case ?

Thanks

On Mon, Sep 17, 2018 at 11:39 AM Austin Heyne  wrote:

> The HBase cluster we're running has well over 100TB of data spread
> across two tables and as you'd guess we're suffering from compaction
> times taking way to long (and we need to double or triple that volume).
> I've found information on Stripe Compactions [1] which it seems like we
> could benefit a lot from, however, 'experimental' is a scary word for
> productions systems so I just wanted to get a general sentiment around
> the stability of the feature. Any experience or input would be very
> helpful.
>
> Thanks again,
> Austin
>
> [1] https://hbase.apache.org/book.html#ops.stripe
>
>


[jira] [Comment Edited] (FLINK-7642) Upgrade maven surefire plugin to 2.21.0

2018-09-17 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433258#comment-16433258
 ] 

Ted Yu edited comment on FLINK-7642 at 9/17/18 10:38 PM:
-

SUREFIRE-1439 is in 2.21.0 which is needed for compiling with Java 10 .


was (Author: yuzhih...@gmail.com):
SUREFIRE-1439 is in 2.21.0 which is needed for compiling with Java 10.

> Upgrade maven surefire plugin to 2.21.0
> ---
>
> Key: FLINK-7642
> URL: https://issues.apache.org/jira/browse/FLINK-7642
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>    Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> Surefire 2.19 release introduced more useful test filters which would let us 
> run a subset of the test.
> This issue is for upgrading maven surefire plugin to 2.21.0 which contains 
> SUREFIRE-1422



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21178) [BC break] : Get and Scan operation with a custom converter_class not working

2018-09-17 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617714#comment-16617714
 ] 

Ted Yu commented on HBASE-21178:


Can you take a look at ruby-lint warning(s) which is related to the patch ?

> [BC break] : Get and Scan operation with a custom converter_class not working
> -
>
> Key: HBASE-21178
> URL: https://issues.apache.org/jira/browse/HBASE-21178
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Subrat Mishra
>Assignee: Subrat Mishra
>Priority: Critical
> Attachments: HBASE-21178.master.001.patch, 
> HBASE-21178.master.002.patch
>
>
> Consider a simple scenario:
> {code:java}
> create 'foo', {NAME => 'f1'}
> put 'foo','r1','f1:a',1000
> get 'foo','r1',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']} 
> scan 'foo',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']}{code}
> Both get and scan fails with ERROR
> {code:java}
> ERROR: wrong number of arguments (3 for 1) {code}
> Looks like in table.rb file converter_method expects 3 arguments [(bytes, 
> offset, len)] since version 2.0.0, prior to version 2.0.0 it was taking only 
> 1 argument [(bytes)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21160:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, liubang.

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HBASE-21160-1.patch, HBASE-21160-2.patch, 
> HBASE-21160-3.patch, HBASE-21160-4.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3484) Update Hadoop version to 2.7.7

2018-09-16 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-3484:
--
Description: We should upgrade the Hadoop 2.7 dependency to 2.7.7, to pick 
up bug and security fixes.  (was: We should upgrade the Hadoop 2.7 dependency 
to 2.7.7, to pick up bug and security fixes .)

> Update Hadoop version to 2.7.7
> --
>
> Key: KYLIN-3484
> URL: https://issues.apache.org/jira/browse/KYLIN-3484
> Project: Kylin
>  Issue Type: Task
>    Reporter: Ted Yu
>Priority: Minor
>
> We should upgrade the Hadoop 2.7 dependency to 2.7.7, to pick up bug and 
> security fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3543) Unclosed Job instance in CreateHTableJob#exportHBaseConfiguration

2018-09-16 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616840#comment-16616840
 ] 

Ted Yu commented on KYLIN-3543:
---

Is it possible to do something for the building cube with MR case ?

> Unclosed Job instance in CreateHTableJob#exportHBaseConfiguration
> -
>
> Key: KYLIN-3543
> URL: https://issues.apache.org/jira/browse/KYLIN-3543
> Project: Kylin
>  Issue Type: Bug
>    Reporter: Ted Yu
>Priority: Minor
>
> {code}
> out = fs.create(new Path(hbaseConfPath));
> job.getConfiguration().writeXml(out);
> {code}
> The job instance should be closed upon return from the method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7175) Make version checking logic more flexible in streams_upgrade_test.py

2018-09-16 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574181#comment-16574181
 ] 

Ted Yu edited comment on KAFKA-7175 at 9/16/18 6:15 PM:


Thanks for taking this, Ray .


was (Author: yuzhih...@gmail.com):
Thanks for taking this, Ray.

> Make version checking logic more flexible in streams_upgrade_test.py
> 
>
> Key: KAFKA-7175
> URL: https://issues.apache.org/jira/browse/KAFKA-7175
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, system tests
>    Reporter: Ted Yu
>Assignee: Ray Chiang
>Priority: Major
>  Labels: newbie++
>
> During debugging of system test failure for KAFKA-5037, it was re-discovered 
> that the version numbers inside version probing related messages are hard 
> coded in streams_upgrade_test.py
> This is in-flexible.
> We should correlate latest version from Java class with the expected version 
> numbers.
> Matthias made the following suggestion:
> We should also make this more generic and test upgrades from 3 -> 4, 3 -> 5 
> and 4 -> 5. The current code does only go from latest version to future 
> version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7316) Use of filter method in KTable.scala may result in StackOverflowError

2018-09-16 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616834#comment-16616834
 ] 

Ted Yu commented on KAFKA-7316:
---

Can this be resolved ?

> Use of filter method in KTable.scala may result in StackOverflowError
> -
>
> Key: KAFKA-7316
> URL: https://issues.apache.org/jira/browse/KAFKA-7316
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Ted Yu
>Priority: Major
>  Labels: scala
> Fix For: 2.0.1, 2.1.0
>
> Attachments: 7316.v4.txt
>
>
> In this thread:
> http://search-hadoop.com/m/Kafka/uyzND1dNbYKXzC4F1?subj=Issue+in+Kafka+2+0+0+
> Druhin reported seeing StackOverflowError when using filter method from 
> KTable.scala
> This can be reproduced with the following change:
> {code}
> diff --git 
> a/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
>  b/streams/streams-scala/src/test/scala
> index 3d1bab5..e0a06f2 100644
> --- 
> a/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
> +++ 
> b/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
> @@ -58,6 +58,7 @@ class StreamToTableJoinScalaIntegrationTestImplicitSerdes 
> extends StreamToTableJ
>  val userClicksStream: KStream[String, Long] = 
> builder.stream(userClicksTopic)
>  val userRegionsTable: KTable[String, String] = 
> builder.table(userRegionsTopic)
> +userRegionsTable.filter { case (_, count) => true }
>  // Compute the total per region by summing the individual click counts 
> per region.
>  val clicksPerRegion: KTable[String, Long] =
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7276) Consider using re2j to speed up regex operations

2018-09-16 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KAFKA-7276:
--
Description: 
https://github.com/google/re2j

re2j claims to do linear time regular expression matching in Java.

Its benefit is most obvious for deeply nested regex (such as a | b | c | d).
We should consider using re2j to speed up regex operations.

  was:
https://github.com/google/re2j

re2j claims to do linear time regular expression matching in Java.

Its benefit is most obvious for deeply nested regex (such as a | b | c | d).

We should consider using re2j to speed up regex operations.


> Consider using re2j to speed up regex operations
> 
>
> Key: KAFKA-7276
> URL: https://issues.apache.org/jira/browse/KAFKA-7276
> Project: Kafka
>  Issue Type: Task
>  Components: packaging
>    Reporter: Ted Yu
>Assignee: kevin.chen
>Priority: Major
>
> https://github.com/google/re2j
> re2j claims to do linear time regular expression matching in Java.
> Its benefit is most obvious for deeply nested regex (such as a | b | c | d).
> We should consider using re2j to speed up regex operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7344) Return early when all tasks are assigned in StickyTaskAssignor#assignActive

2018-09-16 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KAFKA-7344:
--
Description: 
After re-assigning existing active tasks to clients that previously had the 
same active task, there is chance that {{taskIds.size() == assigned.size()}}, 
i.e. all tasks are assigned .
The method continues with:

{code}
final Set unassigned = new HashSet<>(taskIds);
unassigned.removeAll(assigned);
{code}
We can check the above condition and return early before allocating HashSet.

Similar optimization can be done before the following (around line 112):
{code}
// assign any remaining unassigned tasks
final List sortedTasks = new ArrayList<>(unassigned);
{code}

  was:
After re-assigning existing active tasks to clients that previously had the 
same active task, there is chance that {{taskIds.size() == assigned.size()}}, 
i.e. all tasks are assigned .
The method continues with:
{code}
final Set unassigned = new HashSet<>(taskIds);
unassigned.removeAll(assigned);
{code}
We can check the above condition and return early before allocating HashSet.

Similar optimization can be done before the following (around line 112):
{code}
// assign any remaining unassigned tasks
final List sortedTasks = new ArrayList<>(unassigned);
{code}


> Return early when all tasks are assigned in StickyTaskAssignor#assignActive
> ---
>
> Key: KAFKA-7344
> URL: https://issues.apache.org/jira/browse/KAFKA-7344
> Project: Kafka
>  Issue Type: Improvement
>      Components: streams
>Reporter: Ted Yu
>Assignee: kevin.chen
>Priority: Minor
>  Labels: optimization
>
> After re-assigning existing active tasks to clients that previously had the 
> same active task, there is chance that {{taskIds.size() == assigned.size()}}, 
> i.e. all tasks are assigned .
> The method continues with:
> {code}
> final Set unassigned = new HashSet<>(taskIds);
> unassigned.removeAll(assigned);
> {code}
> We can check the above condition and return early before allocating HashSet.
> Similar optimization can be done before the following (around line 112):
> {code}
> // assign any remaining unassigned tasks
> final List sortedTasks = new ArrayList<>(unassigned);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMBARI-23287) Lack of synchronization accessing topologyHolder in HostResourceProvider#processDeleteHostRequests

2018-09-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMBARI-23287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated AMBARI-23287:

Description: 
HostResourceProvider#processDeleteHostRequests accesses topologyHolder without 
any synchronization.
 

  was:
HostResourceProvider#processDeleteHostRequests accesses topologyHolder without 
any synchronization .
 


> Lack of synchronization accessing topologyHolder in 
> HostResourceProvider#processDeleteHostRequests
> --
>
> Key: AMBARI-23287
> URL: https://issues.apache.org/jira/browse/AMBARI-23287
> Project: Ambari
>  Issue Type: Bug
>    Reporter: Ted Yu
>Priority: Minor
>
> HostResourceProvider#processDeleteHostRequests accesses topologyHolder 
> without any synchronization.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMBARI-23288) stateWatcherClient should be closed upon return from OutputSolr#createSolrStateWatcher

2018-09-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMBARI-23288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated AMBARI-23288:

Description: 
{code}
CloudSolrClient stateWatcherClient = createSolrClient();
{code}
stateWatcherClient should be closed upon return from the method.

  was:
{code}
CloudSolrClient stateWatcherClient = createSolrClient();
{code}

stateWatcherClient should be closed upon return from the method.


> stateWatcherClient should be closed upon return from 
> OutputSolr#createSolrStateWatcher
> --
>
> Key: AMBARI-23288
> URL: https://issues.apache.org/jira/browse/AMBARI-23288
> Project: Ambari
>  Issue Type: Bug
>    Reporter: Ted Yu
>Priority: Minor
>
> {code}
> CloudSolrClient stateWatcherClient = createSolrClient();
> {code}
> stateWatcherClient should be closed upon return from the method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMBARI-22621) Ensure value for hbase.coprocessor.abortonerror is true

2018-09-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMBARI-22621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated AMBARI-22621:

Description: 
In the coprocessor refactor for hbase-2, Server#abort has been taken out of 
reach.

We should ensure that value for hbase.coprocessor.abortonerror is true so that 
coprocessor can abort the server by throwing exception.

See HBASE-19341 for related details.

  was:
In the coprocessor refactor for hbase-2, Server#abort has been taken out of 
reach.


We should ensure that value for hbase.coprocessor.abortonerror is true so that 
coprocessor can abort the server by throwing exception.

See HBASE-19341 for related details.


> Ensure value for hbase.coprocessor.abortonerror is true
> ---
>
> Key: AMBARI-22621
> URL: https://issues.apache.org/jira/browse/AMBARI-22621
> Project: Ambari
>  Issue Type: Improvement
>    Reporter: Ted Yu
>Priority: Major
>
> In the coprocessor refactor for hbase-2, Server#abort has been taken out of 
> reach.
> We should ensure that value for hbase.coprocessor.abortonerror is true so 
> that coprocessor can abort the server by throwing exception.
> See HBASE-19341 for related details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMBARI-24607) rand should not be used in WebSocketProtocol.h

2018-09-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMBARI-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated AMBARI-24607:

Description: 
In 
ambari-admin/src/main/resources/ui/admin-web/node_modules/karma/node_modules/socket.io/node_modules/engine.io/node_modules/uws/src/WebSocketProtocol.h
 :

{code}
if (!isServer) {
dst[1] |= 0x80;
uint32_t random = rand();
{code}
Linear congruential algorithms are too easy to break.

  was:
In 
ambari-admin/src/main/resources/ui/admin-web/node_modules/karma/node_modules/socket.io/node_modules/engine.io/node_modules/uws/src/WebSocketProtocol.h
 :
{code}
if (!isServer) {
dst[1] |= 0x80;
uint32_t random = rand();
{code}
Linear congruential algorithms are too easy to break.


> rand should not be used in WebSocketProtocol.h
> --
>
> Key: AMBARI-24607
> URL: https://issues.apache.org/jira/browse/AMBARI-24607
> Project: Ambari
>  Issue Type: Bug
>    Reporter: Ted Yu
>Priority: Minor
>
> In 
> ambari-admin/src/main/resources/ui/admin-web/node_modules/karma/node_modules/socket.io/node_modules/engine.io/node_modules/uws/src/WebSocketProtocol.h
>  :
> {code}
> if (!isServer) {
> dst[1] |= 0x80;
> uint32_t random = rand();
> {code}
> Linear congruential algorithms are too easy to break.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AMBARI-23353) Provide sanity check for hbase in memory flush parameters

2018-09-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/AMBARI-23353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated AMBARI-23353:

Description: 
For hbase 2.0 release, there is correlation between the following parameters:

* hbase.memstore.inmemoryflush.threshold.factor : threshold for the active 
segment
* hbase.hregion.compacting.pipeline.segments.limit : pipeline length

For SSD, a threshold of 2% for the active segment 
(hbase.memstore.inmemoryflush.threshold.factor=0.02) and pipeline length of 4 
(hbase.hregion.compacting.pipeline.segments.limit=4).

For HDD, hbase.hregion.compacting.pipeline.segments.limit should be 3 (due to 
lower throughput of HDD).

  was:
For hbase 2.0 release, there is correlation between the following parameters:

* hbase.memstore.inmemoryflush.threshold.factor : threshold for the active 
segment
* hbase.hregion.compacting.pipeline.segments.limit : pipeline length


For SSD, a threshold of 2% for the active segment 
(hbase.memstore.inmemoryflush.threshold.factor=0.02) and pipeline length of 4 
(hbase.hregion.compacting.pipeline.segments.limit=4).

For HDD, hbase.hregion.compacting.pipeline.segments.limit should be 3 (due to 
lower throughput of HDD).


> Provide sanity check for hbase in memory flush parameters
> -
>
> Key: AMBARI-23353
> URL: https://issues.apache.org/jira/browse/AMBARI-23353
> Project: Ambari
>  Issue Type: Task
>    Reporter: Ted Yu
>Priority: Major
>
> For hbase 2.0 release, there is correlation between the following parameters:
> * hbase.memstore.inmemoryflush.threshold.factor : threshold for the active 
> segment
> * hbase.hregion.compacting.pipeline.segments.limit : pipeline length
> For SSD, a threshold of 2% for the active segment 
> (hbase.memstore.inmemoryflush.threshold.factor=0.02) and pipeline length of 4 
> (hbase.hregion.compacting.pipeline.segments.limit=4).
> For HDD, hbase.hregion.compacting.pipeline.segments.limit should be 3 (due to 
> lower throughput of HDD).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9824) Support IPv6 literal

2018-09-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-9824:
--
Description: 
Currently we use colon as separator when parsing host and port.

We should support the usage of IPv6 literals in parsing.

  was:
Currently we use colon as separator when parsing host and port.


We should support the usage of IPv6 literals in parsing.


> Support IPv6 literal
> 
>
> Key: FLINK-9824
> URL: https://issues.apache.org/jira/browse/FLINK-9824
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>    Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Minor
>
> Currently we use colon as separator when parsing host and port.
> We should support the usage of IPv6 literals in parsing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GORA-542) Upgrade to JUnit 5

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created GORA-542:
---

 Summary: Upgrade to JUnit 5
 Key: GORA-542
 URL: https://issues.apache.org/jira/browse/GORA-542
 Project: Apache Gora
  Issue Type: Task
Reporter: Ted Yu


JUnit 5 brings multiple useful features so tests are easier to read and write.

We can bump up the dependency version and create new tests with JUnit 5 
features.

Relevant features of JUnit 5: dynamic test, nested tests, parameterized tests

https://twitter.com/nipafx/status/1027095088059559936



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3561) Upgrade to JUnit 5

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3561:
-

 Summary: Upgrade to JUnit 5
 Key: KYLIN-3561
 URL: https://issues.apache.org/jira/browse/KYLIN-3561
 Project: Kylin
  Issue Type: Task
Reporter: Ted Yu


JUnit 5 brings multiple useful features so tests are easier to read and write.

We can bump up the dependency version and create new tests with JUnit 5 
features.

Relevant features of JUnit 5: dynamic test, nested tests, parameterized tests
https://twitter.com/nipafx/status/1027095088059559936



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3561) Upgrade to JUnit 5

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3561:
-

 Summary: Upgrade to JUnit 5
 Key: KYLIN-3561
 URL: https://issues.apache.org/jira/browse/KYLIN-3561
 Project: Kylin
  Issue Type: Task
Reporter: Ted Yu


JUnit 5 brings multiple useful features so tests are easier to read and write.

We can bump up the dependency version and create new tests with JUnit 5 
features.

Relevant features of JUnit 5: dynamic test, nested tests, parameterized tests
https://twitter.com/nipafx/status/1027095088059559936



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615189#comment-16615189
 ] 

Ted Yu commented on HBASE-21196:


You can keep them.

https://builds.apache.org/job/PreCommit-HBASE-Build/14415/testReport/org.apache.hadoop.hbase.client/TestRegionLocationCaching/

Is seems the new test class can be medium, judging from the duration it took.

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-21196.master.001.patch, 
> HBASE-21196.master.001.patch, HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
> Advancing internal small scanner to startKey at 
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
> meta region location in ZK, 
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}
> From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see 
> that the string "Removed all cached region locations that map" and "Looking 
> up meta region location in ZK" are present for every put.
> *Analysis:*
>  The problem occurs as we call the {{cleanServerCache}} method always clears 
> the server cache in case tablename is null and exception is null. See 
> [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918]
> {code:java}
> priv

[jira] [Commented] (KYLIN-3171) Support hadoop 3 release

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615047#comment-16615047
 ] 

Ted Yu commented on KYLIN-3171:
---

How would master-hadoop3.1 branch be sync'ed with master branch ?

Currently it lags by few days' commits.

> Support hadoop 3 release
> 
>
> Key: KYLIN-3171
> URL: https://issues.apache.org/jira/browse/KYLIN-3171
> Project: Kylin
>  Issue Type: Improvement
>    Reporter: Ted Yu
>Priority: Major
>
> When compiling against hadoop 3, I got:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile 
> (default-compile) on project kylin-engine-mr: Compilation failure: 
> Compilation  failure:
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[29,36]
>  error: package org.apache.commons.httpclient does not   exist
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[30,36]
>  error: package org.apache.commons.httpclient does not   exist
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[31,43]
>  error: package org.apache.commons.httpclient.params does not exist
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[32,45]
>  error: package org.apache.commons.httpclient.protocol   does not exist
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[33,45]
>  error: package org.apache.commons.httpclient.protocol   does not exist
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[41,56]
>  error: cannot find symbol
> [ERROR]   symbol: class SecureProtocolSocketFactory
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[94,125]
>  error: cannot find symbol
> [ERROR]   symbol:   class HttpConnectionParams
> [ERROR]   location: class DefaultSslProtocolSocketFactory
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[94,196]
>  error: cannot find symbol
> [ERROR]   symbol:   class ConnectTimeoutException
> [ERROR]   location: class DefaultSslProtocolSocketFactory
> [ERROR] 
> /a/kylin/engine-mr/src/main/java/org/apache/kylin/engine/mr/common/DefaultSslProtocolSocketFactory.java:[105,19]
>  error: cannot find symbol
> [ERROR]   symbol:   variable ControllerThreadSocketFactory
> [ERROR]   location: class DefaultSslProtocolSocketFactory
> {code}
> We should allow building against hadoop 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3560) Should not depend on personal repository

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3560:
-

 Summary: Should not depend on personal repository
 Key: KYLIN-3560
 URL: https://issues.apache.org/jira/browse/KYLIN-3560
 Project: Kylin
  Issue Type: Task
Reporter: Ted Yu


In core-common/pom.xml:
{code}
com.github.joshelser
{code}
We shouldn't depend on personal repository.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3560) Should not depend on personal repository

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3560:
-

 Summary: Should not depend on personal repository
 Key: KYLIN-3560
 URL: https://issues.apache.org/jira/browse/KYLIN-3560
 Project: Kylin
  Issue Type: Task
Reporter: Ted Yu


In core-common/pom.xml:
{code}
com.github.joshelser
{code}
We shouldn't depend on personal repository.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21178) [BC break] : Get and Scan operation with a custom converter_class not working

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614891#comment-16614891
 ] 

Ted Yu commented on HBASE-21178:


That should be fine.

> [BC break] : Get and Scan operation with a custom converter_class not working
> -
>
> Key: HBASE-21178
> URL: https://issues.apache.org/jira/browse/HBASE-21178
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Subrat Mishra
>Assignee: Subrat Mishra
>Priority: Critical
> Attachments: HBASE-21178.master.001.patch
>
>
> Consider a simple scenario:
> {code:java}
> create 'foo', {NAME => 'f1'}
> put 'foo','r1','f1:a',1000
> get 'foo','r1',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']} 
> scan 'foo',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']}{code}
> Both get and scan fails with ERROR
> {code:java}
> ERROR: wrong number of arguments (3 for 1) {code}
> Looks like in table.rb file converter_method expects 3 arguments [(bytes, 
> offset, len)] since version 2.0.0, prior to version 2.0.0 it was taking only 
> 1 argument [(bytes)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21178) [BC break] : Get and Scan operation with a custom converter_class not working

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614584#comment-16614584
 ] 

Ted Yu commented on HBASE-21178:


{code}
782   converter_class = 'org.apache.hadoop.hbase.util.Bytes' unless 
converter_class
783   converter_method = 'toStringBinary' unless converter_method
784   eval(converter_class).method(converter_method).call(bytes)
{code}
Is it possible that one of converter_class / converter_method is null ?
In that case the conversion should be skipped, right ?

Also, is it possible to add a test with conversion to prevent regression ?

Thanks for the finding.

> [BC break] : Get and Scan operation with a custom converter_class not working
> -
>
> Key: HBASE-21178
> URL: https://issues.apache.org/jira/browse/HBASE-21178
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Subrat Mishra
>Assignee: Subrat Mishra
>Priority: Critical
> Attachments: HBASE-21178.master.001.patch
>
>
> Consider a simple scenario:
> {code:java}
> create 'foo', {NAME => 'f1'}
> put 'foo','r1','f1:a',1000
> get 'foo','r1',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']} 
> scan 'foo',{COLUMNS => 
> ['f1:a:c(org.apache.hadoop.hbase.util.Bytes).len']}{code}
> Both get and scan fails with ERROR
> {code:java}
> ERROR: wrong number of arguments (3 for 1) {code}
> Looks like in table.rb file converter_method expects 3 arguments [(bytes, 
> offset, len)] since version 2.0.0, prior to version 2.0.0 it was taking only 
> 1 argument [(bytes)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21198) Exclude dependency on net.minidev:json-smart

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21198:
--

 Summary: Exclude dependency on net.minidev:json-smart
 Key: HBASE-21198
 URL: https://issues.apache.org/jira/browse/HBASE-21198
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/14414/artifact/patchprocess/patch-javac-3.0.0.txt
> :
{code}
[ERROR] Failed to execute goal on project hbase-common: Could not resolve 
dependencies for project org.apache.hbase:hbase-common:jar:3.0.0-SNAPSHOT: 
Failed to collect dependencies at org.apache.hadoop:hadoop-common:jar:3.0.0 -> 
org.apache.hadoop:hadoop-auth:jar:3.0.0 -> 
com.nimbusds:nimbus-jose-jwt:jar:4.41.1 -> 
net.minidev:json-smart:jar:2.3-SNAPSHOT: Failed to read artifact descriptor for 
net.minidev:json-smart:jar:2.3-SNAPSHOT: Could not transfer artifact 
net.minidev:json-smart:pom:2.3-SNAPSHOT from/to dynamodb-local-oregon 
(https://s3-us-west-2.amazonaws.com/dynamodb-local/release): Access denied to: 
https://s3-us-west-2.amazonaws.com/dynamodb-local/release/net/minidev/json-smart/2.3-SNAPSHOT/json-smart-2.3-SNAPSHOT.pom
 , ReasonPhrase:Forbidden. -> [Help 1]
{code}
We should exclude dependency on net.minidev:json-smart

hbase-common/bin/pom.xml has done so.

The other pom.xml should do the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21198) Exclude dependency on net.minidev:json-smart

2018-09-14 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21198:
--

 Summary: Exclude dependency on net.minidev:json-smart
 Key: HBASE-21198
 URL: https://issues.apache.org/jira/browse/HBASE-21198
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/14414/artifact/patchprocess/patch-javac-3.0.0.txt
> :
{code}
[ERROR] Failed to execute goal on project hbase-common: Could not resolve 
dependencies for project org.apache.hbase:hbase-common:jar:3.0.0-SNAPSHOT: 
Failed to collect dependencies at org.apache.hadoop:hadoop-common:jar:3.0.0 -> 
org.apache.hadoop:hadoop-auth:jar:3.0.0 -> 
com.nimbusds:nimbus-jose-jwt:jar:4.41.1 -> 
net.minidev:json-smart:jar:2.3-SNAPSHOT: Failed to read artifact descriptor for 
net.minidev:json-smart:jar:2.3-SNAPSHOT: Could not transfer artifact 
net.minidev:json-smart:pom:2.3-SNAPSHOT from/to dynamodb-local-oregon 
(https://s3-us-west-2.amazonaws.com/dynamodb-local/release): Access denied to: 
https://s3-us-west-2.amazonaws.com/dynamodb-local/release/net/minidev/json-smart/2.3-SNAPSHOT/json-smart-2.3-SNAPSHOT.pom
 , ReasonPhrase:Forbidden. -> [Help 1]
{code}
We should exclude dependency on net.minidev:json-smart

hbase-common/bin/pom.xml has done so.

The other pom.xml should do the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614570#comment-16614570
 ] 

Ted Yu commented on HBASE-21160:


Can you address these ?
{code}
[WARNING] 
/testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDefaultVisLabelService.java:[94,9]
 [CatchFail] Ignoring exceptions and calling fail() is unnecessary, and makes 
test output less useful
[WARNING] 
/testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDefaultVisLabelService.java:[183,9]
 [CatchFail] Ignoring exceptions and calling fail() is unnecessary, and makes 
test output less useful
[WARNING] 
/testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDefaultVisLabelService.java:[195,71]
 [DefaultCharset] Implicit use of the platform default charset, which can 
result in e.g. non-ASCII characters being silently replaced with '?' in many 
environments
[WARNING] 
/testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDefaultVisLabelService.java:[196,74]
 [DefaultCharset] Implicit use of the platform default charset, which can 
result in e.g. non-ASCII characters being silently replaced with '?' in many 
environments
[WARNING] 
/testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDefaultVisLabelService.java:[197,77]
 [DefaultCharset] Implicit use of the platform default charset, which can 
result in e.g. non-ASCII characters being silently replaced with '?' in many 
environments
{code}

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
> Attachments: HBASE-21160-1.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21102) ServerCrashProcedure should select target server where no other replicas exist for the current region

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614563#comment-16614563
 ] 

Ted Yu commented on HBASE-21102:


Addendum looks good.
{code}
  public boolean isReplicaAvailbleForRegion(final RegionInfo info) {
{code}
Missing 'a' in 'Availble'

> ServerCrashProcedure should select target server where no other replicas 
> exist for the current region
> -
>
> Key: HBASE-21102
> URL: https://issues.apache.org/jira/browse/HBASE-21102
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 3.0.0, 2.2.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
> Attachments: HBASE-21102_1.patch, HBASE-21102_2.patch, 
> HBASE-21102_3.patch, HBASE-21102_4.patch, HBASE-21102_addendum.patch, 
> HBASE-21102_addendum.patch, HBASE-21102_initial.patch
>
>
> Currently when a server with region replica crashes, when the target server 
> is created for the replica region assignment there is no guarentee that a 
> server is selected where there is no other replica for the current region 
> getting assigned. It so happens that currently we do an assignment randomly 
> and later the LB comes and identifies these cases and again does MOVE for 
> such regions. It will be better if we can identify target servers at least 
> minimally ensuring that replicas are not colocated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614555#comment-16614555
 ] 

Ted Yu commented on HBASE-21196:


Ran test without fix:
{code}
testCachingForHTableMultiplexerMultiPut(org.apache.hadoop.hbase.client.TestRegionLocationCaching)
  Time elapsed: 0.469 sec  <<< FAILURE!
java.lang.AssertionError: Expected non-zero number of cached region locations, 
but it is0. Actual: 0
at 
org.apache.hadoop.hbase.client.TestRegionLocationCaching.checkRegionLocationIsCached(TestRegionLocationCaching.java:156)
at 
org.apache.hadoop.hbase.client.TestRegionLocationCaching.testCachingForHTableMultiplexerMultiPut(TestRegionLocationCaching.java:101)

testCachingForHTableMultiplexerSinglePut(org.apache.hadoop.hbase.client.TestRegionLocationCaching)
  Time elapsed: 0.305 sec  <<< FAILURE!
java.lang.AssertionError: Expected non-zero number of cached region locations, 
but it is0. Actual: 0
at 
org.apache.hadoop.hbase.client.TestRegionLocationCaching.checkRegionLocationIsCached(TestRegionLocationCaching.java:156)
at 
org.apache.hadoop.hbase.client.TestRegionLocationCaching.testCachingForHTableMultiplexerSinglePut(TestRegionLocationCaching.java:79)
{code}
Please fix the assertion ('it is0').

Since the other two tests don't fail without fix, I think you may take them out.



> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-21196.master.001.patch, 
> HBASE-21196.master.001.patch, HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 

[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614551#comment-16614551
 ] 

Ted Yu commented on HBASE-21160:


Ran the test with patch locally which passed.

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
> Attachments: HBASE-21160-1.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614279#comment-16614279
 ] 

Ted Yu commented on HBASE-21160:


As I said above, when there is no assertion at the end of try block, you don't 
need to make change.

Please also keep the try-with-resources structure which releases resource.

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-09-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614222#comment-16614222
 ] 

Ted Yu commented on HBASE-20734:


I wouldn't have big chunk of time to review - working on WAL refactoring.

FYI

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>    Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch, 
> HBASE-20734.branch-1.002.patch, HBASE-20734.branch-1.003.patch, 
> HBASE-20734.branch-1.004.patch, HBASE-20734.master.001.patch, 
> HBASE-20734.master.002.patch, HBASE-20734.master.003.patch, 
> HBASE-20734.master.004.patch, HBASE-20734.master.005.patch, 
> HBASE-20734.master.006.patch, HBASE-20734.master.007.patch, 
> HBASE-20734.master.008.patch, HBASE-20734.master.009.patch, 
> HBASE-20734.master.010.patch, HBASE-20734.master.011.patch, 
> HBASE-20734.master.012.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613541#comment-16613541
 ] 

Ted Yu commented on HBASE-21160:


If there is no assertion in the try block where Throwable is caught, you don't 
need to change.

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3559) Use Splitter for splitting String

2018-09-12 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3559:
-

 Summary: Use Splitter for splitting String
 Key: KYLIN-3559
 URL: https://issues.apache.org/jira/browse/KYLIN-3559
 Project: Kylin
  Issue Type: Task
Reporter: Ted Yu


See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
preferred.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3559) Use Splitter for splitting String

2018-09-12 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3559:
-

 Summary: Use Splitter for splitting String
 Key: KYLIN-3559
 URL: https://issues.apache.org/jira/browse/KYLIN-3559
 Project: Kylin
  Issue Type: Task
Reporter: Ted Yu


See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
preferred.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KYLIN-3310) Use lint for maven-compiler-plugin

2018-09-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560940#comment-16560940
 ] 

Ted Yu edited comment on KYLIN-3310 at 9/13/18 3:16 AM:


Thanks, Jiatao .


was (Author: yuzhih...@gmail.com):
Thanks, Jiatao.

> Use lint for maven-compiler-plugin
> --
>
> Key: KYLIN-3310
> URL: https://issues.apache.org/jira/browse/KYLIN-3310
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>    Reporter: Ted Yu
>Assignee: jiatao.tao
>Priority: Major
>
> lint helps identify structural problems.
> We should enable lint for maven-compiler-plugin
> {code}
>   maven-compiler-plugin
>   ${maven-compiler-plugin.version}
>   
> 1.8
> 1.8
> 
>   -Xlint:all
>   ${compiler.error.flag}
>   
>   -Xlint:-options
>   
>   -Xlint:-cast
>   -Xlint:-deprecation
>   -Xlint:-processing
>   -Xlint:-rawtypes
>   -Xlint:-serial
>   -Xlint:-try
>   -Xlint:-unchecked
>   -Xlint:-varargs
>   
>   
>   
> 
> true
> 
> false
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-09-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21194:
--

 Summary: Add TestCopyTable which exercises MOB feature
 Key: HBASE-21194
 URL: https://issues.apache.org/jira/browse/HBASE-21194
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.

We should add variant that enables MOB on the table being copied and verify 
that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-09-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21194:
--

 Summary: Add TestCopyTable which exercises MOB feature
 Key: HBASE-21194
 URL: https://issues.apache.org/jira/browse/HBASE-21194
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.

We should add variant that enables MOB on the table being copied and verify 
that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning

2018-09-12 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21097:
---
Attachment: (was: 21097.v3.txt)

> Flush pressure assertion may fail in testFlushThroughputTuning 
> ---
>
> Key: HBASE-21097
> URL: https://issues.apache.org/jira/browse/HBASE-21097
> Project: HBase
>  Issue Type: Test
>  Components: regionserver
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v4.txt, 
> HBASE-21097.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt
>  :
> {code}
> [ERROR] 
> testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController)
>   Time elapsed: 17.446 s  <<< FAILURE!
> java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6>
>   at 
> org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185)
> {code}
> Here is the related assertion:
> {code}
> assertEquals(0.0, regionServer.getFlushPressure(), EPSILON);
> {code}
> where EPSILON = 1E-6
> In the above case, due to margin of 2.9E-7, the assertion didn't pass.
> It seems the epsilon can be adjusted to accommodate different workload / 
> hardware combination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning

2018-09-12 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21097:
---
Attachment: 21097.v4.txt

> Flush pressure assertion may fail in testFlushThroughputTuning 
> ---
>
> Key: HBASE-21097
> URL: https://issues.apache.org/jira/browse/HBASE-21097
> Project: HBase
>  Issue Type: Test
>  Components: regionserver
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v4.txt, 
> HBASE-21097.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt
>  :
> {code}
> [ERROR] 
> testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController)
>   Time elapsed: 17.446 s  <<< FAILURE!
> java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6>
>   at 
> org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185)
> {code}
> Here is the related assertion:
> {code}
> assertEquals(0.0, regionServer.getFlushPressure(), EPSILON);
> {code}
> where EPSILON = 1E-6
> In the above case, due to margin of 2.9E-7, the assertion didn't pass.
> It seems the epsilon can be adjusted to accommodate different workload / 
> hardware combination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21160) Assertion in TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels is ignored

2018-09-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612119#comment-16612119
 ] 

Ted Yu commented on HBASE-21160:


Since the catch block currently re-throws IOException, that means the catch 
block is no longer needed.

Please run with the change locally before attaching patch.

Thanks

> Assertion in 
> TestVisibilityLabelsWithDeletes#testDeleteColumnsWithoutAndWithVisibilityLabels
>  is ignored
> ---
>
> Key: HBASE-21160
> URL: https://issues.apache.org/jira/browse/HBASE-21160
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: liubangchen
>Priority: Trivial
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14327/artifact/patchprocess/diff-compile-javac-hbase-server.txt
>  (HBASE-21138 QA run):
> {code}
> [WARNING] 
> /testptch/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java:[315,25]
>  [AssertionFailureIgnored] This assertion throws an AssertionError if it 
> fails, which will be caught by an enclosing try block.
> {code}
> Here is related code:
> {code}
>   PrivilegedExceptionAction scanAction = new 
> PrivilegedExceptionAction() {
> @Override
> public Void run() throws Exception {
>   try (Connection connection = 
> ConnectionFactory.createConnection(conf);
> ...
> assertEquals(1, next.length);
>   } catch (Throwable t) {
> throw new IOException(t);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning

2018-09-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21097:
---
Attachment: (was: 21097.v3.txt)

> Flush pressure assertion may fail in testFlushThroughputTuning 
> ---
>
> Key: HBASE-21097
> URL: https://issues.apache.org/jira/browse/HBASE-21097
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, 
> HBASE-21097.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt
>  :
> {code}
> [ERROR] 
> testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController)
>   Time elapsed: 17.446 s  <<< FAILURE!
> java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6>
>   at 
> org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185)
> {code}
> Here is the related assertion:
> {code}
> assertEquals(0.0, regionServer.getFlushPressure(), EPSILON);
> {code}
> where EPSILON = 1E-6
> In the above case, due to margin of 2.9E-7, the assertion didn't pass.
> It seems the epsilon can be adjusted to accommodate different workload / 
> hardware combination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning

2018-09-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21097:
---
Attachment: 21097.v3.txt

> Flush pressure assertion may fail in testFlushThroughputTuning 
> ---
>
> Key: HBASE-21097
> URL: https://issues.apache.org/jira/browse/HBASE-21097
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, 
> HBASE-21097.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt
>  :
> {code}
> [ERROR] 
> testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController)
>   Time elapsed: 17.446 s  <<< FAILURE!
> java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6>
>   at 
> org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185)
> {code}
> Here is the related assertion:
> {code}
> assertEquals(0.0, regionServer.getFlushPressure(), EPSILON);
> {code}
> where EPSILON = 1E-6
> In the above case, due to margin of 2.9E-7, the assertion didn't pass.
> It seems the epsilon can be adjusted to accommodate different workload / 
> hardware combination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21097) Flush pressure assertion may fail in testFlushThroughputTuning

2018-09-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21097:
---
Attachment: 21097.v3.txt

> Flush pressure assertion may fail in testFlushThroughputTuning 
> ---
>
> Key: HBASE-21097
> URL: https://issues.apache.org/jira/browse/HBASE-21097
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21097.v1.txt, 21097.v2.txt, 21097.v3.txt, 
> HBASE-21097.patch
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/14137/artifact/patchprocess/patch-unit-hbase-server.txt
>  :
> {code}
> [ERROR] 
> testFlushThroughputTuning(org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController)
>   Time elapsed: 17.446 s  <<< FAILURE!
> java.lang.AssertionError: expected:<0.0> but was:<1.2906294173808417E-6>
>   at 
> org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController.testFlushThroughputTuning(TestFlushWithThroughputController.java:185)
> {code}
> Here is the related assertion:
> {code}
> assertEquals(0.0, regionServer.getFlushPressure(), EPSILON);
> {code}
> where EPSILON = 1E-6
> In the above case, due to margin of 2.9E-7, the assertion didn't pass.
> It seems the epsilon can be adjusted to accommodate different workload / 
> hardware combination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3557) PreparedStatement should be closed in JDBCResourceDAO#checkTableExists

2018-09-11 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3557:
-

 Summary: PreparedStatement should be closed in 
JDBCResourceDAO#checkTableExists
 Key: KYLIN-3557
 URL: https://issues.apache.org/jira/browse/KYLIN-3557
 Project: Kylin
  Issue Type: Bug
Reporter: Ted Yu


{code}
final PreparedStatement ps = 
connection.prepareStatement(getCheckTableExistsSql(tableName));
final ResultSet rs = ps.executeQuery();
{code}
{{ps}} should be closed upon return.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3557) PreparedStatement should be closed in JDBCResourceDAO#checkTableExists

2018-09-11 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3557:
-

 Summary: PreparedStatement should be closed in 
JDBCResourceDAO#checkTableExists
 Key: KYLIN-3557
 URL: https://issues.apache.org/jira/browse/KYLIN-3557
 Project: Kylin
  Issue Type: Bug
Reporter: Ted Yu


{code}
final PreparedStatement ps = 
connection.prepareStatement(getCheckTableExistsSql(tableName));
final ResultSet rs = ps.executeQuery();
{code}
{{ps}} should be closed upon return.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3556) Interned string should not be used as lock object

2018-09-11 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3556:
-

 Summary: Interned string should not be used as lock object
 Key: KYLIN-3556
 URL: https://issues.apache.org/jira/browse/KYLIN-3556
 Project: Kylin
  Issue Type: Bug
Reporter: Ted Yu


In JDBCResourceDAO :
{code}
public void execute(Connection connection) throws SQLException {
synchronized (resPath.intern()) {
{code}
Locking on an interned string can cause unexpected locking collisions with 
other part of code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3556) Interned string should not be used as lock object

2018-09-11 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3556:
-

 Summary: Interned string should not be used as lock object
 Key: KYLIN-3556
 URL: https://issues.apache.org/jira/browse/KYLIN-3556
 Project: Kylin
  Issue Type: Bug
Reporter: Ted Yu


In JDBCResourceDAO :
{code}
public void execute(Connection connection) throws SQLException {
synchronized (resPath.intern()) {
{code}
Locking on an interned string can cause unexpected locking collisions with 
other part of code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9924) Upgrade zookeeper to 3.4.13

2018-09-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-9924:
--
Description: 
zookeeper 3.4.13 is being released.

ZOOKEEPER-2959 fixes data loss when observer is used
ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / 
cloud) environment

  was:
zookeeper 3.4.13 is being released.

ZOOKEEPER-2959 fixes data loss when observer is used

ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / 
cloud) environment


> Upgrade zookeeper to 3.4.13
> ---
>
> Key: FLINK-9924
> URL: https://issues.apache.org/jira/browse/FLINK-9924
> Project: Flink
>  Issue Type: Task
>    Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> zookeeper 3.4.13 is being released.
> ZOOKEEPER-2959 fixes data loss when observer is used
> ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container 
> / cloud) environment



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3447) Upgrade zookeeper to 3.4.13

2018-09-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-3447:
--
Description: 
zookeeper 3.4.13 is being released with the following fixes:

ZOOKEEPER-2959 fixes data loss when observer is used
ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / 
cloud)
environment

  was:
zookeeper 3.4.13 is being released with the following fixes:

ZOOKEEPER-2959 fixes data loss when observer is used

ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container / 
cloud)
environment


> Upgrade zookeeper to 3.4.13
> ---
>
> Key: KYLIN-3447
> URL: https://issues.apache.org/jira/browse/KYLIN-3447
> Project: Kylin
>  Issue Type: Improvement
>    Reporter: Ted Yu
>Priority: Major
>
> zookeeper 3.4.13 is being released with the following fixes:
> ZOOKEEPER-2959 fixes data loss when observer is used
> ZOOKEEPER-2184 allows ZooKeeper Java clients to work in dynamic IP (container 
> / cloud)
> environment



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21179) Fix the number of actions in responseTooSlow log

2018-09-11 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610426#comment-16610426
 ] 

Ted Yu commented on HBASE-21179:


Looks good.
The "actions" can be written as "action(s)"

> Fix the number of actions in responseTooSlow log
> 
>
> Key: HBASE-21179
> URL: https://issues.apache.org/jira/browse/HBASE-21179
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Attachments: HBASE-21179.master.001.patch
>
>
> {panel:title=responseTooSlow|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE}
> 2018-09-10 16:13:53,022 WARN  
> [B.DefaultRpcServer.handler=209,queue=29,port=60020] ipc.RpcServer: 
> (responseTooSlow): 
> {"processingtimems":321262,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"127.0.0.1:56149","param":"region=
>  
> tsdb,\\x00\\x00.[\\x89\\x1F\\xB0\\x00\\x00\\x01\\x00\\x01Y\\x00\\x00\\x02\\x00\\x00\\x04,1536133210446.7c752de470bd5558a001117b123a5db5.,
>  {color:red}for 1 actions and 1st row{color} 
> key=\\x00\\x00.[\\x96\\x16p","starttimems":1536566911759,"queuetimems":0,"class":"HRegionServer","responsesize":2,"method":"Multi"}
> {panel}
> The responseTooSlow log is printed when the processing time of a request 
> exceeds the specified threshold. The number of actions and the contents of 
> the first rowkey in the request will be included in the log.
> However, the number of actions is inaccurate, and it is actually the number 
> of regions that the request needs to visit.
> Just like the logs above, users may be mistaken for using 321262ms to process 
> an action, which is incredible, so we need to fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion

2018-09-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21173:
---
Priority: Minor  (was: Major)

> Remove the duplicate HRegion#close in TestHRegion
> -
>
> Key: HBASE-21173
> URL: https://issues.apache.org/jira/browse/HBASE-21173
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Minor
> Attachments: HBASE-21173.master.001.patch, 
> HBASE-21173.master.002.patch
>
>
>  After HBASE-21138, some test methods still have the duplicate 
> HRegion#close.So open this issue to remove the duplicate close



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion

2018-09-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609869#comment-16609869
 ] 

Ted Yu commented on HBASE-21173:


+1

> Remove the duplicate HRegion#close in TestHRegion
> -
>
> Key: HBASE-21173
> URL: https://issues.apache.org/jira/browse/HBASE-21173
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Attachments: HBASE-21173.master.001.patch, 
> HBASE-21173.master.002.patch
>
>
>  After HBASE-21138, some test methods still have the duplicate 
> HRegion#close.So open this issue to remove the duplicate close



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Extremely high CPU usage after upgrading to Hbase 1.4.4

2018-09-10 Thread Ted Yu
In the previous stack trace you sent, shortCompactions and longCompactions
threads were not active.

Was the stack trace captured during period when the number of client
operations was low ?

If not, can you capture stack trace during off peak hours ?

Cheers

On Mon, Sep 10, 2018 at 12:08 PM Srinidhi Muppalla 
wrote:

> Hi Ted,
>
> The highest number of filters used is 10, but the average is generally
> close to 1. Is it possible the CPU usage spike has to do with Hbase
> internal maintenance operations? It looks like post-upgrade the spike isn’t
> correlated with the frequency of reads/writes we are making, because the
> high CPU usage persisted when the number of operations went down.
>
> Thank you,
> Srinidhi
>
> On 9/8/18, 9:44 AM, "Ted Yu"  wrote:
>
> Srinidhi :
> Do you know the average / highest number of ColumnPrefixFilter's in the
> FilterList ?
>
> Thanks
>
> On Fri, Sep 7, 2018 at 10:00 PM Ted Yu  wrote:
>
> > Thanks for detailed background information.
> >
> > I assume your code has done de-dup for the filters contained in
> > FilterListWithOR.
> >
> > I took a look at JIRAs which
> > touched hbase-client/src/main/java/org/apache/hadoop/hbase/filter in
> > branch-1.4
> > There were a few patches (some were very big) since the release of
> 1.3.0
> > So it is not obvious at first glance which one(s) might be related.
> >
> > I noticed ColumnPrefixFilter.getNextCellHint (and
> > KeyValueUtil.createFirstOnRow) appearing many times in the stack
> trace.
> >
> > I plan to dig more in this area.
> >
> > Cheers
> >
> > On Fri, Sep 7, 2018 at 11:30 AM Srinidhi Muppalla <
> srinid...@trulia.com>
> > wrote:
> >
> >> Sure thing. For our table schema, each row represents one user and
> the
> >> row key is that user’s unique id in our system. We currently only
> use one
> >> column family in the table. The column qualifiers represent an item
> that
> >> has been surfaced to that user as well as additional information to
> >> differentiate the way the item has been surfaced to the user.
> Without
> >> getting into too many specifics, the qualifier follows the rough
> format of:
> >>
> >> “Channel-itemId-distinguisher”.
> >>
> >> The channel here is the channel through the item was previously
> surfaced
> >> to the user. The itemid is the unique id of the item that has been
> surfaced
> >> to the user. A distinguisher is some attribute about how that item
> was
> >> surfaced to the user.
> >>
> >> When we run a scan, we currently only ever run it on one row at a
> time.
> >> It was chosen over ‘get’ because (from our understanding) the
> performance
> >> difference is negligible, and down the road using scan would allow
> us some
> >> more flexibility.
> >>
> >> The filter list that is constructed with scan works by using a
> >> ColumnPrefixFilter as you mentioned. When a user is being
> communicated to
> >> on a particular channel, we have a list of items that we want to
> >> potentially surface for that user. So, we construct a prefix list
> with the
> >> channel and each of the item ids in the form of: “channel-itemId”.
> Then we
> >> run a scan on that row with that filter list using “WithOr” to get
> all of
> >> the matching channel-itemId combinations currently in that
> row/column
> >> family in the table. This way we can then know which of the items
> we want
> >> to surface to that user on that channel have already been surfaced
> on that
> >> channel. The reason we query using a prefix filter is so that we
> don’t need
> >> to know the ‘distinguisher’ part of the record when writing the
> actual
> >> query, because the distinguisher is only relevant in certain
> circumstances.
> >>
> >> Let me know if this is the information about our query pattern that
> you
> >> were looking for and if there is anything I can clarify or add.
> >>
> >> Thanks,
> >> Srinidhi
> >>
> >> On 9/6/18, 12:24 PM, "Ted Yu"  wrote:
> >>
> >> From the stack trace, ColumnPrefixFilter is used during scan.
> >>
> >> Can you illustrate how various filters are formed thru

[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-09-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609624#comment-16609624
 ] 

Ted Yu commented on HBASE-20952:


Thanks for helpful comments.

bq. comments nor copy/paste allowed

Googledoc interface changed recently. I have given everyone the "comment" 
permission.

Let me study / think about every point raised above.

Will update the doc.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3394) Prepare for Kafka 2.0

2018-09-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KYLIN-3394:
--
Description: 
Kafka 2.0 is around the corner.

I got the following when compiling against Kafka 2.0.0:

{code}
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:266: 
error: type mismatch;
[ERROR]  found   : Option[org.apache.kafka.common.Node]
[ERROR]  required: org.apache.kafka.common.Node
[ERROR] getBrokerInfoFromCache(zkUtils, cachedBrokerInfo, 
List(l)).head.getNode(listenerName)
[ERROR] 
   ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:273: 
error: type mismatch;
[ERROR]  found   : Seq[Option[org.apache.kafka.common.Node]]
[ERROR]  required: Seq[org.apache.kafka.common.Node]
[ERROR] replicaInfo = getBrokerInfoFromCache(zkUtils, 
cachedBrokerInfo, replicas).map(_.getNode(listenerName))
[ERROR] 
 ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:274: 
error: type mismatch;
[ERROR]  found   : Seq[Option[org.apache.kafka.common.Node]]
[ERROR]  required: Seq[org.apache.kafka.common.Node]
[ERROR] isrInfo = getBrokerInfoFromCache(zkUtils, cachedBrokerInfo, 
inSyncReplicas).map(_.getNode(listenerName))
[ERROR] 
   ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:476: 
error: value getConsumersInGroup is not a member of kafka.utils.ZkUtils
[ERROR] zkUtils.getConsumersInGroup(group).nonEmpty
[ERROR] ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:489: 
error: not found: type ZKGroupDirs
[ERROR]   val dir = new ZKGroupDirs(group)
[ERROR] ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:507: 
error: value getTopicsByConsumerGroup is not a member of kafka.utils.ZkUtils
[ERROR] val topics = zkUtils.getTopicsByConsumerGroup(group)
[ERROR]  ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:512: 
error: not found: type ZKGroupTopicDirs
[ERROR]   val dir = new ZKGroupTopicDirs(group, topic)
[ERROR] ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:528: 
error: value getAllConsumerGroupsForTopic is not a member of kafka.utils.ZkUtils
[ERROR] val groups = zkUtils.getAllConsumerGroupsForTopic(topic)
[ERROR]  ^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:720: 
error: value encode is not a member of object kafka.utils.Json
[ERROR] val content = 
Json.encode(getConfigChangeZnodeData(sanitizedEntityPath))
[ERROR]^
[ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:733: 
error: value encode is not a member of object kafka.utils.Json
[ERROR] zkUtils.updatePersistentPath(entityPath, Json.encode(map))
{code}

  was:
Kafka 2.0 is around the corner.

I got the following when compiling against Kafka 2.0.0-SNAPSHOT :

{code}
[ERROR] 
/a/kylin/kylin-it/src/test/java/org/apache/kylin/provision/MockKafka.java:[79,74]
 error: cannot find symbol
[ERROR]   symbol:   method serverConfig()
[ERROR]   location: variable kafkaServer of type KafkaServerStartable
[ERROR] 
/a/kylin/kylin-it/src/test/java/org/apache/kylin/provision/MockKafka.java:[79,113]
 error: cannot find symbol
[ERROR]   symbol:   method serverConfig()
[ERROR]   location: variable kafkaServer of type KafkaServerStartable
[ERROR] 
/a/kylin/kylin-it/src/test/java/org/apache/kylin/provision/MockKafka.java:[79,148]
 error: cannot find symbol
[ERROR]   symbol:   method serverConfig()
[ERROR]   location: variable kafkaServer of type KafkaServerStartable
[ERROR] 
/a/kylin/kylin-it/src/test/java/org/apache/kylin/provision/MockKafka.java:[98,65]
 error: cannot find symbol
{code}


> Prepare for Kafka 2.0
> -
>
> Key: KYLIN-3394
> URL: https://issues.apache.org/jira/browse/KYLIN-3394
> Project: Kylin
>  Issue Type: Task
>    Reporter: Ted Yu
>Priority: Major
>
> Kafka 2.0 is around the corner.
> I got the following when compiling against Kafka 2.0.0:
> {code}
> [ERROR] /a/kylin/kylin-it/src/test/scala/kafka/admin/AdminUtils.scala:266: 
> error: type mismatch;
> [ERROR]  found   : Option[org.apache.kafka.common.Node]
> [ERROR]  required: org.apache.kafka.common.Node
> [ERROR] getBrokerInfoFromCache(zkUtils, cachedBrokerInfo, 
> List(l)).head.getNode(listenerName)
> [ERROR]   
>   

[jira] [Created] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module

2018-09-10 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21180:
--

 Summary: findbugs incurs DataflowAnalysisException for 
hbase-server module
 Key: HBASE-21180
 URL: https://issues.apache.org/jira/browse/HBASE-21180
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


Running findbugs, I noticed the following in hbase-server module:
{code}
[INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server ---
[INFO] Fork Value is true
 [java] The following errors occurred during analysis:
 [java]   Error generating derefs for 
org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
 [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get 
position -1 of stack
 [java]   At edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250)
 [java]   At 
edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141)
 [java]   At 
edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50)
 [java]   At 
edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31)
 [java]   At 
edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369)
 [java]   At 
edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322)
 [java]   At 
edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005)
 [java]   At 
edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209)
 [java]   At 
edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76)
 [java]   At 
edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089)
 [java]   At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283)
 [java]   At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393)
 [java]   At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200)
 [java] The following classes needed for analysis were missing:
 [java]   accept
 [java]   apply
 [java]   run
 [java]   test
 [java]   call
 [java]   exec
 [java]   getAsInt
 [java]   applyAsLong
 [java]   storeFile
 [java]   get
 [java]   visit
 [java]   compare
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module

2018-09-10 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21180:
--

 Summary: findbugs incurs DataflowAnalysisException for 
hbase-server module
 Key: HBASE-21180
 URL: https://issues.apache.org/jira/browse/HBASE-21180
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


Running findbugs, I noticed the following in hbase-server module:
{code}
[INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server ---
[INFO] Fork Value is true
 [java] The following errors occurred during analysis:
 [java]   Error generating derefs for 
org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
 [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get 
position -1 of stack
 [java]   At edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250)
 [java]   At 
edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141)
 [java]   At 
edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50)
 [java]   At 
edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31)
 [java]   At 
edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369)
 [java]   At 
edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322)
 [java]   At 
edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005)
 [java]   At 
edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266)
 [java]   At 
edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278)
 [java]   At 
edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209)
 [java]   At 
edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76)
 [java]   At 
edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089)
 [java]   At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283)
 [java]   At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393)
 [java]   At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200)
 [java] The following classes needed for analysis were missing:
 [java]   accept
 [java]   apply
 [java]   run
 [java]   test
 [java]   call
 [java]   exec
 [java]   getAsInt
 [java]   applyAsLong
 [java]   storeFile
 [java]   get
 [java]   visit
 [java]   compare
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Improving on MTTR of cluster [Hbase - 1.1.13]

2018-09-10 Thread Ted Yu
For the second config you mentioned, hbase.master.distributed.log.replay,
see http://hbase.apache.org/book.html#upgrade2.0.distributed.log.replay

FYI

On Mon, Sep 10, 2018 at 8:52 AM sahil aggarwal 
wrote:

> Hi,
>
> My cluster has around 50k regions and 130 RS. In case of unclean shutdown,
> the cluster take around 40 50 mins to come up(mostly slow on region
> assignment from observation). Trying to optimize it found following
> possible configs:
>
> *hbase.assignment.usezk:* which will co-host meta table and Hmaster and
> avoid zk interaction for region assignment.
> *hbase.master.distributed.log.replay:* to replay the edit logs in
> distributed manner.
>
>
> Testing *hbase.assignment.usezk* alone on small cluster(2200 regions, 4 RS)
> gave following results:
>
> hbase.assignment.usezk=true -> 12 mins
> hbase.assignment.usezk=false -> 9 mins
>
>
> From this blog
> , i
> was expecting better results so probably I am missing something. Will
> appreciate any pointers.
>
> Thanks,
> Sahil
>


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-09-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609379#comment-16609379
 ] 

Ted Yu commented on HBASE-20952:


This is the google doc :

https://docs.google.com/document/d/141FDNSKHIY0DZeIWQd1Dc1QOw-3zlZxUB4Jqabch24c/edit?usp=sharing

This is the condensed version of the review request:

https://reviews.apache.org/r/68672/

The condensed version closely matches the googledoc in terms of key interfaces 
/ classes.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-09-09 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21175:
--

 Summary: Partially initialized SnapshotHFileCleaner leads to NPE 
during TestHFileArchiving
 Key: HBASE-21175
 URL: https://issues.apache.org/jira/browse/HBASE-21175
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
test.
When SnapshotHFileCleaner.init() is called, there is no master parameter passed 
in {{params}}.

When the chore runs the cleaner during the test, NPE comes out of this line in 
getDeletableFiles():
{code}
  return cache.getUnreferencedFiles(files, master.getSnapshotManager());
{code}
since master is null.

We should either check for the null master or, pass master instance properly 
when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-09-09 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21175:
--

 Summary: Partially initialized SnapshotHFileCleaner leads to NPE 
during TestHFileArchiving
 Key: HBASE-21175
 URL: https://issues.apache.org/jira/browse/HBASE-21175
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
test.
When SnapshotHFileCleaner.init() is called, there is no master parameter passed 
in {{params}}.

When the chore runs the cleaner during the test, NPE comes out of this line in 
getDeletableFiles():
{code}
  return cache.getUnreferencedFiles(files, master.getSnapshotManager());
{code}
since master is null.

We should either check for the null master or, pass master instance properly 
when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21150) Avoid delay in first flushes due to overheads in table metrics registration

2018-09-09 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21150:
---
Attachment: (was: 21150.v6.txt)

> Avoid delay in first flushes due to overheads in table metrics registration
> ---
>
> Key: HBASE-21150
> URL: https://issues.apache.org/jira/browse/HBASE-21150
> Project: HBase
>  Issue Type: Improvement
>    Reporter: Ted Yu
>Priority: Minor
> Attachments: 21150.v1.txt, 21150.v2.txt, 21150.v3.txt, 21150.v4.txt, 
> 21150.v4.txt, 21150.v5.txt
>
>
> After HBASE-15728 is integrated, the lazy table metrics registration results 
> in penalty for the first flushes.
> Excerpt from log shows delay (note the same timestamp 08:18:23,234) :
> {code:java}
> 2018-09-02 08:18:23,232 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableSourceImpl(124): Creating new  
> MetricsTableSourceImpl for table 'testtb-1535901500805'
> 2018-09-02 08:18:23,233 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableSourceImpl(137): registering metrics for testtb-   
> 1535901500805
> 2018-09-02 08:18:23,234 INFO  
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
> regionserver.HRegion(2822): Finished flush of dataSize ~2.29 KB/2343,   
> heapSize ~5.16 KB/5280, currentSize=0 B/0 for 
> fa403f6a4fb8dbc1a1c389744fce2d58 in 280ms, sequenceid=5, compaction 
> requested=false
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 0 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 5 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2,5,FailOnTimeoutGroup]
> {code}
> This is a regression in that there were multiple (6 ms) delays before the 
> flush can finish, waiting for the metrics table to be registered.
> When first region of the table is opened on region server, we can proactively 
> register table metrics.
> This would avoid the penalty on first flushes for the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21150) Avoid delay in first flushes due to overheads in table metrics registration

2018-09-09 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21150:
---
Attachment: (was: 21150.v7.txt)

> Avoid delay in first flushes due to overheads in table metrics registration
> ---
>
> Key: HBASE-21150
> URL: https://issues.apache.org/jira/browse/HBASE-21150
> Project: HBase
>  Issue Type: Improvement
>    Reporter: Ted Yu
>Priority: Minor
> Attachments: 21150.v1.txt, 21150.v2.txt, 21150.v3.txt, 21150.v4.txt, 
> 21150.v4.txt, 21150.v5.txt
>
>
> After HBASE-15728 is integrated, the lazy table metrics registration results 
> in penalty for the first flushes.
> Excerpt from log shows delay (note the same timestamp 08:18:23,234) :
> {code:java}
> 2018-09-02 08:18:23,232 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableSourceImpl(124): Creating new  
> MetricsTableSourceImpl for table 'testtb-1535901500805'
> 2018-09-02 08:18:23,233 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableSourceImpl(137): registering metrics for testtb-   
> 1535901500805
> 2018-09-02 08:18:23,234 INFO  
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
> regionserver.HRegion(2822): Finished flush of dataSize ~2.29 KB/2343,   
> heapSize ~5.16 KB/5280, currentSize=0 B/0 for 
> fa403f6a4fb8dbc1a1c389744fce2d58 in 280ms, sequenceid=5, compaction 
> requested=false
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 0 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register   
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 5 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG 
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2] 
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register  
> testtb-1535901500805 
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2,5,FailOnTimeoutGroup]
> {code}
> This is a regression in that there were multiple (6 ms) delays before the 
> flush can finish, waiting for the metrics table to be registered.
> When first region of the table is opened on region server, we can proactively 
> register table metrics.
> This would avoid the penalty on first flushes for the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-16458) Shorten backup / restore test execution time

2018-09-09 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608470#comment-16608470
 ] 

Ted Yu edited comment on HBASE-16458 at 9/9/18 4:29 PM:


HBASE-16458-v2.patch was the last one from Vlad.

Patch v5 is from me, based on Vlad's v2.


was (Author: yuzhih...@gmail.com):
HBASE-16458-v2.patch was the last one from Vlad.

Patch v5 is from me.

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.v1.txt, 16458.v2.txt, 16458.v2.txt, 
> 16458.v3.txt, 16458.v4.txt, 16458.v5.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Tim

[jira] [Commented] (HBASE-16458) Shorten backup / restore test execution time

2018-09-09 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608470#comment-16608470
 ] 

Ted Yu commented on HBASE-16458:


HBASE-16458-v2.patch was the last one from Vlad.

Patch v5 is from me.

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.v1.txt, 16458.v2.txt, 16458.v2.txt, 
> 16458.v3.txt, 16458.v4.txt, 16458.v5.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapse

[jira] [Updated] (HBASE-16458) Shorten backup / restore test execution time

2018-09-09 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16458:
---
Attachment: (was: 16458.HBASE-7912.v5.txt)

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.v1.txt, 16458.v2.txt, 16458.v2.txt, 
> 16458.v3.txt, 16458.v4.txt, 16458.v5.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
&g

[jira] [Commented] (HBASE-21174) [REST] Failed to parse empty qualifier in TableResource#getScanResource

2018-09-09 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608469#comment-16608469
 ] 

Ted Yu commented on HBASE-21174:


+1, pending QA

> [REST] Failed to parse empty qualifier in TableResource#getScanResource
> ---
>
> Key: HBASE-21174
> URL: https://issues.apache.org/jira/browse/HBASE-21174
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Attachments: HBASE-21174.master.001.patch
>
>
> {code:xml}
> GET /t1/*?column=f:c1=f:
> {code}
> If I want to get the values of 'f:'(empty qualifier) for all rows in the 
> table by rest server, I will send the above request. However, this request 
> will return all column values.
> {code:java|title=TableResource#getScanResource|borderStyle=solid}
>   for (String csplit : column) {
> String[] familysplit = csplit.trim().split(":");
> if (familysplit.length == 2) {
>   if (familysplit[1].length() > 0) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Scan family and column : " + familysplit[0] + "  " + 
> familysplit[1]);
> }
> tableScan.addColumn(Bytes.toBytes(familysplit[0]), 
> Bytes.toBytes(familysplit[1]));
>   } else {
> tableScan.addFamily(Bytes.toBytes(familysplit[0]));
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Scan family : " + familysplit[0] + " and empty 
> qualifier.");
> }
> tableScan.addColumn(Bytes.toBytes(familysplit[0]), null);
>   }
> } else if (StringUtils.isNotEmpty(familysplit[0])) {
>   if (LOG.isTraceEnabled()) {
> LOG.trace("Scan family : " + familysplit[0]);
>   }
>   tableScan.addFamily(Bytes.toBytes(familysplit[0]));
> }
>   }
> {code}
> Through the above code, when the column has an empty qualifier, the empty 
> qualifier cannot be parsed correctly.In other words, 'f:'(empty qualifier) 
> and 'f' (column family) are considered to have the same meaning, which is 
> wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9824) Support IPv6 literal

2018-09-09 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-9824:
--
Description: 
Currently we use colon as separator when parsing host and port.


We should support the usage of IPv6 literals in parsing.

  was:
Currently we use colon as separator when parsing host and port.

We should support the usage of IPv6 literals in parsing.


> Support IPv6 literal
> 
>
> Key: FLINK-9824
> URL: https://issues.apache.org/jira/browse/FLINK-9824
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>    Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Minor
>
> Currently we use colon as separator when parsing host and port.
> We should support the usage of IPv6 literals in parsing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7795) Utilize error-prone to discover common coding mistakes

2018-09-09 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-7795:
--
Description: 
http://errorprone.info/ is a tool which detects common coding mistakes.

We should incorporate into Flink build process.
Here are the dependencies:

{code}

  com.google.errorprone
  error_prone_annotation
  ${error-prone.version}
  provided


  
  com.google.auto.service
  auto-service
  1.0-rc3
  true


  com.google.errorprone
  error_prone_check_api
  ${error-prone.version}
  provided
  

  com.google.code.findbugs
  jsr305

  


  com.google.errorprone
  javac
  9-dev-r4023-3
  provided

  
{code}

  was:
http://errorprone.info/ is a tool which detects common coding mistakes.

We should incorporate into Flink build process.
Here are the dependencies:
{code}

  com.google.errorprone
  error_prone_annotation
  ${error-prone.version}
  provided


  
  com.google.auto.service
  auto-service
  1.0-rc3
  true


  com.google.errorprone
  error_prone_check_api
  ${error-prone.version}
  provided
  

  com.google.code.findbugs
  jsr305

  


  com.google.errorprone
  javac
  9-dev-r4023-3
  provided

  
{code}


> Utilize error-prone to discover common coding mistakes
> --
>
> Key: FLINK-7795
> URL: https://issues.apache.org/jira/browse/FLINK-7795
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>    Reporter: Ted Yu
>Priority: Major
>
> http://errorprone.info/ is a tool which detects common coding mistakes.
> We should incorporate into Flink build process.
> Here are the dependencies:
> {code}
> 
>   com.google.errorprone
>   error_prone_annotation
>   ${error-prone.version}
>   provided
> 
> 
>   
>   com.google.auto.service
>   auto-service
>   1.0-rc3
>   true
> 
> 
>   com.google.errorprone
>   error_prone_check_api
>   ${error-prone.version}
>   provided
>   
> 
>   com.google.code.findbugs
>   jsr305
> 
>   
> 
> 
>   com.google.errorprone
>   javac
>   9-dev-r4023-3
>   provided
> 
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: ABORTING region server and following HBase cluster "crash"

2018-09-08 Thread Ted Yu
It seems you should deploy hbase with the following fix:

HBASE-21069 NPE in StoreScanner.updateReaders causes RS to crash

1.4.7 was recently released.

FYI

On Sat, Sep 8, 2018 at 3:32 PM Batyrshin Alexander <0x62...@gmail.com>
wrote:

>  Hello,
>
> We got this exception from *prod006* server
>
> Sep 09 00:38:02 prod006 hbase[18907]: 2018-09-09 00:38:02,532 FATAL
> [MemStoreFlusher.1] regionserver.HRegionServer: ABORTING region server
> prod006,60020,1536235102833: Replay of WAL required. Forcing server shutdown
> Sep 09 00:38:02 prod006 hbase[18907]:
> org.apache.hadoop.hbase.DroppedSnapshotException:
> region: 
> KM,c\xEF\xBF\xBD\x16I7\xEF\xBF\xBD\x0A"A\xEF\xBF\xBDd\xEF\xBF\xBD\xEF\xBF\xBD\x19\x07t,1536178245576.60c121ba50e67f2429b9ca2ba2a11bad.
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2645)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2322)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2284)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2170)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2095)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:508)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:478)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:76)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:264)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> java.lang.Thread.run(Thread.java:748)
> Sep 09 00:38:02 prod006 hbase[18907]: Caused by:
> java.lang.NullPointerException
> Sep 09 00:38:02 prod006 hbase[18907]: at
> java.util.ArrayList.(ArrayList.java:178)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.StoreScanner.updateReaders(StoreScanner.java:863)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1172)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1145)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HStore.access$900(HStore.java:122)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2505)
> Sep 09 00:38:02 prod006 hbase[18907]: at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2600)
> Sep 09 00:38:02 prod006 hbase[18907]: ... 9 more
> Sep 09 00:38:02 prod006 hbase[18907]: 2018-09-09 00:38:02,532 FATAL
> [MemStoreFlusher.1] regionserver.HRegionServer: RegionServer abort: loaded
> coprocessors
> are: [org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator,
> org.apache.phoenix.coprocessor.SequenceRegionObserver, org.apache.phoenix.c
>
> After that we got ABORTING on almost every Region Servers in cluster with
> different reasons:
>
> *prod003*
> Sep 09 01:12:11 prod003 hbase[11552]: 2018-09-09 01:12:11,799 FATAL
> [PostOpenDeployTasks:88bfac1dfd807c4cd1e9c1f31b4f053f]
> regionserver.HRegionServer: ABORTING region
> server prod003,60020,1536444066291: Exception running postOpenDeployTasks;
> region=88bfac1dfd807c4cd1e9c1f31b4f053f
> Sep 09 01:12:11 prod003 hbase[11552]: java.io.InterruptedIOException:
> #139, interrupted. currentNumberOfTask=8
> Sep 09 01:12:11 prod003 hbase[11552]: at
> org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1853)
> Sep 09 01:12:11 prod003 hbase[11552]: at
> org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1823)
> Sep 09 01:12:11 prod003 hbase[11552]: at
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1899)
> Sep 09 01:12:11 prod003 hbase[11552]: at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:250)
> Sep 09 01:12:11 prod003 hbase[11552]: at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:213)
> Sep 09 01:12:11 prod003 hbase[11552]: at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1484)
> Sep 09 01:12:11 prod003 hbase[11552]: at
> org.apache.hadoop.hbase.client.HTable.put(HTable.java:1031)
> Sep 09 01:12:11 

[jira] [Commented] (HBASE-21052) After restoring a snapshot, table.jsp page for the table gets stuck

2018-09-08 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608191#comment-16608191
 ] 

Ted Yu commented on HBASE-21052:


lgtm

Test failure was not related.

> After restoring a snapshot, table.jsp page for the table gets stuck
> ---
>
> Key: HBASE-21052
> URL: https://issues.apache.org/jira/browse/HBASE-21052
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-21052.master.001.patch, 
> HBASE-21052.master.002.patch, HBASE-21052.master.003.patch
>
>
> Steps to reproduce are as follows:
> 1. Create a table
> {code}
> create "test", "cf"
> {code}
> 2. Take a hbase snapshot for the table
> {code}
> snapshot "test", "snap"
> {code}
> 3. Disable the table
> {code}
> disable "test"
> {code}
> 4. Restore the hbase snapshot
> {code}
> restore_snapshot "snap"
> {code}
> 5. Open the table.jsp page for the table in a browser, but it gets stuck
> {code}
> http://:16010/table.jsp?name=test
> {code}
> According to the following thread dump, it looks like 
> ConnectionImplementation.locateRegionInMeta() gets stuck when getting a 
> compaction state.
> {code}
> "qtp2068100669-89" #89 daemon prio=5 os_prio=31 tid=0x7febac55b800 
> nid=0xf403 waiting on condition [0x762b7000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:933)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:752)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:738)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegions(ConnectionImplementation.java:694)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegions(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCompactionState(HBaseAdmin.java:3336)
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCompactionState(HBaseAdmin.java:2521)
> at 
> org.apache.hadoop.hbase.generated.master.table_jsp._jspService(table_jsp.java:316)
> at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:111)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
> at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
> at 
> org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:112)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.hbase.http.ClickjackingPreventionFilter.doFilter(ClickjackingPreventionFilter.java:48)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1374)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.jav

[jira] [Commented] (HBASE-16458) Shorten backup / restore test execution time

2018-09-08 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608182#comment-16608182
 ] 

Ted Yu commented on HBASE-16458:


>From the test console, you would see:
{code}
17:01:53 |  +1  |   unit  |  13m  3s   | hbase-backup in the patch 
passed. 
{code}
There is no slowdown by tearing down cluster thru shutdown hook.

We should perform mini cluster shutdown to remove intermediate files generated 
during test runs.
Otherwise there is chance that such files stay on the Jenkins machine(s).

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 
> 16458.v2.txt, 16458.v3.txt, 16458.v4.txt, 16458.v5.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.Te

Re: Extremely high CPU usage after upgrading to Hbase 1.4.4

2018-09-08 Thread Ted Yu
Srinidhi :
Do you know the average / highest number of ColumnPrefixFilter's in the
FilterList ?

Thanks

On Fri, Sep 7, 2018 at 10:00 PM Ted Yu  wrote:

> Thanks for detailed background information.
>
> I assume your code has done de-dup for the filters contained in
> FilterListWithOR.
>
> I took a look at JIRAs which
> touched hbase-client/src/main/java/org/apache/hadoop/hbase/filter in
> branch-1.4
> There were a few patches (some were very big) since the release of 1.3.0
> So it is not obvious at first glance which one(s) might be related.
>
> I noticed ColumnPrefixFilter.getNextCellHint (and
> KeyValueUtil.createFirstOnRow) appearing many times in the stack trace.
>
> I plan to dig more in this area.
>
> Cheers
>
> On Fri, Sep 7, 2018 at 11:30 AM Srinidhi Muppalla 
> wrote:
>
>> Sure thing. For our table schema, each row represents one user and the
>> row key is that user’s unique id in our system. We currently only use one
>> column family in the table. The column qualifiers represent an item that
>> has been surfaced to that user as well as additional information to
>> differentiate the way the item has been surfaced to the user. Without
>> getting into too many specifics, the qualifier follows the rough format of:
>>
>> “Channel-itemId-distinguisher”.
>>
>> The channel here is the channel through the item was previously surfaced
>> to the user. The itemid is the unique id of the item that has been surfaced
>> to the user. A distinguisher is some attribute about how that item was
>> surfaced to the user.
>>
>> When we run a scan, we currently only ever run it on one row at a time.
>> It was chosen over ‘get’ because (from our understanding) the performance
>> difference is negligible, and down the road using scan would allow us some
>> more flexibility.
>>
>> The filter list that is constructed with scan works by using a
>> ColumnPrefixFilter as you mentioned. When a user is being communicated to
>> on a particular channel, we have a list of items that we want to
>> potentially surface for that user. So, we construct a prefix list with the
>> channel and each of the item ids in the form of: “channel-itemId”. Then we
>> run a scan on that row with that filter list using “WithOr” to get all of
>> the matching channel-itemId combinations currently in that row/column
>> family in the table. This way we can then know which of the items we want
>> to surface to that user on that channel have already been surfaced on that
>> channel. The reason we query using a prefix filter is so that we don’t need
>> to know the ‘distinguisher’ part of the record when writing the actual
>> query, because the distinguisher is only relevant in certain circumstances.
>>
>> Let me know if this is the information about our query pattern that you
>> were looking for and if there is anything I can clarify or add.
>>
>> Thanks,
>> Srinidhi
>>
>> On 9/6/18, 12:24 PM, "Ted Yu"  wrote:
>>
>> From the stack trace, ColumnPrefixFilter is used during scan.
>>
>> Can you illustrate how various filters are formed thru
>> FilterListWithOR ?
>> It would be easier for other people to reproduce the problem given
>> your
>> query pattern.
>>
>> Cheers
>>
>> On Thu, Sep 6, 2018 at 11:43 AM Srinidhi Muppalla <
>> srinid...@trulia.com>
>> wrote:
>>
>> > Hi Vlad,
>> >
>> > Thank you for the suggestion. I recreated the issue and attached
>> the stack
>> > traces I took. Let me know if there’s any other info I can provide.
>> We
>> > narrowed the issue down to occurring when upgrading from 1.3.0 to
>> any 1.4.x
>> > version.
>> >
>> > Thanks,
>> > Srinidhi
>> >
>> > On 9/4/18, 8:19 PM, "Vladimir Rodionov" 
>> wrote:
>> >
>> > Hi, Srinidhi
>> >
>> > Next time you will see this issue, take jstack of a RS several
>> times
>> > in a
>> > row. W/o stack traces it is hard
>> > to tell what was going on with your cluster after upgrade.
>> >
>> > -Vlad
>> >
>> >
>> >
>> > On Tue, Sep 4, 2018 at 3:50 PM Srinidhi Muppalla <
>> srinid...@trulia.com
>> > >
>> > wrote:
>> >
>> > > Hello all,
>> > >
>> > > We are currently running Hbase 1.3.0

[jira] [Commented] (HBASE-21173) Remove the duplicate HRegion#close in TestHRegion

2018-09-08 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608119#comment-16608119
 ] 

Ted Yu commented on HBASE-21173:


{code}
 region.close();
 assertEquals(max, region.getMaxFlushedSeqId());
+region = null;
{code}
I think the intention of HBASE-21138 is to let 
HBaseTestingUtility.closeRegionAndWAL do the cleanup.

Can you remove the duplicate region.close() call in these subtests ?

Thanks

> Remove the duplicate HRegion#close in TestHRegion
> -
>
> Key: HBASE-21173
> URL: https://issues.apache.org/jira/browse/HBASE-21173
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Attachments: HBASE-21173.master.001.patch
>
>
>  After HBASE-21138, some test methods still have the duplicate 
> HRegion#close.So open this issue to remove the duplicate close



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16458) Shorten backup / restore test execution time

2018-09-08 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16458:
---
Attachment: 16458.v5.txt

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 
> 16458.v2.txt, 16458.v3.txt, 16458.v4.txt, 16458.v5.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete

[jira] [Commented] (HBASE-16458) Shorten backup / restore test execution time

2018-09-08 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608029#comment-16608029
 ] 

Ted Yu commented on HBASE-16458:


16458.v4.txt for checkstyle warnings.

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 
> 16458.v2.txt, 16458.v3.txt, 16458.v4.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapse

[jira] [Updated] (HBASE-16458) Shorten backup / restore test execution time

2018-09-08 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16458:
---
Attachment: 16458.v4.txt

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 
> 16458.v2.txt, 16458.v3.txt, 16458.v4.txt, HBASE-16458-v1.patch, 
> HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
&g

Re: Extremely high CPU usage after upgrading to Hbase 1.4.4

2018-09-07 Thread Ted Yu
The createFirstOnRow() is used by ColumnXXFilter's getNextCellHint() method.
I am thinking about adding a variant to getNextCellHint() which returns a
tuple, representing first on row, consisting of:
  Cell - the passed in Cell instance
  byte[] - qualifier array
  int - qualifier offset
  int - qualifier length
This variant doesn't allocate (new) Cell / KeyValue.

This way, FilterListWithOR#shouldPassCurrentCellToFilter can use the
returned tuple for comparison.

FYI

On Fri, Sep 7, 2018 at 10:00 PM Ted Yu  wrote:

> Thanks for detailed background information.
>
> I assume your code has done de-dup for the filters contained in
> FilterListWithOR.
>
> I took a look at JIRAs which
> touched hbase-client/src/main/java/org/apache/hadoop/hbase/filter in
> branch-1.4
> There were a few patches (some were very big) since the release of 1.3.0
> So it is not obvious at first glance which one(s) might be related.
>
> I noticed ColumnPrefixFilter.getNextCellHint (and
> KeyValueUtil.createFirstOnRow) appearing many times in the stack trace.
>
> I plan to dig more in this area.
>
> Cheers
>
> On Fri, Sep 7, 2018 at 11:30 AM Srinidhi Muppalla 
> wrote:
>
>> Sure thing. For our table schema, each row represents one user and the
>> row key is that user’s unique id in our system. We currently only use one
>> column family in the table. The column qualifiers represent an item that
>> has been surfaced to that user as well as additional information to
>> differentiate the way the item has been surfaced to the user. Without
>> getting into too many specifics, the qualifier follows the rough format of:
>>
>> “Channel-itemId-distinguisher”.
>>
>> The channel here is the channel through the item was previously surfaced
>> to the user. The itemid is the unique id of the item that has been surfaced
>> to the user. A distinguisher is some attribute about how that item was
>> surfaced to the user.
>>
>> When we run a scan, we currently only ever run it on one row at a time.
>> It was chosen over ‘get’ because (from our understanding) the performance
>> difference is negligible, and down the road using scan would allow us some
>> more flexibility.
>>
>> The filter list that is constructed with scan works by using a
>> ColumnPrefixFilter as you mentioned. When a user is being communicated to
>> on a particular channel, we have a list of items that we want to
>> potentially surface for that user. So, we construct a prefix list with the
>> channel and each of the item ids in the form of: “channel-itemId”. Then we
>> run a scan on that row with that filter list using “WithOr” to get all of
>> the matching channel-itemId combinations currently in that row/column
>> family in the table. This way we can then know which of the items we want
>> to surface to that user on that channel have already been surfaced on that
>> channel. The reason we query using a prefix filter is so that we don’t need
>> to know the ‘distinguisher’ part of the record when writing the actual
>> query, because the distinguisher is only relevant in certain circumstances.
>>
>> Let me know if this is the information about our query pattern that you
>> were looking for and if there is anything I can clarify or add.
>>
>> Thanks,
>> Srinidhi
>>
>> On 9/6/18, 12:24 PM, "Ted Yu"  wrote:
>>
>> From the stack trace, ColumnPrefixFilter is used during scan.
>>
>> Can you illustrate how various filters are formed thru
>> FilterListWithOR ?
>> It would be easier for other people to reproduce the problem given
>> your
>> query pattern.
>>
>> Cheers
>>
>> On Thu, Sep 6, 2018 at 11:43 AM Srinidhi Muppalla <
>> srinid...@trulia.com>
>> wrote:
>>
>> > Hi Vlad,
>> >
>> > Thank you for the suggestion. I recreated the issue and attached
>> the stack
>> > traces I took. Let me know if there’s any other info I can provide.
>> We
>> > narrowed the issue down to occurring when upgrading from 1.3.0 to
>> any 1.4.x
>> > version.
>> >
>> > Thanks,
>> > Srinidhi
>> >
>> > On 9/4/18, 8:19 PM, "Vladimir Rodionov" 
>> wrote:
>> >
>> > Hi, Srinidhi
>> >
>> > Next time you will see this issue, take jstack of a RS several
>> times
>> > in a
>> > row. W/o stack traces it is hard
>> > to tell what was going on with your cluster after upgrade.
>> >
>> > -Vlad

Re: Extremely high CPU usage after upgrading to Hbase 1.4.4

2018-09-07 Thread Ted Yu
Thanks for detailed background information.

I assume your code has done de-dup for the filters contained in
FilterListWithOR.

I took a look at JIRAs which
touched hbase-client/src/main/java/org/apache/hadoop/hbase/filter in
branch-1.4
There were a few patches (some were very big) since the release of 1.3.0
So it is not obvious at first glance which one(s) might be related.

I noticed ColumnPrefixFilter.getNextCellHint (and
KeyValueUtil.createFirstOnRow) appearing many times in the stack trace.

I plan to dig more in this area.

Cheers

On Fri, Sep 7, 2018 at 11:30 AM Srinidhi Muppalla 
wrote:

> Sure thing. For our table schema, each row represents one user and the row
> key is that user’s unique id in our system. We currently only use one
> column family in the table. The column qualifiers represent an item that
> has been surfaced to that user as well as additional information to
> differentiate the way the item has been surfaced to the user. Without
> getting into too many specifics, the qualifier follows the rough format of:
>
> “Channel-itemId-distinguisher”.
>
> The channel here is the channel through the item was previously surfaced
> to the user. The itemid is the unique id of the item that has been surfaced
> to the user. A distinguisher is some attribute about how that item was
> surfaced to the user.
>
> When we run a scan, we currently only ever run it on one row at a time. It
> was chosen over ‘get’ because (from our understanding) the performance
> difference is negligible, and down the road using scan would allow us some
> more flexibility.
>
> The filter list that is constructed with scan works by using a
> ColumnPrefixFilter as you mentioned. When a user is being communicated to
> on a particular channel, we have a list of items that we want to
> potentially surface for that user. So, we construct a prefix list with the
> channel and each of the item ids in the form of: “channel-itemId”. Then we
> run a scan on that row with that filter list using “WithOr” to get all of
> the matching channel-itemId combinations currently in that row/column
> family in the table. This way we can then know which of the items we want
> to surface to that user on that channel have already been surfaced on that
> channel. The reason we query using a prefix filter is so that we don’t need
> to know the ‘distinguisher’ part of the record when writing the actual
> query, because the distinguisher is only relevant in certain circumstances.
>
> Let me know if this is the information about our query pattern that you
> were looking for and if there is anything I can clarify or add.
>
> Thanks,
> Srinidhi
>
> On 9/6/18, 12:24 PM, "Ted Yu"  wrote:
>
> From the stack trace, ColumnPrefixFilter is used during scan.
>
> Can you illustrate how various filters are formed thru
> FilterListWithOR ?
> It would be easier for other people to reproduce the problem given your
> query pattern.
>
> Cheers
>
> On Thu, Sep 6, 2018 at 11:43 AM Srinidhi Muppalla <
> srinid...@trulia.com>
> wrote:
>
> > Hi Vlad,
> >
> > Thank you for the suggestion. I recreated the issue and attached the
> stack
> > traces I took. Let me know if there’s any other info I can provide.
> We
> > narrowed the issue down to occurring when upgrading from 1.3.0 to
> any 1.4.x
> > version.
> >
> > Thanks,
> > Srinidhi
> >
> > On 9/4/18, 8:19 PM, "Vladimir Rodionov" 
> wrote:
> >
> > Hi, Srinidhi
> >
> > Next time you will see this issue, take jstack of a RS several
> times
> > in a
> > row. W/o stack traces it is hard
> > to tell what was going on with your cluster after upgrade.
> >
> > -Vlad
> >
> >
> >
> > On Tue, Sep 4, 2018 at 3:50 PM Srinidhi Muppalla <
> srinid...@trulia.com
> > >
> > wrote:
> >
> > > Hello all,
> > >
> > > We are currently running Hbase 1.3.0 on an EMR cluster running
> EMR
> > 5.5.0.
> > > Recently, we attempted to upgrade our cluster to using Hbase
> 1.4.4
> > (along
> > > with upgrading our EMR cluster to 5.16). After upgrading, the
> CPU
> > usage for
> > > all of our region servers spiked up to 90%. The load_one for
> all of
> > our
> > > servers spiked from roughly 1-2 to 10 threads. After
> upgrading, the
> > number
> > > of operations to the cluster hasn’t increased. After giving the
> > cluster a
> > > few hours, we had to revert the upgrade. From the logs, we are
> > unable to
> > > tell what is occupying the CPU resources. Is this a known
> issue with
> > 1.4.4?
> > > Any guidance or ideas for debugging the cause would be greatly
> > > appreciated.  What are the best steps for debugging CPU usage?
> > >
> > > Thank you,
> > > Srinidhi
> > >
> >
> >
> >
>
>
>


[jira] [Commented] (HBASE-16458) Shorten backup / restore test execution time

2018-09-07 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607893#comment-16607893
 ] 

Ted Yu commented on HBASE-16458:


Vlad:
Can you take a look at 16458.v2.txt ?

This is based on your patch, using shutdown hook to tear down the mini-cluster 
at the end of last test which is subclass of TestBackupBase.



> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 
> 16458.v2.txt, 16458.v3.txt, HBASE-16458-v1.patch, HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDe

[jira] [Updated] (HBASE-16458) Shorten backup / restore test execution time

2018-09-07 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16458:
---
Attachment: 16458.v2.txt

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 
> 16458.v2.txt, 16458.v3.txt, HBASE-16458-v1.patch, HBASE-16458-v2.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> 

[jira] [Updated] (HBASE-16458) Shorten backup / restore test execution time

2018-09-07 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16458:
---
Attachment: 16458-v1.patch

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458-v1.patch, 16458.HBASE-7912.v3.txt, 
> 16458.HBASE-7912.v4.txt, 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 
> 16458.v3.txt, HBASE-16458-v1.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> 

[jira] [Commented] (HBASE-16458) Shorten backup / restore test execution time

2018-09-07 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607737#comment-16607737
 ] 

Ted Yu commented on HBASE-16458:


On Linux, with patch, from first test output:
{code}
2018-09-07 22:06:50,491 INFO  [Time-limited test] hbase.ResourceChecker(148): 
before: backup.TestBackupUtils#TestGetBulkOutputDir Thread=8, 
OpenFileDescriptor=179, MaxFileDescriptor=32000, SystemLoadAverage=242, 
ProcessCount=363, AvailableMemoryMB=56614
{code}
to last:
{code}
2018-09-07 22:23:48,010 INFO  [Block report processor] 
blockmanagement.BlockManager(2645): BLOCK* addStoredBlock: blockMap updated: 
127.0.0.1:36058 is added to blk_1073741829_1005{UCState=COMMITTED, 
truncateBlock=null, primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-453ccfd4-ec24-490b-a51e-2b75f5b1da9f:NORMAL:127.0.0.1:36058|RBW]]}
 size 146414
2018-09-07 22:23:48,413 INFO  [Thread-3] 
regionserver.ShutdownHook$ShutdownHookThread(135): Shutdown hook finished.
{code}
That was ~17 minutes.

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458.HBASE-7912.v3.txt, 16458.HBASE-7912.v4.txt, 
> 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 16458.v3.txt, 
> HBASE-16458-v1.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullResto

[jira] [Updated] (HBASE-20743) ASF License warnings for branch-1

2018-09-07 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20743:
---
Description: 
>From 
>https://builds.apache.org/job/HBase%20Nightly/job/branch-1/450/artifact/output-general/patch-asflicense-problems.txt
> :
{code}
Lines that start with ? in the ASF License  report indicate files that do 
not have an Apache license header:
 !? hbase-error-prone/target/checkstyle-result.xml
 !? 
hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
{code}
Looks like they should be excluded.

  was:
>From 
>https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350/artifact/output-general/patch-asflicense-problems.txt
> :
{code}
Lines that start with ? in the ASF License  report indicate files that do 
not have an Apache license header:
 !? hbase-error-prone/target/checkstyle-result.xml
 !? 
hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
{code}
Looks like they should be excluded.


> ASF License warnings for branch-1
> -
>
> Key: HBASE-20743
> URL: https://issues.apache.org/jira/browse/HBASE-20743
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/branch-1/450/artifact/output-general/patch-asflicense-problems.txt
>  :
> {code}
> Lines that start with ? in the ASF License  report indicate files that do 
> not have an Apache license header:
>  !? hbase-error-prone/target/checkstyle-result.xml
>  !? 
> hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
>  !? 
> hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
>  !? 
> hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
> {code}
> Looks like they should be excluded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16458) Shorten backup / restore test execution time

2018-09-07 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16458:
---
Status: Patch Available  (was: Reopened)

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458.HBASE-7912.v3.txt, 16458.HBASE-7912.v4.txt, 
> 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 16458.v3.txt, 
> HBASE-16458-v1.patch
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> 

[jira] [Assigned] (HBASE-16458) Shorten backup / restore test execution time

2018-09-07 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-16458:
--

Assignee: Vladimir Rodionov  (was: Ted Yu)

Assigning to Vlad who has done experiments.

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: 16458.HBASE-7912.v3.txt, 16458.HBASE-7912.v4.txt, 
> 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 16458.v3.txt
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Run

<    5   6   7   8   9   10   11   12   13   14   >