date:20160614

[jira] [Updated] (HBASE-16012) Major compaction can't work because left scanner read point in RegionServer

2016-06-14 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-16012:
---
Attachment: HBASE-16012-v2.patch

> Major compaction can't work because left scanner read point in RegionServer
> ---
>
> Key: HBASE-16012
> URL: https://issues.apache.org/jira/browse/HBASE-16012
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, Scanners
>Affects Versions: 2.0.0, 0.94.27
>Reporter: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16012-v1.patch, HBASE-16012-v2.patch, 
> HBASE-16012.patch
>
>
> When new RegionScanner, it will add a scanner read point in 
> scannerReadPoints. But if we got a exception after add read point, the read 
> point will keep in regions server and the delete after this mvcc number will 
> never be compacted.
> Our hbase version is base 0.94. If it throws other exception when initialize 
> RegionScanner, the master branch has this bug, too.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner 
> java.io.IOException: Could not seek StoreFileScanner
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:160)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:268)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:168)
>   at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2232)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:4026)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1895)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1879)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1854)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.internalOpenScanner(HRegionServer.java:3032)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2995)
>   at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:338)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1595)
> Caused by: org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting 
> call openScanner, since caller disconnected
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:475)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1443)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1902)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1766)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:345)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:499)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:520)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:235)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:148)
>   ... 14 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14331) a single callQueue related improvements

2016-06-14 Thread Hiroshi Ikeda (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331253#comment-15331253
 ] 

Hiroshi Ikeda commented on HBASE-14331:
---

bq. Do you want to close this issue then in favor of HBASE-14479?

I think these issues focus on different concerns (even though they would be 
related). A big issue is hard to tackle.


bq. So, where would we do scheduling if the Reader threads ran the request 
Hiroshi Ikeda?

Even if we adopt the strategy of making reader threads tend to directly execute 
tasks, we can use queues to store excessive tasks waiting for their turn 
(putting it aside whether we should do so).

Passing tasks between different threads will actually reduce latency (and that 
would be justified in heavy congestion), but throughput will be harmed by 
context switches and using CAS based queues might be not so harmful.

Rather, I worry how compatibility issues around callQueue in the way to make 
reader directly execute tasks.



> a single callQueue related improvements
> ---
>
> Key: HBASE-14331
> URL: https://issues.apache.org/jira/browse/HBASE-14331
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, Performance
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
> Attachments: BlockingQueuesPerformanceTestApp-output.pdf, 
> BlockingQueuesPerformanceTestApp-output.txt, 
> BlockingQueuesPerformanceTestApp.java, CallQueuePerformanceTestApp.java, 
> HBASE-14331-V2.patch, HBASE-14331-V3.patch, HBASE-14331-V4.patch, 
> HBASE-14331-V5.patch, HBASE-14331-V6.patch, HBASE-14331-V6.patch, 
> HBASE-14331.patch, HBASE-14331.patch, SemaphoreBasedBlockingQueue.java, 
> SemaphoreBasedLinkedBlockingQueue.java, 
> SemaphoreBasedPriorityBlockingQueue.java
>
>
> {{LinkedBlockingQueue}} well separates locks between the {{take}} method and 
> the {{put}} method, but not between takers, and not between putters. These 
> methods are implemented to take locks at the almost beginning of their logic. 
> HBASE-11355 introduces multiple call-queues to reduce such possible 
> congestion, but I doubt that it is required to stick to {{BlockingQueue}}.
> There are the other shortcomings of using {{BlockingQueue}}. When using 
> multiple queues, since {{BlockingQueue}} blocks threads it is required to 
> prepare enough threads for each queue. It is possible that there is a queue 
> starving for threads while there is another queue where threads are idle. 
> Even if you can tune parameters to avoid such situations, the tuning is not 
> so trivial.
> I suggest using a single {{ConcurrentLinkedQueue}} with {{Semaphore}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT

2016-06-14 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331234#comment-15331234
 ] 

Anoop Sam John commented on HBASE-9393:
---

Thanks..
We should get this work in.  Seems many users get this issue here and there.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> 
>
> Key: HBASE-9393
> URL: https://issues.apache.org/jira/browse/HBASE-9393
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.2, 0.98.0, 1.0.1.1, 1.1.2
> Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 
> 7279 regions
>Reporter: Avi Zrachya
>Assignee: Ashish Singhi
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, 
> HBASE-9393.v10.patch, HBASE-9393.v11.patch, HBASE-9393.v12.patch, 
> HBASE-9393.v13.patch, HBASE-9393.v14.patch, HBASE-9393.v15.patch, 
> HBASE-9393.v15.patch, HBASE-9393.v2.patch, HBASE-9393.v3.patch, 
> HBASE-9393.v4.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, 
> HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, 
> HBASE-9393.v6.patch, HBASE-9393.v7.patch, HBASE-9393.v8.patch, 
> HBASE-9393.v9.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect 
> to the datanode because too many mapped sockets from one host to another on 
> the same port.
> The example below is with low CLOSE_WAIT count because we had to restart 
> hbase to solve the porblem, later in time it will incease to 60-100K sockets 
> on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root 17255 17219  0 12:26 pts/000:00:00 grep 21592
> hbase21592 1 17 Aug29 ?03:29:06 
> /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m 
> -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
> -Dhbase.log.dir=/var/log/hbase 
> -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16012) Major compaction can't work because left scanner read point in RegionServer

2016-06-14 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331230#comment-15331230
 ] 

Jingcheng Du commented on HBASE-16012:
--

Remove the changes in RSRpcServices? It seems we don't need it.

> Major compaction can't work because left scanner read point in RegionServer
> ---
>
> Key: HBASE-16012
> URL: https://issues.apache.org/jira/browse/HBASE-16012
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, Scanners
>Affects Versions: 2.0.0, 0.94.27
>Reporter: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16012-v1.patch, HBASE-16012.patch
>
>
> When new RegionScanner, it will add a scanner read point in 
> scannerReadPoints. But if we got a exception after add read point, the read 
> point will keep in regions server and the delete after this mvcc number will 
> never be compacted.
> Our hbase version is base 0.94. If it throws other exception when initialize 
> RegionScanner, the master branch has this bug, too.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner 
> java.io.IOException: Could not seek StoreFileScanner
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:160)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:268)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:168)
>   at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2232)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:4026)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1895)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1879)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1854)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.internalOpenScanner(HRegionServer.java:3032)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2995)
>   at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:338)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1595)
> Caused by: org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting 
> call openScanner, since caller disconnected
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:475)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1443)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1902)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1766)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:345)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:499)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:520)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:235)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:148)
>   ... 14 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-16028) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang resolved HBASE-16028.

Resolution: Duplicate

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16028
> URL: https://issues.apache.org/jira/browse/HBASE-16028
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase, Performance
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-16029) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang resolved HBASE-16029.

Resolution: Duplicate

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16029
> URL: https://issues.apache.org/jira/browse/HBASE-16029
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase, Performance
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-16027) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang resolved HBASE-16027.

Resolution: Duplicate

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16027
> URL: https://issues.apache.org/jira/browse/HBASE-16027
> Project: HBase
>  Issue Type: Bug
>  Components: hbase, Performance
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16012) Major compaction can't work because left scanner read point in RegionServer

2016-06-14 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331218#comment-15331218
 ] 

Jingcheng Du commented on HBASE-16012:
--

bq. Yes, it should close the store scanner, too. But there are 
additionalScanners when initialize RegionScanner. I am not sure whether should 
close these additionalScanners. Current code in master branch never use 
additionalScanners and it always is null.
When the region scanner is closed, the additionalScanners are closed too, I 
guess it is okay to close them when exceptions occur. Any thoughts on this 
change guys?
And the heaps have to be closed if they are not empty/null, otherwise close all 
the store scanners?


> Major compaction can't work because left scanner read point in RegionServer
> ---
>
> Key: HBASE-16012
> URL: https://issues.apache.org/jira/browse/HBASE-16012
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, Scanners
>Affects Versions: 2.0.0, 0.94.27
>Reporter: Guanghao Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16012-v1.patch, HBASE-16012.patch
>
>
> When new RegionScanner, it will add a scanner read point in 
> scannerReadPoints. But if we got a exception after add read point, the read 
> point will keep in regions server and the delete after this mvcc number will 
> never be compacted.
> Our hbase version is base 0.94. If it throws other exception when initialize 
> RegionScanner, the master branch has this bug, too.
> ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner 
> java.io.IOException: Could not seek StoreFileScanner
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:160)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:268)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:168)
>   at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2232)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:4026)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1895)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1879)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1854)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.internalOpenScanner(HRegionServer.java:3032)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2995)
>   at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:338)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1595)
> Caused by: org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting 
> call openScanner, since caller disconnected
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:475)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1443)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1902)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1766)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:345)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:499)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:520)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:235)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:148)
>   ... 14 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT

2016-06-14 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331202#comment-15331202
 ] 

Chris Nauroth commented on HBASE-9393:
--

With short-circuit read, there is still a brief interaction between the HDFS 
client and the DataNode's data transfer port to request sharing file 
descriptors to perform the direct read.  It's much less data compared to an 
actual block transfer, but the TCP socket is still there.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> 
>
> Key: HBASE-9393
> URL: https://issues.apache.org/jira/browse/HBASE-9393
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.2, 0.98.0, 1.0.1.1, 1.1.2
> Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 
> 7279 regions
>Reporter: Avi Zrachya
>Assignee: Ashish Singhi
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, 
> HBASE-9393.v10.patch, HBASE-9393.v11.patch, HBASE-9393.v12.patch, 
> HBASE-9393.v13.patch, HBASE-9393.v14.patch, HBASE-9393.v15.patch, 
> HBASE-9393.v15.patch, HBASE-9393.v2.patch, HBASE-9393.v3.patch, 
> HBASE-9393.v4.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, 
> HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, 
> HBASE-9393.v6.patch, HBASE-9393.v7.patch, HBASE-9393.v8.patch, 
> HBASE-9393.v9.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect 
> to the datanode because too many mapped sockets from one host to another on 
> the same port.
> The example below is with low CLOSE_WAIT count because we had to restart 
> hbase to solve the porblem, later in time it will incease to 60-100K sockets 
> on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root 17255 17219  0 12:26 pts/000:00:00 grep 21592
> hbase21592 1 17 Aug29 ?03:29:06 
> /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m 
> -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
> -Dhbase.log.dir=/var/log/hbase 
> -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT

2016-06-14 Thread Ashish Singhi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331195#comment-15331195
 ] 

Ashish Singhi commented on HBASE-9393:
--

I think If scr is enabled then there will not be any socket connection to DN 
port. The sockets in CLOSE_WAIT's are connected to which port then ?

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> 
>
> Key: HBASE-9393
> URL: https://issues.apache.org/jira/browse/HBASE-9393
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.2, 0.98.0, 1.0.1.1, 1.1.2
> Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 
> 7279 regions
>Reporter: Avi Zrachya
>Assignee: Ashish Singhi
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, 
> HBASE-9393.v10.patch, HBASE-9393.v11.patch, HBASE-9393.v12.patch, 
> HBASE-9393.v13.patch, HBASE-9393.v14.patch, HBASE-9393.v15.patch, 
> HBASE-9393.v15.patch, HBASE-9393.v2.patch, HBASE-9393.v3.patch, 
> HBASE-9393.v4.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, 
> HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, 
> HBASE-9393.v6.patch, HBASE-9393.v7.patch, HBASE-9393.v8.patch, 
> HBASE-9393.v9.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect 
> to the datanode because too many mapped sockets from one host to another on 
> the same port.
> The example below is with low CLOSE_WAIT count because we had to restart 
> hbase to solve the porblem, later in time it will incease to 60-100K sockets 
> on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root 17255 17219  0 12:26 pts/000:00:00 grep 21592
> hbase21592 1 17 Aug29 ?03:29:06 
> /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m 
> -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
> -Dhbase.log.dir=/var/log/hbase 
> -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15977) Failed variable substitution on home page

2016-06-14 Thread Dima Spivak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dima Spivak updated HBASE-15977:

Attachment: HBASE-15977.patch

Posted a trivial patch that fixes the unsubstituted variable on the homepage.

> Failed variable substitution on home page
> -
>
> Key: HBASE-15977
> URL: https://issues.apache.org/jira/browse/HBASE-15977
> Project: HBase
>  Issue Type: Bug
>  Components: website
>Reporter: Nick Dimiduk
>Assignee: Dima Spivak
> Attachments: HBASE-15977.patch, banner.name.png
>
>
> Check out the top-left of hbase.apache.org, there's an unevaluated variable 
> {{$banner.name}} leaking through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15977) Failed variable substitution on home page

2016-06-14 Thread Dima Spivak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dima Spivak updated HBASE-15977:

Status: Patch Available  (was: Open)

> Failed variable substitution on home page
> -
>
> Key: HBASE-15977
> URL: https://issues.apache.org/jira/browse/HBASE-15977
> Project: HBase
>  Issue Type: Bug
>  Components: website
>Reporter: Nick Dimiduk
>Assignee: Dima Spivak
> Attachments: HBASE-15977.patch, banner.name.png
>
>
> Check out the top-left of hbase.apache.org, there's an unevaluated variable 
> {{$banner.name}} leaking through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-15977) Failed variable substitution on home page

2016-06-14 Thread Dima Spivak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dima Spivak reassigned HBASE-15977:
---

Assignee: Dima Spivak  (was: stack)

I've got a one-liner that fixes the problem when I do a {{mvn site}} locally if 
someone wants to commit it.

> Failed variable substitution on home page
> -
>
> Key: HBASE-15977
> URL: https://issues.apache.org/jira/browse/HBASE-15977
> Project: HBase
>  Issue Type: Bug
>  Components: website
>Reporter: Nick Dimiduk
>Assignee: Dima Spivak
> Attachments: banner.name.png
>
>
> Check out the top-left of hbase.apache.org, there's an unevaluated variable 
> {{$banner.name}} leaking through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-15406) Split / merge switch left disabled after early termination of hbck

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov resolved HBASE-15406.
-
Resolution: Fixed

Several separate issues (linked) have been opened to address this, so I'm 
closing this one..

I think current consensus is to revert this from 1.3 (it's not really critical 
bug), and leave it in branch-1 and master until the better solution is found 
and implemented.

Feel free to re-open if I missed something.

> Split / merge switch left disabled after early termination of hbck
> --
>
> Key: HBASE-15406
> URL: https://issues.apache.org/jira/browse/HBASE-15406
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Heng Chen
>  Labels: reviewed
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-15406.patch, HBASE-15406.v1.patch, 
> HBASE-15406_v1.patch, HBASE-15406_v2.patch, test.patch, wip.patch
>
>
> This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday:
> Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster
> Terminate hbck early
> Enter hbase shell where I observed:
> {code}
> hbase(main):001:0> splitormerge_enabled 'SPLIT'
> false
> 0 row(s) in 0.3280 seconds
> hbase(main):002:0> splitormerge_enabled 'MERGE'
> false
> 0 row(s) in 0.0070 seconds
> {code}
> Expectation is that the split / merge switches should be restored to default 
> value after hbck exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15136) Explore different queuing behaviors while busy

2016-06-14 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331179#comment-15331179
 ] 

Mikhail Antonov commented on HBASE-15136:
-

[~stack] setting it as a default is flipping one flag, almost oneliner patch, 
and this is where I definitely want to get (yeah, new issue would be good, I 
actually thought I might have created one before, need to check).

The reasons I didn't set it to default yet are that 1) though I'd been running 
it on some decent size test clusters with realistic workloads, it's not as 
battle-tested yet and the defaults need to be tweaked and 2) workloads I was 
running are different from, say read-only workloads with ~100% cache hit rate 
(like this latest one which revealed regression in the scheduler), so my focus 
was a bit different.

Since it's mostly FIFO and since fastpath hand-off and codel are kind of 
non-intersecting optimizations as I see it (if you need to combat long queues 
it means there's definitely no option for direct hand-off), applying fastfast 
scheduling to codel should be pretty straightforward. Then yes, we can make 
this a default?

> Explore different queuing behaviors while busy
> --
>
> Key: HBASE-15136
> URL: https://issues.apache.org/jira/browse/HBASE-15136
> Project: HBase
>  Issue Type: New Feature
>  Components: IPC/RPC
>Reporter: Elliott Clark
>Assignee: Mikhail Antonov
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-15136-1.2.v1.patch, HBASE-15136-v2.patch, 
> deadline_scheduler_v_0_2.patch
>
>
> http://queue.acm.org/detail.cfm?id=2839461



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16023) Fastpath for the FIFO rpcscheduler

2016-06-14 Thread Hiroshi Ikeda (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331172#comment-15331172
 ] 

Hiroshi Ikeda commented on HBASE-16023:
---

Sorry my lack of explanation.

An concrete example of the race condition:

1. Worker checks no task.
2. Reader checks no ready handler.
3. Worker pushes itself as a ready handler and waits on the semaphore.
4. Reader queues a task to the queue, without directly passing it to the ready 
handler nor releasing the semaphore.

(1,3) and (2,4) should be exclusively executed. That depends on luck, and it 
might be not severe(?)


> Fastpath for the FIFO rpcscheduler
> --
>
> Key: HBASE-16023
> URL: https://issues.apache.org/jira/browse/HBASE-16023
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16023.branch-1.001.patch, 
> hits.nofifo.fifoplusfp.fifownofp.hacks.png
>
>
> This is an idea copied from kudu where we skip queuing a request if there is 
> a handler ready to go; we just do a direct handoff from reader to handler.
> Makes for close to a %20 improvement in random read workloadc testing moving 
> the bottleneck to HBASE-15716 and to returning the results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15136) Explore different queuing behaviors while busy

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331148#comment-15331148
 ] 

stack commented on HBASE-15136:
---

[~mantonov] Why ain't we working on making this our default? It has FIFO as 
behavior when it is not overloaded so we can do stuff like our fastpath speedup 
HBASE-16023 when all is 'normal' but then when we start to backup, it moves to 
LIFO w/ the controlled delay with shedding, an alogrithim that is pretty basic 
and seems to used with some success in a few places at your shop. What you 
think? We'd have to do some work to make it so it was as fricitonless as 
possible in the 'normal' case. We can make new issue to make it default?

> Explore different queuing behaviors while busy
> --
>
> Key: HBASE-15136
> URL: https://issues.apache.org/jira/browse/HBASE-15136
> Project: HBase
>  Issue Type: New Feature
>  Components: IPC/RPC
>Reporter: Elliott Clark
>Assignee: Mikhail Antonov
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-15136-1.2.v1.patch, HBASE-15136-v2.patch, 
> deadline_scheduler_v_0_2.patch
>
>
> http://queue.acm.org/detail.cfm?id=2839461



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15136) Explore different queuing behaviors while busy

2016-06-14 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15136:
--
Priority: Critical  (was: Major)

> Explore different queuing behaviors while busy
> --
>
> Key: HBASE-15136
> URL: https://issues.apache.org/jira/browse/HBASE-15136
> Project: HBase
>  Issue Type: New Feature
>  Components: IPC/RPC
>Reporter: Elliott Clark
>Assignee: Mikhail Antonov
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-15136-1.2.v1.patch, HBASE-15136-v2.patch, 
> deadline_scheduler_v_0_2.patch
>
>
> http://queue.acm.org/detail.cfm?id=2839461



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15978) Netty API leaked into public API

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331142#comment-15331142
 ] 

stack commented on HBASE-15978:
---

HBase 2.0 is jdk8 only. We voted on it up on dev list. Example conversion 
methods are here 
http://stackoverflow.com/questions/23301598/transform-java-future-into-a-completablefuture



> Netty API leaked into public API
> 
>
> Key: HBASE-15978
> URL: https://issues.apache.org/jira/browse/HBASE-15978
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Jurriaan Mous
>Priority: Blocker
> Attachments: HBASE-15978-guava.patch, HBASE-15978.patch
>
>
> Noticed out public 
> {{[client.Future|http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Future.html]}}
>  interface extends Netty, which means our public API is bound to a specific 
> Netty API and release. IIRC we were minimizing our public-facing surface area 
> and asserting ownership over the whole of it so as to control our 
> compatibility. Ie, we've done this with Protobuf as well. Not sure if this 
> has made it back to other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15406) Split / merge switch left disabled after early termination of hbck

2016-06-14 Thread Heng Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331137#comment-15331137
 ] 

Heng Chen commented on HBASE-15406:
---

OK.  Let's go on with your proposal [~enis] 

> Split / merge switch left disabled after early termination of hbck
> --
>
> Key: HBASE-15406
> URL: https://issues.apache.org/jira/browse/HBASE-15406
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Heng Chen
>  Labels: reviewed
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-15406.patch, HBASE-15406.v1.patch, 
> HBASE-15406_v1.patch, HBASE-15406_v2.patch, test.patch, wip.patch
>
>
> This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday:
> Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster
> Terminate hbck early
> Enter hbase shell where I observed:
> {code}
> hbase(main):001:0> splitormerge_enabled 'SPLIT'
> false
> 0 row(s) in 0.3280 seconds
> hbase(main):002:0> splitormerge_enabled 'MERGE'
> false
> 0 row(s) in 0.0070 seconds
> {code}
> Expectation is that the split / merge switches should be restored to default 
> value after hbck exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16023) Fastpath for the FIFO rpcscheduler

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331135#comment-15331135
 ] 

stack commented on HBASE-16023:
---

bq. Checking no task and waiting on the semaphore is not atomic, and it is 
possible that a reader queues a task while a worker are going to waiting on the 
semaphore.

Even if the Reader semaphore release runs before we get to the acquire, thats 
ok? The acquire will just not block when called? Let me know if I misunderstand.

> Fastpath for the FIFO rpcscheduler
> --
>
> Key: HBASE-16023
> URL: https://issues.apache.org/jira/browse/HBASE-16023
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16023.branch-1.001.patch, 
> hits.nofifo.fifoplusfp.fifownofp.hacks.png
>
>
> This is an idea copied from kudu where we skip queuing a request if there is 
> a handler ready to go; we just do a direct handoff from reader to handler.
> Makes for close to a %20 improvement in random read workloadc testing moving 
> the bottleneck to HBASE-15716 and to returning the results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-15862) Backup - Delete- Restore does not restore deleted data

2016-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-15862.

  Resolution: Fixed
Hadoop Flags: Reviewed

> Backup - Delete- Restore does not restore deleted data
> --
>
> Key: HBASE-15862
> URL: https://issues.apache.org/jira/browse/HBASE-15862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-15862-v1.patch, HBASE-15862-v2.patch, 
> HBASE-15862-v3.patch
>
>
> This was discovered during testing. If we delete row after full backup and 
> perform immediately restore, the deleted row still remains deleted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15746) Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion

2016-06-14 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-15746:

Fix Version/s: (was: 1.2.2)
   1.2.3

> Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
> --
>
> Key: HBASE-15746
> URL: https://issues.apache.org/jira/browse/HBASE-15746
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors, regionserver
>Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 0.98.19
>Reporter: Matteo Bertozzi
>Assignee: Stephen Yuan Jiang
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.0.4, 1.4.0, 1.2.3
>
> Attachments: HBASE-15746.v1-master.patch
>
>
> The preClose() region coprocessor call gets called 3 times via rpc.
> The first one is when we receive the RPC
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1329
> The second time is when ask the RS to close the region
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L2852
> The third time is when the doClose() on the region is executed.
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L1419
> I'm pretty sure the first one can be removed since, there is no code between 
> that and the second call. and they are a copy-paste.
> The second one explicitly says that is to enforce ACLs before starting the 
> operation, which leads me to the fact that the 3rd one in the region gets 
> executed too late in the process. but the region.close() may be called by 
> someone other than the RS, so we should probably leave the preClose() in 
> there (e.g. OpenRegionHandler on failure cleanup). 
> any idea?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-16030:

Fix Version/s: (was: 1.2.2)
   1.2.3

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 2.0.0, 1.3.0, 1.2.3
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16023) Fastpath for the FIFO rpcscheduler

2016-06-14 Thread Hiroshi Ikeda (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331123#comment-15331123
 ] 

Hiroshi Ikeda commented on HBASE-16023:
---

{code}
protected CallRunner getCallRunner() throws InterruptedException {
  // Get a callrunner if one in the Q.
  CallRunner cr = this.q.poll();
  if (cr == null) {
// Else, if a fastPathHandlerStack present and no callrunner in Q, 
register ourselves for
// the fastpath handoff done via fastPathHandlerStack.
if (this.fastPathHandlerStack != null) {
  this.fastPathHandlerStack.push(this);
  this.semaphore.acquire();
{code}

Checking no task and waiting on the semaphore is not atomic, and it is possible 
that a reader queues a task while a worker are going to waiting on the 
semaphore.

There might be no simple way to implement without a flaw.

> Fastpath for the FIFO rpcscheduler
> --
>
> Key: HBASE-16023
> URL: https://issues.apache.org/jira/browse/HBASE-16023
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16023.branch-1.001.patch, 
> hits.nofifo.fifoplusfp.fifownofp.hacks.png
>
>
> This is an idea copied from kudu where we skip queuing a request if there is 
> a handler ready to go; we just do a direct handoff from reader to handler.
> Makes for close to a %20 improvement in random read workloadc testing moving 
> the bottleneck to HBASE-15716 and to returning the results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15406) Split / merge switch left disabled after early termination of hbck

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331115#comment-15331115
 ] 

stack commented on HBASE-15406:
---

Sorry. Late to the reply.

My -1 comment was against a system that had a tool set a state that master 
would cleanup under certain conditions. I was against complicating the master 
role to deal with weird tooling behaviors. To keep things manageable and 
contained, I suggested let the tool do the fix up/clean up.

Enis shows up with a general category under which we could gather a fleet of 
operation types that could impinge on master state that keeps-it-simple for the 
master. I'm in favor of his suggestion.

Sorry if I ended up making more work for you [~chenheng]

> Split / merge switch left disabled after early termination of hbck
> --
>
> Key: HBASE-15406
> URL: https://issues.apache.org/jira/browse/HBASE-15406
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Heng Chen
>  Labels: reviewed
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-15406.patch, HBASE-15406.v1.patch, 
> HBASE-15406_v1.patch, HBASE-15406_v2.patch, test.patch, wip.patch
>
>
> This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday:
> Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster
> Terminate hbck early
> Enter hbase shell where I observed:
> {code}
> hbase(main):001:0> splitormerge_enabled 'SPLIT'
> false
> 0 row(s) in 0.3280 seconds
> hbase(main):002:0> splitormerge_enabled 'MERGE'
> false
> 0 row(s) in 0.0070 seconds
> {code}
> Expectation is that the split / merge switches should be restored to default 
> value after hbck exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15862) Backup - Delete- Restore does not restore deleted data

2016-06-14 Thread Jerry He (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331104#comment-15331104
 ] 

Jerry He commented on HBASE-15862:
--

Ok. What you said above makes sense, [~vrodionov].

> Backup - Delete- Restore does not restore deleted data
> --
>
> Key: HBASE-15862
> URL: https://issues.apache.org/jira/browse/HBASE-15862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBASE-15862-v1.patch, HBASE-15862-v2.patch, 
> HBASE-15862-v3.patch
>
>
> This was discovered during testing. If we delete row after full backup and 
> perform immediately restore, the deleted row still remains deleted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16031) Documents about "hbase.replication" default value seems wrong

2016-06-14 Thread Heng Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heng Chen updated HBASE-16031:
--
Attachment: HBASE-16031.patch

> Documents about "hbase.replication" default value seems wrong
> -
>
> Key: HBASE-16031
> URL: https://issues.apache.org/jira/browse/HBASE-16031
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
> Attachments: HBASE-16031.patch
>
>
> {code}
>   public static final String
>   REPLICATION_ENABLE_KEY = "hbase.replication";
>   public static final boolean
>   REPLICATION_ENABLE_DEFAULT = true;
> {code}
> The code shows that default value is true, but documents shows the default 
> value is false.
> {code}
> | hbase.replication
> | Whether replication is enabled or disabled on a given
> cluster
> | false
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331081#comment-15331081
 ] 

Hadoop QA commented on HBASE-16024:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} rubocop {color} | {color:blue} 0m 0s 
{color} | {color:blue} rubocop was not available. {color} |
| {color:blue}0{color} | {color:blue} ruby-lint {color} | {color:blue} 0m 0s 
{color} | {color:blue} Ruby-lint was not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
48s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
54s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
50s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
21s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 38s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_79 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 8m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 21s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 17s 
{color} | {color:green} hbase-protocol in the patch passed. {color} |
| {color:green}+1{color

[jira] [Created] (HBASE-16031) Documents about "hbase.replication" default value seems wrong

2016-06-14 Thread Heng Chen (JIRA)

Heng Chen created HBASE-16031:
-

 Summary: Documents about "hbase.replication" default value seems 
wrong
 Key: HBASE-16031
 URL: https://issues.apache.org/jira/browse/HBASE-16031
 Project: HBase
  Issue Type: Bug
Reporter: Heng Chen


{code}
  public static final String
  REPLICATION_ENABLE_KEY = "hbase.replication";
  public static final boolean
  REPLICATION_ENABLE_DEFAULT = true;
{code}

The code shows that default value is true, but documents shows the default 
value is false.

{code}

| hbase.replication
| Whether replication is enabled or disabled on a given
cluster
| false
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14331) a single callQueue related improvements

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331034#comment-15331034
 ] 

stack commented on HBASE-14331:
---

Do you want to close this issue then in favor of HBASE-14479?

The handoff from Reader to Handler is a perf killer. Running more threads than 
there are CPUs, at least in the read case where all is from cache, only makes 
us slower.

But the handoff of Reader to Handler is where we do request scheduling. 
Currently we have a scheduling that will try to background requests that come 
from long-running Scans in favor of other request types ('deadline'). It could 
be made do more sophisticated scheduling but we should probably put this aside 
and just take on the recent FB addition of AdaptiveLifoCoDelCallQueue as  our 
default policy (See http://queue.acm.org/detail.cfm?id=2839461). It is a 
general heuristic that works well for a broad set of loadings.

So, where would we do scheduling if the Reader threads ran the request 
[~ikeda]? Thanks.

> a single callQueue related improvements
> ---
>
> Key: HBASE-14331
> URL: https://issues.apache.org/jira/browse/HBASE-14331
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, Performance
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
> Attachments: BlockingQueuesPerformanceTestApp-output.pdf, 
> BlockingQueuesPerformanceTestApp-output.txt, 
> BlockingQueuesPerformanceTestApp.java, CallQueuePerformanceTestApp.java, 
> HBASE-14331-V2.patch, HBASE-14331-V3.patch, HBASE-14331-V4.patch, 
> HBASE-14331-V5.patch, HBASE-14331-V6.patch, HBASE-14331-V6.patch, 
> HBASE-14331.patch, HBASE-14331.patch, SemaphoreBasedBlockingQueue.java, 
> SemaphoreBasedLinkedBlockingQueue.java, 
> SemaphoreBasedPriorityBlockingQueue.java
>
>
> {{LinkedBlockingQueue}} well separates locks between the {{take}} method and 
> the {{put}} method, but not between takers, and not between putters. These 
> methods are implemented to take locks at the almost beginning of their logic. 
> HBASE-11355 introduces multiple call-queues to reduce such possible 
> congestion, but I doubt that it is required to stick to {{BlockingQueue}}.
> There are the other shortcomings of using {{BlockingQueue}}. When using 
> multiple queues, since {{BlockingQueue}} blocks threads it is required to 
> prepare enough threads for each queue. It is possible that there is a queue 
> starving for threads while there is another queue where threads are idle. 
> Even if you can tune parameters to avoid such situations, the tuning is not 
> so trivial.
> I suggest using a single {{ConcurrentLinkedQueue}} with {{Semaphore}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-14 Thread Hiroshi Ikeda (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331029#comment-15331029
 ] 

Hiroshi Ikeda commented on HBASE-15971:
---

I meant LinkedTransferQueue implements BlockingQueue efficiently for 
simultaneous puts/takes. TransferQueue adds extra methods, but I forget the 
details.

Anyway, LinkedTransferQueue is unbound, and we should do something or tasks 
might be queued unlimitedly, causing out of memory. Even if we could use 
semaphores for that control, ConcurrentLinkedQueue will be good enough after 
all, with more simple implementation (and lighter overhead).

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15978) Netty API leaked into public API

2016-06-14 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331018#comment-15331018
 ] 

Enis Soztutar commented on HBASE-15978:
---

How about using Java {{CompletableFuture}}. I don't remember we have decided to 
have HBase-2.0 to be JDK-8.0 only, but we can make it so if needed. 

> Netty API leaked into public API
> 
>
> Key: HBASE-15978
> URL: https://issues.apache.org/jira/browse/HBASE-15978
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Jurriaan Mous
>Priority: Blocker
> Attachments: HBASE-15978-guava.patch, HBASE-15978.patch
>
>
> Noticed out public 
> {{[client.Future|http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Future.html]}}
>  interface extends Netty, which means our public API is bound to a specific 
> Netty API and release. IIRC we were minimizing our public-facing surface area 
> and asserting ownership over the whole of it so as to control our 
> compatibility. Ie, we've done this with Protobuf as well. Not sure if this 
> has made it back to other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14331) a single callQueue related improvements

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331015#comment-15331015
 ] 

stack commented on HBASE-14331:
---

I'd commit these [~ikeda]

> a single callQueue related improvements
> ---
>
> Key: HBASE-14331
> URL: https://issues.apache.org/jira/browse/HBASE-14331
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, Performance
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
> Attachments: BlockingQueuesPerformanceTestApp-output.pdf, 
> BlockingQueuesPerformanceTestApp-output.txt, 
> BlockingQueuesPerformanceTestApp.java, CallQueuePerformanceTestApp.java, 
> HBASE-14331-V2.patch, HBASE-14331-V3.patch, HBASE-14331-V4.patch, 
> HBASE-14331-V5.patch, HBASE-14331-V6.patch, HBASE-14331-V6.patch, 
> HBASE-14331.patch, HBASE-14331.patch, SemaphoreBasedBlockingQueue.java, 
> SemaphoreBasedLinkedBlockingQueue.java, 
> SemaphoreBasedPriorityBlockingQueue.java
>
>
> {{LinkedBlockingQueue}} well separates locks between the {{take}} method and 
> the {{put}} method, but not between takers, and not between putters. These 
> methods are implemented to take locks at the almost beginning of their logic. 
> HBASE-11355 introduces multiple call-queues to reduce such possible 
> congestion, but I doubt that it is required to stick to {{BlockingQueue}}.
> There are the other shortcomings of using {{BlockingQueue}}. When using 
> multiple queues, since {{BlockingQueue}} blocks threads it is required to 
> prepare enough threads for each queue. It is possible that there is a queue 
> starving for threads while there is another queue where threads are idle. 
> Even if you can tune parameters to avoid such situations, the tuning is not 
> so trivial.
> I suggest using a single {{ConcurrentLinkedQueue}} with {{Semaphore}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-16024:

Status: Patch Available  (was: Open)

> Revert HBASE-15406 from branch-1.3
> --
>
> Key: HBASE-16024
> URL: https://issues.apache.org/jira/browse/HBASE-16024
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 1.3.0
>
> Attachments: HBASE-15406-reverted-from.branch-1.3.v1.patch, 
> HBASE-16024-branch-1.3.v2.patch
>
>
> As discussed in there, this is kind of controversial change and more thoughts 
> are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-16024:

Attachment: HBASE-16024-branch-1.3.v2.patch

[~enis] restored Deprecated annotation in one place

> Revert HBASE-15406 from branch-1.3
> --
>
> Key: HBASE-16024
> URL: https://issues.apache.org/jira/browse/HBASE-16024
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 1.3.0
>
> Attachments: HBASE-15406-reverted-from.branch-1.3.v1.patch, 
> HBASE-16024-branch-1.3.v2.patch
>
>
> As discussed in there, this is kind of controversial change and more thoughts 
> are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-16024:

Status: Open  (was: Patch Available)

> Revert HBASE-15406 from branch-1.3
> --
>
> Key: HBASE-16024
> URL: https://issues.apache.org/jira/browse/HBASE-16024
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 1.3.0
>
> Attachments: HBASE-15406-reverted-from.branch-1.3.v1.patch, 
> HBASE-16024-branch-1.3.v2.patch
>
>
> As discussed in there, this is kind of controversial change and more thoughts 
> are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15950) Fix memstore size estimates to be more tighter

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331013#comment-15331013
 ] 

stack commented on HBASE-15950:
---

branch-1.3?

[~mantonov]?

> Fix memstore size estimates to be more tighter
> --
>
> Key: HBASE-15950
> URL: https://issues.apache.org/jira/browse/HBASE-15950
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0
>
> Attachments: Screen Shot 2016-06-02 at 8.48.27 PM.png, 
> hbase-15950-v0.patch, hbase-15950-v1.patch, hbase-15950-v2.branch-1.patch, 
> hbase-15950-v2.patch
>
>
> While testing something else, I was loading a region with a lot of data. 
> Writing 30M cells in 1M rows, with 1 byte values. 
> The memstore size turned out to be estimated as 4.5GB, while with the JFR 
> profiling I can see that we are using 2.8GB for all the objects in the 
> memstore (KV + KV byte[] + CSLM.Node + CSLM.Index). 
> This obviously means that there is room in the write cache that we are not 
> effectively using. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16017) HBase TableOutputFormat has connection leak in getRecordWriter

2016-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16017:
---
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0

> HBase TableOutputFormat has connection leak in getRecordWriter
> --
>
> Key: HBASE-16017
> URL: https://issues.apache.org/jira/browse/HBASE-16017
> Project: HBase
>  Issue Type: Bug
>Reporter: Zhan Zhang
>Assignee: Zhan Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16017-1.patch
>
>
> Currently getRecordWriter will not release the connection until jvm 
> terminate, which is not a right assumption given that the function may be 
> invoked many times in the same jvm lifecycle. Inside of mapreduce, the issue 
> has already fixed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16017) HBase TableOutputFormat has connection leak in getRecordWriter

2016-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331010#comment-15331010
 ] 

Hadoop QA commented on HBASE-16017:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 
19s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
0s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m 49s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 104m 21s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 158m 15s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12810076/HBASE-16017-1.patch |
| JIRA Issue | HBASE-16017 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / fa50d45 |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/usr/local/jenkins/java/jdk1.7.0_79:1.7.0_7

[jira] [Updated] (HBASE-15950) Fix memstore size estimates to be more tighter

2016-06-14 Thread Enis Soztutar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-15950:
--
Attachment: hbase-15950-v2.branch-1.patch

branch-1 patch. 

> Fix memstore size estimates to be more tighter
> --
>
> Key: HBASE-15950
> URL: https://issues.apache.org/jira/browse/HBASE-15950
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0
>
> Attachments: Screen Shot 2016-06-02 at 8.48.27 PM.png, 
> hbase-15950-v0.patch, hbase-15950-v1.patch, hbase-15950-v2.branch-1.patch, 
> hbase-15950-v2.patch
>
>
> While testing something else, I was loading a region with a lot of data. 
> Writing 30M cells in 1M rows, with 1 byte values. 
> The memstore size turned out to be estimated as 4.5GB, while with the JFR 
> profiling I can see that we are using 2.8GB for all the objects in the 
> memstore (KV + KV byte[] + CSLM.Node + CSLM.Index). 
> This obviously means that there is room in the write cache that we are not 
> effectively using. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15335) Add composite key support in row key

2016-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330986#comment-15330986
 ] 

Hadoop QA commented on HBASE-15335:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 13s 
{color} | {color:red} root in master failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
45s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 
33s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} scalac {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} scalac {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 0m 52s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.4.0. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 1m 43s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.4.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 2m 36s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.5.0. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 3m 29s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.5.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 4m 23s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.5.2. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 5m 17s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.6.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 6m 10s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.6.2. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 7m 3s 
{color} |

[jira] [Commented] (HBASE-3727) MultiHFileOutputFormat

2016-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330982#comment-15330982
 ] 

Hadoop QA commented on HBASE-3727:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 15s 
{color} | {color:red} root in master failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 43s 
{color} | {color:red} hbase-server in master failed with JDK v1.8.0. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 30s 
{color} | {color:red} hbase-server in master failed with JDK v1.7.0_79. {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 23s 
{color} | {color:red} hbase-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 30s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 37s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} 
|
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 37s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 30s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 30s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 0m 55s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.4.0. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 1m 48s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.4.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 2m 43s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.5.0. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 3m 36s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.5.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 4m 31s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.5.2. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 5m 26s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.6.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 6m 21s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.6.2. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 7m 15s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.6.3. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 8m 9s 
{color} | {color:red} Patch causes 15 errors with Hadoop v2.7.1. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {colo

[jira] [Commented] (HBASE-16026) Master UI should display status of additional ZK switches

2016-06-14 Thread Gary Helmling (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330980#comment-15330980
 ] 

Gary Helmling commented on HBASE-16026:
---

+1 with those changes

> Master UI should display status of additional ZK switches
> -
>
> Key: HBASE-16026
> URL: https://issues.apache.org/jira/browse/HBASE-16026
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBASE-16026-branch-1.3.patch
>
>
> Currently in the warning section we show warnings for bad JVM version, master 
> not initialized OR catalog janitor disabled, and balancer being disabled.
> We should also have status for split / merge switches (so that if someone ran 
> hbck, aborted and it switches are in the bad state we can see that) and 
> possibly normalizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16026) Master UI should display status of additional ZK switches

2016-06-14 Thread Gary Helmling (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330969#comment-15330969
 ] 

Gary Helmling commented on HBASE-16026:
---

A couple minor suggestions for the wording:

"Region splits are disabled. This may be the result of HBCK aborting while 
running in repair mode. Manually enable splits from the HBase shell, or re-run 
HBCK in repair mode."

"Region merges are disabled. This may be the result of HBCK aborting while 
running in repair mode. Manually enable merges from the HBase shell, or re-run 
HBCK in repair mode."

> Master UI should display status of additional ZK switches
> -
>
> Key: HBASE-16026
> URL: https://issues.apache.org/jira/browse/HBASE-16026
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBASE-16026-branch-1.3.patch
>
>
> Currently in the warning section we show warnings for bad JVM version, master 
> not initialized OR catalog janitor disabled, and balancer being disabled.
> We should also have status for split / merge switches (so that if someone ran 
> hbck, aborted and it switches are in the bad state we can see that) and 
> possibly normalizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Enis Soztutar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-16030:
--
Fix Version/s: (was: 1.2.1)
   1.2.2
   1.3.0
   2.0.0

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 2.0.0, 1.3.0, 1.2.2
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330968#comment-15330968
 ] 

Enis Soztutar commented on HBASE-16030:
---

We should already be doing jitter in PeriodicMemstoreFlusher: 
{code}
if (((HRegion)r).shouldFlush(whyFlush)) {
  FlushRequester requester = server.getFlushRequester();
  if (requester != null) {
long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
MIN_DELAY_TIME;
LOG.info(getName() + " requesting flush of " +
  r.getRegionInfo().getRegionNameAsString() + " because " +
  whyFlush.toString() +
  " after random delay " + randomDelay + "ms");
//Throttle the flushes by putting a delay. If we don't throttle, 
and there
//is a balanced write-load on the regions in a table, we might end 
up
//overwhelming the filesystem with too many flushes at once.
requester.requestDelayedFlush(r, randomDelay, false);
  }
}
{code}

You mean the delayed flush with jitter is not working? Range of delay is 5 
mins, so 2.5min jitter is not enough? 

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 2.0.0, 1.3.0, 1.2.2
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330967#comment-15330967
 ] 

Enis Soztutar commented on HBASE-16030:
---

We should already be doing jitter in PeriodicMemstoreFlusher: 
{code}
if (((HRegion)r).shouldFlush(whyFlush)) {
  FlushRequester requester = server.getFlushRequester();
  if (requester != null) {
long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
MIN_DELAY_TIME;
LOG.info(getName() + " requesting flush of " +
  r.getRegionInfo().getRegionNameAsString() + " because " +
  whyFlush.toString() +
  " after random delay " + randomDelay + "ms");
//Throttle the flushes by putting a delay. If we don't throttle, 
and there
//is a balanced write-load on the regions in a table, we might end 
up
//overwhelming the filesystem with too many flushes at once.
requester.requestDelayedFlush(r, randomDelay, false);
  }
}
{code}

You mean the delayed flush with jitter is not working? Range of delay is 5 
mins, so 2.5min jitter is not enough? 

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 1.2.1
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Enis Soztutar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-16030:
--
Comment: was deleted

(was: We should already be doing jitter in PeriodicMemstoreFlusher: 
{code}
if (((HRegion)r).shouldFlush(whyFlush)) {
  FlushRequester requester = server.getFlushRequester();
  if (requester != null) {
long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
MIN_DELAY_TIME;
LOG.info(getName() + " requesting flush of " +
  r.getRegionInfo().getRegionNameAsString() + " because " +
  whyFlush.toString() +
  " after random delay " + randomDelay + "ms");
//Throttle the flushes by putting a delay. If we don't throttle, 
and there
//is a balanced write-load on the regions in a table, we might end 
up
//overwhelming the filesystem with too many flushes at once.
requester.requestDelayedFlush(r, randomDelay, false);
  }
}
{code}

You mean the delayed flush with jitter is not working? Range of delay is 5 
mins, so 2.5min jitter is not enough? )

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 2.0.0, 1.3.0, 1.2.2
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT

2016-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330965#comment-15330965
 ] 

Hadoop QA commented on HBASE-9393:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
45s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
54s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 2s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 90m 26s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 135m 29s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12794484/HBASE-9393.v15.patch |
| JIRA Issue | HBASE-9393 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / b69c77a |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/usr/local/jenkins/java/jdk1.7.0_79:1.7.0_79 |
|

[jira] [Commented] (HBASE-15967) Metric for active ipc Readers and make default fraction of cpu count

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330955#comment-15330955
 ] 

stack commented on HBASE-15967:
---

I am interested in pursuing this idea Hiroshi. If we handle the total request 
on the Reader thread, we go faster. Lets take it up over in HBASE-14479.

> Metric for active ipc Readers and make default fraction of cpu count
> 
>
> Key: HBASE-15967
> URL: https://issues.apache.org/jira/browse/HBASE-15967
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
> Attachments: HBASE-15967.master.001.patch
>
>
> Our ipc Readers are hard coded at 10 regardless since . Running w/ less 
> Readers, we go faster..(e.g. 12 Readers has us doing 135k with workloadc and 
> 6 readers has us doing 145k).. .but hard to tell what count of Readers are 
> needed since no metric.
> This issue changes Readers to be 1/4 the installed CPUs or 8, whichever is 
> the minimum, and then adds a new hbase.regionserver.ipc.runningReaders metric 
> so you have a chance seeing whats needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14331) a single callQueue related improvements

2016-06-14 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14331:
--
Priority: Major  (was: Minor)

> a single callQueue related improvements
> ---
>
> Key: HBASE-14331
> URL: https://issues.apache.org/jira/browse/HBASE-14331
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, Performance
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
> Attachments: BlockingQueuesPerformanceTestApp-output.pdf, 
> BlockingQueuesPerformanceTestApp-output.txt, 
> BlockingQueuesPerformanceTestApp.java, CallQueuePerformanceTestApp.java, 
> HBASE-14331-V2.patch, HBASE-14331-V3.patch, HBASE-14331-V4.patch, 
> HBASE-14331-V5.patch, HBASE-14331-V6.patch, HBASE-14331-V6.patch, 
> HBASE-14331.patch, HBASE-14331.patch, SemaphoreBasedBlockingQueue.java, 
> SemaphoreBasedLinkedBlockingQueue.java, 
> SemaphoreBasedPriorityBlockingQueue.java
>
>
> {{LinkedBlockingQueue}} well separates locks between the {{take}} method and 
> the {{put}} method, but not between takers, and not between putters. These 
> methods are implemented to take locks at the almost beginning of their logic. 
> HBASE-11355 introduces multiple call-queues to reduce such possible 
> congestion, but I doubt that it is required to stick to {{BlockingQueue}}.
> There are the other shortcomings of using {{BlockingQueue}}. When using 
> multiple queues, since {{BlockingQueue}} blocks threads it is required to 
> prepare enough threads for each queue. It is possible that there is a queue 
> starving for threads while there is another queue where threads are idle. 
> Even if you can tune parameters to avoid such situations, the tuning is not 
> so trivial.
> I suggest using a single {{ConcurrentLinkedQueue}} with {{Semaphore}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16023) Fastpath for the FIFO rpcscheduler

2016-06-14 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-16023:
--
Component/s: rpc

> Fastpath for the FIFO rpcscheduler
> --
>
> Key: HBASE-16023
> URL: https://issues.apache.org/jira/browse/HBASE-16023
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16023.branch-1.001.patch, 
> hits.nofifo.fifoplusfp.fifownofp.hacks.png
>
>
> This is an idea copied from kudu where we skip queuing a request if there is 
> a handler ready to go; we just do a direct handoff from reader to handler.
> Makes for close to a %20 improvement in random read workloadc testing moving 
> the bottleneck to HBASE-15716 and to returning the results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330952#comment-15330952
 ] 

stack commented on HBASE-15971:
---

[~ikeda] Do you mean use LinkedTransferQueue#transfer? That will block the 
Reader though, right? We don't want to do that. Would be interested in your 
input on HBASE-16023

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16023) Fastpath for the FIFO rpcscheduler

2016-06-14 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-16023:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.3.0
   2.0.0
 Release Note: Adds a 'fastpath' when using the default FIFO rpc scheduler 
('fifo'). Does direct handoff from Reader thread to Handler if there is one 
ready and willing. Will shine best when high random read workload (YCSB 
workloadc for instance)
   Status: Resolved  (was: Patch Available)

Pushed to branch-1.3+ Thanks for the review [~mantonov]

> Fastpath for the FIFO rpcscheduler
> --
>
> Key: HBASE-16023
> URL: https://issues.apache.org/jira/browse/HBASE-16023
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16023.branch-1.001.patch, 
> hits.nofifo.fifoplusfp.fifownofp.hacks.png
>
>
> This is an idea copied from kudu where we skip queuing a request if there is 
> a handler ready to go; we just do a direct handoff from reader to handler.
> Makes for close to a %20 improvement in random read workloadc testing moving 
> the bottleneck to HBASE-15716 and to returning the results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330943#comment-15330943
 ] 

Elliott Clark commented on HBASE-16030:
---

* Use the thread local random please.
* Where are you reseting lastFlushTime ?
* This can cause the region to flush half as much as others. That amount of 
jitter seems unreasonably high.

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 1.2.1
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-14 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330929#comment-15330929
 ] 

stack commented on HBASE-15971:
---

If you don't care about ordering, you could maybe make use of the fastpath over 
in HBASE-16023 [~mantonov]


> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16026) Master UI should display status of additional ZK switches

2016-06-14 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330922#comment-15330922
 ] 

Mikhail Antonov commented on HBASE-16026:
-

(tested on local cluster)

> Master UI should display status of additional ZK switches
> -
>
> Key: HBASE-16026
> URL: https://issues.apache.org/jira/browse/HBASE-16026
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBASE-16026-branch-1.3.patch
>
>
> Currently in the warning section we show warnings for bad JVM version, master 
> not initialized OR catalog janitor disabled, and balancer being disabled.
> We should also have status for split / merge switches (so that if someone ran 
> hbck, aborted and it switches are in the bad state we can see that) and 
> possibly normalizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-14 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15971:
--
   Resolution: Fixed
Fix Version/s: 1.3.0
   2.0.0
   Status: Resolved  (was: Patch Available)

Pushed to branch-1.3+ The test failure looks unrelated and passes locally. Will 
revert if it shows again. Thanks for reviews.

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16018) Better documentation of ReplicationPeers

2016-06-14 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330916#comment-15330916
 ] 

Enis Soztutar commented on HBASE-16018:
---

Why cache? ReplicationPeers is not caching anything AFAIK. 
{code}
+  public boolean getStatusOfPeerFromCache(String id) {
{code}

> Better documentation of ReplicationPeers
> 
>
> Key: HBASE-16018
> URL: https://issues.apache.org/jira/browse/HBASE-16018
> Project: HBase
>  Issue Type: Improvement
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-16018.patch
>
>
> Some of the ReplicationPeers interface's methods are not documented and are 
> tied to a ZooKeeper implementation of ReplicationPeers. Also some method 
> names are a little confusing.
> Review board at: https://reviews.apache.org/r/48696/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16026) Master UI should display status of additional ZK switches

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-16026:

Attachment: HBASE-16026-branch-1.3.patch

simple patch

> Master UI should display status of additional ZK switches
> -
>
> Key: HBASE-16026
> URL: https://issues.apache.org/jira/browse/HBASE-16026
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBASE-16026-branch-1.3.patch
>
>
> Currently in the warning section we show warnings for bad JVM version, master 
> not initialized OR catalog janitor disabled, and balancer being disabled.
> We should also have status for split / merge switches (so that if someone ran 
> hbck, aborted and it switches are in the bad state we can see that) and 
> possibly normalizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-16030:
---
Fix Version/s: 1.2.1
   Status: Patch Available  (was: In Progress)

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 1.2.1
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330913#comment-15330913
 ] 

Tianying Chang commented on HBASE-16030:


attached patch for 1.2.1

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 1.2.1
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-16030:
---
Attachment: hbase-16030.patch

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 1.2.1
>
> Attachments: hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-16030 started by Tianying Chang.
--
> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15950) Fix memstore size estimates to be more tighter

2016-06-14 Thread Enis Soztutar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-15950:
--
Attachment: hbase-15950-v2.patch

Thanks Anoop. Will commit v2. Let me attach a branch-1 patch as well. 

> Fix memstore size estimates to be more tighter
> --
>
> Key: HBASE-15950
> URL: https://issues.apache.org/jira/browse/HBASE-15950
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0
>
> Attachments: Screen Shot 2016-06-02 at 8.48.27 PM.png, 
> hbase-15950-v0.patch, hbase-15950-v1.patch, hbase-15950-v2.patch
>
>
> While testing something else, I was loading a region with a lot of data. 
> Writing 30M cells in 1M rows, with 1 byte values. 
> The memstore size turned out to be estimated as 4.5GB, while with the JFR 
> profiling I can see that we are using 2.8GB for all the objects in the 
> memstore (KV + KV byte[] + CSLM.Node + CSLM.Index). 
> This obviously means that there is room in the write cache that we are not 
> effectively using. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)

Tianying Chang created HBASE-16030:
--

 Summary: All Regions are flushed at about same time when 
MEMSTORE_PERIODIC_FLUSH is on, causing flush spike
 Key: HBASE-16030
 URL: https://issues.apache.org/jira/browse/HBASE-16030
 Project: HBase
  Issue Type: Improvement
Affects Versions: 1.2.1
Reporter: Tianying Chang
Assignee: Tianying Chang


In our production cluster, we observed that memstore flush spike every hour for 
all regions/RS. (we use the default memstore periodic flush time of 1 hour). 

This will happend when two conditions are met: 
1. the memstore does not have enough data to be flushed before 1 hour limit 
reached;
2. all regions are opened around the same time, (e.g. all RS are started at the 
same time when start a cluster). 

With above two conditions, all the regions will be flushed around the same time 
at: startTime+1hour-delay again and again.

We added a flush jittering time to randomize the flush time of each region, so 
that they don't get flushed at around the same time. We had this feature 
running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found this 
issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-16026) Master UI should display status of additional ZK switches

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov reassigned HBASE-16026:
---

Assignee: Mikhail Antonov

> Master UI should display status of additional ZK switches
> -
>
> Key: HBASE-16026
> URL: https://issues.apache.org/jira/browse/HBASE-16026
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>
> Currently in the warning section we show warnings for bad JVM version, master 
> not initialized OR catalog janitor disabled, and balancer being disabled.
> We should also have status for split / merge switches (so that if someone ran 
> hbck, aborted and it switches are in the bad state we can see that) and 
> possibly normalizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16029) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)

Tianying Chang created HBASE-16029:
--

 Summary: All Regions are flushed at about same time when 
MEMSTORE_PERIODIC_FLUSH is on, causing flush spike
 Key: HBASE-16029
 URL: https://issues.apache.org/jira/browse/HBASE-16029
 Project: HBase
  Issue Type: Improvement
  Components: hbase, Performance
Affects Versions: 1.2.1
Reporter: Tianying Chang
Assignee: Tianying Chang


 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16028) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)

Tianying Chang created HBASE-16028:
--

 Summary: All Regions are flushed at about same time when 
MEMSTORE_PERIODIC_FLUSH is on, causing flush spike
 Key: HBASE-16028
 URL: https://issues.apache.org/jira/browse/HBASE-16028
 Project: HBase
  Issue Type: Improvement
  Components: hbase, Performance
Affects Versions: 1.2.1
Reporter: Tianying Chang
Assignee: Tianying Chang


In our production cluster, we observed that memstore flush spike every hour for 
all regions/RS. (we use the default memstore periodic flush time of 1 hour). 

This will happend when two conditions are met: 
1. the memstore does not have enough data to be flushed before 1 hour limit 
reached;
2. all regions are opened around the same time, (e.g. all RS are started at the 
same time when start a cluster). 

With above two conditions, all the regions will be flushed around the same time 
at: startTime+1hour-delay again and again.

We added a flush jittering time to randomize the flush time of each region, so 
that they don't get flushed at around the same time. We had this feature 
running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found this 
issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16027) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-06-14 Thread Tianying Chang (JIRA)

Tianying Chang created HBASE-16027:
--

 Summary: All Regions are flushed at about same time when 
MEMSTORE_PERIODIC_FLUSH is on, causing flush spike
 Key: HBASE-16027
 URL: https://issues.apache.org/jira/browse/HBASE-16027
 Project: HBase
  Issue Type: Bug
  Components: hbase, Performance
Affects Versions: 1.2.1
Reporter: Tianying Chang
Assignee: Tianying Chang


In our production cluster, we observed that memstore flush spike every hour for 
all regions/RS. (we use the default memstore periodic flush time of 1 hour). 

This will happend when two conditions are met: 
1. the memstore does not have enough data to be flushed before 1 hour limit 
reached;
2. all regions are opened around the same time, (e.g. all RS are started at the 
same time when start a cluster). 

With above two conditions, all the regions will be flushed around the same time 
at: startTime+1hour-delay again and again.

We added a flush jittering time to randomize the flush time of each region, so 
that they don't get flushed at around the same time. We had this feature 
running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found this 
issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15643) Need metrics of cache hit ratio, etc for one table

2016-06-14 Thread Alicia Ying Shu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alicia Ying Shu updated HBASE-15643:

Status: Patch Available  (was: Open)

> Need metrics of cache hit ratio, etc for one table
> --
>
> Key: HBASE-15643
> URL: https://issues.apache.org/jira/browse/HBASE-15643
> Project: HBase
>  Issue Type: Improvement
>Reporter: Heng Chen
>Assignee: Alicia Ying Shu
> Attachments: HBASE-15643.patch
>
>
> There are many tables on our cluster,  only some of them need to be read 
> online.  
> We could improve the performance of read by cache,  but we need some metrics 
> for it at table level. There are a few we can collect: BlockCacheCount, 
> BlockCacheSize, BlockCacheHitCount, BlockCacheMissCount, BlockCacheHitPercent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15643) Need metrics of cache hit ratio, etc for one table

2016-06-14 Thread Alicia Ying Shu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330885#comment-15330885
 ] 

Alicia Ying Shu commented on HBASE-15643:
-

Uploaded a patch for collection table level metrics for block cache: 
BlockCacheCount, BlockCacheSize, BlockCacheHitCount, BlockCacheMissCount, 
BlockCacheHitPercent

> Need metrics of cache hit ratio, etc for one table
> --
>
> Key: HBASE-15643
> URL: https://issues.apache.org/jira/browse/HBASE-15643
> Project: HBase
>  Issue Type: Improvement
>Reporter: Heng Chen
>Assignee: Alicia Ying Shu
> Attachments: HBASE-15643.patch
>
>
> There are many tables on our cluster,  only some of them need to be read 
> online.  
> We could improve the performance of read by cache,  but we need some metrics 
> for it at table level. There are a few we can collect: BlockCacheCount, 
> BlockCacheSize, BlockCacheHitCount, BlockCacheMissCount, BlockCacheHitPercent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15643) Need metrics of cache hit ratio, etc for one table

2016-06-14 Thread Alicia Ying Shu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alicia Ying Shu updated HBASE-15643:

Attachment: HBASE-15643.patch

> Need metrics of cache hit ratio, etc for one table
> --
>
> Key: HBASE-15643
> URL: https://issues.apache.org/jira/browse/HBASE-15643
> Project: HBase
>  Issue Type: Improvement
>Reporter: Heng Chen
>Assignee: Alicia Ying Shu
> Attachments: HBASE-15643.patch
>
>
> There are many tables on our cluster,  only some of them need to be read 
> online.  
> We could improve the performance of read by cache,  but we need some metrics 
> for it at table level. There are a few we can collect: BlockCacheCount, 
> BlockCacheSize, BlockCacheHitCount, BlockCacheMissCount, BlockCacheHitPercent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15643) Need metrics of cache hit ratio, etc for one table

2016-06-14 Thread Alicia Ying Shu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alicia Ying Shu updated HBASE-15643:

Description: 
There are many tables on our cluster,  only some of them need to be read 
online.  

We could improve the performance of read by cache,  but we need some metrics 
for it at table level. There are a few we can collect: BlockCacheCount, 
BlockCacheSize, BlockCacheHitCount, BlockCacheMissCount, BlockCacheHitPercent

  was:
There are many tables on our cluster,  only some of them need to be read 
online.  

We could improve the performance of read by cache,  but we need some metrics 
for it at table level 


> Need metrics of cache hit ratio, etc for one table
> --
>
> Key: HBASE-15643
> URL: https://issues.apache.org/jira/browse/HBASE-15643
> Project: HBase
>  Issue Type: Improvement
>Reporter: Heng Chen
>Assignee: Alicia Ying Shu
>
> There are many tables on our cluster,  only some of them need to be read 
> online.  
> We could improve the performance of read by cache,  but we need some metrics 
> for it at table level. There are a few we can collect: BlockCacheCount, 
> BlockCacheSize, BlockCacheHitCount, BlockCacheMissCount, BlockCacheHitPercent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15643) Need metrics of cache hit ratio, etc for one table

2016-06-14 Thread Alicia Ying Shu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alicia Ying Shu updated HBASE-15643:

Summary: Need metrics of cache hit ratio, etc for one table  (was: Need 
metrics of cache hit ratio for one table)

> Need metrics of cache hit ratio, etc for one table
> --
>
> Key: HBASE-15643
> URL: https://issues.apache.org/jira/browse/HBASE-15643
> Project: HBase
>  Issue Type: Improvement
>Reporter: Heng Chen
>Assignee: Alicia Ying Shu
>
> There are many tables on our cluster,  only some of them need to be read 
> online.  
> We could improve the performance of read by cache,  but we need some metrics 
> for it at table level 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-3727) MultiHFileOutputFormat

2016-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330866#comment-15330866
 ] 

Ted Yu commented on HBASE-3727:
---

Can you add a unit test ?

Putting patch on review board would be appreciated.

> MultiHFileOutputFormat
> --
>
> Key: HBASE-3727
> URL: https://issues.apache.org/jira/browse/HBASE-3727
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Andrew Purtell
>Assignee: yi liang
>Priority: Minor
> Attachments: MH.patch, MultiHFileOutputFormat.java, 
> MultiHFileOutputFormat.java, MultiHFileOutputFormat.java
>
>
> Like MultiTableOutputFormat, but outputting HFiles. Key is tablename as an 
> IBW. Creates sub-writers (code cut and pasted from HFileOutputFormat) on 
> demand that produce HFiles in per-table subdirectories of the configured 
> output path. Does not currently support partitioning for existing tables / 
> incremental update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15335) Add composite key support in row key

2016-06-14 Thread Zhan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhan Zhang updated HBASE-15335:
---
Status: Patch Available  (was: Open)

> Add composite key support in row key
> 
>
> Key: HBASE-15335
> URL: https://issues.apache.org/jira/browse/HBASE-15335
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zhan Zhang
>Assignee: Zhan Zhang
> Attachments: HBASE-15335-1.patch, HBASE-15335-2.patch, 
> HBASE-15335-3.patch, HBASE-15335-4.patch, HBASE-15335-5.patch, 
> HBASE-15335-6.patch, HBASE-15335-7.patch
>
>
> Add composite key filter support in the connector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15335) Add composite key support in row key

2016-06-14 Thread Zhan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhan Zhang updated HBASE-15335:
---
Status: Open  (was: Patch Available)

> Add composite key support in row key
> 
>
> Key: HBASE-15335
> URL: https://issues.apache.org/jira/browse/HBASE-15335
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zhan Zhang
>Assignee: Zhan Zhang
> Attachments: HBASE-15335-1.patch, HBASE-15335-2.patch, 
> HBASE-15335-3.patch, HBASE-15335-4.patch, HBASE-15335-5.patch, 
> HBASE-15335-6.patch, HBASE-15335-7.patch
>
>
> Add composite key filter support in the connector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330861#comment-15330861
 ] 

Mikhail Antonov commented on HBASE-16024:
-

Didn't know you were doing this (maybe I missed a jira, sorry, might be a dup 
then)

Yeah, compile protobuf ran fine and didn't change any files (didn't run full 
tests yet, waiting for precommut bot to vote on it as well)

> Revert HBASE-15406 from branch-1.3
> --
>
> Key: HBASE-16024
> URL: https://issues.apache.org/jira/browse/HBASE-16024
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 1.3.0
>
> Attachments: HBASE-15406-reverted-from.branch-1.3.v1.patch
>
>
> As discussed in there, this is kind of controversial change and more thoughts 
> are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330829#comment-15330829
 ] 

Enis Soztutar commented on HBASE-16024:
---

I was doing it via {{git revert }} yesterday, but I was not able to commit it 
due to a different issue. 

I think this should stay (from HBASE-15608): 
{code}
-  @Deprecated
{code}

Otherwise looks good. Can you please run: 
{code}
mvn clean install -Pcompile-protobuf 
{code}
to make sure that protobuf is recompiled junt in case. 




> Revert HBASE-15406 from branch-1.3
> --
>
> Key: HBASE-16024
> URL: https://issues.apache.org/jira/browse/HBASE-16024
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 1.3.0
>
> Attachments: HBASE-15406-reverted-from.branch-1.3.v1.patch
>
>
> As discussed in there, this is kind of controversial change and more thoughts 
> are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15971) Regression: Random Read/WorkloadC slower in 1.x than 0.98

2016-06-14 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330843#comment-15330843
 ] 

Enis Soztutar commented on HBASE-15971:
---

+1. Great finding. What to do for 1.2 and 1.1? [~ndimiduk], [~busbey]. 

> Regression: Random Read/WorkloadC slower in 1.x than 0.98
> -
>
> Key: HBASE-15971
> URL: https://issues.apache.org/jira/browse/HBASE-15971
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Affects Versions: 2.0.0, 1.3.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: 098.hits.png, 098.png, HBASE-15971.branch-1.001.patch, 
> HBASE-15971.branch-1.002.patch, Screen Shot 2016-06-10 at 5.08.24 PM.png, 
> Screen Shot 2016-06-10 at 5.08.26 PM.png, branch-1.hits.png, branch-1.png, 
> flight_recording_10172402220203_28.branch-1.jfr, 
> flight_recording_10172402220203_29.09820.0.98.20.jfr, handlers.fp.png, 
> hits.fp.png, hits.patched1.0.vs.unpatched1.0.vs.098.png, run_ycsb.sh
>
>
> branch-1 is slower than 0.98 doing YCSB random read/workloadC. It seems to be 
> doing about 1/2 the throughput of 0.98.
> In branch-1, we have low handler occupancy compared to 0.98. Hacking in 
> reader thread occupancy metric, is about the same in both. In parent issue, 
> hacking out the scheduler, I am able to get branch-1 to go 3x faster so will 
> dig in here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-3727) MultiHFileOutputFormat

2016-06-14 Thread yi liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi liang updated HBASE-3727:

Status: Patch Available  (was: Open)

> MultiHFileOutputFormat
> --
>
> Key: HBASE-3727
> URL: https://issues.apache.org/jira/browse/HBASE-3727
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Andrew Purtell
>Assignee: yi liang
>Priority: Minor
> Attachments: MH.patch, MultiHFileOutputFormat.java, 
> MultiHFileOutputFormat.java, MultiHFileOutputFormat.java
>
>
> Like MultiTableOutputFormat, but outputting HFiles. Key is tablename as an 
> IBW. Creates sub-writers (code cut and pasted from HFileOutputFormat) on 
> demand that produce HFiles in per-table subdirectories of the configured 
> output path. Does not currently support partitioning for existing tables / 
> incremental update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-3727) MultiHFileOutputFormat

2016-06-14 Thread yi liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi liang updated HBASE-3727:

Attachment: MH.patch

> MultiHFileOutputFormat
> --
>
> Key: HBASE-3727
> URL: https://issues.apache.org/jira/browse/HBASE-3727
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Andrew Purtell
>Assignee: yi liang
>Priority: Minor
> Attachments: MH.patch, MultiHFileOutputFormat.java, 
> MultiHFileOutputFormat.java, MultiHFileOutputFormat.java
>
>
> Like MultiTableOutputFormat, but outputting HFiles. Key is tablename as an 
> IBW. Creates sub-writers (code cut and pasted from HFileOutputFormat) on 
> demand that produce HFiles in per-table subdirectories of the configured 
> output path. Does not currently support partitioning for existing tables / 
> incremental update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15966) Bulk load unable to read HFiles from different filesystem type than fs.defaultFS

2016-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-15966:
---
Assignee: Ted Yu
  Status: Patch Available  (was: Open)

> Bulk load unable to read HFiles from different filesystem type than 
> fs.defaultFS
> 
>
> Key: HBASE-15966
> URL: https://issues.apache.org/jira/browse/HBASE-15966
> Project: HBase
>  Issue Type: Bug
>  Components: hbase, HFile
>Affects Versions: 0.98.4
> Environment: Microsoft Azure HDInsight 3.2 cluster with eight hosts
> - Ubuntu 12.04.5
> - HDP 2.2
> - Hadoop 2.6.0
> - HBase 0.98.4
>Reporter: Dustin Christmann
>Assignee: Ted Yu
> Attachments: 15966.v1.txt
>
>
> In a YARN job, I am creating HFiles with code that has been cribbed from the 
> TableOutputFormat class and bulkloading them with 
> LoadIncrementalHFiles.doBulkLoad.
> On other clusters, where fs.defaultFS is set to an hdfs: URI, and my HFiles 
> are placed in an hdfs: URI, the bulkload works as intended.
> On this particular cluster, where fs.defaultFS is set to a wasb: URI and my 
> HFiles are placed in a wasb: URI, the bulkload also works as intended.
> However, on this same cluster, whenever I place the HFiles in an hdfs: URI, I 
> get the following logs in my application from the HBase client logging 
> repeatedly:
> [02 Jun 23:23:26.002](20259/140062246807296) 
> Info2:org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles: Trying to load 
> hfile=hdfs://[my cluster]/[my path] first=\x00\x00\x11\x06 last=;\x8B\x85\x18
> [02 Jun 23:23:26.002](20259/140062245754624) 
> Info3:org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles: Going to 
> connect to server region=[my namespace]:[my 
> table],,1464909723920.00eafdb73989312bd8864f0913255f50., 
> hostname=10.0.1.6,16020,1464698786237, seqNum=2 for row  with hfile group 
> [{[B@4d0409e7,hdfs://[my cluster]/[my path]}]
> [02 Jun 23:23:26.012](20259/140062245754624) 
> Info1:org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles: Attempt to 
> bulk load region containing  into table [my namespace]:[my table] with files 
> [family:[my family] path:hdfs://[my cluster]/[my path]] failed.  This is 
> recoverable and they will be retried.
> [02 Jun 23:23:26.019](20259/140061634982912) 
> Info2:org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles: Split occured 
> while grouping HFiles, retry attempt 2 with 1 files remaining to group or 
> split
> And when I look at the appropriate region server's log, I find the following 
> exception repeatedly:
> 2016-06-02 20:22:50,771 ERROR 
> [B.DefaultRpcServer.handler=22,queue=2,port=16020] 
> access.SecureBulkLoadEndpoint: Failed to complete bulk load
> java.io.FileNotFoundException: File doesn't exist: hdfs://[my cluster]/[my 
> path]  at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.setPermission(NativeAzureFileSystem.java:2192)
>at 
> org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint$1.run(SecureBulkLoadEndpoint.java:280)
>at 
> org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint$1.run(SecureBulkLoadEndpoint.java:270)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:356)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
>at 
> org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint.secureBulkLoadHFiles(SecureBulkLoadEndpoint.java:270)
>at 
> org.apache.hadoop.hbase.protobuf.generated.SecureBulkLoadProtos$SecureBulkLoadService.callMethod(SecureBulkLoadProtos.java:4631)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6986)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3456)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3438)
>at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29998)
>at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2080)
>at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>at java.lang.Thread.run(Thread.java:745)
> Looking at the appropriate code in SecureBulkLoadEndpoint.java, I'm finding 
> the following:
> public Boolean run() {
>   FileSystem fs = null;
>  try {
> Configuration conf = env.getConfiguration();
> fs = FileSystem.get(conf);
> for(Pair el: familyPaths) {
>   Path p = new Path(el.getSecond

[jira] [Updated] (HBASE-15746) Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion

2016-06-14 Thread Stephen Yuan Jiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-15746:
---
   Resolution: Fixed
Fix Version/s: 1.2.2
   1.4.0
   1.0.4
   1.3.0
   2.0.0
   Status: Resolved  (was: Patch Available)

> Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
> --
>
> Key: HBASE-15746
> URL: https://issues.apache.org/jira/browse/HBASE-15746
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors, regionserver
>Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 0.98.19
>Reporter: Matteo Bertozzi
>Assignee: Stephen Yuan Jiang
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.0.4, 1.4.0, 1.2.2
>
> Attachments: HBASE-15746.v1-master.patch
>
>
> The preClose() region coprocessor call gets called 3 times via rpc.
> The first one is when we receive the RPC
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1329
> The second time is when ask the RS to close the region
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L2852
> The third time is when the doClose() on the region is executed.
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L1419
> I'm pretty sure the first one can be removed since, there is no code between 
> that and the second call. and they are a copy-paste.
> The second one explicitly says that is to enforce ACLs before starting the 
> operation, which leads me to the fact that the 3rd one in the region gets 
> executed too late in the process. but the region.close() may be called by 
> someone other than the RS, so we should probably leave the preClose() in 
> there (e.g. OpenRegionHandler on failure cleanup). 
> any idea?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-3727) MultiHFileOutputFormat

2016-06-14 Thread yi liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi liang updated HBASE-3727:

Status: Open  (was: Patch Available)

> MultiHFileOutputFormat
> --
>
> Key: HBASE-3727
> URL: https://issues.apache.org/jira/browse/HBASE-3727
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Andrew Purtell
>Assignee: yi liang
>Priority: Minor
> Attachments: MultiHFileOutputFormat.java, 
> MultiHFileOutputFormat.java, MultiHFileOutputFormat.java
>
>
> Like MultiTableOutputFormat, but outputting HFiles. Key is tablename as an 
> IBW. Creates sub-writers (code cut and pasted from HFileOutputFormat) on 
> demand that produce HFiles in per-table subdirectories of the configured 
> output path. Does not currently support partitioning for existing tables / 
> incremental update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-3727) MultiHFileOutputFormat

2016-06-14 Thread yi liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi liang updated HBASE-3727:

Attachment: (was: MH.patch)

> MultiHFileOutputFormat
> --
>
> Key: HBASE-3727
> URL: https://issues.apache.org/jira/browse/HBASE-3727
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Andrew Purtell
>Assignee: yi liang
>Priority: Minor
> Attachments: MultiHFileOutputFormat.java, 
> MultiHFileOutputFormat.java, MultiHFileOutputFormat.java
>
>
> Like MultiTableOutputFormat, but outputting HFiles. Key is tablename as an 
> IBW. Creates sub-writers (code cut and pasted from HFileOutputFormat) on 
> demand that produce HFiles in per-table subdirectories of the configured 
> output path. Does not currently support partitioning for existing tables / 
> incremental update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-5291) Add Kerberos HTTP SPNEGO authentication support to HBase web consoles

2016-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330813#comment-15330813
 ] 

Ted Yu commented on HBASE-5291:
---

lgtm

Mind filling out release notes ?

> Add Kerberos HTTP SPNEGO authentication support to HBase web consoles
> -
>
> Key: HBASE-5291
> URL: https://issues.apache.org/jira/browse/HBASE-5291
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver, security
>Reporter: Andrew Purtell
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-5291.001.patch, HBASE-5291.002.patch, 
> HBASE-5291.003.patch, HBASE-5291.004.patch, HBASE-5291.005.patch
>
>
> Like HADOOP-7119, the same motivations:
> {quote}
> Hadoop RPC already supports Kerberos authentication. 
> {quote}
> As does the HBase secure RPC engine.
> {quote}
> Kerberos enables single sign-on.
> Popular browsers (Firefox and Internet Explorer) have support for Kerberos 
> HTTP SPNEGO.
> Adding support for Kerberos HTTP SPNEGO to [HBase] web consoles would provide 
> a unified authentication mechanism and single sign-on for web UI and RPC.
> {quote}
> Also like HADOOP-7119, the same solution:
> A servlet filter is configured in front of all Hadoop web consoles for 
> authentication.
> This filter verifies if the incoming request is already authenticated by the 
> presence of a signed HTTP cookie. If the cookie is present, its signature is 
> valid and its value didn't expire; then the request continues its way to the 
> page invoked by the request. If the cookie is not present, it is invalid or 
> it expired; then the request is delegated to an authenticator handler. The 
> authenticator handler then is responsible for requesting/validating the 
> user-agent for the user credentials. This may require one or more additional 
> interactions between the authenticator handler and the user-agent (which will 
> be multiple HTTP requests). Once the authenticator handler verifies the 
> credentials and generates an authentication token, a signed cookie is 
> returned to the user-agent for all subsequent invocations.
> The authenticator handler is pluggable and 2 implementations are provided out 
> of the box: pseudo/simple and kerberos.
> 1. The pseudo/simple authenticator handler is equivalent to the Hadoop 
> pseudo/simple authentication. It trusts the value of the user.name query 
> string parameter. The pseudo/simple authenticator handler supports an 
> anonymous mode which accepts any request without requiring the user.name 
> query string parameter to create the token. This is the default behavior, 
> preserving the behavior of the HBase web consoles before this patch.
> 2. The kerberos authenticator handler implements the Kerberos HTTP SPNEGO 
> implementation. This authenticator handler will generate a token only if a 
> successful Kerberos HTTP SPNEGO interaction is performed between the 
> user-agent and the authenticator. Browsers like Firefox and Internet Explorer 
> support Kerberos HTTP SPNEGO.
> We can build on the support added to Hadoop via HADOOP-7119. Should just be a 
> matter of wiring up the filter to our infoservers in a similar manner. 
> And from 
> https://issues.apache.org/jira/browse/HBASE-5050?focusedCommentId=13171086&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171086
> {quote}
> Hadoop 0.23 onwards has a hadoop-auth artifact that provides SPNEGO/Kerberos 
> authentication for webapps via a filter. You should consider using it. You 
> don't have to move Hbase to 0.23 for that, just consume the hadoop-auth 
> artifact, which has no dependencies on the rest of Hadoop 0.23 artifacts.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15746) Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion

2016-06-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330811#comment-15330811
 ] 

Hadoop QA commented on HBASE-15746:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} 
| {color:red} HBASE-15746 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12810661/HBASE-15746.v1-master.patch
 |
| JIRA Issue | HBASE-15746 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2213/console |
| Powered by | Apache Yetus 0.2.1   http://yetus.apache.org |


This message was automatically generated.



> Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
> --
>
> Key: HBASE-15746
> URL: https://issues.apache.org/jira/browse/HBASE-15746
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors, regionserver
>Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 0.98.19
>Reporter: Matteo Bertozzi
>Assignee: Stephen Yuan Jiang
>Priority: Minor
> Attachments: HBASE-15746.v1-master.patch
>
>
> The preClose() region coprocessor call gets called 3 times via rpc.
> The first one is when we receive the RPC
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1329
> The second time is when ask the RS to close the region
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L2852
> The third time is when the doClose() on the region is executed.
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L1419
> I'm pretty sure the first one can be removed since, there is no code between 
> that and the second call. and they are a copy-paste.
> The second one explicitly says that is to enforce ACLs before starting the 
> operation, which leads me to the fact that the 3rd one in the region gets 
> executed too late in the process. but the region.close() may be called by 
> someone other than the RS, so we should probably leave the preClose() in 
> there (e.g. OpenRegionHandler on failure cleanup). 
> any idea?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16026) Master UI should display status of additional ZK switches

2016-06-14 Thread Mikhail Antonov (JIRA)

Mikhail Antonov created HBASE-16026:
---

 Summary: Master UI should display status of additional ZK switches
 Key: HBASE-16026
 URL: https://issues.apache.org/jira/browse/HBASE-16026
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Antonov


Currently in the warning section we show warnings for bad JVM version, master 
not initialized OR catalog janitor disabled, and balancer being disabled.

We should also have status for split / merge switches (so that if someone ran 
hbck, aborted and it switches are in the bad state we can see that) and 
possibly normalizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-16025) Cache table state to reduce load on META

2016-06-14 Thread Gary Helmling (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling reassigned HBASE-16025:
-

Assignee: Gary Helmling

> Cache table state to reduce load on META
> 
>
> Key: HBASE-16025
> URL: https://issues.apache.org/jira/browse/HBASE-16025
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
> Fix For: 2.0.0
>
>
> HBASE-12035 moved keeping table enabled/disabled state from ZooKeeper into 
> hbase:meta.  When we retry operations on the client, we check table state in 
> order to return a specific message if the table is disabled.  This means that 
> in master we will be going back to meta for every retry, even if a region's 
> location has not changed.  This is going to cause performance issues when a 
> cluster is already loaded, ie. in cases where regionservers may be returning 
> CallQueueTooBigException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16025) Cache table state to reduce load on META

2016-06-14 Thread Gary Helmling (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-16025:
--
Description: HBASE-12035 moved keeping table enabled/disabled state from 
ZooKeeper into hbase:meta.  When we retry operations on the client, we check 
table state in order to return a specific message if the table is disabled.  
This means that in master we will be going back to meta for every retry, even 
if a region's location has not changed.  This is going to cause performance 
issues when a cluster is already loaded, ie. in cases where regionservers may 
be returning CallQueueTooBigException.  (was: HBASE-12035 moved keeping table 
enabled/disabled state from ZooKeeper into hbase:meta.  When we retry 
operations on the client, we check table state in order to return a specific 
message if the table is disabled.  This means that in master we will be going 
back to meta for every retry, even if a region's location has not changed.)

> Cache table state to reduce load on META
> 
>
> Key: HBASE-16025
> URL: https://issues.apache.org/jira/browse/HBASE-16025
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Gary Helmling
>Priority: Critical
> Fix For: 2.0.0
>
>
> HBASE-12035 moved keeping table enabled/disabled state from ZooKeeper into 
> hbase:meta.  When we retry operations on the client, we check table state in 
> order to return a specific message if the table is disabled.  This means that 
> in master we will be going back to meta for every retry, even if a region's 
> location has not changed.  This is going to cause performance issues when a 
> cluster is already loaded, ie. in cases where regionservers may be returning 
> CallQueueTooBigException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16025) Cache table state to reduce load on META

2016-06-14 Thread Gary Helmling (JIRA)

Gary Helmling created HBASE-16025:
-

 Summary: Cache table state to reduce load on META
 Key: HBASE-16025
 URL: https://issues.apache.org/jira/browse/HBASE-16025
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Gary Helmling
Priority: Critical
 Fix For: 2.0.0


HBASE-12035 moved keeping table enabled/disabled state from ZooKeeper into 
hbase:meta.  When we retry operations on the client, we check table state in 
order to return a specific message if the table is disabled.  This means that 
in master we will be going back to meta for every retry, even if a region's 
location has not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16021) graceful_stop.sh: Wrap variables in double quote to avoid "[: too many arguments" error

2016-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16021:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the patch, Samir.

> graceful_stop.sh: Wrap variables in double quote to avoid  "[: too many 
> arguments" error
> 
>
> Key: HBASE-16021
> URL: https://issues.apache.org/jira/browse/HBASE-16021
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.0
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16021_v1.patch
>
>
> On few places in graceful_stop.sh there is if conditions where variables are 
> not  double quoted which may cause error.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15406) Split / merge switch left disabled after early termination of hbck

2016-06-14 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330785#comment-15330785
 ] 

Mikhail Antonov commented on HBASE-15406:
-

Linked separate jira to revert it form 1.3 as I can't attach files to this one. 
Reviews?

> Split / merge switch left disabled after early termination of hbck
> --
>
> Key: HBASE-15406
> URL: https://issues.apache.org/jira/browse/HBASE-15406
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Heng Chen
>  Labels: reviewed
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-15406.patch, HBASE-15406.v1.patch, 
> HBASE-15406_v1.patch, HBASE-15406_v2.patch, test.patch, wip.patch
>
>
> This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday:
> Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster
> Terminate hbck early
> Enter hbase shell where I observed:
> {code}
> hbase(main):001:0> splitormerge_enabled 'SPLIT'
> false
> 0 row(s) in 0.3280 seconds
> hbase(main):002:0> splitormerge_enabled 'MERGE'
> false
> 0 row(s) in 0.0070 seconds
> {code}
> Expectation is that the split / merge switches should be restored to default 
> value after hbck exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-16024:

Status: Patch Available  (was: Open)

> Revert HBASE-15406 from branch-1.3
> --
>
> Key: HBASE-16024
> URL: https://issues.apache.org/jira/browse/HBASE-16024
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 1.3.0
>
> Attachments: HBASE-15406-reverted-from.branch-1.3.v1.patch
>
>
> As discussed in there, this is kind of controversial change and more thoughts 
> are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-16024:

Attachment: HBASE-15406-reverted-from.branch-1.3.v1.patch

It looks like I can't attach patch to original jira (not sure why - is it 
because Im not assignee, or because I'm missing from some ldap group, or some 
jira glitch), but here's the patch.

> Revert HBASE-15406 from branch-1.3
> --
>
> Key: HBASE-16024
> URL: https://issues.apache.org/jira/browse/HBASE-16024
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 1.3.0
>
> Attachments: HBASE-15406-reverted-from.branch-1.3.v1.patch
>
>
> As discussed in there, this is kind of controversial change and more thoughts 
> are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16024) Revert HBASE-15406 from branch-1.3

2016-06-14 Thread Mikhail Antonov (JIRA)

Mikhail Antonov created HBASE-16024:
---

 Summary: Revert HBASE-15406 from branch-1.3
 Key: HBASE-16024
 URL: https://issues.apache.org/jira/browse/HBASE-16024
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Mikhail Antonov
Assignee: Mikhail Antonov
 Fix For: 1.3.0


As discussed in there, this is kind of controversial change and more thoughts 
are probably needed around it to get it right.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15335) Add composite key support in row key

2016-06-14 Thread Zhan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhan Zhang updated HBASE-15335:
---
Status: Open  (was: Patch Available)

> Add composite key support in row key
> 
>
> Key: HBASE-15335
> URL: https://issues.apache.org/jira/browse/HBASE-15335
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zhan Zhang
>Assignee: Zhan Zhang
> Attachments: HBASE-15335-1.patch, HBASE-15335-2.patch, 
> HBASE-15335-3.patch, HBASE-15335-4.patch, HBASE-15335-5.patch, 
> HBASE-15335-6.patch, HBASE-15335-7.patch
>
>
> Add composite key filter support in the connector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 3 >

1 - 100 of 212 matches

Mail list logo