[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877317#comment-14877317
 ] 

Hudson commented on HBASE-14280:


SUCCESS: Integrated in HBase-1.2-IT #157 (See 
[https://builds.apache.org/job/HBase-1.2-IT/157/])
HBASE-14280 Bulk Upload from HA cluster to remote HA hbase cluster fails (Ankit 
Singhal) (tedyu: rev 34032492d7984a865e956aa43689834a519b93ca)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java


> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14448) Refine RegionGroupingProvider Phase-2: remove provider nesting and formalize wal group name

2015-09-19 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-14448:
--
Attachment: HBASE-14448.patch

> Refine RegionGroupingProvider Phase-2: remove provider nesting and formalize 
> wal group name
> ---
>
> Key: HBASE-14448
> URL: https://issues.apache.org/jira/browse/HBASE-14448
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-14448.patch
>
>
> Now we are nesting DefaultWALProvider inside RegionGroupingProvider, which 
> makes the logic ambiguous since a "provider" itself should provide logs. 
> Suggest to directly instantiate FSHlog in RegionGroupingProvider.
> W.r.t wal group name, now in RegionGroupingProvider it's using sth like 
> "-null-" which is quite long and unnecessary. Suggest 
> to directly use ".".
> For more details, please refer to the initial patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14448) Refine RegionGroupingProvider Phase-2: remove provider nesting and formalize wal group name

2015-09-19 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-14448:
--
Attachment: HBASE-14448_v2.patch

Upload patch in sync with rb

> Refine RegionGroupingProvider Phase-2: remove provider nesting and formalize 
> wal group name
> ---
>
> Key: HBASE-14448
> URL: https://issues.apache.org/jira/browse/HBASE-14448
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-14448.patch, HBASE-14448_v2.patch
>
>
> Now we are nesting DefaultWALProvider inside RegionGroupingProvider, which 
> makes the logic ambiguous since a "provider" itself should provide logs. 
> Suggest to directly instantiate FSHlog in RegionGroupingProvider.
> W.r.t wal group name, now in RegionGroupingProvider it's using sth like 
> "-null-" which is quite long and unnecessary. Suggest 
> to directly use ".".
> For more details, please refer to the initial patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877322#comment-14877322
 ] 

Hudson commented on HBASE-14280:


FAILURE: Integrated in HBase-1.3 #186 (See 
[https://builds.apache.org/job/HBase-1.3/186/])
HBASE-14280 Bulk Upload from HA cluster to remote HA hbase cluster fails (Ankit 
Singhal) (tedyu: rev 566a20145c425cca03c968c1849b5052dbae705d)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java


> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery

2015-09-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-1:
---
   Resolution: Later
Fix Version/s: (was: 2.0.0)
   Status: Resolved  (was: Patch Available)

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> ---
>
> Key: HBASE-1
> URL: https://issues.apache.org/jira/browse/HBASE-1
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 1-0.96-v5.txt, 1-0.96-v6.txt, 
> 1-recover-ts-with-pb-2.txt, 1-recover-ts-with-pb-3.txt, 
> 1-recover-ts-with-pb-4.txt, 1-recover-ts-with-pb-5.txt, 
> 1-recover-ts-with-pb-6.txt, 1-recover-ts-with-pb-7.txt, 
> 1-recover-ts-with-pb-8.txt, 1-recover-ts-with-pb-8.txt, 1-v4.txt, 
> 1-v5.txt, 1-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14407) NotServingRegion: hbase region closed forever

2015-09-19 Thread Shuaifeng Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877385#comment-14877385
 ] 

Shuaifeng Zhou commented on HBASE-14407:


lgtm

should patch goes to 0.98 and branch-1.1 ?

> NotServingRegion: hbase region closed forever
> -
>
> Key: HBASE-14407
> URL: https://issues.apache.org/jira/browse/HBASE-14407
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.98.10, 1.2.0, 1.1.2, 1.3.0
>Reporter: Shuaifeng Zhou
>Assignee: Shuaifeng Zhou
>Priority: Critical
> Attachments: 14407-branch-1.2.patch, hbase-14407-0.98.patch, 
> hbase-14407-1.1.patch, hbase-14407-1.2.patch, hs4.log, master.log
>
>
> I found a situation may cause region closed forever, and this situation 
> happend usually on my cluster, version is 0.98.10, but 1.1.2 also have the 
> problem:
> 1, master send region open to regionserver
> 2, rs open a handler do openregion
> 3, rs return resopnse to master
> 3, master not received the response, or timeout, send open region again
> 4, rs already opened the region
> 5, master processAlreadyOpenedRegion, update regionstate open in master 
> memory
> 6, master received zk message region opened(for some reason late, eg: net 
> work), and triger update regionstate open, but find that region already 
> opened, ERROR!
> 7, master send close region, and region be closed forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877309#comment-14877309
 ] 

Hudson commented on HBASE-14280:


FAILURE: Integrated in HBase-1.1 #668 (See 
[https://builds.apache.org/job/HBase-1.1/668/])
HBASE-14280 Bulk Upload from HA cluster to remote HA hbase cluster fails (Ankit 
Singhal) (tedyu: rev 36200f1ea9ef2f02d7158e49c0f8c1e2ea290b9a)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java


> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877315#comment-14877315
 ] 

Hudson commented on HBASE-14280:


FAILURE: Integrated in HBase-1.2 #186 (See 
[https://builds.apache.org/job/HBase-1.2/186/])
HBASE-14280 Bulk Upload from HA cluster to remote HA hbase cluster fails (Ankit 
Singhal) (tedyu: rev 34032492d7984a865e956aa43689834a519b93ca)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java


> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877325#comment-14877325
 ] 

Hudson commented on HBASE-14280:


FAILURE: Integrated in HBase-TRUNK #6822 (See 
[https://builds.apache.org/job/HBase-TRUNK/6822/])
HBASE-14280 Bulk Upload from HA cluster to remote HA hbase cluster fails (Ankit 
Singhal) (tedyu: rev a7afc132e2db7ebbb571b43ae17161bff40b59c5)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java


> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876930#comment-14876930
 ] 

Hudson commented on HBASE-14453:


FAILURE: Integrated in HBase-0.98 #1126 (See 
[https://builds.apache.org/job/HBase-0.98/1126/])
HBASE-14453 HBaseAdmin#deleteTable should relocate META when cached location is 
stale (apurtell: rev c4fa84965b48b1dae3ecd005ec80cac8872d1ece)
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java


> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Samir Ahmic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Ahmic updated HBASE-14431:

Attachment: HBASE-14431-v2.patch

Thanks for review [~tedyu]. Here is a new patch using local variable. 

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877011#comment-14877011
 ] 

Hudson commented on HBASE-14453:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1080 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1080/])
HBASE-14453 HBaseAdmin#deleteTable should relocate META when cached location is 
stale (apurtell: rev c4fa84965b48b1dae3ecd005ec80cac8872d1ece)
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java


> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877053#comment-14877053
 ] 

Ted Yu commented on HBASE-14431:


+1

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14431:
---
 Hadoop Flags: Reviewed
Fix Version/s: 1.1.3
   1.3.0
   1.2.0
   2.0.0

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14280:
---
Attachment: 14280-v4.patch

Patch v4 addresses checkstyle warnings.

> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877141#comment-14877141
 ] 

Hadoop QA commented on HBASE-14431:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12761281/HBASE-14431-v2.patch
  against master branch at commit b0f52332651ecbb8af11557df5af3189c7283212.
  ATTACHMENT ID: 12761281

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//console

This message is automatically generated.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14431:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch, Samir

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14280:
---
Fix Version/s: 1.1.3
   1.3.0
   1.2.0
   2.0.0

> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877218#comment-14877218
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.1 #667 (See 
[https://builds.apache.org/job/HBase-1.1/667/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
911c4342ae66447d51ec05e25eeb3b6c4d348a22)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877222#comment-14877222
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.3 #185 (See 
[https://builds.apache.org/job/HBase-1.3/185/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
88adccd553e4f70a0e5362d5ab5158f45d57d201)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14280:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch, Ankit

> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877234#comment-14877234
 ] 

Hadoop QA commented on HBASE-14280:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12761294/14280-v4.patch
  against master branch at commit 1545e1ed8d68b780dca49084cf5d8173481f72c0.
  ATTACHMENT ID: 12761294

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.coprocessor.TestMasterObserver.testTableOperations(TestMasterObserver.java:1365)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15642//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15642//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15642//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15642//console

This message is automatically generated.

> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   

[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877318#comment-14877318
 ] 

Hudson commented on HBASE-14280:


SUCCESS: Integrated in HBase-1.3-IT #168 (See 
[https://builds.apache.org/job/HBase-1.3-IT/168/])
HBASE-14280 Bulk Upload from HA cluster to remote HA hbase cluster fails (Ankit 
Singhal) (tedyu: rev 566a20145c425cca03c968c1849b5052dbae705d)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java


> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14280-v4.patch, HBASE-14280_v1.0.patch, 
> HBASE-14280_v2.patch, HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877211#comment-14877211
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-TRUNK #6821 (See 
[https://builds.apache.org/job/HBase-TRUNK/6821/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
1545e1ed8d68b780dca49084cf5d8173481f72c0)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877206#comment-14877206
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.2-IT #156 (See 
[https://builds.apache.org/job/HBase-1.2-IT/156/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
388e948dfedab59cfe8fe8cf42001fec0eb32cd3)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877204#comment-14877204
 ] 

Hudson commented on HBASE-14431:


SUCCESS: Integrated in HBase-1.3-IT #167 (See 
[https://builds.apache.org/job/HBase-1.3-IT/167/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
88adccd553e4f70a0e5362d5ab5158f45d57d201)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877214#comment-14877214
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.2 #185 (See 
[https://builds.apache.org/job/HBase-1.2/185/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
388e948dfedab59cfe8fe8cf42001fec0eb32cd3)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)