[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-21 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Affects Version/s: 0.94.16
 Release Note: A corrupt HFile or HFIle missed could cause endless 
attempts to assign the region without a chance of success
   Status: Patch Available  (was: Open)

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-21 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: TestFailedAssignRegion.java
patch-9740_0.94.txt

patch and test

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Attachments: TestFailedAssignRegion.java, patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-21 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: (was: TestFailedAssignRegion.java)

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-21 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: (was: patch-9740_0.94.txt)

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-21 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: HBASE-9740_0.94.16.patch

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Attachments: HBASE-9740_0.94.16.patch
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-21 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: (was: HBASE-9740_0.94.16.patch)

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Attachments: patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-21 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: patch-9740_0.94.txt

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Attachments: patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-23 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: HBase-9749_0.94_v2.patch

include a test

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Attachments: HBase-9749_0.94_v2.patch, patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9740:
-

Fix Version/s: 0.94.17

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.94.17
>
> Attachments: HBase-9749_0.94_v2.patch, patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-01-23 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: HBase-9749_0.94_v3.patch

Format code and add license header , review please, thanks.

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.94.17
>
> Attachments: HBase-9749_0.94_v2.patch, HBase-9749_0.94_v3.patch, 
> patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-02-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9740:
-

Fix Version/s: (was: 0.94.17)
   0.94.18

Lemme push this to 0.94.18.

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.94.18
>
> Attachments: HBase-9749_0.94_v2.patch, HBase-9749_0.94_v3.patch, 
> patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-03-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9740:
-

Assignee: Ping  (was: Aditya Kishore)

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Ping
> Fix For: 0.94.18
>
> Attachments: HBase-9749_0.94_v2.patch, HBase-9749_0.94_v3.patch, 
> patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-03-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9740:
-

Fix Version/s: (was: 0.94.18)
   0.94.19

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Ping
> Fix For: 0.94.19
>
> Attachments: HBase-9749_0.94_v2.patch, HBase-9749_0.94_v3.patch, 
> patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-03-18 Thread Ping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping updated HBASE-9740:


Attachment: HBase-9740_0.94_v4.patch

hi,  Hofhansl, thanks for your suggestions,  I modified hashmap to 
ConcurrentHashMap and also replace Integer with AtomicInteger for its values.
please review that

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Ping
> Fix For: 0.94.19
>
> Attachments: HBase-9740_0.94_v4.patch, HBase-9749_0.94_v2.patch, 
> HBase-9749_0.94_v3.patch, patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-04-21 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9740:
-

Fix Version/s: (was: 0.94.19)
   0.94.20

I am going to push this one more (last hopefully) time.
The issue is that as long as we have the region in transition we know we have 
something to do. With this we no longer have that after a while and that would 
be a behavior change in 0.94.
Not saying it's wrong, but I would like to discuss this aspect.

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Ping
> Fix For: 0.94.20
>
> Attachments: HBase-9740_0.94_v4.patch, HBase-9749_0.94_v2.patch, 
> HBase-9749_0.94_v3.patch, patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success

2014-05-12 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9740:
-

   Resolution: Won't Fix
Fix Version/s: (was: 0.94.20)
   Status: Resolved  (was: Patch Available)

After pushing this around for a few releases... Do we need this in 0.94? It's 
not clear how anybody would find out about these regions. Currently it is clear 
that something is wrong.
We can resurrect if it is important to have this in 0.94.

> A corrupt HFile could cause endless attempts to assign the region without a 
> chance of success
> -
>
> Key: HBASE-9740
> URL: https://issues.apache.org/jira/browse/HBASE-9740
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.16
>Reporter: Aditya Kishore
>Assignee: Ping
> Attachments: HBase-9740_0.94_v4.patch, HBase-9749_0.94_v2.patch, 
> HBase-9749_0.94_v3.patch, patch-9740_0.94.txt
>
>
> As described in HBASE-9737, a corrupt HFile in a region could lead to an 
> assignment storm in the cluster since the Master will keep trying to assign 
> the region to each region server one after another and obviously none will 
> succeed.
> The region server, upon detecting such a scenario should mark the region as 
> "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper 
> which should indicate the Master to stop assigning the region until the error 
> has been resolved (via an HBase shell command, probably "assign"?)



--
This message was sent by Atlassian JIRA
(v6.2#6252)