[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11667: --- Status: Open (was: Patch Available) Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-doc-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch, IntegrationTestBigLinkedListWithRegionMovement.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-11667: -- Attachment: 11667-doc-0.94.txt How this for comment? (0.94) Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-doc-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch, IntegrationTestBigLinkedListWithRegionMovement.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-11667: -- Priority: Minor (was: Major) Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch, IntegrationTestBigLinkedListWithRegionMovement.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-11667: -- Attachment: 11667-0.94.txt Here's a *proposal* against 0.94. Please have a very careful look, as there might be corner conditions lurking that I have not seen. I ran TestFromClientSide (including the new test I added) and it worked fine. Since we retry with the previous key that caused the failure, all the skipFirst huh-hah just goes away, which is nice. Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11667: --- Fix Version/s: (was: 0.98.6) 0.98.5 Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 0.98.5, 2.0.0, 0.94.23 Attachments: 11667-0.94.txt We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-11667: -- Status: Patch Available (was: Open) Let's try hadoop QA Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 0.98.5, 2.0.0, 0.94.23 Attachments: 11667-0.94.txt, 11667-trunk.txt We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-11667: -- Attachment: 11667-trunk.txt And a trunk version. The logic here was more complicated due to region replicas. {{TestRegionReplicas}} fails locally for me with or without the patch, so not sure. [~enis] and whoever knows about region replicas (maybe [~jeffreyz]?), please have a careful look. The simplification of this code would be nice if it is correct. Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 0.98.5, 2.0.0, 0.94.23 Attachments: 11667-0.94.txt, 11667-trunk.txt We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11667: --- Attachment: HBASE-11667-0.98.patch Attaching a patch for 0.98. Working on an integration test now. Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 0.98.5, 2.0.0, 0.94.23 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11667: --- Fix Version/s: (was: 0.98.5) 0.98.6 Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-11667: -- Attachment: (was: 11667-trunk.txt) Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-11667: -- Attachment: 11667-trunk.txt Reattaching trunk patch to make sure Hadoop QA picks up the last attachment (is that still necessary?) Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11667: --- Attachment: IntegrationTestBigLinkedListWithRegionMovement.patch Attached as IntegrationTestBigLinkedListWithRegionMovement, an integration test that extends ITBLL with a fixed monkey policy that moves a random region of the table every second. Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch, IntegrationTestBigLinkedListWithRegionMovement.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11667: --- Attachment: (was: IntegrationTestBigLinkedListWithRegionMovement.patch) Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11667) Simplify ClientScanner logic for NSREs.
[ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11667: --- Attachment: IntegrationTestBigLinkedListWithRegionMovement.patch Simplify ClientScanner logic for NSREs. --- Key: HBASE-11667 URL: https://issues.apache.org/jira/browse/HBASE-11667 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6 Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch, IntegrationTestBigLinkedListWithRegionMovement.patch We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan and returns an aggregate (in this case a count) with a fake row key. It turns out this does not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset the scanner to try again (which in this case would be the fake key). While this is arguably a rare case and one could also argue that a region observer just shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary. A NSRE occurred because we contacted a region server with a key that it no longer hosts. This is the start key, so it is always correct to retry with this same key. That simplifies the ClientScanner logic and also make this sort of coprocessors possible, -- This message was sent by Atlassian JIRA (v6.2#6252)