[jira] [Commented] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418185#comment-15418185 ] James Taylor commented on PHOENIX-3176: --- Looks like the RENEW_LEASE feature is impacting this. If that feature is disabled, an infinite loop occurs as it did before when running UpsertSelectAutoCommitIT#testUpsertSelectDoesntSeeUpsertedData. This statement won't actually run on the server because of the usage of the sequence. I think it's working now because we have a single scan running instead of the separate scans per chunk as before. It'd likely still be a problem is a split occurs while the scan is running, though. It might not cause an infinite loop, but if the select starts seeing the rows from the upsert, the results will be wrong. How about adapting that test to force a split while it's running, [~an...@apache.org] or [~samarthjain]? This would be a pretty fundamental change, so we need to be careful with it. Also, if I change that test to not auto commit, but instead commit after the upsert statements, it fails. That may just mean the count we're returning for the UPSERT call is wrong, but that'd be a bug too. I filed PHOENIX-3178 for that. Do you see anything wrong with that test, [~samarthjain]? > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3178) Row count incorrect for UPSERT SELECT when auto commit is false
James Taylor created PHOENIX-3178: - Summary: Row count incorrect for UPSERT SELECT when auto commit is false Key: PHOENIX-3178 URL: https://issues.apache.org/jira/browse/PHOENIX-3178 Project: Phoenix Issue Type: Bug Reporter: James Taylor To reproduce, use the following test: {code} @Test public void testRowCountWithNoAutoCommitOnUpsertSelect() throws Exception { Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); props.setProperty(QueryServices.MUTATE_BATCH_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_CACHE_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_RESULT_CHUNK_SIZE, Integer.toString(3)); Connection conn = DriverManager.getConnection(getUrl(), props); conn.setAutoCommit(false); conn.createStatement().execute("CREATE SEQUENCE keys"); String tableName = generateRandomString(); conn.createStatement().execute( "CREATE TABLE " + tableName + " (pk INTEGER PRIMARY KEY, val INTEGER)"); conn.createStatement().execute( "UPSERT INTO " + tableName + " VALUES (NEXT VALUE FOR keys,1)"); conn.commit(); for (int i=0; i<6; i++) { Statement stmt = conn.createStatement(); int upsertCount = stmt.executeUpdate( "UPSERT INTO " + tableName + " SELECT NEXT VALUE FOR keys, val FROM " + tableName); conn.commit(); assertEquals((int)Math.pow(2, i), upsertCount); } conn.close(); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3036) Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT
[ https://issues.apache.org/jira/browse/PHOENIX-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418139#comment-15418139 ] ASF GitHub Bot commented on PHOENIX-3036: - Github user samarthjain commented on a diff in the pull request: https://github.com/apache/phoenix/pull/189#discussion_r74523759 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SkipScanQueryIT.java --- @@ -111,10 +113,10 @@ private void initSelectAfterUpsertTable(Connection conn) throws Exception { @Test public void testSkipScanFilterQuery() throws Exception { -String createTableDDL = "CREATE TABLE test" + "(col1 VARCHAR," + "col2 VARCHAR," + "col3 VARCHAR," +String createTableDDL = "CREATE TABLE test1" + "(col1 VARCHAR," + "col2 VARCHAR," + "col3 VARCHAR," --- End diff -- Is there a reason why table name here wasn't generated randomly? > Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT > -- > > Key: PHOENIX-3036 > URL: https://issues.apache.org/jira/browse/PHOENIX-3036 > Project: Phoenix > Issue Type: Improvement >Reporter: Samarth Jain >Assignee: prakul agarwal > Fix For: 4.9.0 > > Attachments: PHOENIX-3036.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3036) Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT
[ https://issues.apache.org/jira/browse/PHOENIX-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418140#comment-15418140 ] ASF GitHub Bot commented on PHOENIX-3036: - Github user samarthjain commented on a diff in the pull request: https://github.com/apache/phoenix/pull/189#discussion_r74523807 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SkipScanQueryIT.java --- @@ -246,15 +248,15 @@ public void testVarCharXIntInQuery() throws Exception { public void testPreSplitCompositeFixedKey() throws Exception { Connection conn = DriverManager.getConnection(getUrl()); try { -conn.createStatement().execute("create table test(key_1 char(3) not null, key_2 char(4) not null, v varchar(8) CONSTRAINT pk PRIMARY KEY (key_1,key_2)) split on('000','100','200')"); +conn.createStatement().execute("create table test2(key_1 char(3) not null, key_2 char(4) not null, v varchar(8) CONSTRAINT pk PRIMARY KEY (key_1,key_2)) split on('000','100','200')"); --- End diff -- Is there a reason why table name here wasn't generated randomly? > Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT > -- > > Key: PHOENIX-3036 > URL: https://issues.apache.org/jira/browse/PHOENIX-3036 > Project: Phoenix > Issue Type: Improvement >Reporter: Samarth Jain >Assignee: prakul agarwal > Fix For: 4.9.0 > > Attachments: PHOENIX-3036.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] phoenix pull request #189: PHOENIX-3036 Modify phoenix IT tests to extend Ba...
Github user samarthjain commented on a diff in the pull request: https://github.com/apache/phoenix/pull/189#discussion_r74523807 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SkipScanQueryIT.java --- @@ -246,15 +248,15 @@ public void testVarCharXIntInQuery() throws Exception { public void testPreSplitCompositeFixedKey() throws Exception { Connection conn = DriverManager.getConnection(getUrl()); try { -conn.createStatement().execute("create table test(key_1 char(3) not null, key_2 char(4) not null, v varchar(8) CONSTRAINT pk PRIMARY KEY (key_1,key_2)) split on('000','100','200')"); +conn.createStatement().execute("create table test2(key_1 char(3) not null, key_2 char(4) not null, v varchar(8) CONSTRAINT pk PRIMARY KEY (key_1,key_2)) split on('000','100','200')"); --- End diff -- Is there a reason why table name here wasn't generated randomly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] phoenix pull request #189: PHOENIX-3036 Modify phoenix IT tests to extend Ba...
Github user samarthjain commented on a diff in the pull request: https://github.com/apache/phoenix/pull/189#discussion_r74523759 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SkipScanQueryIT.java --- @@ -111,10 +113,10 @@ private void initSelectAfterUpsertTable(Connection conn) throws Exception { @Test public void testSkipScanFilterQuery() throws Exception { -String createTableDDL = "CREATE TABLE test" + "(col1 VARCHAR," + "col2 VARCHAR," + "col3 VARCHAR," +String createTableDDL = "CREATE TABLE test1" + "(col1 VARCHAR," + "col2 VARCHAR," + "col3 VARCHAR," --- End diff -- Is there a reason why table name here wasn't generated randomly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (PHOENIX-3036) Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT
[ https://issues.apache.org/jira/browse/PHOENIX-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418117#comment-15418117 ] ASF GitHub Bot commented on PHOENIX-3036: - Github user JamesRTaylor commented on the issue: https://github.com/apache/phoenix/pull/189 Is this ready to go, @prakul & @samarthjain? Sounds like there are just a few minor issues turned up by Samarth. Let's get this committed to 4.x ASAP, please. > Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT > -- > > Key: PHOENIX-3036 > URL: https://issues.apache.org/jira/browse/PHOENIX-3036 > Project: Phoenix > Issue Type: Improvement >Reporter: Samarth Jain >Assignee: prakul agarwal > Fix For: 4.9.0 > > Attachments: PHOENIX-3036.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] phoenix issue #189: PHOENIX-3036 Modify phoenix IT tests to extend BaseHBase...
Github user JamesRTaylor commented on the issue: https://github.com/apache/phoenix/pull/189 Is this ready to go, @prakul & @samarthjain? Sounds like there are just a few minor issues turned up by Samarth. Let's get this committed to 4.x ASAP, please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Comment Edited] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418114#comment-15418114 ] Samarth Jain edited comment on PHOENIX-3176 at 8/11/16 11:27 PM: - Wrote a test with ROW_TIMESTAMP column following UpsertSelectAutoCommitIT#testUpsertSelectDoesntSeeUpsertedData. The test passes with and without the patch. The time range on the scan doesn't take into account the time at which the table was resolved. Here is the test that I wrote: {code} @Test public void testUpsertSelectDoesntSeeUpsertedDataWithRowTimestampColumn() throws Exception { Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); props.setProperty(QueryServices.MUTATE_BATCH_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_CACHE_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_RESULT_CHUNK_SIZE, Integer.toString(3)); Connection conn = DriverManager.getConnection(getUrl(), props); conn.setAutoCommit(true); conn.createStatement().execute("CREATE SEQUENCE keys"); String tableName = generateRandomString(); conn.createStatement().execute( "CREATE TABLE " + tableName + " (pk BIGINT PRIMARY KEY ROW_TIMESTAMP, val INTEGER)"); conn.createStatement().execute( "UPSERT INTO " + tableName + " VALUES (NEXT VALUE FOR keys,1)"); for (int i=0; i<6; i++) { Statement stmt = conn.createStatement(); int upsertCount = stmt.executeUpdate( "UPSERT INTO " + tableName + " SELECT NEXT VALUE FOR keys, val FROM " + tableName); assertEquals((int)Math.pow(2, i), upsertCount); } conn.close(); } {code} was (Author: samarthjain): Wrote a test with ROW_TIMESTAMP column following UpsertSelectAutoCommitIT#testUpsertSelectDoesntSeeUpsertedData. The test passes with and without the patch. The time range on the scan doesn't take into account the time at which the table was resolved. Here is the test that I wrote: @Test public void testUpsertSelectDoesntSeeUpsertedDataWithRowTimestampColumn() throws Exception { Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); props.setProperty(QueryServices.MUTATE_BATCH_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_CACHE_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_RESULT_CHUNK_SIZE, Integer.toString(3)); Connection conn = DriverManager.getConnection(getUrl(), props); conn.setAutoCommit(true); conn.createStatement().execute("CREATE SEQUENCE keys"); String tableName = generateRandomString(); conn.createStatement().execute( "CREATE TABLE " + tableName + " (pk BIGINT PRIMARY KEY ROW_TIMESTAMP, val INTEGER)"); conn.createStatement().execute( "UPSERT INTO " + tableName + " VALUES (NEXT VALUE FOR keys,1)"); for (int i=0; i<6; i++) { Statement stmt = conn.createStatement(); int upsertCount = stmt.executeUpdate( "UPSERT INTO " + tableName + " SELECT NEXT VALUE FOR keys, val FROM " + tableName); assertEquals((int)Math.pow(2, i), upsertCount); } conn.close(); } {code} > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+-
[jira] [Commented] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418114#comment-15418114 ] Samarth Jain commented on PHOENIX-3176: --- Wrote a test with ROW_TIMESTAMP column following UpsertSelectAutoCommitIT#testUpsertSelectDoesntSeeUpsertedData. The test passes with and without the patch. The time range on the scan doesn't take into account the time at which the table was resolved. Here is the test that I wrote: @Test public void testUpsertSelectDoesntSeeUpsertedDataWithRowTimestampColumn() throws Exception { Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); props.setProperty(QueryServices.MUTATE_BATCH_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_CACHE_SIZE_ATTRIB, Integer.toString(3)); props.setProperty(QueryServices.SCAN_RESULT_CHUNK_SIZE, Integer.toString(3)); Connection conn = DriverManager.getConnection(getUrl(), props); conn.setAutoCommit(true); conn.createStatement().execute("CREATE SEQUENCE keys"); String tableName = generateRandomString(); conn.createStatement().execute( "CREATE TABLE " + tableName + " (pk BIGINT PRIMARY KEY ROW_TIMESTAMP, val INTEGER)"); conn.createStatement().execute( "UPSERT INTO " + tableName + " VALUES (NEXT VALUE FOR keys,1)"); for (int i=0; i<6; i++) { Statement stmt = conn.createStatement(); int upsertCount = stmt.executeUpdate( "UPSERT INTO " + tableName + " SELECT NEXT VALUE FOR keys, val FROM " + tableName); assertEquals((int)Math.pow(2, i), upsertCount); } conn.close(); } {code} > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418049#comment-15418049 ] James Taylor commented on PHOENIX-3072: --- I'm confused by this, [~enis]. On the RS, we already make index table updates higher priority than data table updates. Why doesn't this solve the issue? Also, would you mind generating a patch that ignores whitespace changes as it's difficult to find the change you've made. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3072) Deadlock on region opening with secondary index recovery
[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418050#comment-15418050 ] James Taylor commented on PHOENIX-3072: --- I'm confused by this, [~enis]. On the RS, we already make index table updates higher priority than data table updates. Why doesn't this solve the issue? Also, would you mind generating a patch that ignores whitespace changes as it's difficult to find the change you've made. > Deadlock on region opening with secondary index recovery > > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417896#comment-15417896 ] James Taylor commented on PHOENIX-3176: --- Those UpsertSelectIT test supply a CURRENT_SCN. We need tests that don't. Also, this question: bq. Does the time range on the scan already take into account the time at which the table was resolved? > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417888#comment-15417888 ] Samarth Jain commented on PHOENIX-3176: --- We have tests that do UPSERT SELECT with row_timestamp. UpsertSelectIT class has testUpsertSelectWithRowtimeStampColumn among a few others. [~an...@apache.org] - did all IT tests pass successfully for you? > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417879#comment-15417879 ] James Taylor commented on PHOENIX-3176: --- We may need to advise against forward dating rows using the ROW_TIMESTAMP feature given the correlation between the Cell time stamp and the way in which Phoenix works (showing you the rows that exist as of when the query is compiled). FWIW, a workaround for users is to connect with a CURRENT_SCN of Long.MAX_VALUE. > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417866#comment-15417866 ] James Taylor commented on PHOENIX-3176: --- It's important to run the scan as of the time stamp from which the table was resolved so that we have a consistent time across all our parallel scans. One example where we depend on this is when an UPSERT SELECT is done on the same table. We ensure that we don't see the new rows being inserted so that we don't get into an infinite loop. See the UpsertSelectAutoCommitIT.testUpsertSelectDoesntSeeUpsertedData(). Does the time range on the scan already take into account the time at which the table was resolved? It'd be interesting to have another similar test on a table using the ROW_TIMESTAMP feature. Maybe we need a special case for this feature (which would be a shame)? If so, we'd still need to handle the UPSERT SELECT case. > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417815#comment-15417815 ] Samarth Jain commented on PHOENIX-3176: --- Patch looks good to me, [~an...@apache.org]. [~jamestaylor], would be good to get your keen eye on it too. I would move the test that you have added to UpsertValuesIT as this is where the other upsert tests for row_timestamp column were added. > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3165) System table integrity check and repair tool
[ https://issues.apache.org/jira/browse/PHOENIX-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417802#comment-15417802 ] Andrew Purtell commented on PHOENIX-3165: - bq. For SYSTEM.SEQUENCE corruption, we're in a similar, but more precarious situation. Agreed, not all corruptions are amenable to an automated approach. That does not take away from the important (and, IMHO, mandatory) contribution of metadata integrity check and repair tools to the "mission critical"-ness of a data store. > System table integrity check and repair tool > > > Key: PHOENIX-3165 > URL: https://issues.apache.org/jira/browse/PHOENIX-3165 > Project: Phoenix > Issue Type: New Feature >Reporter: Andrew Purtell >Priority: Critical > > When the Phoenix system tables become corrupt recovery is a painstaking > process of low level examination of table contents and manipulation of same > with the HBase shell. This is very difficult work providing no margin of > safety, and is a critical gap in terms of usability. > At the OS level, we have fsck. > At the HDFS level, we have fsck (integrity checking only, though) > At the HBase level, we have hbck. > At the Phoenix level, we lack a system table repair tool. > Implement a tool that: > - Does not depend on the Phoenix client. > - Supports integrity checking of SYSTEM tables. Check for the existence of > all required columns in entries. Check that entries exist for all Phoenix > managed tables (implies Phoenix should add supporting advisory-only metadata > to the HBase table schemas). Check that serializations are valid. > - Supports complete repair of SYSTEM.CATALOG and recreation, if necessary, of > other tables like SYSTEM.STATS which can be dropped to recover from an > emergency. We should be able to drop SYSTEM.CATALOG (or any other SYSTEM > table), run the tool, and have a completely correct recreation of > SYSTEM.CATALOG available at the end of its execution. > - To the extent we have or introduce cross-system-table invariants, check > them and offer a repair or reconstruction option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (PHOENIX-3165) System table integrity check and repair tool
[ https://issues.apache.org/jira/browse/PHOENIX-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417792#comment-15417792 ] Andrew Purtell edited comment on PHOENIX-3165 at 8/11/16 7:27 PM: -- bq. Unfortunately, that's not possible across all the features of Phoenix: No integrity check or repair tool will handle 100% of the cases. Indeed there will be cases where fallback to manual recovery approaches will be necessary and some aspects of metadata tricky or not amenable at all to automated repair approaches. That said, I'm comfortable stating long operator experience with 'fsck' class tools over the history of operation of computing systems demonstrates their utility. Take HBase fsck as an example. I know it to only cover a subset of possible problems, but it allowed me to recover a critical production system in minutes. Imagine if I only had as recourse hacking of META table HFiles with the Hadoop fsshell! It would have been hours of high profile downtime as opposed to minutes, which was serious enough. bq. Corruption can take many forms, though. I think it's important to understand the root cause of the corruption, as IMHO prevention is the best medicine. It's not possible to prevent corruption. There are so many opportunities, so many chains of events that lead to this outcome. Like I said even with a recovery tool there are going to be cases where the tool won't help, but on the other hand there are cases - and with care and attention, those likely to be common - where a recovery tool will allow the user to bring their systems back to an available state very quickly. Snapshots and backups increase the margin of safety overall but are never a quick nor complete solution for system recovery. By definition they miss latest updates. Recovering from latest state by applying a dynamically analyzed delta is faster and a deft surgical tool compared to the big drop-and-restore hammer. Metadata repair tools are not different from index rebuild tools. The RDBMS system has metadata. The system is meant for mission critical operation. The system requires operational tools that meet that objective. bq. Updating HBase metadata with every change to the SYSTEM.CATALOG would put a huge drag on the system. How so? bq. If we're going to do something like that, better to change the design and keep the system-of-record in zookeeper instead. I don't think "system of record" is a use case suitable for ZooKeeper, and I believe this to be a common understanding. It's certainly a frequent conclusion in system design discussions of which I have been a part. That is not a knock on ZooKeeper. It is rock solid as a coordination and consensus service. bq. Best to have Phoenix-level APIs instead that can guarantee that the system catalog is kept in a valid state with commits being performed transactionally. Sure, "Does not depend on the Phoenix client" is rephrased alternatively and hopefully better as "Is Phoenix code using blessed repair mechanisms that do not depend on the normal client code paths" I don't think we can depend on transactional functionality to always be in a workable state, if you are referring to the 4.8+ transactional functionality that requires Tephra and its metadata to be in working order. bq. If the table becomes corrupt, it'd be potentially ambiguous on how to fix it. In theory, I suppose, a tool could let the user choose between the possible choices it'd make to fix it. This is often the case in other 'fsck' style applications and a very common option. Consider Windows CHKDSK and the Linux fsck suite as two very widely deployed examples of this. was (Author: apurtell): bq. Unfortunately, that's not possible across all the features of Phoenix: No integrity check or repair tool will handle 100% of the cases. Indeed there will be cases where fallback to manual recovery approaches will be necessary and some aspects of metadata tricky or not amenable at all to automated repair approaches. That said, I'm comfortable stating long operator experience with 'fsck' class tools over the history of operation of computing systems demonstrates their utility. Take HBase fsck as an example. I know it to only cover a subset of possible problems, but it allowed me to recover a critical production system in minutes. Imagine if I only had as recourse hacking of META table HFiles with the Hadoop fsshell! It would have been hours of high profile downtime as opposed to minutes, which was serious enough. bq. Corruption can take many forms, though. I think it's important to understand the root cause of the corruption, as IMHO prevention is the best medicine. It's not possible to prevent corruption. There are so many opportunities, so many chains of events that lead to this outcome. Like I said even with a recovery tool there are going to be cases
[jira] [Commented] (PHOENIX-3165) System table integrity check and repair tool
[ https://issues.apache.org/jira/browse/PHOENIX-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417792#comment-15417792 ] Andrew Purtell commented on PHOENIX-3165: - bq. Unfortunately, that's not possible across all the features of Phoenix: No integrity check or repair tool will handle 100% of the cases. Indeed there will be cases where fallback to manual recovery approaches will be necessary and some aspects of metadata tricky or not amenable at all to automated repair approaches. That said, I'm comfortable stating long operator experience with 'fsck' class tools over the history of operation of computing systems demonstrates their utility. Take HBase fsck as an example. I know it to only cover a subset of possible problems, but it allowed me to recover a critical production system in minutes. Imagine if I only had as recourse hacking of META table HFiles with the Hadoop fsshell! It would have been hours of high profile downtime as opposed to minutes, which was serious enough. bq. Corruption can take many forms, though. I think it's important to understand the root cause of the corruption, as IMHO prevention is the best medicine. It's not possible to prevent corruption. There are so many opportunities, so many chains of events that lead to this outcome. Like I said even with a recovery tool there are going to be cases where the tool won't help, but on the other hand there are cases - and with care and attention, those likely to be common - where a recovery tool will allow the user to bring their systems back to an available state very quickly. Snapshots and backups increase the margin of safety overall but are never a quick nor complete solution for system recovery. By definition they miss latest updates. Recovering from latest state by applying a dynamically analyzed delta is faster and a deft surgical tool compared to the big drop-and-restore hammer. Metadata repair tools are not different from index rebuild tools. The RDBMS system has metadata. The system is meant for mission critical operation. The system requires operational tools that meet that objective. bq. Updating HBase metadata with every change to the SYSTEM.CATALOG would put a huge drag on the system. How so? bq. If we're going to do something like that, better to change the design and keep the system-of-record in zookeeper instead. I don't think "system of record" is a use case suitable for ZooKeeper, and I believe this to be a common understanding. It's certainly a frequent conclusion in system design discussions of which I have been a part. That is not a knock on ZooKeeper. It is rock solid as a coordination and consensus service. bq. Best to have Phoenix-level APIs instead that can guarantee that the system catalog is kept in a valid state with commits being performed transactionally. Sure, "Does not depend on the Phoenix client" is rephrased alternatively and hopefully better as "Is Phoenix code using blessed repair mechanisms that do not depend on the normal client code paths" I don't think we can depend on transactional functionality to always be in a workable state, if you are referring to the 4.8+ transactional functionality that requires Tephra and its metadata to be in working order. > System table integrity check and repair tool > > > Key: PHOENIX-3165 > URL: https://issues.apache.org/jira/browse/PHOENIX-3165 > Project: Phoenix > Issue Type: New Feature >Reporter: Andrew Purtell >Priority: Critical > > When the Phoenix system tables become corrupt recovery is a painstaking > process of low level examination of table contents and manipulation of same > with the HBase shell. This is very difficult work providing no margin of > safety, and is a critical gap in terms of usability. > At the OS level, we have fsck. > At the HDFS level, we have fsck (integrity checking only, though) > At the HBase level, we have hbck. > At the Phoenix level, we lack a system table repair tool. > Implement a tool that: > - Does not depend on the Phoenix client. > - Supports integrity checking of SYSTEM tables. Check for the existence of > all required columns in entries. Check that entries exist for all Phoenix > managed tables (implies Phoenix should add supporting advisory-only metadata > to the HBase table schemas). Check that serializations are valid. > - Supports complete repair of SYSTEM.CATALOG and recreation, if necessary, of > other tables like SYSTEM.STATS which can be dropped to recover from an > emergency. We should be able to drop SYSTEM.CATALOG (or any other SYSTEM > table), run the tool, and have a completely correct recreation of > SYSTEM.CATALOG available at the end of its execution. > - To the extent we have or introduce cross-system-table invariants, check > them
[jira] [Updated] (PHOENIX-1119) Use Zipkin to visualize Phoenix metrics data
[ https://issues.apache.org/jira/browse/PHOENIX-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishani updated PHOENIX-1119: -- Attachment: spans.png dependency.png callcount.png dependencytree.png Hi, The screenshots are attached. Please share your feedback. https://github.com/AyolaJayamaha/phoenix/tree/zipkin > Use Zipkin to visualize Phoenix metrics data > > > Key: PHOENIX-1119 > URL: https://issues.apache.org/jira/browse/PHOENIX-1119 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Nishani > Labels: gsoc2016, tracing > Attachments: Screenshot from 2016-07-30.png, callcount.png, > dependency.png, dependencytree.png, spans.png > > > Zipkin provides a nice tool for visualizing trace information: > http://twitter.github.io/zipkin/ > It's likely not difficult to visualize the Phoenix tracing data through this > tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3036) Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT
[ https://issues.apache.org/jira/browse/PHOENIX-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417761#comment-15417761 ] ASF GitHub Bot commented on PHOENIX-3036: - Github user prakul commented on a diff in the pull request: https://github.com/apache/phoenix/pull/189#discussion_r74483839 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinMoreIT.java --- @@ -43,7 +43,7 @@ import com.google.common.collect.Maps; -public class HashJoinMoreIT extends BaseHBaseManagedTimeIT { +public class HashJoinMoreIT extends BaseHBaseManagedTimeTableReuseIT { --- End diff -- This class uses unique name for each test and thus can be derived from BaseHBaseManagedTimeTableReuseIT > Modify phoenix IT tests to extend BaseHBaseManagedTimeTableReuseIT > -- > > Key: PHOENIX-3036 > URL: https://issues.apache.org/jira/browse/PHOENIX-3036 > Project: Phoenix > Issue Type: Improvement >Reporter: Samarth Jain >Assignee: prakul agarwal > Fix For: 4.9.0 > > Attachments: PHOENIX-3036.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] phoenix pull request #189: PHOENIX-3036 Modify phoenix IT tests to extend Ba...
Github user prakul commented on a diff in the pull request: https://github.com/apache/phoenix/pull/189#discussion_r74483839 --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinMoreIT.java --- @@ -43,7 +43,7 @@ import com.google.common.collect.Maps; -public class HashJoinMoreIT extends BaseHBaseManagedTimeIT { +public class HashJoinMoreIT extends BaseHBaseManagedTimeTableReuseIT { --- End diff -- This class uses unique name for each test and thus can be derived from BaseHBaseManagedTimeTableReuseIT --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (PHOENIX-3165) System table integrity check and repair tool
[ https://issues.apache.org/jira/browse/PHOENIX-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417753#comment-15417753 ] James Taylor commented on PHOENIX-3165: --- I agree that mucking with the system catalog in the HBase shell is not the right approach if it becomes corrupted. It's safest to rely on a backup/restore mechanism to put the system catalog back into a known/good state IMHO. For particular scenarios in which the SYSTEM.CATALOG is being updated across many rows (such as during upgrade), I think PHOENIX-808 will be a good, simple, and quick to implement safeguard. Corruption can take many forms, though. I think it's important to understand the root cause of the corruption, as IMHO prevention is the best medicine. If a system interacts with Phoenix at the HBase level, this is very dangerous as that system will not know if it's changing the table in an invalid manner. Best to have Phoenix-level APIs instead that can guarantee that the system catalog is kept in a valid state with commits being performed transactionally. Another approach would be to have an RDBMS-style referential integrity check (https://en.wikipedia.org/wiki/Referential_integrity) to prevent invalid states from being entered. This would require, of course, that tools mucking with the SYSTEM.CATALOG go through APIs that check integrity. This would be a pretty big undertaking and it's typically the first thing that a real installation disables because it adds too much overhead. It also wouldn't provide all the integrity checks we need with the denormalization we do. In theory, we could enhance our integrity checks to be able to express these and include them in the check. This would be a very big undertaking. bq. We should be able to drop SYSTEM.CATALOG (or any other SYSTEM table), run the tool, and have a completely correct recreation of SYSTEM.CATALOG available at the end of its execution. Unfortunately, that's not possible across all the features of Phoenix: - The SYSTEM.CATALOG has Phoenix table definitions for all tenants in the form of views. These views are essentially unbounded - for example, a time-series metric system such as Argus may have 10M of them. Other use cases may have multiple per user of a system. There's no other place this information can be retrieved or derived from. - The SYSTEM.CATALOG may vary over time. A client can connect at an earlier time stamp with our CURRENT_SCN capability and see the version that was in place at that time which may be different than the latest. - Updating HBase metadata with every change to the SYSTEM.CATALOG would put a huge drag on the system. If we're going to do something like that, better to change the design and keep the system-of-record in zookeeper instead. - Because we need updates to the system catalog to have all or none commit behavior (i.e. a DDL operation should succeed completely or on failure have made no change), we store both column and table information in the same table (in contiguous rows). We also store view and index metadata in the table. If the table becomes corrupt, it'd be potentially ambiguous on how to fix it. In theory, I suppose, a tool could let the user choose between the possible choices it'd make to fix it. - Since the SYSTEM.CATALOG table is essentially data, corruption may mean data loss. You can't recover from this (other than by restoring from a backup). I don't think guessing or default values that are loss would be viable. In theory, the tool could ask they user what value they'd like to use, but if even a small percentage of 10M rows are corrupt, I don't think this is feasible. For SYSTEM.SEQUENCE corruption, we're in a similar, but more precarious situation. If any attempts to fix sequences cause sequences to no longer be monotonically increasing, then user data can start to be corrupted. It'd be a bit scary to have an automated system drive this. Might need to fallback to a manual approach here, as you might need to look at user data (and Phoenix wouldn't know which data) to know what to reset the current value of a sequence to. > System table integrity check and repair tool > > > Key: PHOENIX-3165 > URL: https://issues.apache.org/jira/browse/PHOENIX-3165 > Project: Phoenix > Issue Type: New Feature >Reporter: Andrew Purtell >Priority: Critical > > When the Phoenix system tables become corrupt recovery is a painstaking > process of low level examination of table contents and manipulation of same > with the HBase shell. This is very difficult work providing no margin of > safety, and is a critical gap in terms of usability. > At the OS level, we have fsck. > At the HDFS level, we have fsck (integrity checking only, though) > At the HBase level, we have hbck. > At the Phoeni
[jira] [Commented] (PHOENIX-2336) Queries with small case column-names return empty result-set when working with Spark Datasource Plugin
[ https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417740#comment-15417740 ] Josh Mahonin commented on PHOENIX-2336: --- Hi [~kalyanhadoop] That github commit looks good. Can you upload a new patch file which includes the changes with all three unit tests? i.e. a squashed diff of commits 81df0c698ba4155a8f73ffe0ad657e9a5640d811 937cab227c26bc364129e6395bf06378ee536103 Thanks! Josh > Queries with small case column-names return empty result-set when working > with Spark Datasource Plugin > --- > > Key: PHOENIX-2336 > URL: https://issues.apache.org/jira/browse/PHOENIX-2336 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Suhas Nalapure >Assignee: Josh Mahonin > Labels: verify > Fix For: 4.9.0 > > > Hi, > The Spark DataFrame filter operation returns empty result-set when > column-name is in the smaller case. Example below: > DataFrame df = > sqlContext.read().format("org.apache.phoenix.spark").options(params).load(); > df.filter("\"col1\" = '5.0'").show(); > Result: > +---++---+---+---+--- > | ID|col1| c1| d2| d3| d4| > +---++---+---+---+---+ > +---++---+---+---+---+ > Whereas the table actually has some rows matching the filter condition. And > if double quotes are removed from around the column name i.e. df.filter("col1 > = '5.0'").show(); , a ColumnNotFoundException is thrown: > Exception in thread "main" java.lang.RuntimeException: > org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): > Undefined column. columnName=D1 > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125) > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80) > at > org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3123) Document new local index implementation and upgrade steps
[ https://issues.apache.org/jira/browse/PHOENIX-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417516#comment-15417516 ] James Taylor commented on PHOENIX-3123: --- Thanks for the doc updates, [~rajeshbabu]. The release note changes look good, but would you mind making your on top of the patch that Josh did so we can commit both together without merge conflicts? For the changes to secondary_indexing.md, I'd recommend changing this: {code} +Prior to 4.8.0 version Local indexing requires following 3 special configurations to ensure data table and local index regions co-location but 4.8.0 onwords we don't need any special configurations for local indexes. {code} to this: {code} +From Phoenix 4.8.0 onward, no configuration changes are required to use local indexing. In Phoenix 4.7 and below, the following configuration changes are required to the server-side hbase-site.xml on the master and regions server nodes: hbase.master.loadbalancer.class org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer hbase.coprocessor.master.classes org.apache.phoenix.hbase.index.master.IndexMasterObserver hbase.coprocessor.regionserver.classes org.apache.hadoop.hbase.regionserver.LocalIndexMerger {code} > Document new local index implementation and upgrade steps > - > > Key: PHOENIX-3123 > URL: https://issues.apache.org/jira/browse/PHOENIX-3123 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Attachments: PHOENIX-3123.patch, PHOENIX-3123_v2.patch > > > Document local index new implementation after PHOENIX-1734 and steps to > upgrade from the old local indexes to new local indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3163) Split during global index creation may cause ERROR 201 error.
[ https://issues.apache.org/jira/browse/PHOENIX-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417500#comment-15417500 ] Hadoop QA commented on PHOENIX-3163: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12823287/PHOENIX-3163_v3.patch against master branch at commit ba82b1cb5a14c2cf109deb8a862389142d92f541. ATTACHMENT ID: 12823287 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 34 warning messages. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + queryPlan.getContext().getScan().setAttribute("CLIENT_SIDE_UPSERT_SELECT", Bytes.toBytes(true)); +new ScanningResultIterator(htable.getScanner(newScan), scanMetrics); {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/515//testReport/ Javadoc warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/515//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/515//console This message is automatically generated. > Split during global index creation may cause ERROR 201 error. > - > > Key: PHOENIX-3163 > URL: https://issues.apache.org/jira/browse/PHOENIX-3163 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Sergey Soldatov >Assignee: Sergey Soldatov > Fix For: 4.8.1 > > Attachments: PHOENIX-3163_v1.patch, PHOENIX-3163_v3.patch > > > When we create global index and split happen meanwhile there is a chance to > fail with ERROR 201: > {noformat} > 2016-08-08 15:55:17,248 INFO [Thread-6] > org.apache.phoenix.iterate.BaseResultIterators(878): Failed to execute task > during cancel > java.util.concurrent.ExecutionException: java.sql.SQLException: ERROR 201 > (22000): Illegal data. > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.phoenix.iterate.BaseResultIterators.close(BaseResultIterators.java:872) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:809) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:713) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) > at > org.apache.phoenix.compile.UpsertCompiler$2.execute(UpsertCompiler.java:815) > at > org.apache.phoenix.compile.DelegateMutationPlan.execute(DelegateMutationPlan.java:31) > at > org.apache.phoenix.compile.PostIndexDDLCompiler$1.execute(PostIndexDDLCompiler.java:124) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.updateData(ConnectionQueryServicesImpl.java:2823) > at > org.apache.phoenix.schema.MetaDataClient.buildIndex(MetaDataClient.java:1079) > at > org.apache.phoenix.schema.MetaDataClient.createIndex(MetaDataClient.java:1382) > at > org.apache.phoenix.compile.CreateIndexCompiler$1.execute(CreateIndexCompiler.java:85) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:330) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1440) > at > org.apache.phoenix.hbase.index.write.TestIndexWriter$1.run(TestIndexWriter.java:93) > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:441) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionIn
[jira] [Updated] (PHOENIX-3163) Split during global index creation may cause ERROR 201 error.
[ https://issues.apache.org/jira/browse/PHOENIX-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3163: - Attachment: PHOENIX-3163_v3.patch Here is the patch what I was telling. > Split during global index creation may cause ERROR 201 error. > - > > Key: PHOENIX-3163 > URL: https://issues.apache.org/jira/browse/PHOENIX-3163 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Sergey Soldatov >Assignee: Sergey Soldatov > Fix For: 4.8.1 > > Attachments: PHOENIX-3163_v1.patch, PHOENIX-3163_v3.patch > > > When we create global index and split happen meanwhile there is a chance to > fail with ERROR 201: > {noformat} > 2016-08-08 15:55:17,248 INFO [Thread-6] > org.apache.phoenix.iterate.BaseResultIterators(878): Failed to execute task > during cancel > java.util.concurrent.ExecutionException: java.sql.SQLException: ERROR 201 > (22000): Illegal data. > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.phoenix.iterate.BaseResultIterators.close(BaseResultIterators.java:872) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:809) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:713) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) > at > org.apache.phoenix.compile.UpsertCompiler$2.execute(UpsertCompiler.java:815) > at > org.apache.phoenix.compile.DelegateMutationPlan.execute(DelegateMutationPlan.java:31) > at > org.apache.phoenix.compile.PostIndexDDLCompiler$1.execute(PostIndexDDLCompiler.java:124) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.updateData(ConnectionQueryServicesImpl.java:2823) > at > org.apache.phoenix.schema.MetaDataClient.buildIndex(MetaDataClient.java:1079) > at > org.apache.phoenix.schema.MetaDataClient.createIndex(MetaDataClient.java:1382) > at > org.apache.phoenix.compile.CreateIndexCompiler$1.execute(CreateIndexCompiler.java:85) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:330) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1440) > at > org.apache.phoenix.hbase.index.write.TestIndexWriter$1.run(TestIndexWriter.java:93) > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:441) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) > at > org.apache.phoenix.schema.types.PDataType.newIllegalDataException(PDataType.java:287) > at > org.apache.phoenix.schema.types.PUnsignedSmallint$UnsignedShortCodec.decodeShort(PUnsignedSmallint.java:146) > at > org.apache.phoenix.schema.types.PSmallint.toObject(PSmallint.java:104) > at org.apache.phoenix.schema.types.PSmallint.toObject(PSmallint.java:28) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:980) > at > org.apache.phoenix.schema.types.PUnsignedSmallint.toObject(PUnsignedSmallint.java:102) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:980) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:992) > at > org.apache.phoenix.schema.types.PDataType.coerceBytes(PDataType.java:830) > at > org.apache.phoenix.schema.types.PDecimal.coerceBytes(PDecimal.java:342) > at > org.apache.phoenix.schema.types.PDataType.coerceBytes(PDataType.java:810) > at > org.apache.phoenix.expression.CoerceExpression.evaluate(CoerceExpression.java:149) > at > org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:69) > at > org.apache.phoenix.jdbc.PhoenixResultSet.getBytes(PhoenixResultSet.java:308) > at > org.apache.phoenix.compile.UpsertCompiler.upsertSelect(UpsertCompiler.java:197) > at > org.apache.phoenix.compile.UpsertCompiler.access$000(UpsertCompiler.java:115) > at > org.apache.phoenix.compile.UpsertCompiler$UpsertingParallelIteratorFactory.mutate(UpsertCompiler.java:259) > at > org.apache.phoenix.compile.MutatingParallelI
[jira] [Comment Edited] (PHOENIX-3163) Split during global index creation may cause ERROR 201 error.
[ https://issues.apache.org/jira/browse/PHOENIX-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417355#comment-15417355 ] Rajeshbabu Chintaguntla edited comment on PHOENIX-3163 at 8/11/16 2:45 PM: --- Here is the patch what I was telling. Which can be improved by moving the attribute name to static field in BaseScannerRegionObserver. was (Author: rajeshbabu): Here is the patch what I was telling. > Split during global index creation may cause ERROR 201 error. > - > > Key: PHOENIX-3163 > URL: https://issues.apache.org/jira/browse/PHOENIX-3163 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Sergey Soldatov >Assignee: Sergey Soldatov > Fix For: 4.8.1 > > Attachments: PHOENIX-3163_v1.patch, PHOENIX-3163_v3.patch > > > When we create global index and split happen meanwhile there is a chance to > fail with ERROR 201: > {noformat} > 2016-08-08 15:55:17,248 INFO [Thread-6] > org.apache.phoenix.iterate.BaseResultIterators(878): Failed to execute task > during cancel > java.util.concurrent.ExecutionException: java.sql.SQLException: ERROR 201 > (22000): Illegal data. > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.phoenix.iterate.BaseResultIterators.close(BaseResultIterators.java:872) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:809) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:713) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) > at > org.apache.phoenix.compile.UpsertCompiler$2.execute(UpsertCompiler.java:815) > at > org.apache.phoenix.compile.DelegateMutationPlan.execute(DelegateMutationPlan.java:31) > at > org.apache.phoenix.compile.PostIndexDDLCompiler$1.execute(PostIndexDDLCompiler.java:124) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.updateData(ConnectionQueryServicesImpl.java:2823) > at > org.apache.phoenix.schema.MetaDataClient.buildIndex(MetaDataClient.java:1079) > at > org.apache.phoenix.schema.MetaDataClient.createIndex(MetaDataClient.java:1382) > at > org.apache.phoenix.compile.CreateIndexCompiler$1.execute(CreateIndexCompiler.java:85) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:330) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1440) > at > org.apache.phoenix.hbase.index.write.TestIndexWriter$1.run(TestIndexWriter.java:93) > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:441) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) > at > org.apache.phoenix.schema.types.PDataType.newIllegalDataException(PDataType.java:287) > at > org.apache.phoenix.schema.types.PUnsignedSmallint$UnsignedShortCodec.decodeShort(PUnsignedSmallint.java:146) > at > org.apache.phoenix.schema.types.PSmallint.toObject(PSmallint.java:104) > at org.apache.phoenix.schema.types.PSmallint.toObject(PSmallint.java:28) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:980) > at > org.apache.phoenix.schema.types.PUnsignedSmallint.toObject(PUnsignedSmallint.java:102) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:980) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:992) > at > org.apache.phoenix.schema.types.PDataType.coerceBytes(PDataType.java:830) > at > org.apache.phoenix.schema.types.PDecimal.coerceBytes(PDecimal.java:342) > at > org.apache.phoenix.schema.types.PDataType.coerceBytes(PDataType.java:810) > at > org.apache.phoenix.expression.CoerceExpression.evaluate(CoerceExpression.java:149) > at > org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:69) > at > org.apache.phoenix.jdbc.PhoenixResultSet.getBytes(PhoenixResultSet.java:308) > at > org.apache.phoenix.compile.UpsertCompiler.upsertSelect(UpsertCompiler.java:197) > at > org.apache.phoenix.compil
[jira] [Commented] (PHOENIX-3163) Split during global index creation may cause ERROR 201 error.
[ https://issues.apache.org/jira/browse/PHOENIX-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417352#comment-15417352 ] Rajeshbabu Chintaguntla commented on PHOENIX-3163: -- For non aggregate queries on local indexes we need to call the iterator to create scanners for both the split regions otherwise query will fail with StaleRegionBoundaryCacheException. So we can add an attribute for client side upgrade select and create the ScanningResultIterator in that case or else go by calling iterator. > Split during global index creation may cause ERROR 201 error. > - > > Key: PHOENIX-3163 > URL: https://issues.apache.org/jira/browse/PHOENIX-3163 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Sergey Soldatov >Assignee: Sergey Soldatov > Fix For: 4.8.1 > > Attachments: PHOENIX-3163_v1.patch > > > When we create global index and split happen meanwhile there is a chance to > fail with ERROR 201: > {noformat} > 2016-08-08 15:55:17,248 INFO [Thread-6] > org.apache.phoenix.iterate.BaseResultIterators(878): Failed to execute task > during cancel > java.util.concurrent.ExecutionException: java.sql.SQLException: ERROR 201 > (22000): Illegal data. > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.phoenix.iterate.BaseResultIterators.close(BaseResultIterators.java:872) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:809) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:713) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) > at > org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) > at > org.apache.phoenix.compile.UpsertCompiler$2.execute(UpsertCompiler.java:815) > at > org.apache.phoenix.compile.DelegateMutationPlan.execute(DelegateMutationPlan.java:31) > at > org.apache.phoenix.compile.PostIndexDDLCompiler$1.execute(PostIndexDDLCompiler.java:124) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.updateData(ConnectionQueryServicesImpl.java:2823) > at > org.apache.phoenix.schema.MetaDataClient.buildIndex(MetaDataClient.java:1079) > at > org.apache.phoenix.schema.MetaDataClient.createIndex(MetaDataClient.java:1382) > at > org.apache.phoenix.compile.CreateIndexCompiler$1.execute(CreateIndexCompiler.java:85) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343) > at > org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:330) > at > org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1440) > at > org.apache.phoenix.hbase.index.write.TestIndexWriter$1.run(TestIndexWriter.java:93) > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:441) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) > at > org.apache.phoenix.schema.types.PDataType.newIllegalDataException(PDataType.java:287) > at > org.apache.phoenix.schema.types.PUnsignedSmallint$UnsignedShortCodec.decodeShort(PUnsignedSmallint.java:146) > at > org.apache.phoenix.schema.types.PSmallint.toObject(PSmallint.java:104) > at org.apache.phoenix.schema.types.PSmallint.toObject(PSmallint.java:28) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:980) > at > org.apache.phoenix.schema.types.PUnsignedSmallint.toObject(PUnsignedSmallint.java:102) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:980) > at > org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:992) > at > org.apache.phoenix.schema.types.PDataType.coerceBytes(PDataType.java:830) > at > org.apache.phoenix.schema.types.PDecimal.coerceBytes(PDecimal.java:342) > at > org.apache.phoenix.schema.types.PDataType.coerceBytes(PDataType.java:810) > at > org.apache.phoenix.expression.CoerceExpression.evaluate(CoerceExpression.java:149) > at > org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:69) > at > org.apache.phoenix.jdbc.PhoenixResultSet.getBytes(PhoenixResultSet.java:308) > at > org.apache.phoenix.compile.UpsertCompiler.upsertSelect(UpsertComp
[jira] [Commented] (PHOENIX-3171) Update release notes for known issue PHOENIX-3164 in 4.8
[ https://issues.apache.org/jira/browse/PHOENIX-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417341#comment-15417341 ] Rajeshbabu Chintaguntla commented on PHOENIX-3171: -- Added the release notes as part of PHOENIX-3123. > Update release notes for known issue PHOENIX-3164 in 4.8 > > > Key: PHOENIX-3171 > URL: https://issues.apache.org/jira/browse/PHOENIX-3171 > Project: Phoenix > Issue Type: Bug >Reporter: Ankit Singhal >Assignee: Rajeshbabu Chintaguntla > Attachments: PHOENIX-3171.patch > > > please update the release notes on below page related to PHOENIX-3164 and > with any workaround if possible. > https://phoenix.apache.org/release_notes.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3123) Document new local index implementation and upgrade steps
[ https://issues.apache.org/jira/browse/PHOENIX-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3123: - Attachment: PHOENIX-3123_v2.patch Added the required documentation and release notes as well. [~jamestaylor] please review. > Document new local index implementation and upgrade steps > - > > Key: PHOENIX-3123 > URL: https://issues.apache.org/jira/browse/PHOENIX-3123 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Attachments: PHOENIX-3123.patch, PHOENIX-3123_v2.patch > > > Document local index new implementation after PHOENIX-1734 and steps to > upgrade from the old local indexes to new local indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3123) Document new local index implementation and upgrade steps
[ https://issues.apache.org/jira/browse/PHOENIX-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3123: - Attachment: (was: PHOENIX-3123.patch) > Document new local index implementation and upgrade steps > - > > Key: PHOENIX-3123 > URL: https://issues.apache.org/jira/browse/PHOENIX-3123 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Attachments: PHOENIX-3123.patch > > > Document local index new implementation after PHOENIX-1734 and steps to > upgrade from the old local indexes to new local indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3123) Document new local index implementation and upgrade steps
[ https://issues.apache.org/jira/browse/PHOENIX-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3123: - Attachment: PHOENIX-3123.patch > Document new local index implementation and upgrade steps > - > > Key: PHOENIX-3123 > URL: https://issues.apache.org/jira/browse/PHOENIX-3123 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Attachments: PHOENIX-3123.patch > > > Document local index new implementation after PHOENIX-1734 and steps to > upgrade from the old local indexes to new local indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3123) Document new local index implementation and upgrade steps
[ https://issues.apache.org/jira/browse/PHOENIX-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3123: - Attachment: PHOENIX-3123.patch > Document new local index implementation and upgrade steps > - > > Key: PHOENIX-3123 > URL: https://issues.apache.org/jira/browse/PHOENIX-3123 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Attachments: PHOENIX-3123.patch > > > Document local index new implementation after PHOENIX-1734 and steps to > upgrade from the old local indexes to new local indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3123) Document new local index implementation and upgrade steps
[ https://issues.apache.org/jira/browse/PHOENIX-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3123: - Attachment: (was: PHOENIX-3123.patch) > Document new local index implementation and upgrade steps > - > > Key: PHOENIX-3123 > URL: https://issues.apache.org/jira/browse/PHOENIX-3123 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > > Document local index new implementation after PHOENIX-1734 and steps to > upgrade from the old local indexes to new local indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3123) Document new local index implementation and upgrade steps
[ https://issues.apache.org/jira/browse/PHOENIX-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-3123: - Attachment: PHOENIX-3123.patch [~jamestaylor] Please review. > Document new local index implementation and upgrade steps > - > > Key: PHOENIX-3123 > URL: https://issues.apache.org/jira/browse/PHOENIX-3123 > Project: Phoenix > Issue Type: Bug >Reporter: Rajeshbabu Chintaguntla >Assignee: Rajeshbabu Chintaguntla > Attachments: PHOENIX-3123.patch > > > Document local index new implementation after PHOENIX-1734 and steps to > upgrade from the old local indexes to new local indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated PHOENIX-3176: --- Attachment: (was: PHOENIX-3176.patch) > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated PHOENIX-3176: --- Attachment: PHOENIX-3176.patch > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
[ https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated PHOENIX-3176: --- Attachment: PHOENIX-3176.patch [~samarthjain], when SCN is not set and no row_timestamp filter is in the query , we should not need to cap the scan. {code} // If we haven't resolved the time at the beginning of compilation, don't - // force the lookup on the server, but use HConstants.LATEST_TIMESTAMP instead. - scn = tableRef.getTimeStamp(); - if (scn == QueryConstants.UNSET_TIMESTAMP) { - scn = HConstants.LATEST_TIMESTAMP; - } {code} > Rows will be skipped which are having future timestamp in row_timestamp column > -- > > Key: PHOENIX-3176 > URL: https://issues.apache.org/jira/browse/PHOENIX-3176 > Project: Phoenix > Issue Type: Bug >Reporter: Ankit Singhal > Fix For: 4.8.1 > > Attachments: PHOENIX-3176.patch > > > Rows will be skipped when row_timestamp have future timestamp > {code} > : jdbc:phoenix:localhost> CREATE TABLE historian.data ( > . . . . . . . . . . . . .> assetid unsigned_int not null, > . . . . . . . . . . . . .> metricid unsigned_int not null, > . . . . . . . . . . . . .> ts timestamp not null, > . . . . . . . . . . . . .> val double > . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts > row_timestamp)) > . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; > No rows affected (1.283 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2015-01-01',1.2); > 1 row affected (0.047 seconds) > 0: jdbc:phoenix:localhost> upsert into historian.data > values(1,2,'2018-01-01',1.2); > 1 row affected (0.005 seconds) > 0: jdbc:phoenix:localhost> select * from historian.data; > +--+---+--+--+ > | ASSETID | METRICID |TS| VAL | > +--+---+--+--+ > | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | > +--+---+--+--+ > 1 row selected (0.04 seconds) > 0: jdbc:phoenix:localhost> select count(*) from historian.data; > +---+ > | COUNT(1) | > +---+ > | 1 | > +---+ > 1 row selected (0.013 seconds) > {code} > Explain plan, where scan range is capped to compile time. > {code} > | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | > | ROW TIMESTAMP FILTER [0, 1470901929982) | > | SERVER FILTER BY FIRST KEY ONLY | > | SERVER AGGREGATE INTO SINGLE ROW | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3176) Rows will be skipped which are having future timestamp in row_timestamp column
Ankit Singhal created PHOENIX-3176: -- Summary: Rows will be skipped which are having future timestamp in row_timestamp column Key: PHOENIX-3176 URL: https://issues.apache.org/jira/browse/PHOENIX-3176 Project: Phoenix Issue Type: Bug Reporter: Ankit Singhal Fix For: 4.8.1 Rows will be skipped when row_timestamp have future timestamp {code} : jdbc:phoenix:localhost> CREATE TABLE historian.data ( . . . . . . . . . . . . .> assetid unsigned_int not null, . . . . . . . . . . . . .> metricid unsigned_int not null, . . . . . . . . . . . . .> ts timestamp not null, . . . . . . . . . . . . .> val double . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts row_timestamp)) . . . . . . . . . . . . .> IMMUTABLE_ROWS=true; No rows affected (1.283 seconds) 0: jdbc:phoenix:localhost> upsert into historian.data values(1,2,'2015-01-01',1.2); 1 row affected (0.047 seconds) 0: jdbc:phoenix:localhost> upsert into historian.data values(1,2,'2018-01-01',1.2); 1 row affected (0.005 seconds) 0: jdbc:phoenix:localhost> select * from historian.data; +--+---+--+--+ | ASSETID | METRICID |TS| VAL | +--+---+--+--+ | 1| 2 | 2015-01-01 00:00:00.000 | 1.2 | +--+---+--+--+ 1 row selected (0.04 seconds) 0: jdbc:phoenix:localhost> select count(*) from historian.data; +---+ | COUNT(1) | +---+ | 1 | +---+ 1 row selected (0.013 seconds) {code} Explain plan, where scan range is capped to compile time. {code} | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA | | ROW TIMESTAMP FILTER [0, 1470901929982) | | SERVER FILTER BY FIRST KEY ONLY | | SERVER AGGREGATE INTO SINGLE ROW | {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2336) Queries with small case column-names return empty result-set when working with Spark Datasource Plugin
[ https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417025#comment-15417025 ] Kalyan commented on PHOENIX-2336: - Hi Josh Mahonin, same patch is going to work for PHOENIX-2336, PHOENIX-2290 and PHOENIX-2547. i added the unit tests with proper comment also .. Comments: Limitation: filter / where expressions are not allowed with "double quotes", instead of that pass it as column expressions Reason: if the expression contains "double quotes" then spark sql parser, ignoring evaluating .. giving to next level to handle Please review this patch https://github.com/kalyanhadooptraining/phoenix/commit/81df0c698ba4155a8f73ffe0ad657e9a5640d811 > Queries with small case column-names return empty result-set when working > with Spark Datasource Plugin > --- > > Key: PHOENIX-2336 > URL: https://issues.apache.org/jira/browse/PHOENIX-2336 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.6.0 >Reporter: Suhas Nalapure >Assignee: Josh Mahonin > Labels: verify > Fix For: 4.9.0 > > > Hi, > The Spark DataFrame filter operation returns empty result-set when > column-name is in the smaller case. Example below: > DataFrame df = > sqlContext.read().format("org.apache.phoenix.spark").options(params).load(); > df.filter("\"col1\" = '5.0'").show(); > Result: > +---++---+---+---+--- > | ID|col1| c1| d2| d3| d4| > +---++---+---+---+---+ > +---++---+---+---+---+ > Whereas the table actually has some rows matching the filter condition. And > if double quotes are removed from around the column name i.e. df.filter("col1 > = '5.0'").show(); , a ColumnNotFoundException is thrown: > Exception in thread "main" java.lang.RuntimeException: > org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): > Undefined column. columnName=D1 > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125) > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80) > at > org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-541) Make mutable batch size bytes-based instead of row-based
[ https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416922#comment-15416922 ] Saurabh Seth commented on PHOENIX-541: -- If no one has started working on this maybe I can take this up. I have a question though. What is the plan for the phoenix.mutate.batchSize and phoenix.mutate.maxSize configuration variables that are exposed to end users? How do we want to deal with this for upgrades? > Make mutable batch size bytes-based instead of row-based > > > Key: PHOENIX-541 > URL: https://issues.apache.org/jira/browse/PHOENIX-541 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 3.0-Release >Reporter: mujtaba > Labels: newbie > > With current configuration of row-count based mutable batch size, ideal value > for batch size is around 800 rather then current 15k when creating indexes > based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14 > integer column in separate CFs) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (PHOENIX-2821) Document Offset and Fetch SQL construct
[ https://issues.apache.org/jira/browse/PHOENIX-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal resolved PHOENIX-2821. Resolution: Fixed Fix Version/s: 4.8.0 > Document Offset and Fetch SQL construct > > > Key: PHOENIX-2821 > URL: https://issues.apache.org/jira/browse/PHOENIX-2821 > Project: Phoenix > Issue Type: Sub-task >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Fix For: 4.8.0 > > Attachments: PHOENIX-2821.patch, PHOENIX-2821_v1.patch, > PHOENIX-2821_v2.patch, PHOENIX-2821_v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2821) Document Offset and Fetch SQL construct
[ https://issues.apache.org/jira/browse/PHOENIX-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416693#comment-15416693 ] James Taylor commented on PHOENIX-2821: --- +1. Thanks for the updates! > Document Offset and Fetch SQL construct > > > Key: PHOENIX-2821 > URL: https://issues.apache.org/jira/browse/PHOENIX-2821 > Project: Phoenix > Issue Type: Sub-task >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Attachments: PHOENIX-2821.patch, PHOENIX-2821_v1.patch, > PHOENIX-2821_v2.patch, PHOENIX-2821_v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2821) Document Offset and Fetch SQL construct
[ https://issues.apache.org/jira/browse/PHOENIX-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated PHOENIX-2821: --- Attachment: PHOENIX-2821_v3.patch Sorry about that, last diff was just taken from publish directory by mistake > Document Offset and Fetch SQL construct > > > Key: PHOENIX-2821 > URL: https://issues.apache.org/jira/browse/PHOENIX-2821 > Project: Phoenix > Issue Type: Sub-task >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Attachments: PHOENIX-2821.patch, PHOENIX-2821_v1.patch, > PHOENIX-2821_v2.patch, PHOENIX-2821_v3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3128) Remove extraneous operations during upsert with local immutable index
[ https://issues.apache.org/jira/browse/PHOENIX-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416686#comment-15416686 ] James Taylor commented on PHOENIX-3128: --- There are Gets done on the server side to lookup the prior value of a row that should no longer be done. Maybe you can confirm, [~rajeshbabu]? The overhead of index maintenance for local immutable indexes should be much lower (maybe 30% lower). > Remove extraneous operations during upsert with local immutable index > - > > Key: PHOENIX-3128 > URL: https://issues.apache.org/jira/browse/PHOENIX-3128 > Project: Phoenix > Issue Type: Bug >Reporter: Junegunn Choi >Assignee: Junegunn Choi > Fix For: 4.8.0 > > Attachments: PHOENIX-3128.patch, PHOENIX-3128_v2.patch, > PHOENIX-3128_v3.patch, PHOENIX-3128_v4.patch, PHOENIX-3128_v5.patch, > PHOENIX-3128_v6.patch, PHOENIX-3128_v7.patch, PHOENIX-3128_v8.patch, > PHOENIX-3128_wip.patch > > > Upsert to a table with a local immutable index is supposed to be more > efficient than to a table with a local mutable index, but it's actually > slower (in our environment by 30%) due to extraneous operations involved. > The problem is twofold: > 1. Client unnecessarily prepares and sends index update. > 2. Index cleanup is done regardless of the immutability of the table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2821) Document Offset and Fetch SQL construct
[ https://issues.apache.org/jira/browse/PHOENIX-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416683#comment-15416683 ] James Taylor commented on PHOENIX-2821: --- I'm not seeing any .md or site.xml in the patch. Maybe some files were left out (or I missed something)? > Document Offset and Fetch SQL construct > > > Key: PHOENIX-2821 > URL: https://issues.apache.org/jira/browse/PHOENIX-2821 > Project: Phoenix > Issue Type: Sub-task >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Attachments: PHOENIX-2821.patch, PHOENIX-2821_v1.patch, > PHOENIX-2821_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2821) Document Offset and Fetch SQL construct
[ https://issues.apache.org/jira/browse/PHOENIX-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416673#comment-15416673 ] Ankit Singhal commented on PHOENIX-2821: bq. let's move this page back to the Features menu (maybe under Row Timestamp Column item) done in the latest patch is it ready for commit now? > Document Offset and Fetch SQL construct > > > Key: PHOENIX-2821 > URL: https://issues.apache.org/jira/browse/PHOENIX-2821 > Project: Phoenix > Issue Type: Sub-task >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Attachments: PHOENIX-2821.patch, PHOENIX-2821_v1.patch, > PHOENIX-2821_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-2821) Document Offset and Fetch SQL construct
[ https://issues.apache.org/jira/browse/PHOENIX-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated PHOENIX-2821: --- Attachment: PHOENIX-2821_v2.patch > Document Offset and Fetch SQL construct > > > Key: PHOENIX-2821 > URL: https://issues.apache.org/jira/browse/PHOENIX-2821 > Project: Phoenix > Issue Type: Sub-task >Reporter: Ankit Singhal >Assignee: Ankit Singhal > Attachments: PHOENIX-2821.patch, PHOENIX-2821_v1.patch, > PHOENIX-2821_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)