[ https://issues.apache.org/jira/browse/PHOENIX-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas D'Silva reassigned PHOENIX-5136: --------------------------------------- Assignee: Miles Spielberg > Rows with null values inserted by UPSERT .. ON DUPLICATE KEY UPDATE are > included in query results when they shouldn't be > ------------------------------------------------------------------------------------------------------------------------ > > Key: PHOENIX-5136 > URL: https://issues.apache.org/jira/browse/PHOENIX-5136 > Project: Phoenix > Issue Type: Bug > Affects Versions: 5.0.0 > Reporter: Hieu Nguyen > Assignee: Miles Spielberg > Priority: Major > Fix For: 4.15.0, 5.1.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will > be selected in queries when they should not be. > Here is a failing test that demonstrates the issue: > {noformat} > @Test > public void > testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() > throws Exception { > Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); > Connection conn = DriverManager.getConnection(getUrl(), props); > String tableName = generateUniqueName(); > String ddl = " create table " + tableName + "(pk varchar primary key, > counter1 bigint, counter2 smallint)"; > conn.createStatement().execute(ddl); > createIndex(conn, tableName); > // The data has to be specifically starting with null for the first > counter to fail the test. If you reverse the values, the test passes. > String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON > DUPLICATE KEY UPDATE " + > "counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 > END, " + > "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 > END"; > conn.createStatement().execute(dml1); > conn.commit(); > String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE > KEY UPDATE " + > "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 > END, " + > "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 > END"; > conn.createStatement().execute(dml2); > conn.commit(); > // Using this statement causes the test to pass > //ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + > tableName + " WHERE counter2 = 2 AND counter1 = 1"); > // This statement should be equivalent to the one above, but it selects > both rows. > ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + > tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)"); > assertTrue(rs.next()); > assertEquals("b",rs.getString(1)); > assertEquals(1,rs.getLong(2)); > assertEquals(2,rs.getLong(3)); > assertFalse(rs.next()); > conn.close(); > }{noformat} > The conditions are fairly specific: > * Must use ON DUPLICATE KEY UPDATE. Inserting rows using UPSERT by itself > will have correct results > * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the > test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND > counter1 = 1". I tested a similar "counter2 = 2 AND (counter1 = 1 OR > counter1 < 1)", which also caused the test to fail. > * If the NULL value for row 'a' is instead in the last position (counter2), > then row 'a' is not selected in the query as expected. The below test > demonstrates this behavior (it passes as expected): > {noformat} > @Test > public void > testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() > throws Exception { > Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES); > Connection conn = DriverManager.getConnection(getUrl(), props); > String tableName = generateUniqueName(); > String ddl = " create table " + tableName + "(pk varchar primary key, > counter1 bigint, counter2 smallint)"; > conn.createStatement().execute(ddl); > createIndex(conn, tableName); > String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON > DUPLICATE KEY UPDATE " + > "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 > END, " + > "counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 > END"; > conn.createStatement().execute(dml1); > conn.commit(); > String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE > KEY UPDATE " + > "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 > END, " + > "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 > END"; > conn.createStatement().execute(dml2); > conn.commit(); > ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + > tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)"); > assertTrue(rs.next()); > assertEquals("b",rs.getString(1)); > assertEquals(1,rs.getLong(2)); > assertEquals(2,rs.getLong(3)); > assertFalse(rs.next()); > conn.close(); > } > {noformat} > We also noticed this behavior when upserting and selecting manually against a > View. > Any ideas on where to look to fix this issue? -- This message was sent by Atlassian JIRA (v7.6.14#76016)