[
https://issues.apache.org/jira/browse/PHOENIX-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas D'Silva reassigned PHOENIX-5136:
---------------------------------------
Assignee: Miles Spielberg
> Rows with null values inserted by UPSERT .. ON DUPLICATE KEY UPDATE are
> included in query results when they shouldn't be
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-5136
> URL: https://issues.apache.org/jira/browse/PHOENIX-5136
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 5.0.0
> Reporter: Hieu Nguyen
> Assignee: Miles Spielberg
> Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will
> be selected in queries when they should not be.
> Here is a failing test that demonstrates the issue:
> {noformat}
> @Test
> public void
> testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched()
> throws Exception {
> Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
> Connection conn = DriverManager.getConnection(getUrl(), props);
> String tableName = generateUniqueName();
> String ddl = " create table " + tableName + "(pk varchar primary key,
> counter1 bigint, counter2 smallint)";
> conn.createStatement().execute(ddl);
> createIndex(conn, tableName);
> // The data has to be specifically starting with null for the first
> counter to fail the test. If you reverse the values, the test passes.
> String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON
> DUPLICATE KEY UPDATE " +
> "counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1
> END, " +
> "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2
> END";
> conn.createStatement().execute(dml1);
> conn.commit();
> String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE
> KEY UPDATE " +
> "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1
> END, " +
> "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2
> END";
> conn.createStatement().execute(dml2);
> conn.commit();
> // Using this statement causes the test to pass
> //ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " +
> tableName + " WHERE counter2 = 2 AND counter1 = 1");
> // This statement should be equivalent to the one above, but it selects
> both rows.
> ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " +
> tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
> assertTrue(rs.next());
> assertEquals("b",rs.getString(1));
> assertEquals(1,rs.getLong(2));
> assertEquals(2,rs.getLong(3));
> assertFalse(rs.next());
> conn.close();
> }{noformat}
> The conditions are fairly specific:
> * Must use ON DUPLICATE KEY UPDATE. Inserting rows using UPSERT by itself
> will have correct results
> * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the
> test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND
> counter1 = 1". I tested a similar "counter2 = 2 AND (counter1 = 1 OR
> counter1 < 1)", which also caused the test to fail.
> * If the NULL value for row 'a' is instead in the last position (counter2),
> then row 'a' is not selected in the query as expected. The below test
> demonstrates this behavior (it passes as expected):
> {noformat}
> @Test
> public void
> testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched()
> throws Exception {
> Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
> Connection conn = DriverManager.getConnection(getUrl(), props);
> String tableName = generateUniqueName();
> String ddl = " create table " + tableName + "(pk varchar primary key,
> counter1 bigint, counter2 smallint)";
> conn.createStatement().execute(ddl);
> createIndex(conn, tableName);
> String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON
> DUPLICATE KEY UPDATE " +
> "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1
> END, " +
> "counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2
> END";
> conn.createStatement().execute(dml1);
> conn.commit();
> String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE
> KEY UPDATE " +
> "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1
> END, " +
> "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2
> END";
> conn.createStatement().execute(dml2);
> conn.commit();
> ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " +
> tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
> assertTrue(rs.next());
> assertEquals("b",rs.getString(1));
> assertEquals(1,rs.getLong(2));
> assertEquals(2,rs.getLong(3));
> assertFalse(rs.next());
> conn.close();
> }
> {noformat}
> We also noticed this behavior when upserting and selecting manually against a
> View.
> Any ideas on where to look to fix this issue?
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)