[ 
https://issues.apache.org/jira/browse/PHOENIX-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva reassigned PHOENIX-5136:
---------------------------------------

    Assignee: Miles Spielberg

> Rows with null values inserted by UPSERT .. ON DUPLICATE KEY UPDATE are 
> included in query results when they shouldn't be
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5136
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5136
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Hieu Nguyen
>            Assignee: Miles Spielberg
>            Priority: Major
>             Fix For: 4.15.0, 5.1.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will 
> be selected in queries when they should not be.
> Here is a failing test that demonstrates the issue:
> {noformat}
> @Test
> public void 
> testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched()
>  throws Exception {
>     Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
>     Connection conn = DriverManager.getConnection(getUrl(), props);
>     String tableName = generateUniqueName();
>     String ddl = " create table " + tableName + "(pk varchar primary key, 
> counter1 bigint, counter2 smallint)";
>     conn.createStatement().execute(ddl);
>     createIndex(conn, tableName);
>     // The data has to be specifically starting with null for the first 
> counter to fail the test. If you reverse the values, the test passes.
>     String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON 
> DUPLICATE KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 
> END";
>     conn.createStatement().execute(dml1);
>     conn.commit();
>     String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
> KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 
> END";
>     conn.createStatement().execute(dml2);
>     conn.commit();
>     // Using this statement causes the test to pass
>     //ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
> tableName + " WHERE counter2 = 2 AND counter1 = 1");
>     // This statement should be equivalent to the one above, but it selects 
> both rows.
>     ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
> tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
>     assertTrue(rs.next());
>     assertEquals("b",rs.getString(1));
>     assertEquals(1,rs.getLong(2));
>     assertEquals(2,rs.getLong(3));
>     assertFalse(rs.next());
>     conn.close();
> }{noformat}
> The conditions are fairly specific:
>  * Must use ON DUPLICATE KEY UPDATE.  Inserting rows using UPSERT by itself 
> will have correct results
>  * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the 
> test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND 
> counter1 = 1".  I tested a similar "counter2 = 2 AND (counter1 = 1 OR 
> counter1 < 1)", which also caused the test to fail.
>  * If the NULL value for row 'a' is instead in the last position (counter2), 
> then row 'a' is not selected in the query as expected.  The below test 
> demonstrates this behavior (it passes as expected):
> {noformat}
> @Test
> public void 
> testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched()
>  throws Exception {
>     Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
>     Connection conn = DriverManager.getConnection(getUrl(), props);
>     String tableName = generateUniqueName();
>     String ddl = " create table " + tableName + "(pk varchar primary key, 
> counter1 bigint, counter2 smallint)";
>     conn.createStatement().execute(ddl);
>     createIndex(conn, tableName);
>     String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON 
> DUPLICATE KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 
> END";
>     conn.createStatement().execute(dml1);
>     conn.commit();
>     String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
> KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 
> END";
>     conn.createStatement().execute(dml2);
>     conn.commit();
>     ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
> tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
>     assertTrue(rs.next());
>     assertEquals("b",rs.getString(1));
>     assertEquals(1,rs.getLong(2));
>     assertEquals(2,rs.getLong(3));
>     assertFalse(rs.next());
>     conn.close();
> }
> {noformat}
> We also noticed this behavior when upserting and selecting manually against a 
> View.
> Any ideas on where to look to fix this issue?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to