zheliu2 opened a new pull request, #15564:
URL: https://github.com/apache/iceberg/pull/15564

   ## Summary
   
   The executor cache for equality deletes in 
`BaseDeleteLoader.getOrReadEqDeletes` used only the delete file location as the 
cache key. When different queries on the same table require different 
projections of the equality delete columns (determined by 
`DeleteFilter.applyEqDeletes` based on the query's required schema), the cached 
result from the first query would be incorrectly reused for subsequent queries 
with a different projection. This caused equality deletes to be silently 
ignored for the second query.
   
   The fix includes the projected column field IDs in the cache key so that 
entries read with different projections are cached and retrieved independently.
   
   **Root cause:** In `BaseDeleteLoader` line 111, `String cacheKey = 
deleteFile.location()` does not account for the `projection` parameter. The 
projection varies per query because `DeleteFilter.applyEqDeletes` computes 
`deleteSchema = TypeUtil.select(requiredSchema, ids)` where `requiredSchema` 
depends on which columns the query reads.
   
   **Fix:** Append `#fieldId1,fieldId2,...` to the cache key so that the same 
delete file read with different projections gets separate cache entries.
   
   Fixes #15039


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to