wgtmac commented on code in PR #590:
URL: https://github.com/apache/iceberg-cpp/pull/590#discussion_r2986684167


##########
src/iceberg/manifest/manifest_list.h:
##########
@@ -272,3 +272,12 @@ ICEBERG_EXPORT inline constexpr Result<ManifestContent> 
ManifestContentFromStrin
 }
 
 }  // namespace iceberg
+
+namespace std {
+template <>
+struct hash<iceberg::ManifestFile> {

Review Comment:
   **Style / Logic Issue:** The added `std::hash<iceberg::ManifestFile>` 
specialization only hashes `manifest_path`. However, `ManifestFile::operator==` 
defaults to member-wise equality. If an `unordered_set` (like the one used in 
`IncrementalAppendScan::PlanFiles`) encounters two `ManifestFile` objects with 
the same path but slightly different metadata fields, they will have the same 
hash but fail equality, resulting in duplicates being inserted. In Java, 
`GenericManifestFile` uses *only* `manifestPath` for both `hashCode()` and 
`equals()`.
   
   *Suggested fix:* Modify `ManifestFile::operator==` to only compare 
`manifest_path` to align with Java, or provide a custom equality functor 
specifically for the `unordered_set<ManifestFile>`.



##########
src/iceberg/table_scan.cc:
##########
@@ -536,20 +639,110 @@ Result<std::vector<std::shared_ptr<FileScanTask>>> 
DataTableScan::PlanFiles() co
   return manifest_group->PlanFiles();
 }
 
+// Friend function template for IncrementalScan that implements the shared 
PlanFiles
+// logic. It resolves the from/to snapshot range from the scan context and 
delegates
+// to the two-arg virtual PlanFiles() override in the concrete subclass.
+// Defined as a friend to access the protected two-arg PlanFiles().
+template <typename ScanTaskType>
+Result<std::vector<std::shared_ptr<ScanTaskType>>> ResolvePlanFiles(

Review Comment:
   **Logic / Parity Issue:** `scan.metadata()->Snapshot()` returns a `NotFound` 
error if the table has no current snapshot (e.g., table is empty), rather than 
a successful result with `nullptr`. `ICEBERG_ASSIGN_OR_RAISE` will immediately 
propagate this `NotFound` error, making the `if (current_snapshot == nullptr)` 
check unreachable. This breaks parity with Java, which handles empty tables 
gracefully by returning an empty iterable.
   
   *Suggested fix:* Check the ID directly: `if 
(scan.metadata()->current_snapshot_id == kInvalidSnapshotId) { return ...; }` 
or use `scan.snapshot()` which properly resolves to `nullptr`.
   
   *(Note: The same issue exists on line 214 in `ToSnapshotIdInclusive`, where 
`ICEBERG_ASSIGN_OR_RAISE` will propagate `NotFound` before the custom error 
message can be triggered).*



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to