mxm commented on code in PR #13831: URL: https://github.com/apache/iceberg/pull/13831#discussion_r2314316256
########## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/maintenance/operator/TestDeleteFilesProcessor.java: ########## @@ -74,27 +75,158 @@ void testDeleteMissingFile() throws Exception { Path dummyFile = FileSystems.getDefault().getPath(table.location().substring(5), DUMMY_FILE_NAME); - deleteFile(tableLoader(), dummyFile.toString()); + deleteFile(tableLoader(), dummyFile.toString(), true /* expectSuccess */); assertThat(listFiles(table)).isEqualTo(TABLE_FILES); } @Test void testInvalidURIScheme() throws Exception { - deleteFile(tableLoader(), "wrong://"); + deleteFile(tableLoader(), "wrong://", false /* expectFail */); assertThat(listFiles(table)).isEqualTo(TABLE_FILES); } - private void deleteFile(TableLoader tableLoader, String fileName) throws Exception { - tableLoader().open(); + @Test + void testDeleteNonExistentFile() throws Exception { + String nonexistentFile = "nonexistentFile.txt"; + + deleteFile(tableLoader(), nonexistentFile, true /* expectSuccess */); + + assertThat(listFiles(table)).isEqualTo(TABLE_FILES); + } + + @Test + void testDelete10MBFile() throws Exception { + // Simulate a large file (e.g., 10MB file) + String largeFileName = "largeFile.txt"; + Path largeFile = Path.of(tablePath(table).toString(), largeFileName); + + // Write a large file to disk (this will simulate the large file in the filesystem) + byte[] largeData = new byte[1024 * 1024 * 10]; // 10 MB + Files.write(largeFile, largeData); + + // Verify that the file was created + Set<String> files = listFiles(table); + assertThat(files).contains(largeFileName); + + // Use the DeleteFilesProcessor to delete the large file + deleteFile(tableLoader(), largeFile.toString(), true /* expectSuccess */); + + // Verify that the large file has been deleted + files = listFiles(table); + assertThat(files).doesNotContain(largeFileName); + } Review Comment: I'm sure that we can delete a 10Mb file, at least on local file system. I'm not sure what we are testing here. Why does the size matter here? ########## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DeleteFilesProcessor.java: ########## @@ -76,6 +79,10 @@ public void open() throws Exception { taskMetricGroup.counter(TableMaintenanceMetrics.DELETE_FILE_FAILED_COUNTER); this.succeededCounter = taskMetricGroup.counter(TableMaintenanceMetrics.DELETE_FILE_SUCCEEDED_COUNTER); + this.deleteFileTimeMsHistogram = + taskMetricGroup.histogram( + TableMaintenanceMetrics.DELETE_FILE_TIME_MS_HISTOGRAM, + new DescriptiveStatisticsHistogram(1000)); Review Comment: We haven't required the need to measure delete file duration. We are generally more concerned with the total time the maintenance takes. I would assume that the actual deletion of the files is a large chunk, but we have managed storage, so there is usually very little we can do about it. Maybe it would suffice to keep the max duration here instead of a full histogram? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org