[GitHub] [incubator-hudi] prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer
prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390562273 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieDeltaStreamer.java ## @@ -394,8 +394,8 @@ private void testUpsertsContinuousMode(HoodieTableType tableType, String tempDir } else { TestHelpers.assertAtleastNCompactionCommits(5, datasetBasePath, dfs); } - TestHelpers.assertRecordCount(totalRecords, datasetBasePath + "/*/*.parquet", sqlContext); - TestHelpers.assertDistanceCount(totalRecords, datasetBasePath + "/*/*.parquet", sqlContext); + TestHelpers.assertRecordCount(totalRecords + 200, datasetBasePath + "/*/*.parquet", sqlContext); Review comment: Thanks Sivabalan for such a quick reply. I had filed https://issues.apache.org/jira/browse/HUDI-667 to work on this fix. You may wish to use it to submit your PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer
prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390555211 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieDeltaStreamer.java ## @@ -394,8 +394,8 @@ private void testUpsertsContinuousMode(HoodieTableType tableType, String tempDir } else { TestHelpers.assertAtleastNCompactionCommits(5, datasetBasePath, dfs); } - TestHelpers.assertRecordCount(totalRecords, datasetBasePath + "/*/*.parquet", sqlContext); - TestHelpers.assertDistanceCount(totalRecords, datasetBasePath + "/*/*.parquet", sqlContext); + TestHelpers.assertRecordCount(totalRecords + 200, datasetBasePath + "/*/*.parquet", sqlContext); Review comment: I could not reason why there is +200 here? The inserts/deletes/updates are calculated to keep the number of records equal to totalRecords. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer
prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390553417 ## File path: hudi-client/src/test/java/org/apache/hudi/common/HoodieTestDataGenerator.java ## @@ -435,11 +439,46 @@ public HoodieRecord generateUpdateRecord(HoodieKey key, String commitTime) throw index = (index + 1) % numExistingKeys; kp = existingKeys.get(index); } + existingKeys.remove(kp); Review comment: Shouldn't the remove be with the key rather than the value? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services