[GitHub] [incubator-hudi] prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer

2020-03-10 Thread GitBox
prashantwason commented on a change in pull request #1073: [HUDI-377] Adding 
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390562273
 
 

 ##
 File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieDeltaStreamer.java
 ##
 @@ -394,8 +394,8 @@ private void testUpsertsContinuousMode(HoodieTableType 
tableType, String tempDir
   } else {
 TestHelpers.assertAtleastNCompactionCommits(5, datasetBasePath, dfs);
   }
-  TestHelpers.assertRecordCount(totalRecords, datasetBasePath + 
"/*/*.parquet", sqlContext);
-  TestHelpers.assertDistanceCount(totalRecords, datasetBasePath + 
"/*/*.parquet", sqlContext);
+  TestHelpers.assertRecordCount(totalRecords + 200, datasetBasePath + 
"/*/*.parquet", sqlContext);
 
 Review comment:
   Thanks Sivabalan for such a quick reply.
   
   I had filed https://issues.apache.org/jira/browse/HUDI-667 to work on this 
fix. You may wish to use it to submit your PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer

2020-03-10 Thread GitBox
prashantwason commented on a change in pull request #1073: [HUDI-377] Adding 
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390555211
 
 

 ##
 File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieDeltaStreamer.java
 ##
 @@ -394,8 +394,8 @@ private void testUpsertsContinuousMode(HoodieTableType 
tableType, String tempDir
   } else {
 TestHelpers.assertAtleastNCompactionCommits(5, datasetBasePath, dfs);
   }
-  TestHelpers.assertRecordCount(totalRecords, datasetBasePath + 
"/*/*.parquet", sqlContext);
-  TestHelpers.assertDistanceCount(totalRecords, datasetBasePath + 
"/*/*.parquet", sqlContext);
+  TestHelpers.assertRecordCount(totalRecords + 200, datasetBasePath + 
"/*/*.parquet", sqlContext);
 
 Review comment:
   I could not reason why there is +200 here? The inserts/deletes/updates are 
calculated to keep the number of records equal to totalRecords.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] prashantwason commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer

2020-03-10 Thread GitBox
prashantwason commented on a change in pull request #1073: [HUDI-377] Adding 
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390553417
 
 

 ##
 File path: 
hudi-client/src/test/java/org/apache/hudi/common/HoodieTestDataGenerator.java
 ##
 @@ -435,11 +439,46 @@ public HoodieRecord generateUpdateRecord(HoodieKey key, 
String commitTime) throw
 index = (index + 1) % numExistingKeys;
 kp = existingKeys.get(index);
   }
+  existingKeys.remove(kp);
 
 Review comment:
   Shouldn't the remove be with the key rather than the value?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services