[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer
nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390557620 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieDeltaStreamer.java ## @@ -394,8 +394,8 @@ private void testUpsertsContinuousMode(HoodieTableType tableType, String tempDir } else { TestHelpers.assertAtleastNCompactionCommits(5, datasetBasePath, dfs); } - TestHelpers.assertRecordCount(totalRecords, datasetBasePath + "/*/*.parquet", sqlContext); - TestHelpers.assertDistanceCount(totalRecords, datasetBasePath + "/*/*.parquet", sqlContext); + TestHelpers.assertRecordCount(totalRecords + 200, datasetBasePath + "/*/*.parquet", sqlContext); Review comment: you might be right. as mentioned in the other thread, I am working on the fix. For some reason, my continuous tests times out w/ hitting the expected no of commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer
nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390557169 ## File path: hudi-client/src/test/java/org/apache/hudi/common/HoodieTestDataGenerator.java ## @@ -435,11 +439,46 @@ public HoodieRecord generateUpdateRecord(HoodieKey key, String commitTime) throw index = (index + 1) % numExistingKeys; kp = existingKeys.get(index); } + existingKeys.remove(kp); Review comment: yes, you are right. I figured this recently. working on the fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer
nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r362278320 ## File path: hudi-client/src/test/java/org/apache/hudi/common/HoodieTestDataGenerator.java ## @@ -74,14 +75,15 @@ public static final String[] DEFAULT_PARTITION_PATHS = {DEFAULT_FIRST_PARTITION_PATH, DEFAULT_SECOND_PARTITION_PATH, DEFAULT_THIRD_PARTITION_PATH}; public static final int DEFAULT_PARTITION_DEPTH = 3; - public static String TRIP_EXAMPLE_SCHEMA = "{\"type\": \"record\",\"name\": \"triprec\",\"fields\": [ " - + "{\"name\": \"timestamp\",\"type\": \"double\"},{\"name\": \"_row_key\", \"type\": \"string\"}," - + "{\"name\": \"rider\", \"type\": \"string\"},{\"name\": \"driver\", \"type\": \"string\"}," - + "{\"name\": \"begin_lat\", \"type\": \"double\"},{\"name\": \"begin_lon\", \"type\": \"double\"}," - + "{\"name\": \"end_lat\", \"type\": \"double\"},{\"name\": \"end_lon\", \"type\": \"double\"}," - + "{\"name\":\"fare\",\"type\": \"double\"}]}"; + public static String TRIP_EXAMPLE_SCHEMA = "{\"type\": \"record\"," + "\"name\": \"triprec\"," + "\"fields\": [ " + + "{\"name\": \"timestamp\",\"type\": \"double\"}," + "{\"name\": \"_row_key\", \"type\": \"string\"}," + + "{\"name\": \"rider\", \"type\": \"string\"}," + "{\"name\": \"driver\", \"type\": \"string\"}," + + "{\"name\": \"begin_lat\", \"type\": \"double\"}," + "{\"name\": \"begin_lon\", \"type\": \"double\"}," + + "{\"name\": \"end_lat\", \"type\": \"double\"}," + "{\"name\": \"end_lon\", \"type\": \"double\"}," + + "{\"name\":\"fare\",\"type\": \"double\"}," + + "{\"name\": \"_hoodie_delete_marker\", \"type\": \"boolean\", \"default\": false} ]}"; public static String NULL_SCHEMA = Schema.create(Schema.Type.NULL).toString(); - public static String TRIP_HIVE_COLUMN_TYPES = "double,string,string,string,double,double,double,double,double"; + public static String TRIP_HIVE_COLUMN_TYPES = "double,string,string,string,double,double,double,double,double,string"; Review comment: actual data type of the delete marker is boolean only in line 84. Missed to update this list of columns. Will fix it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services