[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer

2020-03-10 Thread GitBox
nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding 
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390557620
 
 

 ##
 File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieDeltaStreamer.java
 ##
 @@ -394,8 +394,8 @@ private void testUpsertsContinuousMode(HoodieTableType 
tableType, String tempDir
   } else {
 TestHelpers.assertAtleastNCompactionCommits(5, datasetBasePath, dfs);
   }
-  TestHelpers.assertRecordCount(totalRecords, datasetBasePath + 
"/*/*.parquet", sqlContext);
-  TestHelpers.assertDistanceCount(totalRecords, datasetBasePath + 
"/*/*.parquet", sqlContext);
+  TestHelpers.assertRecordCount(totalRecords + 200, datasetBasePath + 
"/*/*.parquet", sqlContext);
 
 Review comment:
   you might be right. as mentioned in the other thread, I am working on the 
fix. For some reason, my continuous tests times out w/ hitting the expected no 
of commits. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer

2020-03-10 Thread GitBox
nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding 
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r390557169
 
 

 ##
 File path: 
hudi-client/src/test/java/org/apache/hudi/common/HoodieTestDataGenerator.java
 ##
 @@ -435,11 +439,46 @@ public HoodieRecord generateUpdateRecord(HoodieKey key, 
String commitTime) throw
 index = (index + 1) % numExistingKeys;
 kp = existingKeys.get(index);
   }
+  existingKeys.remove(kp);
 
 Review comment:
   yes, you are right. I figured this recently. working on the fix. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding Delete() support to DeltaStreamer

2019-12-31 Thread GitBox
nsivabalan commented on a change in pull request #1073: [HUDI-377] Adding 
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r362278320
 
 

 ##
 File path: 
hudi-client/src/test/java/org/apache/hudi/common/HoodieTestDataGenerator.java
 ##
 @@ -74,14 +75,15 @@
   public static final String[] DEFAULT_PARTITION_PATHS =
   {DEFAULT_FIRST_PARTITION_PATH, DEFAULT_SECOND_PARTITION_PATH, 
DEFAULT_THIRD_PARTITION_PATH};
   public static final int DEFAULT_PARTITION_DEPTH = 3;
-  public static String TRIP_EXAMPLE_SCHEMA = "{\"type\": \"record\",\"name\": 
\"triprec\",\"fields\": [ "
-  + "{\"name\": \"timestamp\",\"type\": \"double\"},{\"name\": 
\"_row_key\", \"type\": \"string\"},"
-  + "{\"name\": \"rider\", \"type\": \"string\"},{\"name\": \"driver\", 
\"type\": \"string\"},"
-  + "{\"name\": \"begin_lat\", \"type\": \"double\"},{\"name\": 
\"begin_lon\", \"type\": \"double\"},"
-  + "{\"name\": \"end_lat\", \"type\": \"double\"},{\"name\": \"end_lon\", 
\"type\": \"double\"},"
-  + "{\"name\":\"fare\",\"type\": \"double\"}]}";
+  public static String TRIP_EXAMPLE_SCHEMA = "{\"type\": \"record\"," + 
"\"name\": \"triprec\"," + "\"fields\": [ "
+  + "{\"name\": \"timestamp\",\"type\": \"double\"}," + "{\"name\": 
\"_row_key\", \"type\": \"string\"},"
+  + "{\"name\": \"rider\", \"type\": \"string\"}," + "{\"name\": 
\"driver\", \"type\": \"string\"},"
+  + "{\"name\": \"begin_lat\", \"type\": \"double\"}," + "{\"name\": 
\"begin_lon\", \"type\": \"double\"},"
+  + "{\"name\": \"end_lat\", \"type\": \"double\"}," + "{\"name\": 
\"end_lon\", \"type\": \"double\"},"
+  + "{\"name\":\"fare\",\"type\": \"double\"},"
+  + "{\"name\": \"_hoodie_delete_marker\", \"type\": \"boolean\", 
\"default\": false} ]}";
   public static String NULL_SCHEMA = 
Schema.create(Schema.Type.NULL).toString();
-  public static String TRIP_HIVE_COLUMN_TYPES = 
"double,string,string,string,double,double,double,double,double";
+  public static String TRIP_HIVE_COLUMN_TYPES = 
"double,string,string,string,double,double,double,double,double,string";
 
 Review comment:
   actual data type of the delete marker is boolean only in line 84. Missed to 
update this list of columns. Will fix it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services