[GitHub] [hudi] codope commented on a change in pull request #4473: [HUDI-2590] Adding tests to validate different key generators
codope commented on a change in pull request #4473: URL: https://github.com/apache/hudi/pull/4473#discussion_r777829954 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSourceStorage.scala ## @@ -100,8 +120,26 @@ class TestCOWDataSourceStorage extends SparkClientFunctionalTestHarness { assertEquals(updatedVerificationVal, snapshotDF2.filter(col("_row_key") === verificationRowKey).select(verificationCol).first.getString(0)) // Upsert Operation without Hudi metadata columns -val records2 = recordsToStrings(dataGen.generateUpdates("001", 100)).toList -val inputDF2 = spark.read.json(spark.sparkContext.parallelize(records2 , 2)) +val records2 = recordsToStrings(dataGen.generateUpdates("002", 100)).toList +var inputDF2 = spark.read.json(spark.sparkContext.parallelize(records2, 2)) + +if (classOf[TimestampBasedKeyGenerator].getName.equals(keyGenClass)) { + // incase of Timestamp based key gen, current_ts should not be updated. but dataGen.generateUpdates() would have updated Review comment: Sounds good. Will land this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #4473: [HUDI-2590] Adding tests to validate different key generators
codope commented on a change in pull request #4473: URL: https://github.com/apache/hudi/pull/4473#discussion_r776919486 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSourceStorage.scala ## @@ -100,8 +120,26 @@ class TestCOWDataSourceStorage extends SparkClientFunctionalTestHarness { assertEquals(updatedVerificationVal, snapshotDF2.filter(col("_row_key") === verificationRowKey).select(verificationCol).first.getString(0)) // Upsert Operation without Hudi metadata columns -val records2 = recordsToStrings(dataGen.generateUpdates("001", 100)).toList -val inputDF2 = spark.read.json(spark.sparkContext.parallelize(records2 , 2)) +val records2 = recordsToStrings(dataGen.generateUpdates("002", 100)).toList +var inputDF2 = spark.read.json(spark.sparkContext.parallelize(records2, 2)) + +if (classOf[TimestampBasedKeyGenerator].getName.equals(keyGenClass)) { + // incase of Timestamp based key gen, current_ts should not be updated. but dataGen.generateUpdates() would have updated Review comment: So this was the issue with the test datagen. Would special handling of timestamp keygen in the datagen itself be better than doing it here in a test case? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org