Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16564#discussion_r95979368 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -898,11 +899,15 @@ class DatasetSuite extends QueryTest with SharedSQLContext { (1, 2), (1, 1), (2, 1), (2, 2)) } - test("dropDuplicates should not change child plan output") { - val ds = Seq(("a", 1), ("a", 2), ("b", 1), ("a", 1)).toDS() - checkDataset( - ds.dropDuplicates("_1").select(ds("_1").as[String], ds("_2").as[Int]), - ("a", 1), ("b", 1)) + test("SPARK-19065 dropDuplicates should not create expressions using the same id") { --- End diff -- how about we remove this test and add a new test to show the behavior change more obvious? ``` val df = ... val df2 = df.dropDuplicates("i") intercept[AnalysisException] { df2.select(df("i")) } ```
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org