Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17077#discussion_r110794303 --- Diff: python/pyspark/sql/tests.py --- @@ -2167,6 +2167,61 @@ def test_BinaryType_serialization(self): df = self.spark.createDataFrame(data, schema=schema) df.collect() + def test_bucketed_write(self): + data = [ + (1, "foo", 3.0), (2, "foo", 5.0), + (3, "bar", -1.0), (4, "bar", 6.0), + ] + df = self.spark.createDataFrame(data, ["x", "y", "z"]) + + # Test write with one bucketing column + df.write.bucketBy(3, "x").mode("overwrite").saveAsTable("pyspark_bucket") + self.assertEqual( + len([c for c in self.spark.catalog.listColumns("pyspark_bucket") + if c.name == "x" and c.isBucket]), + 1 + ) + self.assertSetEqual(set(data), set(self.spark.table("pyspark_bucket").collect())) + + # Test write two bucketing columns + df.write.bucketBy(3, "x", "y").mode("overwrite").saveAsTable("pyspark_bucket") + self.assertEqual( + len([c for c in self.spark.catalog.listColumns("pyspark_bucket") + if c.name in ("x", "y") and c.isBucket]), --- End diff -- Thank you for taking my opinion into account. Yea, we should remove or change the version. I meant to follow the rest of contents. Generally, the contents in documentation has been matched among APIs in different languages up to my knowledge. I don't think this is a kind of a must but I think it is safer to avoid getting blamed for any reason in the future and confusion for the users. I have seen several minor PRs fixing documentations (e.g., typos) that has to identically be fixed for other APIs in different language and I also made some PRs to match the documentations, e.g., https://github.com/apache/spark/pull/17429
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org