Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17077#discussion_r110794303
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -2167,6 +2167,61 @@ def test_BinaryType_serialization(self):
             df = self.spark.createDataFrame(data, schema=schema)
             df.collect()
     
    +    def test_bucketed_write(self):
    +        data = [
    +            (1, "foo", 3.0), (2, "foo", 5.0),
    +            (3, "bar", -1.0), (4, "bar", 6.0),
    +        ]
    +        df = self.spark.createDataFrame(data, ["x", "y", "z"])
    +
    +        # Test write with one bucketing column
    +        df.write.bucketBy(3, 
"x").mode("overwrite").saveAsTable("pyspark_bucket")
    +        self.assertEqual(
    +            len([c for c in 
self.spark.catalog.listColumns("pyspark_bucket")
    +                 if c.name == "x" and c.isBucket]),
    +            1
    +        )
    +        self.assertSetEqual(set(data), 
set(self.spark.table("pyspark_bucket").collect()))
    +
    +        # Test write two bucketing columns
    +        df.write.bucketBy(3, "x", 
"y").mode("overwrite").saveAsTable("pyspark_bucket")
    +        self.assertEqual(
    +            len([c for c in 
self.spark.catalog.listColumns("pyspark_bucket")
    +                 if c.name in ("x", "y") and c.isBucket]),
    --- End diff --
    
    Thank you for taking my opinion into account. Yea, we should remove or 
change the version. I meant to follow the rest of contents.
    
    Generally, the contents in documentation has been matched among APIs in 
different languages up to my knowledge. I don't think this is a kind of a must 
but I think it is safer to avoid getting blamed for any reason in the future 
and confusion for the users.
    
    I have seen several minor PRs fixing documentations (e.g., typos) that has 
to identically be fixed for other APIs in different language and I also made 
some PRs to match the documentations, e.g., 
https://github.com/apache/spark/pull/17429


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to