Zhen Li created SPARK-45679:
-------------------------------

             Summary: Add clusterBy in DataFrame API
                 Key: SPARK-45679
                 URL: https://issues.apache.org/jira/browse/SPARK-45679
             Project: Spark
          Issue Type: Improvement
          Components: Connect
    Affects Versions: 3.5.1
            Reporter: Zhen Li


Add clusterBy to Dataframe API e.g. in python

DataframeWriterV1
```
df.write
  .format("delta")
  .clusterBy("clusteringColumn1", "clusteringColumn2")
  .save(...) or saveAsTable(...)
```

DataFrameWriterV2
```
df.writeTo(...).using("delta")
  .clusterBy("clusteringColumn1", "clusteringColumn2")
  .create() or replace() or createOrReplace()
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to