Re: [PR] feat: Added DataFrameWriteOptions option when writing as csv, json, p… [datafusion-python]

via GitHub Fri, 06 Sep 2024 05:36:52 -0700


timsaucer commented on code in PR #857:
URL: https://github.com/apache/datafusion-python/pull/857#discussion_r1747044279



##########
python/datafusion/dataframe.py:
##########
@@ -409,37 +409,62 @@ def except_all(self, other: DataFrame) -> DataFrame:
         """
         return DataFrame(self.df.except_all(other.df))
 
-    def write_csv(self, path: str | pathlib.Path, with_header: bool = False) 
-> None:
+    def write_csv(
+        self, 
+        path: str | pathlib.Path, 
+        with_header: bool = False,
+        write_options_overwrite: bool = False,
+        write_options_single_file_output: bool = False,
+        write_options_partition_by: List = [],

Review Comment:
   I think it's okay to remove the `write_options_` prefixes here.
   
   ```
           with_header: bool = False,
           overwrite: bool = False,
           single_file_output: bool = False,
   ```
   
   Also for the partition by, I took a *very* quick look at the code and it 
looks like `partition_by` takes a list of strings, which I think our users 
would be surprised because all other uses of `partition_by` takes a list of 
expressions. So I think we want to add to the documentation a tiny bit about 
how to use that. 
   
   My understanding is that it's bad form in python to pass in a `[]` as a 
default, but I'm no expert. I bet we could change the type hint to 
`partition_by: Optional[List[str]] = None` and make the appropriate change on 
the call in the lines below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Added DataFrameWriteOptions option when writing as csv, json, p… [datafusion-python]

Reply via email to