Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-10 Thread via GitHub
timsaucer commented on PR #981: URL: https://github.com/apache/datafusion-python/pull/981#issuecomment-2585006200 Thank you for another great addition @kosiew ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-10 Thread via GitHub
timsaucer merged PR #981: URL: https://github.com/apache/datafusion-python/pull/981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-09 Thread via GitHub
timsaucer commented on PR #981: URL: https://github.com/apache/datafusion-python/pull/981#issuecomment-2579966500 It looks like some minor difference in ruff versions probably caused yours to pass and the CI to fail. I pushed a correction to this branch. -- This is an automated message f

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-08 Thread via GitHub
kosiew commented on PR #981: URL: https://github.com/apache/datafusion-python/pull/981#issuecomment-2579331381 Does anyone know how to fix this error: ``` ruff check --output-format=github python/ ruff format --check python/ shell: /usr/bin/bash -e {0} env: py

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-08 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1908071361 ## python/datafusion/dataframe.py: ## @@ -35,6 +35,65 @@ from datafusion._internal import DataFrame as DataFrameInternal from datafusion.expr import Expr, So

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-08 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1908071116 ## python/datafusion/dataframe.py: ## @@ -620,17 +679,34 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-08 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1908068498 ## python/datafusion/dataframe.py: ## @@ -620,17 +679,34 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-08 Thread via GitHub
timsaucer commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1907925883 ## python/datafusion/dataframe.py: ## @@ -620,17 +679,34 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_p

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-07 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1906516715 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-07 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1906516715 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-07 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1906516715 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-07 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1906516715 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-07 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1906516715 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-07 Thread via GitHub
ion-elgreco commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1905619163 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-07 Thread via GitHub
kevinjqliu commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1905611807 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-06 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1904826148 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-06 Thread via GitHub
kevinjqliu commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1904816353 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,25 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-06 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1904789223 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-06 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1904789223 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2025-01-06 Thread via GitHub
kylebarron commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1904715954 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-30 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1899871918 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-28 Thread via GitHub
ion-elgreco commented on PR #981: URL: https://github.com/apache/datafusion-python/pull/981#issuecomment-2564288959 In delta-rs we have the default to use "snappy" compression, except our optimize operation which uses ZSTD(4) -- This is an automated message from the Apache Git Service. T

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-26 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1898185864 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-26 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1898185864 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-26 Thread via GitHub
kosiew commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1898185454 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_parq

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-24 Thread via GitHub
kylebarron commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1896895123 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-24 Thread via GitHub
kylebarron commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1896894738 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_

[PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-23 Thread via GitHub
kosiew opened a new pull request, #981: URL: https://github.com/apache/datafusion-python/pull/981 # Which issue does this PR close? Closes #978. # Rationale for this change Currently, the write_parquet method defaults to "uncompressed" Parquet files, whi