[jira] [Updated] (SPARK-35807) Deprecate the `num_files` argument

2021-07-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-35807:
-
Fix Version/s: 3.2.0

> Deprecate the `num_files` argument
> --
>
> Key: SPARK-35807
> URL: https://issues.apache.org/jira/browse/SPARK-35807
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
> Fix For: 3.2.0
>
>
> We should deprecate the num_files argument in [DataFrame.to_csv 
> |https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_csv.html]and
>  
> [DataFrame.to_json|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_json.html].
> Because the behavior of num_files is not actually specify the number of 
> files, but it specifies the number of partition.
> So we should encourage users to use 
> [DataFrame.spark.repartition|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.spark.repartition.html]
>  instead in the warning message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-35807) Deprecate the `num_files` argument

2021-07-18 Thread Haejoon Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-35807:

Summary: Deprecate the `num_files` argument  (was: Rename the `num_files` 
argument)

> Deprecate the `num_files` argument
> --
>
> Key: SPARK-35807
> URL: https://issues.apache.org/jira/browse/SPARK-35807
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Haejoon Lee
>Priority: Major
>
> We should rename the num_files argument in [DataFrame.to_csv 
> |https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_csv.html]and
>  
> [DataFrame.to_json|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_json.html].
> Because the behavior of num_files is not actually specify the number of 
> files, but it specifies the number of partition.
> Or we just can remove, and use the 
> +[DataFrame.spark.repartition|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.spark.repartition.html]+
>  as a work around.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-35807) Deprecate the `num_files` argument

2021-07-18 Thread Haejoon Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-35807:

Description: 
We should deprecate the num_files argument in [DataFrame.to_csv 
|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_csv.html]and
 
[DataFrame.to_json|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_json.html].

Because the behavior of num_files is not actually specify the number of files, 
but it specifies the number of partition.

So we should encourage users to use 
[DataFrame.spark.repartition|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.spark.repartition.html]
 instead in the warning message.

  was:
We should rename the num_files argument in [DataFrame.to_csv 
|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_csv.html]and
 
[DataFrame.to_json|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_json.html].

Because the behavior of num_files is not actually specify the number of files, 
but it specifies the number of partition.

Or we just can remove, and use the 
+[DataFrame.spark.repartition|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.spark.repartition.html]+
 as a work around.


> Deprecate the `num_files` argument
> --
>
> Key: SPARK-35807
> URL: https://issues.apache.org/jira/browse/SPARK-35807
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Haejoon Lee
>Priority: Major
>
> We should deprecate the num_files argument in [DataFrame.to_csv 
> |https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_csv.html]and
>  
> [DataFrame.to_json|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_json.html].
> Because the behavior of num_files is not actually specify the number of 
> files, but it specifies the number of partition.
> So we should encourage users to use 
> [DataFrame.spark.repartition|https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.spark.repartition.html]
>  instead in the warning message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org