[jira] [Commented] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-14 Thread Andreas Saltveit (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566839#comment-17566839
 ] 

Andreas Saltveit commented on SPARK-39732:
--

3.2.0

[https://spark.apache.org/docs/3.3.0/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.drop.html#pyspark.pandas.DataFrame.drop]

3.3.0

[https://spark.apache.org/docs/3.3.0/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.drop.html#pyspark.pandas.DataFrame.drop]

 

> pyspark.pandas.DataFrame.drop drops dataframe if axis not specified
> ---
>
> Key: SPARK-39732
> URL: https://issues.apache.org/jira/browse/SPARK-39732
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Andreas Saltveit
>Priority: Major
>
> import pyspark.pandas as pd
> data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
>         \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
>         \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
>         \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
>         ]
> df = pd.DataFrame(data)
> df.display()
> --drops dataframe "Query returned no results"
> df1=df.drop(["ID","Category"])
> df1.display()
> --works
> df2=df.drop(["ID","Category"], 1)
> df2.display()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-14 Thread Andreas Saltveit (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566840#comment-17566840
 ] 

Andreas Saltveit commented on SPARK-39732:
--

Seems weird to change default.

> pyspark.pandas.DataFrame.drop drops dataframe if axis not specified
> ---
>
> Key: SPARK-39732
> URL: https://issues.apache.org/jira/browse/SPARK-39732
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Andreas Saltveit
>Priority: Major
>
> import pyspark.pandas as pd
> data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
>         \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
>         \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
>         \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
>         ]
> df = pd.DataFrame(data)
> df.display()
> --drops dataframe "Query returned no results"
> df1=df.drop(["ID","Category"])
> df1.display()
> --works
> df2=df.drop(["ID","Category"], 1)
> df2.display()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-14 Thread Andreas Saltveit (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566835#comment-17566835
 ] 

Andreas Saltveit commented on SPARK-39732:
--

this is a behavior change from old spark version. I have used this in 
production code and had an incident due to this. Default axis used to be 1.

> pyspark.pandas.DataFrame.drop drops dataframe if axis not specified
> ---
>
> Key: SPARK-39732
> URL: https://issues.apache.org/jira/browse/SPARK-39732
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Andreas Saltveit
>Priority: Major
>
> import pyspark.pandas as pd
> data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
>         \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
>         \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
>         \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
>         ]
> df = pd.DataFrame(data)
> df.display()
> --drops dataframe "Query returned no results"
> df1=df.drop(["ID","Category"])
> df1.display()
> --works
> df2=df.drop(["ID","Category"], 1)
> df2.display()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-10 Thread Andreas Saltveit (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564767#comment-17564767
 ] 

Andreas Saltveit commented on SPARK-39732:
--

Introduced after 2022.07.04

> pyspark.pandas.DataFrame.drop drops dataframe if axis not specified
> ---
>
> Key: SPARK-39732
> URL: https://issues.apache.org/jira/browse/SPARK-39732
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Andreas Saltveit
>Priority: Major
>
> import pyspark.pandas as pd
> data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
>         \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
>         \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
>         \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
>         ]
> df = pd.DataFrame(data)
> df.display()
> --drops dataframe "Query returned no results"
> df1=df.drop(["ID","Category"])
> df1.display()
> --works
> df2=df.drop(["ID","Category"], 1)
> df2.display()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-10 Thread Andreas Saltveit (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Saltveit updated SPARK-39732:
-
Description: 
import pyspark.pandas as pd
data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
        \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
        \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
        \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
        ]
df = pd.DataFrame(data)
df.display()

--drops dataframe "Query returned no results"
df1=df.drop(["ID","Category"])
df1.display()

--works

df2=df.drop(["ID","Category"], 1)
df2.display()

  was:
import pyspark.pandas as pd
data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
        \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
        \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
        \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
        ]
df = pd.DataFrame(data)
df.display()

# drops dataframe "Query returned no results"
df1=df.drop(["ID","Category"])
df1.display()

# works

df2=df.drop(["ID","Category"], 1)
df2.display()


> pyspark.pandas.DataFrame.drop drops dataframe if axis not specified
> ---
>
> Key: SPARK-39732
> URL: https://issues.apache.org/jira/browse/SPARK-39732
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Andreas Saltveit
>Priority: Major
>
> import pyspark.pandas as pd
> data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
>         \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
>         \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
>         \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
>         ]
> df = pd.DataFrame(data)
> df.display()
> --drops dataframe "Query returned no results"
> df1=df.drop(["ID","Category"])
> df1.display()
> --works
> df2=df.drop(["ID","Category"], 1)
> df2.display()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-10 Thread Andreas Saltveit (Jira)
Andreas Saltveit created SPARK-39732:


 Summary: pyspark.pandas.DataFrame.drop drops dataframe if axis not 
specified
 Key: SPARK-39732
 URL: https://issues.apache.org/jira/browse/SPARK-39732
 Project: Spark
  Issue Type: Bug
  Components: Pandas API on Spark
Affects Versions: 3.3.0
Reporter: Andreas Saltveit


import pyspark.pandas as pd
data = [\{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True},
        \{"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False},
        \{"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None},
        \{"Category": 'E', "ID": 4, "Value": 33.87, "Truth": True}
        ]
df = pd.DataFrame(data)
df.display()

# drops dataframe "Query returned no results"
df1=df.drop(["ID","Category"])
df1.display()

# works

df2=df.drop(["ID","Category"], 1)
df2.display()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org