Santosh Pingale created SPARK-39895:
---------------------------------------

             Summary: pyspark drop doesn't accept *cols 
                 Key: SPARK-39895
                 URL: https://issues.apache.org/jira/browse/SPARK-39895
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 3.2.2, 3.3.0, 3.0.3
            Reporter: Santosh Pingale


Pyspark dataframe drop has following signature:

{color:#4c9aff}{{def drop(self, *cols: "ColumnOrName") -> "DataFrame":}}{color}

However when we try to pass multiple Column types to drop function it raises 
TypeError

{{each col in the param list should be a string}}

*Minimal reproducible example:*
{color:#4c9aff}values = [("id_1", 5, 9), ("id_2", 5, 1), ("id_3", 4, 3), 
("id_1", 3, 3), ("id_2", 4, 3)]{color}
{color:#4c9aff}df = spark.createDataFrame(values, "id string, point int, count 
int"){color}
|– id: string (nullable = true)|
|– point: integer (nullable = true)|
|– count: integer (nullable = true)|

{color:#4c9aff}{{df.drop(df.point, df.count)}}{color}
{quote}{color:#505f79}/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py in 
drop(self, *cols){color}
{color:#505f79}2537 for col in cols:{color}
{color:#505f79}2538 if not isinstance(col, str):{color}
{color:#505f79}-> 2539 raise TypeError("each col in the param list should be a 
string"){color}
{color:#505f79}2540 jdf = self._jdf.drop(self._jseq(cols)){color}
{color:#505f79}2541{color}

{color:#505f79}TypeError: each col in the param list should be a string{color}
{quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to