[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering

2022-08-14 Thread Marcelo Rossini Castro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Rossini Castro updated SPARK-40063:
---
Description: 
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2 

df['col_to_apply_function'] = df.apply(lambda row: 
example_func(row['col_to_apply_function']), axis=1) {code}
A workaround is to assign the results to a new column instead of the same one, 
but if the old column is dropped, the same error is produced.

Setting one column as index also didn't work.

  was:
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2 

df['row_to_apply_function'] = df.apply(lambda row: 
example_func(row['row_to_apply_function']), axis=1) {code}
A workaround is to assign the results to a new column instead of the same one, 
but if the old column is dropped, the same error is produced.

Setting one column as index also didn't work.


> pyspark.pandas .apply() changing rows ordering
> --
>
> Key: SPARK-40063
> URL: https://issues.apache.org/jira/browse/SPARK-40063
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
> Environment: Databricks Runtime 11.1
>Reporter: Marcelo Rossini Castro
>Priority: Minor
>  Labels: Pandas, PySpark
>
> When using the apply function to apply a function to a DataFrame column, it 
> ends up mixing the column's rows ordering.
> A command like this:
> {code:java}
> def example_func(df_col):
>   return df_col ** 2 
> df['col_to_apply_function'] = df.apply(lambda row: 
> example_func(row['col_to_apply_function']), axis=1) {code}
> A workaround is to assign the results to a new column instead of the same 
> one, but if the old column is dropped, the same error is produced.
> Setting one column as index also didn't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering

2022-08-12 Thread Marcelo Rossini Castro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Rossini Castro updated SPARK-40063:
---
Description: 
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2 

df['row_to_apply_function'] = df.apply(lambda row: 
example_func(row['row_to_apply_function']), axis=1) {code}
A workaround is to assign the results to a new column instead of the same one, 
but if the old column is dropped, the same error is produced.

Setting one column as index also didn't work.

  was:
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2 

df['row_to_apply_function'] = df.apply(lambda row: 
example_func(row['row_to_apply_function']), axis=1) {code}
A workaround is to assign the results to a new column instead of the same one, 
but if the old column is dropped, the same error is produced.


> pyspark.pandas .apply() changing rows ordering
> --
>
> Key: SPARK-40063
> URL: https://issues.apache.org/jira/browse/SPARK-40063
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
> Environment: Databricks Runtime 11.1
>Reporter: Marcelo Rossini Castro
>Priority: Minor
>  Labels: Pandas, PySpark
>
> When using the apply function to apply a function to a DataFrame column, it 
> ends up mixing the column's rows ordering.
> A command like this:
> {code:java}
> def example_func(df_col):
>   return df_col ** 2 
> df['row_to_apply_function'] = df.apply(lambda row: 
> example_func(row['row_to_apply_function']), axis=1) {code}
> A workaround is to assign the results to a new column instead of the same 
> one, but if the old column is dropped, the same error is produced.
> Setting one column as index also didn't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering

2022-08-12 Thread Marcelo Rossini Castro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Rossini Castro updated SPARK-40063:
---
   Language: Python
Environment: Databricks Runtime 11.1
 Labels: Pandas PySpark  (was: )

> pyspark.pandas .apply() changing rows ordering
> --
>
> Key: SPARK-40063
> URL: https://issues.apache.org/jira/browse/SPARK-40063
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
> Environment: Databricks Runtime 11.1
>Reporter: Marcelo Rossini Castro
>Priority: Minor
>  Labels: Pandas, PySpark
>
> When using the apply function to apply a function to a DataFrame column, it 
> ends up mixing the column's rows ordering.
> A command like this:
> {code:java}
> def example_func(df_col):
>   return df_col ** 2 
> df['row_to_apply_function'] = df.apply(lambda row: 
> example_func(row['row_to_apply_function']), axis=1) {code}
> A workaround is to assign the results to a new column instead of the same 
> one, but if the old column is dropped, the same error is produced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering

2022-08-12 Thread Marcelo Rossini Castro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Rossini Castro updated SPARK-40063:
---
Description: 
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2 

df['row_to_apply_function'] = df.apply(lambda row: 
example_func(row['row_to_apply_function']), axis=1) {code}
A workaround is to assign the results to a new column instead of the same one, 
but if the old column is dropped, the same error is produced.

  was:
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2 

df['row_to_apply_function'] = df.apply(lambda row: 
example_func(row['row_to_apply_function']), axis=1) {code}
 

A workaround is to assign the results to a new column instead of the same one, 
but if the old column is dropped, the same error is produced.


> pyspark.pandas .apply() changing rows ordering
> --
>
> Key: SPARK-40063
> URL: https://issues.apache.org/jira/browse/SPARK-40063
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Marcelo Rossini Castro
>Priority: Minor
>
> When using the apply function to apply a function to a DataFrame column, it 
> ends up mixing the column's rows ordering.
> A command like this:
> {code:java}
> def example_func(df_col):
>   return df_col ** 2 
> df['row_to_apply_function'] = df.apply(lambda row: 
> example_func(row['row_to_apply_function']), axis=1) {code}
> A workaround is to assign the results to a new column instead of the same 
> one, but if the old column is dropped, the same error is produced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering

2022-08-12 Thread Marcelo Rossini Castro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Rossini Castro updated SPARK-40063:
---
Description: 
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2 

df['row_to_apply_function'] = df.apply(lambda row: 
example_func(row['row_to_apply_function']), axis=1) {code}
 

A workaround is to assign the results to a new column instead of the same one, 
but if the old column is dropped, the same error is produced.

  was:
When using the apply function to apply a function to a DataFrame column, it 
ends up mixing the column's rows ordering.

A command like this:
{code:java}
def example_func(df_col):
  return df_col ** 2

df['row_to_apply_function'] = df.apply(lambda row: 
example_func(row['row_to_apply_function']), axis=1){code}


> pyspark.pandas .apply() changing rows ordering
> --
>
> Key: SPARK-40063
> URL: https://issues.apache.org/jira/browse/SPARK-40063
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Marcelo Rossini Castro
>Priority: Minor
>
> When using the apply function to apply a function to a DataFrame column, it 
> ends up mixing the column's rows ordering.
> A command like this:
> {code:java}
> def example_func(df_col):
>   return df_col ** 2 
> df['row_to_apply_function'] = df.apply(lambda row: 
> example_func(row['row_to_apply_function']), axis=1) {code}
>  
> A workaround is to assign the results to a new column instead of the same 
> one, but if the old column is dropped, the same error is produced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering

2022-08-12 Thread Marcelo Rossini Castro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Rossini Castro updated SPARK-40063:
---
Summary: pyspark.pandas .apply() changing rows ordering  (was: 
pyspark.pandas .apply() chaging rows ordering)

> pyspark.pandas .apply() changing rows ordering
> --
>
> Key: SPARK-40063
> URL: https://issues.apache.org/jira/browse/SPARK-40063
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 3.3.0
>Reporter: Marcelo Rossini Castro
>Priority: Minor
>
> When using the apply function to apply a function to a DataFrame column, it 
> ends up mixing the column's rows ordering.
> A command like this:
> {code:java}
> def example_func(df_col):
>   return df_col ** 2
> df['row_to_apply_function'] = df.apply(lambda row: 
> example_func(row['row_to_apply_function']), axis=1){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org