[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering
[ https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Rossini Castro updated SPARK-40063: --- Description: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['col_to_apply_function'] = df.apply(lambda row: example_func(row['col_to_apply_function']), axis=1) {code} A workaround is to assign the results to a new column instead of the same one, but if the old column is dropped, the same error is produced. Setting one column as index also didn't work. was: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['row_to_apply_function'] = df.apply(lambda row: example_func(row['row_to_apply_function']), axis=1) {code} A workaround is to assign the results to a new column instead of the same one, but if the old column is dropped, the same error is produced. Setting one column as index also didn't work. > pyspark.pandas .apply() changing rows ordering > -- > > Key: SPARK-40063 > URL: https://issues.apache.org/jira/browse/SPARK-40063 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 3.3.0 > Environment: Databricks Runtime 11.1 >Reporter: Marcelo Rossini Castro >Priority: Minor > Labels: Pandas, PySpark > > When using the apply function to apply a function to a DataFrame column, it > ends up mixing the column's rows ordering. > A command like this: > {code:java} > def example_func(df_col): > return df_col ** 2 > df['col_to_apply_function'] = df.apply(lambda row: > example_func(row['col_to_apply_function']), axis=1) {code} > A workaround is to assign the results to a new column instead of the same > one, but if the old column is dropped, the same error is produced. > Setting one column as index also didn't work. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering
[ https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Rossini Castro updated SPARK-40063: --- Description: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['row_to_apply_function'] = df.apply(lambda row: example_func(row['row_to_apply_function']), axis=1) {code} A workaround is to assign the results to a new column instead of the same one, but if the old column is dropped, the same error is produced. Setting one column as index also didn't work. was: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['row_to_apply_function'] = df.apply(lambda row: example_func(row['row_to_apply_function']), axis=1) {code} A workaround is to assign the results to a new column instead of the same one, but if the old column is dropped, the same error is produced. > pyspark.pandas .apply() changing rows ordering > -- > > Key: SPARK-40063 > URL: https://issues.apache.org/jira/browse/SPARK-40063 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 3.3.0 > Environment: Databricks Runtime 11.1 >Reporter: Marcelo Rossini Castro >Priority: Minor > Labels: Pandas, PySpark > > When using the apply function to apply a function to a DataFrame column, it > ends up mixing the column's rows ordering. > A command like this: > {code:java} > def example_func(df_col): > return df_col ** 2 > df['row_to_apply_function'] = df.apply(lambda row: > example_func(row['row_to_apply_function']), axis=1) {code} > A workaround is to assign the results to a new column instead of the same > one, but if the old column is dropped, the same error is produced. > Setting one column as index also didn't work. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering
[ https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Rossini Castro updated SPARK-40063: --- Language: Python Environment: Databricks Runtime 11.1 Labels: Pandas PySpark (was: ) > pyspark.pandas .apply() changing rows ordering > -- > > Key: SPARK-40063 > URL: https://issues.apache.org/jira/browse/SPARK-40063 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 3.3.0 > Environment: Databricks Runtime 11.1 >Reporter: Marcelo Rossini Castro >Priority: Minor > Labels: Pandas, PySpark > > When using the apply function to apply a function to a DataFrame column, it > ends up mixing the column's rows ordering. > A command like this: > {code:java} > def example_func(df_col): > return df_col ** 2 > df['row_to_apply_function'] = df.apply(lambda row: > example_func(row['row_to_apply_function']), axis=1) {code} > A workaround is to assign the results to a new column instead of the same > one, but if the old column is dropped, the same error is produced. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering
[ https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Rossini Castro updated SPARK-40063: --- Description: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['row_to_apply_function'] = df.apply(lambda row: example_func(row['row_to_apply_function']), axis=1) {code} A workaround is to assign the results to a new column instead of the same one, but if the old column is dropped, the same error is produced. was: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['row_to_apply_function'] = df.apply(lambda row: example_func(row['row_to_apply_function']), axis=1) {code} A workaround is to assign the results to a new column instead of the same one, but if the old column is dropped, the same error is produced. > pyspark.pandas .apply() changing rows ordering > -- > > Key: SPARK-40063 > URL: https://issues.apache.org/jira/browse/SPARK-40063 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 3.3.0 >Reporter: Marcelo Rossini Castro >Priority: Minor > > When using the apply function to apply a function to a DataFrame column, it > ends up mixing the column's rows ordering. > A command like this: > {code:java} > def example_func(df_col): > return df_col ** 2 > df['row_to_apply_function'] = df.apply(lambda row: > example_func(row['row_to_apply_function']), axis=1) {code} > A workaround is to assign the results to a new column instead of the same > one, but if the old column is dropped, the same error is produced. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering
[ https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Rossini Castro updated SPARK-40063: --- Description: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['row_to_apply_function'] = df.apply(lambda row: example_func(row['row_to_apply_function']), axis=1) {code} A workaround is to assign the results to a new column instead of the same one, but if the old column is dropped, the same error is produced. was: When using the apply function to apply a function to a DataFrame column, it ends up mixing the column's rows ordering. A command like this: {code:java} def example_func(df_col): return df_col ** 2 df['row_to_apply_function'] = df.apply(lambda row: example_func(row['row_to_apply_function']), axis=1){code} > pyspark.pandas .apply() changing rows ordering > -- > > Key: SPARK-40063 > URL: https://issues.apache.org/jira/browse/SPARK-40063 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 3.3.0 >Reporter: Marcelo Rossini Castro >Priority: Minor > > When using the apply function to apply a function to a DataFrame column, it > ends up mixing the column's rows ordering. > A command like this: > {code:java} > def example_func(df_col): > return df_col ** 2 > df['row_to_apply_function'] = df.apply(lambda row: > example_func(row['row_to_apply_function']), axis=1) {code} > > A workaround is to assign the results to a new column instead of the same > one, but if the old column is dropped, the same error is produced. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40063) pyspark.pandas .apply() changing rows ordering
[ https://issues.apache.org/jira/browse/SPARK-40063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Rossini Castro updated SPARK-40063: --- Summary: pyspark.pandas .apply() changing rows ordering (was: pyspark.pandas .apply() chaging rows ordering) > pyspark.pandas .apply() changing rows ordering > -- > > Key: SPARK-40063 > URL: https://issues.apache.org/jira/browse/SPARK-40063 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 3.3.0 >Reporter: Marcelo Rossini Castro >Priority: Minor > > When using the apply function to apply a function to a DataFrame column, it > ends up mixing the column's rows ordering. > A command like this: > {code:java} > def example_func(df_col): > return df_col ** 2 > df['row_to_apply_function'] = df.apply(lambda row: > example_func(row['row_to_apply_function']), axis=1){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org