[ https://issues.apache.org/jira/browse/SPARK-38988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527432#comment-17527432 ]
Bjørn Jørgensen commented on SPARK-38988: ----------------------------------------- I add a new fil "warning printed.txt" it show that it depends one the dataframe size. So if you have a dataframe Int64Index: 34 entries, 0 to 33 Data columns (total 37 columns): The warning won`t get printed. If the datafreme is Int64Index: 109 entries, 0 to 108 Data columns (total 112 columns): Then the warning is printed 13 times. > Pandas API - "PerformanceWarning: DataFrame is highly fragmented." get > printed many times. > ------------------------------------------------------------------------------------------- > > Key: SPARK-38988 > URL: https://issues.apache.org/jira/browse/SPARK-38988 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.3.0, 3.4.0 > Reporter: Bjørn Jørgensen > Priority: Major > Attachments: Untitled.html, info.txt, warning printed.txt > > > I add a file and a notebook with the info msg I get when I run df.info() > Spark master build from 13.04.22. > df.shape > (763300, 224) -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org