Re: Dataframe's storage size

2021-12-24 Thread Gourav Sengupta
Hi, even the cached data has different memory for the dataframes with exactly the same data depending on a lot of conditions. I generally tend to try to understand the problem before jumping into conclusions through assumptions, sadly a habit I cannot overcome. Is there a way to understand what

Re: Dataframe's storage size

2021-12-24 Thread Sean Owen
I assume it means size in memory when cached, which does make sense. Fastest thing is to look at it in the UI Storage tab after it is cached. On Fri, Dec 24, 2021, 4:54 AM Gourav Sengupta wrote: > Hi, > > This question, once again like the last one, does not make much sense at > all. Where are

Re: Dataframe's storage size

2021-12-24 Thread Gourav Sengupta
Hi, This question, once again like the last one, does not make much sense at all. Where are you trying to store the data frame, and how? Are you just trying to write a blog, as you were mentioning in an earlier email, and trying to fill in some gaps? I think that the questions are entirely

Dataframe's storage size

2021-12-23 Thread bitfox
Hello Is it possible to know a dataframe's total storage size in bytes? such as: df.size() Traceback (most recent call last): File "", line 1, in File "/opt/spark/python/pyspark/sql/dataframe.py", line 1660, in __getattr__ "'%s' object has no attribute '%s'" %