Is one Spark partition mapped to one and only Spark Task ?

2024-03-24 Thread Sreyan Chakravarty
. -- Regards, Sreyan Chakravarty

Re: pyspark - Where are Dataframes created from Python objects stored?

2024-03-18 Thread Sreyan Chakravarty
be faked. I want data to actually reside on the storage or executors. Maybe this will be better tackled in a separate thread here: https://lists.apache.org/thread/w6f7rq7m8fj6hzwpyhvvx3c42wbmkwdq -- Regards, Sreyan Chakravarty

pyspark - Use Spark to generate a large dataset on the fly

2024-03-18 Thread Sreyan Chakravarty
n the data from the Kafka topic ? Basically, my problem means calls from sending each piece of data as I receive it to the worker node. Can that be done somehow ? -- Regards, Sreyan Chakravarty

pyspark - Use Spark to generate a large dataset on the fly

2024-03-18 Thread Sreyan Chakravarty
n the data from the Kafka topic ? *Basically, my problem means calls from sending each piece of data as I receive it to the worker node. Can that be done somehow ?* -- Regards, Sreyan Chakravarty

Re: pyspark - Where are Dataframes created from Python objects stored?

2024-03-18 Thread Sreyan Chakravarty
> So just to be clear the transformations are always executed on the worker node but it is just transferred until an action on the dataframe is triggered. Am I correct ? If so, then how do I generate a large dataset ? I may need something like that for synthetic data for testing. Any way to do that ? -- Regards, Sreyan Chakravarty

pyspark - Where are Dataframes created from Python objects stored?

2024-03-14 Thread Sreyan Chakravarty
. -- Regards, Sreyan Chakravarty