HI,
i have a simple question about creating RDD . Whenever RDD is created in
spark streaming for the particular time window .When does the RDD gets
stored .
1. Does it get stored at the Driver machine ? or it gets stored on all the
machines in the cluster ?
2. Does the data gets stored in memory
Hey Abhi,
many of StreamingContext's methods to create input streams take a
StorageLevel parameter to configure this behavior. RDD partitions are
generally stored in the in-memory cache of worker nodes I think. You can
also configure replication and spilling to disk if needed.
Regards,
Jeff