Hi everybody Someone can tell me if it is possible to read and filter a 60 GB file of tweets (Json Docs) in a Standalone Spark Deployment that runs in a single machine with 40 Gb RAM and 8 cores???
I mean, is it possible to configure Spark to work with some amount of memory (20 GB) and the rest of the process in Disk, and avoid OutOfMemory exceptions???? Regards Abel