Hi, I have a question regarding Spark streaming resiliency and the documentation is ambiguous :
The documentation says that the default configuration use a replication factor of 2 for data received but the recommendation is to use write ahead logs to guarantee data resiliency with receivers. "Additionally, it is recommended that the replication of the received data within Spark be disabled when the write ahead log is enabled as the log is already stored in a replicated storage system." The doc says it useless to duplicate with WAL, but what is the benefit of using WAL instead of the internal in memory replication ? I would assume it's better to replicate in memory than write on a replicated FS reagarding performance... Can a streaming expert explain me ? BR