Hi, We've been considering using the download package Spark 2.4.4 that's pre-built for Hadoop 2.7 with Hadoop 2.7.7.
When used with Spark, Hadoop 2.7 is often quoted as the most stable. However, Hadoop 2.7.7 is End Of Life, so it's not supported and no longer available as a download. The most recent Hadoop vulnerabilities have only been fixed in versions 2.8.5 and above. Currently only Hadoop 2.9.2 and above is available to download. We've searched the Spark user forum and have also been following discussions on the development forum and it's still unclear as to the best approach to this issue. Discussions about Spark 3.0.0 currently want to leave Hadoop 2.7 as the default, when there are known vulnerabilities this is a concern. What's our best way forward with this? What versions of Hadoop 2.X do you support? Should we be switching to use the package Spark 2.4.4 with user-provided Apache Hadoop? If so, which supported version of Hadoop when used with Spark should we be using? Thanks Jeff -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org