Hi Akshay
On the Spark Download page when you select Spark 2.2.1 it gives you an
option to select package type. In that, there is an option to select
"Pre-Built for Apache Hadoop 2.7 and later". I am assuming it means that it
does support Hadoop 3.0.
http://spark.apache.org/downloads.html
Hey,
We use a customize receiver to receive data from our MQ. We used to use def
store(dataItem: T) to store data however I found the block size can be very
different from 0.5K to 5M size. So that data partition processing time is
very different. Shuffle is an option, but I want to avoid it.
I
AFAIK, there's no large scale test for Hadoop 3.0 in the community. So it
is not clear whether it is supported or not (or has some issues). I think
in the download page "Pre-Built for Apache Hadoop 2.7 and later" mostly
means that it supports Hadoop 2.7+ (2.8...), but not 3.0 (IIUC).
Thanks
Jerry