Re: Is Apache Spark-2.2.1 compatible with Hadoop-3.0.0

2018-01-07 Thread Saisai Shao
AFAIK, there's no large scale test for Hadoop 3.0 in the community. So it is not clear whether it is supported or not (or has some issues). I think in the download page "Pre-Built for Apache Hadoop 2.7 and later" mostly means that it supports Hadoop 2.7+ (2.8...), but not 3.0 (IIUC). Thanks Jerry

Limit the block size of data received by spring streaming receiver

2018-01-07 Thread Xilang Yan
Hey, We use a customize receiver to receive data from our MQ. We used to use def store(dataItem: T) to store data however I found the block size can be very different from 0.5K to 5M size. So that data partition processing time is very different. Shuffle is an option, but I want to avoid it. I

Re: Is Apache Spark-2.2.1 compatible with Hadoop-3.0.0

2018-01-07 Thread Raj Adyanthaya
Hi Akshay On the Spark Download page when you select Spark 2.2.1 it gives you an option to select package type. In that, there is an option to select "Pre-Built for Apache Hadoop 2.7 and later". I am assuming it means that it does support Hadoop 3.0. http://spark.apache.org/downloads.html