I'm learning Apache Spark, where I'm trying to run a basic Spark Program
written in Java. I've installed Apache Spark
*(spark-2.4.3-bin-without-hadoop)* downloaded from https://spark.apache.org/
.
I've created a maven project in eclipse and added the following dependency :
org.apache.spark
spark-core_2.11
2.4.3
After building the project, I've tried to run the program by setting
sparkMaster=local through spark config and now I've Encountered with the
following Error :
java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
After referring to some sites, I've installed hadoop-2.7.7 and added
"HADOOP_HOME" to my .bash_profile.
And I'm able to execute my Spark Program!!
*Now I need to know where and how Hadoop is necessary for Spark??*
I've posted the same in stackoverflow long back, but still can't get a
response.
https://stackoverflow.com/questions/57435163/why-apache-hadoop-need-to-be-installed-for-running-apache-spark
Regards,
Praveen Kumar Ramachandran