Re: Spark SQL configuration

2014-10-27 Thread Akhil Das
You will face problems if the spark version isn't compatible with your
hadoop version. (Lets say you have hadoop 2.x and you downloaded spark
pre-compiled with hadoop 1.x then it would be a problem.) Of course you can
use spark without telling about any hadoop configurations unless you are
trying to access the HDFS.

If you are having the spark hadoop version same as your hadoop version then
you can set the HADOOP_CONF_DIR=/path/to/your/hadoop/conf/ inside the
spark-env.sh file inside $SPARK_HOME/conf/

Thanks
Best Regards

On Mon, Oct 27, 2014 at 9:46 AM, Pagliari, Roberto rpagli...@appcomsci.com
wrote:

 What is yarn cluster?



 And, does spark necessarily need Hadoop already installed in the cluster?
 For example, can one download spark and run it on a bunch of nodes, with no
 prior installation of Hadoop?



 Thanks,





 *From:* Yi Tian [mailto:tianyi.asiai...@gmail.com]
 *Sent:* Sunday, October 26, 2014 9:08 PM
 *To:* Pagliari, Roberto
 *Cc:* u...@spark.incubator.apache.org
 *Subject:* Re: Spark SQL configuration



 You can write `HADOOP_CONF_DIR=your_hadoop_conf_path` to
 `conf/spark-env.sh` to enable:



 1 connect to your yarn cluster

 2 set `hdfs` as default FileSystem, otherwise you have to write “hdfs://“
 before every paths you defined, like: `val input = sc.textFile(“
 hdfs://user/spark/test.dat”)`




 Best Regards,

 Yi Tian
 tianyi.asiai...@gmail.com




 On Oct 27, 2014, at 07:59, Pagliari, Roberto rpagli...@appcomsci.com
 wrote:



 I’m a newbie with Spark. After installing it on all the machines I want to
 use, do I need to tell it about Hadoop configuration, or will it be able to
 find it himself?



 Thank you,





Spark SQL configuration

2014-10-26 Thread Pagliari, Roberto
I'm a newbie with Spark. After installing it on all the machines I want to use, 
do I need to tell it about Hadoop configuration, or will it be able to find it 
himself?

Thank you,