Dear all, I am trying to connect a remote windows machine to a standalone spark cluster (a single VM running on Ubuntu server with 8 cores and 64GB RAM). Both client and server have Spark 2.0 software prebuilt for Hadoop 2.6, and hadoop 2.7
I have the following settings on cluster: export SPARK_WORKER_MEMORY=32G export SPARK_WORKER_CORES=8 and the following settings on client (spark-defaults.conf) spark.driver.memory 4g spark.executor.memory 8g spark.executor.cores 2 When I start pyspark, everything works smoothly. In Spark UI, I see that my app is running and has 4 executors attached to it, each with 2 cores and 8g of memory. However, when I try to read some HDFS files, it hangs and gives me the following message in the loop. >>> df = sqlContext.read.parquet('/projects/kaggle-bimbo/dataset_full.pqt') 16/08/17 01:04:34 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because UNIX Domain sockets are not available on Windows. 16/08/17 01:04:52 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources I go to Spark UI again and see that it actually tries to start another set of 4 executors . These executors hang for some time, fail and start again. So when this app is left alone it generates many executors in the status "EXITED". Nothing really happens. If I go to application UI, it just shows me Stages : 0/1 (1 failed) and TasksNo tasks have started yet Is it a bug or am I doing something wrong? looks like re-occurence of https://issues.apache.org/jira/browse/SPARK-2260