Re: sqlCtx.sql('some_hive_table') works in pyspark but not spark-submit

2015-11-08 Thread Deng Ching-Mallete
Hi, Did you check if HADOOP_CONF_DIR is configured in your YARN's application classpath? By default, the shell runs in local client mode which is probably why it's resolving the env variable you're setting and was able to get the Hive metastore from your hive-site.xml.. HTH, Deng On Sun, Nov 8,

sqlCtx.sql('some_hive_table') works in pyspark but not spark-submit

2015-11-07 Thread YaoPau
Within a pyspark shell, both of these work for me: print hc.sql("SELECT * from raw.location_tbl LIMIT 10").collect() print sqlCtx.sql("SELECT * from raw.location_tbl LIMIT 10").collect() But when I submit both of those in batch mode (hc and sqlCtx both exist), I get the following error. Why is