There is /user/biapp in hdfs. The problem is that the hive-site.xml is
being ignored, so it is looking for it locally.

> Create /user/biapp in hdfs manually first.
>> Sure, I did it with spark-shell, which seems to be showing the same error
>> - not using the hive-site.xml
>> $ HADOOP_CONF_DIR=$SPARK_HOME/hadoop-conf
>> $SPARK_HOME/bin/pyspark --deploy-mode client --driver-class-path
>> Python 2.6.6 (r266:84292, Jul 23 2015, 05:13:40)
>> [GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/usr/lib/spark-1.5.1-bin-without-hadoop/lib/spark-assembly-1.5.1-hadoop2.5.0-cdh5.3.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 15/10/29 10:33:20 WARN MetricsSystem: Using default name DAGScheduler for
>> source because is not set.
>> 15/10/29 10:33:22 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 15/10/29 10:33:50 WARN HiveConf: HiveConf of name hive.metastore.local
>> does not exist
>> Welcome to
>>       ____              __
>>      / __/__  ___ _____/ /__
>>     _\ \/ _ \/ _ `/ __/  '_/
>>    /__ / .__/\_,_/_/ /_/\_\   version 1.5.1
>>       /_/
>> Using Python version 2.6.6 (r266:84292, Jul 23 2015 05:13:40)
>> SparkContext available as sc, HiveContext available as sqlContext.
>> >>>
>> biapps@biapps-qa01:~> HADOOP_CONF_DIR=$SPARK_HOME/hadoop-conf
>> $SPARK_HOME/bin/spark-shell --deploy-mode client
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/usr/lib/spark-1.5.1-bin-without-hadoop/lib/spark-assembly-1.5.1-hadoop2.5.0-cdh5.3.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> Welcome to
>>       ____              __
>>      / __/__  ___ _____/ /__
>>     _\ \/ _ \/ _ `/ __/  '_/
>>    /___/ .__/\_,_/_/ /_/\_\   version 1.5.1
>>       /_/
>> Using Scala version 2.10.4 (OpenJDK 64-Bit Server VM, Java 1.7.0_91)
>> Type in expressions to have them evaluated.
>> Type :help for more information.
>> 15/10/29 10:34:15 WARN MetricsSystem: Using default name DAGScheduler for
>> source because is not set.
>> 15/10/29 10:34:16 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> Spark context available as sc.
>> 15/10/29 10:34:46 WARN HiveConf: HiveConf of name hive.metastore.local
>> does not exist
>> 15/10/29 10:34:46 WARN ShellBasedUnixGroupsMapping: got exception trying
>> to get groups for user biapp: id: biapp: No such user
>> 15/10/29 10:34:46 WARN UserGroupInformation: No groups available for user
>> biapp
>> java.lang.RuntimeException:
>> Permission denied:
>> user=biapp, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(
>> at
>> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(
>> at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> at org.apache.hadoop.ipc.RPC$
>> at org.apache.hadoop.ipc.Server$Handler$
>> at org.apache.hadoop.ipc.Server$Handler$
>> at Method)
>> at
>> at
>> at org.apache.hadoop.ipc.Server$
>> at
>> org.apache.hadoop.hive.ql.session.SessionState.start(
>> at
>> org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171)
>> at
>> org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:162)
>> at
>> org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:160)
>> at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:167)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
>> at java.lang.reflect.Constructor.newInstance(
>> at
>> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
>> at $iwC$$iwC.<init>(<console>:9)
>> at $iwC.<init>(<console>:18)
>> at <init>(<console>:20)
>> at .<init>(<console>:24)
>> at .<clinit>(<console>)
>> at .<init>(<console>:7)
>> at .<clinit>(<console>)
>> at $print(<console>)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> at java.lang.reflect.Method.invoke(
>> at
>> org.apache.spark.repl.SparkIMain$
>> at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
>> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>> at
>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>> at
>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>> at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>> at
>> org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:132)
>> at
>> org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
>> at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
>> at
>> org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
>> at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
>> at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
>> at
>> org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
>> at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
>> at
>> org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
>> at
>> org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
>> at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
>> at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>> at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>> at
>> at
>> $apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>> at org.apache.spark.repl.Main$.main(Main.scala:31)
>> at org.apache.spark.repl.Main.main(Main.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> at java.lang.reflect.Method.invoke(
>> at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
>> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>> Caused by: Permission
>> denied: user=biapp, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(
>> at
>> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(
>> at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> at org.apache.hadoop.ipc.RPC$
>> at org.apache.hadoop.ipc.Server$Handler$
>> at org.apache.hadoop.ipc.Server$Handler$
>> at Method)
>> at
>> at
>> at org.apache.hadoop.ipc.Server$
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
>> at java.lang.reflect.Constructor.newInstance(
>> at
>> org.apache.hadoop.ipc.RemoteException.instantiateException(
>> at
>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(
>> at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(
>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(
>> at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(
>> at
>> org.apache.hadoop.hive.ql.exec.Utilities.createDirsWithPermission(
>> at
>> org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(
>> at
>> org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(
>> at
>> org.apache.hadoop.hive.ql.session.SessionState.start(
>> ... 56 more
>> Caused by:
>> org.apache.hadoop.ipc.RemoteException(
>> Permission denied: user=biapp, access=WRITE,
>> inode="/user":hdfs:supergroup:drwxr-xr-x
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(
>> at
>> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(
>> at
>> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(
>> at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> at org.apache.hadoop.ipc.RPC$
>> at org.apache.hadoop.ipc.Server$Handler$
>> at org.apache.hadoop.ipc.Server$Handler$
>> at Method)
>> at
>> at
>> at org.apache.hadoop.ipc.Server$
>> at
>> at
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>> at com.sun.proxy.$Proxy14.mkdirs(Unknown Source)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> at java.lang.reflect.Method.invoke(
>> at
>> at
>> at com.sun.proxy.$Proxy15.mkdirs(Unknown Source)
>> at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(
>> ... 66 more
>> <console>:10: error: not found: value sqlContext
>>        import sqlContext.implicits._
>>               ^
>> <console>:10: error: not found: value sqlContext
>>        import sqlContext.sql
>>               ^
>> scala> sqlContext.sql("show databases").collect
>> <console>:14: error: not found: value sqlContext
>>               sqlContext.sql("show databases").collect
>>               ^
>> scala>
>>> I dont know a lot about how pyspark works. Can you possibly try running
>>> spark-shell and do the same?
>>> sqlContext.sql("show databases").collect
>>> Deenar
>>>> Yes, I am. It was compiled with the following:
>>>> export SPARK_HADOOP_VERSION=2.5.0-cdh5.3.3
>>>> export SPARK_YARN=true
>>>> export SPARK_HIVE=true
>>>> export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M
>>>> -XX:ReservedCodeCacheSize=512m"
>>>> mvn -Pyarn -Phadoop-2.5 -Dhadoop.version=2.5.0-cdh5.3.3 -Phive
>>>> -Phive-thriftserver -DskipTests clean package
>>>>> Are you using Spark built with hive ?
>>>>> # Apache Hadoop 2.4.X with Hive 13 support
>>>>> mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver 
>>>>> -DskipTests clean package
>>>>>> Hi Deenar,
>>>>>> As suggested, I have moved the hive-site.xml from HADOOP_CONF_DIR
>>>>>> ($SPARK_HOME/hadoop-conf) to YARN_CONF_DIR ($SPARK_HOME/conf/yarn-conf) 
>>>>>> and
>>>>>> use the below to start pyspark, but the error is the exact same as 
>>>>>> before.
>>>>>> $ HADOOP_CONF_DIR=$SPARK_HOME/hadoop-conf
>>>>>> YARN_CONF_DIR=$SPARK_HOME/conf/yarn-conf HADOOP_USER_NAME=biapp 
>>>>>> MASTER=yarn
>>>>>> $SPARK_HOME/bin/pyspark --deploy-mode client
>>>>>> Python 2.6.6 (r266:84292, Jul 23 2015, 05:13:40)
>>>>>> [GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux2
>>>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>> SLF4J: Found binding in
>>>>>> [jar:file:/usr/lib/spark-1.5.1-bin-without-hadoop/lib/spark-assembly-1.5.1-hadoop2.5.0-cdh5.3.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>> SLF4J: Found binding in
>>>>>> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>> SLF4J: See for an
>>>>>> explanation.
>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>> 15/10/29 09:06:36 WARN MetricsSystem: Using default name DAGScheduler
>>>>>> for source because is not set.
>>>>>> 15/10/29 09:06:38 WARN NativeCodeLoader: Unable to load native-hadoop
>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>> 15/10/29 09:07:03 WARN HiveConf: HiveConf of name
>>>>>> hive.metastore.local does not exist
>>>>>> Welcome to
>>>>>>       ____              __
>>>>>>      / __/__  ___ _____/ /__
>>>>>>     _\ \/ _ \/ _ `/ __/  '_/
>>>>>>    /__ / .__/\_,_/_/ /_/\_\   version 1.5.1
>>>>>>       /_/
>>>>>> Using Python version 2.6.6 (r266:84292, Jul 23 2015 05:13:40)
>>>>>> SparkContext available as sc, HiveContext available as sqlContext.
>>>>>> >>> sqlContext2 = HiveContext(sc)
>>>>>> >>> sqlContext2 = HiveContext(sc)
>>>>>> >>> sqlContext2.sql("show databases").first()
>>>>>> 15/10/29 09:07:34 WARN HiveConf: HiveConf of name
>>>>>> hive.metastore.local does not exist
>>>>>> 15/10/29 09:07:35 WARN ShellBasedUnixGroupsMapping: got exception
>>>>>> trying to get groups for user biapp: id: biapp: No such user
>>>>>> 15/10/29 09:07:35 WARN UserGroupInformation: No groups available for
>>>>>> user biapp
>>>>>> Traceback (most recent call last):
>>>>>>   File "<stdin>", line 1, in <module>
>>>>>>   File
>>>>>> "/usr/lib/spark-1.5.1-bin-without-hadoop/python/pyspark/sql/",
>>>>>> line 552, in sql
>>>>>>     return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
>>>>>>   File
>>>>>> "/usr/lib/spark-1.5.1-bin-without-hadoop/python/pyspark/sql/",
>>>>>> line 660, in _ssql_ctx
>>>>>>     "build/sbt assembly", e)
>>>>>> Exception: ("You must build Spark with Hive. Export 'SPARK_HIVE=true'
>>>>>> and run build/sbt assembly", Py4JJavaError(u'An error occurred while
>>>>>> calling\n', JavaObject 
>>>>>> id=o20))
>>>>>> >>>
>>>>>>> *Hi Zoltan*
>>>>>>> Add hive-site.xml to your YARN_CONF_DIR. i.e.
>>>>>>> $SPARK_HOME/conf/yarn-conf
>>>>>>> Deenar
>>>>>>> *Think Reactive Ltd*
>>>>>>> 07714140812
>>>>>>>> Hi,
>>>>>>>> We have a shared CDH 5.3.3 cluster and trying to use Spark 1.5.1 on
>>>>>>>> it in yarn client mode with Hive.
>>>>>>>> I have compiled Spark 1.5.1 with SPARK_HIVE=true, but it seems I am
>>>>>>>> not able to make SparkSQL to pick up the hive-site.xml when runnig 
>>>>>>>> pyspark.
>>>>>>>> hive-site.xml is located in $SPARK_HOME/hadoop-conf/hive-site.xml
>>>>>>>> and also in $SPARK_HOME/conf/hive-site.xml
>>>>>>>> When I start pyspark with the below command and then run some
>>>>>>>> simple SparkSQL it fails, it seems it didn't pic up the settings in
>>>>>>>> hive-site.xml
>>>>>>>> $ HADOOP_CONF_DIR=$SPARK_HOME/hadoop-conf
>>>>>>>> $SPARK_HOME/bin/pyspark --deploy-mode client
>>>>>>>> Python 2.6.6 (r266:84292, Jul 23 2015, 05:13:40)
>>>>>>>> [GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux2
>>>>>>>> Type "help", "copyright", "credits" or "license" for more
>>>>>>>> information.
>>>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>>>> SLF4J: Found binding in
>>>>>>>> [jar:file:/usr/lib/spark-1.5.1-bin-without-hadoop/lib/spark-assembly-1.5.1-hadoop2.5.0-cdh5.3.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>> SLF4J: Found binding in
>>>>>>>> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>> SLF4J: See for
>>>>>>>> an explanation.
>>>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>>>> 15/10/28 10:22:33 WARN MetricsSystem: Using default name
>>>>>>>> DAGScheduler for source because is not set.
>>>>>>>> 15/10/28 10:22:35 WARN NativeCodeLoader: Unable to load
>>>>>>>> native-hadoop library for your platform... using builtin-java classes 
>>>>>>>> where
>>>>>>>> applicable
>>>>>>>> 15/10/28 10:22:59 WARN HiveConf: HiveConf of name
>>>>>>>> hive.metastore.local does not exist
>>>>>>>> Welcome to
>>>>>>>>       ____              __
>>>>>>>>      / __/__  ___ _____/ /__
>>>>>>>>     _\ \/ _ \/ _ `/ __/  '_/
>>>>>>>>    /__ / .__/\_,_/_/ /_/\_\   version 1.5.1
>>>>>>>>       /_/
>>>>>>>> Using Python version 2.6.6 (r266:84292, Jul 23 2015 05:13:40)
>>>>>>>> SparkContext available as sc, HiveContext available as sqlContext.
>>>>>>>> >>> sqlContext2 = HiveContext(sc)
>>>>>>>> >>> sqlContext2.sql("show databases").first()
>>>>>>>> 15/10/28 10:23:12 WARN HiveConf: HiveConf of name
>>>>>>>> hive.metastore.local does not exist
>>>>>>>> 15/10/28 10:23:13 WARN ShellBasedUnixGroupsMapping: got exception
>>>>>>>> trying to get groups for user biapp: id: biapp: No such user
>>>>>>>> 15/10/28 10:23:13 WARN UserGroupInformation: No groups available
>>>>>>>> for user biapp
>>>>>>>> Traceback (most recent call last):
>>>>>>>>   File "<stdin>", line 1, in <module>
>>>>>>>>   File
>>>>>>>> "/usr/lib/spark-1.5.1-bin-without-hadoop/python/pyspark/sql/",
>>>>>>>> line 552, in sql
>>>>>>>>     return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
>>>>>>>>   File
>>>>>>>> "/usr/lib/spark-1.5.1-bin-without-hadoop/python/pyspark/sql/",
>>>>>>>> line 660, in _ssql_ctx
>>>>>>>>     "build/sbt assembly", e)
>>>>>>>> Exception: ("You must build Spark with Hive. Export
>>>>>>>> 'SPARK_HIVE=true' and run build/sbt assembly", Py4JJavaError(u'An error
>>>>>>>> occurred while calling\n',
>>>>>>>> JavaObject id=o20))
>>>>>>>> >>>
>>>>>>>> See in the above the warning about "WARN HiveConf: HiveConf of name
>>>>>>>> hive.metastore.local does not exist" while actually there is a
>>>>>>>> hive.metastore.local attribute in the hive-site.xml
>>>>>>>> Any idea how to submit hive-site.xml in yarn client mode?
>>>>>>>> Thanks

