Hello all,

I'm new to Spark and I'm trying to interact with it using Pyspark. I'm
using the prebuilt version of spark v. 2.1.1 and when I go to the command
line and use the command 'bin\pyspark' I have initialization problems and
get the following message:

C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41)
[MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).
17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
17/06/06 10:30:21 WARN ObjectStore: Version information not found in
metastore. hive.metastore.schema.verification is not enabled so recording
the schema version 1.2.0
17/06/06 10:30:21 WARN ObjectStore: Failed to get database default,
returning NoSuchObjectException
Traceback (most recent call last):
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
line 63, in deco
    return f(*a, **kw)
  File
"C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
o22.sessionState.
: java.lang.IllegalArgumentException: Error while instantiating
'org.apache.spark.sql.hive.HiveSessionState':
        at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)
        at
org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
        at
org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)
        ... 13 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating
'org.apache.spark.sql.hive.HiveExternalCatalog':
        at
org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)
        at
org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86)
        at
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
        at
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
        at scala.Option.getOrElse(Option.scala:121)
        at
org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101)
        at
org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
        at
org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157)
        at
org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32)
        ... 18 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at
org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166)
        ... 26 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
        at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358)
        at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
        at
org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCatalog.scala:66)
        ... 31 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root
scratch dir: /tmp/hive on HDFS should be writable. Current permissions are:
rw-rw-rw-
        at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
        at
org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:188)
        ... 39 more
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on
HDFS should be writable. Current permissions are: rw-rw-rw-
        at
org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612)
        at
org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
        at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
        ... 40 more


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\bin\..\python\pyspark\shell.py",
line 43, in <module>
    spark = SparkSession.builder\
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\session.py",
line 179, in getOrCreate
    session._jsparkSession.sessionState().conf().setConfString(key, value)
  File
"C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py",
line 1133, in __call__
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
line 79, in deco
    raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: "Error while instantiating
'org.apache.spark.sql.hive.HiveSessionState':"
>>>

Any help with what might be going wrong here would be greatly appreciated.

Best
-- 
Curtis Burkhalter
Postdoctoral Research Associate, National Audubon Society

https://sites.google.com/site/curtisburkhalter/

Reply via email to