Curtis, assuming you are running a somewhat recent windows version you would 
not have access to c:\tmp, in your command example

winutils.exe ls -F C:\tmp\hive

Try changing the path to under your user directory.

Running Spark on Windows should work :)

________________________________
From: Curtis Burkhalter <curtisburkhal...@gmail.com>
Sent: Wednesday, June 7, 2017 7:46:56 AM
To: Doc Dwarf
Cc: user@spark.apache.org
Subject: Re: problem initiating spark context with pyspark

Thanks Doc I saw this on another board yesterday so I've tried this by first 
going to the directory where I've stored the wintutils.exe and then as an admin 
running the command  that you suggested and I get this exception when checking 
the permissions:

C:\winutils\bin>winutils.exe ls -F C:\tmp\hive
FindFileOwnerAndPermission error (1789): The trust relationship between this 
workstation and the primary domain failed.

I'm fairly new to the command line and determining what the different 
exceptions mean. Do you have any advice what this error means and how I might 
go about fixing this?

Thanks again


On Wed, Jun 7, 2017 at 9:51 AM, Doc Dwarf 
<doc.dwar...@gmail.com<mailto:doc.dwar...@gmail.com>> wrote:
Hi Curtis,

I believe in windows, the following command needs to be executed: (will need 
winutils installed)

D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive



On 6 June 2017 at 09:45, Curtis Burkhalter 
<curtisburkhal...@gmail.com<mailto:curtisburkhal...@gmail.com>> wrote:
Hello all,

I'm new to Spark and I'm trying to interact with it using Pyspark. I'm using 
the prebuilt version of spark v. 2.1.1 and when I go to the command line and 
use the command 'bin\pyspark' I have initialization problems and get the 
following message:

C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC 
v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
17/06/06 10:30:21 WARN ObjectStore: Version information not found in metastore. 
hive.metastore.schema.verification is not enabled so recording the schema 
version 1.2.0
17/06/06 10:30:21 WARN ObjectStore: Failed to get database default, returning 
NoSuchObjectException
Traceback (most recent call last):
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py", line 
63, in deco
    return f(*a, **kw)
  File 
"C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
 line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o22.sessionState.
: java.lang.IllegalArgumentException: Error while instantiating 
'org.apache.spark.sql.hive.HiveSessionState':
        at 
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)
        at 
org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
        at 
org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)
        ... 13 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 
'org.apache.spark.sql.hive.HiveExternalCatalog':
        at 
org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)
        at 
org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86)
        at 
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
        at 
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
        at scala.Option.getOrElse(Option.scala:121)
        at 
org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101)
        at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
        at 
org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157)
        at 
org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32)
        ... 18 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166)
        ... 26 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
        at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358)
        at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
        at 
org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCatalog.scala:66)
        ... 31 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root 
scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: 
rw-rw-rw-
        at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
        at 
org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:188)
        ... 39 more
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS 
should be writable. Current permissions are: rw-rw-rw-
        at 
org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612)
        at 
org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
        at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
        ... 40 more


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\bin\..\python\pyspark\shell.py", 
line 43, in <module>
    spark = SparkSession.builder\
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\session.py", line 
179, in getOrCreate
    session._jsparkSession.sessionState().conf().setConfString(key, value)
  File 
"C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py",
 line 1133, in __call__
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py", line 
79, in deco
    raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: "Error while instantiating 
'org.apache.spark.sql.hive.HiveSessionState':"
>>>

Any help with what might be going wrong here would be greatly appreciated.

Best
--
Curtis Burkhalter
Postdoctoral Research Associate, National Audubon Society

https://sites.google.com/site/curtisburkhalter/




--
Curtis Burkhalter
Postdoctoral Research Associate, National Audubon Society

https://sites.google.com/site/curtisburkhalter/

Reply via email to