Re: problem initiating spark context with pyspark

2017-06-11 Thread Gourav Sengupta
Generally I try to make best of the amount of memory my system has for
computation. It might just be of help to see the amount of memory Windows
takes just for running itself and then compare it with Ubuntu or any other
linux or unix or solaris systems.

But I am not quite sure of the used case of course.

Regards,
Gourav Sengupta

On Sat, Jun 10, 2017 at 11:29 PM, Felix Cheung <felixcheun...@hotmail.com>
wrote:

> Curtis, assuming you are running a somewhat recent windows version you
> would not have access to c:\tmp, in your command example
>
> winutils.exe ls -F C:\tmp\hive
>
> Try changing the path to under your user directory.
>
> Running Spark on Windows should work :)
>
> --
> *From:* Curtis Burkhalter <curtisburkhal...@gmail.com>
> *Sent:* Wednesday, June 7, 2017 7:46:56 AM
> *To:* Doc Dwarf
> *Cc:* user@spark.apache.org
> *Subject:* Re: problem initiating spark context with pyspark
>
> Thanks Doc I saw this on another board yesterday so I've tried this by
> first going to the directory where I've stored the wintutils.exe and then
> as an admin running the command  that you suggested and I get this
> exception when checking the permissions:
>
> C:\winutils\bin>winutils.exe ls -F C:\tmp\hive
> FindFileOwnerAndPermission error (1789): The trust relationship between
> this workstation and the primary domain failed.
>
> I'm fairly new to the command line and determining what the different
> exceptions mean. Do you have any advice what this error means and how I
> might go about fixing this?
>
> Thanks again
>
>
> On Wed, Jun 7, 2017 at 9:51 AM, Doc Dwarf <doc.dwar...@gmail.com> wrote:
>
>> Hi Curtis,
>>
>> I believe in windows, the following command needs to be executed: (will
>> need winutils installed)
>>
>> D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive
>>
>>
>>
>> On 6 June 2017 at 09:45, Curtis Burkhalter <curtisburkhal...@gmail.com>
>> wrote:
>>
>>> Hello all,
>>>
>>> I'm new to Spark and I'm trying to interact with it using Pyspark. I'm
>>> using the prebuilt version of spark v. 2.1.1 and when I go to the command
>>> line and use the command 'bin\pyspark' I have initialization problems and
>>> get the following message:
>>>
>>> C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
>>> Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41)
>>> [MSC v.1900 64 bit (AMD64)] on win32
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> Using Spark's default log4j profile: org/apache/spark/log4j-default
>>> s.properties
>>> Setting default log level to "WARN".
>>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>>> setLogLevel(newLevel).
>>> 17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 17/06/06 10:30:21 WARN ObjectStore: Version information not found in
>>> metastore. hive.metastore.schema.verification is not enabled so
>>> recording the schema version 1.2.0
>>> 17/06/06 10:30:21 WARN ObjectStore: Failed to get database default,
>>> returning NoSuchObjectException
>>> Traceback (most recent call last):
>>>   File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
>>> line 63, in deco
>>> return f(*a, **kw)
>>>   File 
>>> "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
>>> line 319, in get_return_value
>>> py4j.protocol.Py4JJavaError: An error occurred while calling
>>> o22.sessionState.
>>> : java.lang.IllegalArgumentException: Error while instantiating
>>> 'org.apache.spark.sql.hive.HiveSessionState':
>>> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$Spar
>>> kSession$$reflect(SparkSession.scala:981)
>>> at org.apache.spark.sql.SparkSession.sessionState$lzycompute(Sp
>>> arkSession.scala:110)
>>> at org.apache.spark.sql.SparkSession.sessionState(SparkSession.
>>> scala:109)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> at py

Re: problem initiating spark context with pyspark

2017-06-10 Thread Felix Cheung
Curtis, assuming you are running a somewhat recent windows version you would 
not have access to c:\tmp, in your command example

winutils.exe ls -F C:\tmp\hive

Try changing the path to under your user directory.

Running Spark on Windows should work :)


From: Curtis Burkhalter <curtisburkhal...@gmail.com>
Sent: Wednesday, June 7, 2017 7:46:56 AM
To: Doc Dwarf
Cc: user@spark.apache.org
Subject: Re: problem initiating spark context with pyspark

Thanks Doc I saw this on another board yesterday so I've tried this by first 
going to the directory where I've stored the wintutils.exe and then as an admin 
running the command  that you suggested and I get this exception when checking 
the permissions:

C:\winutils\bin>winutils.exe ls -F C:\tmp\hive
FindFileOwnerAndPermission error (1789): The trust relationship between this 
workstation and the primary domain failed.

I'm fairly new to the command line and determining what the different 
exceptions mean. Do you have any advice what this error means and how I might 
go about fixing this?

Thanks again


On Wed, Jun 7, 2017 at 9:51 AM, Doc Dwarf 
<doc.dwar...@gmail.com<mailto:doc.dwar...@gmail.com>> wrote:
Hi Curtis,

I believe in windows, the following command needs to be executed: (will need 
winutils installed)

D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive



On 6 June 2017 at 09:45, Curtis Burkhalter 
<curtisburkhal...@gmail.com<mailto:curtisburkhal...@gmail.com>> wrote:
Hello all,

I'm new to Spark and I'm trying to interact with it using Pyspark. I'm using 
the prebuilt version of spark v. 2.1.1 and when I go to the command line and 
use the command 'bin\pyspark' I have initialization problems and get the 
following message:

C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC 
v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
17/06/06 10:30:21 WARN ObjectStore: Version information not found in metastore. 
hive.metastore.schema.verification is not enabled so recording the schema 
version 1.2.0
17/06/06 10:30:21 WARN ObjectStore: Failed to get database default, returning 
NoSuchObjectException
Traceback (most recent call last):
  File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py", line 
63, in deco
return f(*a, **kw)
  File 
"C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
 line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o22.sessionState.
: java.lang.IllegalArgumentException: Error while instantiating 
'org.apache.spark.sql.hive.HiveSessionState':
at 
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)
at 
org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
at 
org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)
... 13 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 
'org.apache.spark.sql.hive.HiveExternalCatalog':
at 
org.apach

Re: problem initiating spark context with pyspark

2017-06-10 Thread Marco Mistroni
Ha...it's a 1 off.I run spk on Ubuntu and docker on windows...I
don't think spark and windows are best friends.  

On Jun 10, 2017 6:36 PM, "Gourav Sengupta" 
wrote:

> seeing for the very first time someone try SPARK on Windows :)
>
> On Thu, Jun 8, 2017 at 8:38 PM, Marco Mistroni 
> wrote:
>
>> try this link
>>
>> http://letstalkspark.blogspot.co.uk/2016/02/getting-started-
>> with-spark-on-window-64.html
>>
>> it helped me when i had similar problems with windows...
>>
>> hth
>>
>> On Wed, Jun 7, 2017 at 3:46 PM, Curtis Burkhalter <
>> curtisburkhal...@gmail.com> wrote:
>>
>>> Thanks Doc I saw this on another board yesterday so I've tried this by
>>> first going to the directory where I've stored the wintutils.exe and then
>>> as an admin running the command  that you suggested and I get this
>>> exception when checking the permissions:
>>>
>>> C:\winutils\bin>winutils.exe ls -F C:\tmp\hive
>>> FindFileOwnerAndPermission error (1789): The trust relationship between
>>> this workstation and the primary domain failed.
>>>
>>> I'm fairly new to the command line and determining what the different
>>> exceptions mean. Do you have any advice what this error means and how I
>>> might go about fixing this?
>>>
>>> Thanks again
>>>
>>>
>>> On Wed, Jun 7, 2017 at 9:51 AM, Doc Dwarf  wrote:
>>>
 Hi Curtis,

 I believe in windows, the following command needs to be executed: (will
 need winutils installed)

 D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive



 On 6 June 2017 at 09:45, Curtis Burkhalter 
 wrote:

> Hello all,
>
> I'm new to Spark and I'm trying to interact with it using Pyspark. I'm
> using the prebuilt version of spark v. 2.1.1 and when I go to the command
> line and use the command 'bin\pyspark' I have initialization problems and
> get the following message:
>
> C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
> Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016,
> 11:57:41) [MSC v.1900 64 bit (AMD64)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> Using Spark's default log4j profile: org/apache/spark/log4j-default
> s.properties
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
> setLogLevel(newLevel).
> 17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 17/06/06 10:30:21 WARN ObjectStore: Version information not found in
> metastore. hive.metastore.schema.verification is not enabled so
> recording the schema version 1.2.0
> 17/06/06 10:30:21 WARN ObjectStore: Failed to get database default,
> returning NoSuchObjectException
> Traceback (most recent call last):
>   File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
> line 63, in deco
> return f(*a, **kw)
>   File 
> "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
> line 319, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling
> o22.sessionState.
> : java.lang.IllegalArgumentException: Error while instantiating
> 'org.apache.spark.sql.hive.HiveSessionState':
> at org.apache.spark.sql.SparkSess
> ion$.org$apache$spark$sql$SparkSession$$reflect(SparkSession
> .scala:981)
> at org.apache.spark.sql.SparkSess
> ion.sessionState$lzycompute(SparkSession.scala:110)
> at org.apache.spark.sql.SparkSess
> ion.sessionState(SparkSession.scala:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccess
> orImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAc
> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at py4j.reflection.MethodInvoker.
> invoke(MethodInvoker.java:244)
> at py4j.reflection.ReflectionEngi
> ne.invoke(ReflectionEngine.java:357)
> at py4j.Gateway.invoke(Gateway.java:280)
> at py4j.commands.AbstractCommand.
> invokeMethod(AbstractCommand.java:132)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:214)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorA
> ccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorA
> ccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstruc

Re: problem initiating spark context with pyspark

2017-06-10 Thread Gourav Sengupta
seeing for the very first time someone try SPARK on Windows :)

On Thu, Jun 8, 2017 at 8:38 PM, Marco Mistroni  wrote:

> try this link
>
> http://letstalkspark.blogspot.co.uk/2016/02/getting-started-
> with-spark-on-window-64.html
>
> it helped me when i had similar problems with windows...
>
> hth
>
> On Wed, Jun 7, 2017 at 3:46 PM, Curtis Burkhalter <
> curtisburkhal...@gmail.com> wrote:
>
>> Thanks Doc I saw this on another board yesterday so I've tried this by
>> first going to the directory where I've stored the wintutils.exe and then
>> as an admin running the command  that you suggested and I get this
>> exception when checking the permissions:
>>
>> C:\winutils\bin>winutils.exe ls -F C:\tmp\hive
>> FindFileOwnerAndPermission error (1789): The trust relationship between
>> this workstation and the primary domain failed.
>>
>> I'm fairly new to the command line and determining what the different
>> exceptions mean. Do you have any advice what this error means and how I
>> might go about fixing this?
>>
>> Thanks again
>>
>>
>> On Wed, Jun 7, 2017 at 9:51 AM, Doc Dwarf  wrote:
>>
>>> Hi Curtis,
>>>
>>> I believe in windows, the following command needs to be executed: (will
>>> need winutils installed)
>>>
>>> D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive
>>>
>>>
>>>
>>> On 6 June 2017 at 09:45, Curtis Burkhalter 
>>> wrote:
>>>
 Hello all,

 I'm new to Spark and I'm trying to interact with it using Pyspark. I'm
 using the prebuilt version of spark v. 2.1.1 and when I go to the command
 line and use the command 'bin\pyspark' I have initialization problems and
 get the following message:

 C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
 Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41)
 [MSC v.1900 64 bit (AMD64)] on win32
 Type "help", "copyright", "credits" or "license" for more information.
 Using Spark's default log4j profile: org/apache/spark/log4j-default
 s.properties
 Setting default log level to "WARN".
 To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
 setLogLevel(newLevel).
 17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 17/06/06 10:30:21 WARN ObjectStore: Version information not found in
 metastore. hive.metastore.schema.verification is not enabled so
 recording the schema version 1.2.0
 17/06/06 10:30:21 WARN ObjectStore: Failed to get database default,
 returning NoSuchObjectException
 Traceback (most recent call last):
   File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
 line 63, in deco
 return f(*a, **kw)
   File 
 "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
 line 319, in get_return_value
 py4j.protocol.Py4JJavaError: An error occurred while calling
 o22.sessionState.
 : java.lang.IllegalArgumentException: Error while instantiating
 'org.apache.spark.sql.hive.HiveSessionState':
 at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$Spar
 kSession$$reflect(SparkSession.scala:981)
 at org.apache.spark.sql.SparkSession.sessionState$lzycompute(Sp
 arkSession.scala:110)
 at org.apache.spark.sql.SparkSession.sessionState(SparkSession.
 scala:109)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
 ssorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
 thodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
 at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.jav
 a:357)
 at py4j.Gateway.invoke(Gateway.java:280)
 at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.j
 ava:132)
 at py4j.commands.CallCommand.execute(CallCommand.java:79)
 at py4j.GatewayConnection.run(GatewayConnection.java:214)
 at java.lang.Thread.run(Thread.java:748)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance(Native
 ConstructorAccessorImpl.java:62)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(De
 legatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:4
 23)
 at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$Spar
 kSession$$reflect(SparkSession.scala:978)
 ... 13 more
 Caused by: 

Re: problem initiating spark context with pyspark

2017-06-08 Thread Marco Mistroni
try this link

http://letstalkspark.blogspot.co.uk/2016/02/getting-started-with-spark-on-window-64.html

it helped me when i had similar problems with windows...

hth

On Wed, Jun 7, 2017 at 3:46 PM, Curtis Burkhalter <
curtisburkhal...@gmail.com> wrote:

> Thanks Doc I saw this on another board yesterday so I've tried this by
> first going to the directory where I've stored the wintutils.exe and then
> as an admin running the command  that you suggested and I get this
> exception when checking the permissions:
>
> C:\winutils\bin>winutils.exe ls -F C:\tmp\hive
> FindFileOwnerAndPermission error (1789): The trust relationship between
> this workstation and the primary domain failed.
>
> I'm fairly new to the command line and determining what the different
> exceptions mean. Do you have any advice what this error means and how I
> might go about fixing this?
>
> Thanks again
>
>
> On Wed, Jun 7, 2017 at 9:51 AM, Doc Dwarf  wrote:
>
>> Hi Curtis,
>>
>> I believe in windows, the following command needs to be executed: (will
>> need winutils installed)
>>
>> D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive
>>
>>
>>
>> On 6 June 2017 at 09:45, Curtis Burkhalter 
>> wrote:
>>
>>> Hello all,
>>>
>>> I'm new to Spark and I'm trying to interact with it using Pyspark. I'm
>>> using the prebuilt version of spark v. 2.1.1 and when I go to the command
>>> line and use the command 'bin\pyspark' I have initialization problems and
>>> get the following message:
>>>
>>> C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
>>> Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41)
>>> [MSC v.1900 64 bit (AMD64)] on win32
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> Using Spark's default log4j profile: org/apache/spark/log4j-default
>>> s.properties
>>> Setting default log level to "WARN".
>>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>>> setLogLevel(newLevel).
>>> 17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 17/06/06 10:30:21 WARN ObjectStore: Version information not found in
>>> metastore. hive.metastore.schema.verification is not enabled so
>>> recording the schema version 1.2.0
>>> 17/06/06 10:30:21 WARN ObjectStore: Failed to get database default,
>>> returning NoSuchObjectException
>>> Traceback (most recent call last):
>>>   File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
>>> line 63, in deco
>>> return f(*a, **kw)
>>>   File 
>>> "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
>>> line 319, in get_return_value
>>> py4j.protocol.Py4JJavaError: An error occurred while calling
>>> o22.sessionState.
>>> : java.lang.IllegalArgumentException: Error while instantiating
>>> 'org.apache.spark.sql.hive.HiveSessionState':
>>> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$Spar
>>> kSession$$reflect(SparkSession.scala:981)
>>> at org.apache.spark.sql.SparkSession.sessionState$lzycompute(Sp
>>> arkSession.scala:110)
>>> at org.apache.spark.sql.SparkSession.sessionState(SparkSession.
>>> scala:109)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.jav
>>> a:357)
>>> at py4j.Gateway.invoke(Gateway.java:280)
>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.j
>>> ava:132)
>>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>> at py4j.GatewayConnection.run(GatewayConnection.java:214)
>>> at java.lang.Thread.run(Thread.java:748)
>>> Caused by: java.lang.reflect.InvocationTargetException
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> Method)
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance(Native
>>> ConstructorAccessorImpl.java:62)
>>> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(De
>>> legatingConstructorAccessorImpl.java:45)
>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:4
>>> 23)
>>> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$Spar
>>> kSession$$reflect(SparkSession.scala:978)
>>> ... 13 more
>>> Caused by: java.lang.IllegalArgumentException: Error while
>>> instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
>>> at org.apache.spark.sql.internal.SharedState$.org$apache$spark$
>>> sql$internal$SharedState$$reflect(SharedState.scala:169)
>>>   

Re: problem initiating spark context with pyspark

2017-06-07 Thread Curtis Burkhalter
Thanks Doc I saw this on another board yesterday so I've tried this by
first going to the directory where I've stored the wintutils.exe and then
as an admin running the command  that you suggested and I get this
exception when checking the permissions:

C:\winutils\bin>winutils.exe ls -F C:\tmp\hive
FindFileOwnerAndPermission error (1789): The trust relationship between
this workstation and the primary domain failed.

I'm fairly new to the command line and determining what the different
exceptions mean. Do you have any advice what this error means and how I
might go about fixing this?

Thanks again


On Wed, Jun 7, 2017 at 9:51 AM, Doc Dwarf  wrote:

> Hi Curtis,
>
> I believe in windows, the following command needs to be executed: (will
> need winutils installed)
>
> D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive
>
>
>
> On 6 June 2017 at 09:45, Curtis Burkhalter 
> wrote:
>
>> Hello all,
>>
>> I'm new to Spark and I'm trying to interact with it using Pyspark. I'm
>> using the prebuilt version of spark v. 2.1.1 and when I go to the command
>> line and use the command 'bin\pyspark' I have initialization problems and
>> get the following message:
>>
>> C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
>> Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41)
>> [MSC v.1900 64 bit (AMD64)] on win32
>> Type "help", "copyright", "credits" or "license" for more information.
>> Using Spark's default log4j profile: org/apache/spark/log4j-default
>> s.properties
>> Setting default log level to "WARN".
>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>> setLogLevel(newLevel).
>> 17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 17/06/06 10:30:21 WARN ObjectStore: Version information not found in
>> metastore. hive.metastore.schema.verification is not enabled so
>> recording the schema version 1.2.0
>> 17/06/06 10:30:21 WARN ObjectStore: Failed to get database default,
>> returning NoSuchObjectException
>> Traceback (most recent call last):
>>   File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
>> line 63, in deco
>> return f(*a, **kw)
>>   File 
>> "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py",
>> line 319, in get_return_value
>> py4j.protocol.Py4JJavaError: An error occurred while calling
>> o22.sessionState.
>> : java.lang.IllegalArgumentException: Error while instantiating
>> 'org.apache.spark.sql.hive.HiveSessionState':
>> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$
>> SparkSession$$reflect(SparkSession.scala:981)
>> at org.apache.spark.sql.SparkSession.sessionState$lzycompute(
>> SparkSession.scala:110)
>> at org.apache.spark.sql.SparkSession.sessionState(SparkSession.
>> scala:109)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.
>> java:357)
>> at py4j.Gateway.invoke(Gateway.java:280)
>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.j
>> ava:132)
>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>> at py4j.GatewayConnection.run(GatewayConnection.java:214)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by: java.lang.reflect.InvocationTargetException
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance(Native
>> ConstructorAccessorImpl.java:62)
>> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(De
>> legatingConstructorAccessorImpl.java:45)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:4
>> 23)
>> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$
>> SparkSession$$reflect(SparkSession.scala:978)
>> ... 13 more
>> Caused by: java.lang.IllegalArgumentException: Error while instantiating
>> 'org.apache.spark.sql.hive.HiveExternalCatalog':
>> at org.apache.spark.sql.internal.SharedState$.org$apache$spark$
>> sql$internal$SharedState$$reflect(SharedState.scala:169)
>> at org.apache.spark.sql.internal.SharedState.(SharedState
>> .scala:86)
>> at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.
>> apply(SparkSession.scala:101)
>> at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.
>> apply(SparkSession.scala:101)
>> at scala.Option.getOrElse(Option.scala:121)
>> at 

Re: problem initiating spark context with pyspark

2017-06-07 Thread Doc Dwarf
Hi Curtis,

I believe in windows, the following command needs to be executed: (will
need winutils installed)

D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive



On 6 June 2017 at 09:45, Curtis Burkhalter 
wrote:

> Hello all,
>
> I'm new to Spark and I'm trying to interact with it using Pyspark. I'm
> using the prebuilt version of spark v. 2.1.1 and when I go to the command
> line and use the command 'bin\pyspark' I have initialization problems and
> get the following message:
>
> C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark
> Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41)
> [MSC v.1900 64 bit (AMD64)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> Using Spark's default log4j profile: org/apache/spark/log4j-
> defaults.properties
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
> setLogLevel(newLevel).
> 17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 17/06/06 10:30:21 WARN ObjectStore: Version information not found in
> metastore. hive.metastore.schema.verification is not enabled so recording
> the schema version 1.2.0
> 17/06/06 10:30:21 WARN ObjectStore: Failed to get database default,
> returning NoSuchObjectException
> Traceback (most recent call last):
>   File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py",
> line 63, in deco
> return f(*a, **kw)
>   File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0.
> 10.4-src.zip\py4j\protocol.py", line 319, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling
> o22.sessionState.
> : java.lang.IllegalArgumentException: Error while instantiating
> 'org.apache.spark.sql.hive.HiveSessionState':
> at org.apache.spark.sql.SparkSession$.org$apache$
> spark$sql$SparkSession$$reflect(SparkSession.scala:981)
> at org.apache.spark.sql.SparkSession.sessionState$
> lzycompute(SparkSession.scala:110)
> at org.apache.spark.sql.SparkSession.sessionState(
> SparkSession.scala:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> at py4j.reflection.ReflectionEngine.invoke(
> ReflectionEngine.java:357)
> at py4j.Gateway.invoke(Gateway.java:280)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.
> java:132)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:214)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.spark.sql.SparkSession$.org$apache$
> spark$sql$SparkSession$$reflect(SparkSession.scala:978)
> ... 13 more
> Caused by: java.lang.IllegalArgumentException: Error while instantiating
> 'org.apache.spark.sql.hive.HiveExternalCatalog':
> at org.apache.spark.sql.internal.SharedState$.org$apache$spark$
> sql$internal$SharedState$$reflect(SharedState.scala:169)
> at org.apache.spark.sql.internal.SharedState.(
> SharedState.scala:86)
> at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(
> SparkSession.scala:101)
> at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(
> SparkSession.scala:101)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.sql.SparkSession.sharedState$
> lzycompute(SparkSession.scala:101)
> at org.apache.spark.sql.SparkSession.sharedState(
> SparkSession.scala:100)
> at org.apache.spark.sql.internal.SessionState.(
> SessionState.scala:157)
> at org.apache.spark.sql.hive.HiveSessionState.(
> HiveSessionState.scala:32)
> ... 18 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)