Re: Creating remote tables using PySpark

2024-03-07 Thread Tom Barber
Okay that was some caching issue. Now there is a shared mount point between the place the pyspark code is executed and the spark nodes it runs. Hrmph, I was hoping that wouldn't be the case. Fair enough! On Thu, Mar 7, 2024 at 11:23 PM Tom Barber wrote: > Okay interesting, maybe my assumption

Re: Creating remote tables using PySpark

2024-03-07 Thread Tom Barber
Okay interesting, maybe my assumption was incorrect, although I'm still confused. I tried to mount a central mount point that would be the same on my local machine and the container. Same error although I moved the path to /tmp/hive/data/hive/ but when I rerun the test code to save a table,

Creating remote tables using PySpark

2024-03-07 Thread Tom Barber
Wonder if anyone can just sort my brain out here as to whats possible or not. I have a container running Spark, with Hive and a ThriftServer. I want to run code against it remotely. If I take something simple like this from pyspark.sql import SparkSession from pyspark.sql.types import