Apparently this is a OS dynamic lib link error.  Make sure you have the LD_LIBRARY_PATH (in Linux) or PATH (windows) set up properly for the right .so or .dll file...

On 12/2/20 5:31 PM, Mich Talebzadeh wrote:
Hi,

I have a simple code that tries to create Hive derby database as follows:

from pysparkimport SparkContext
from pyspark.sqlimport SQLContext
from pyspark.sqlimport HiveContext
from pyspark.sqlimport SparkSession
from pyspark.sqlimport Row
from pyspark.sql.typesimport StringType, ArrayType
from pyspark.sql.functionsimport udf, col, maxas max, to_date, date_add, \
     add_months
from datetimeimport datetime, timedelta
import os
from os.pathimport join, abspath
from typingimport Optional
import logging
import random
import string
import math
warehouseLocation ='c:\\Users\\admin\\PycharmProjects\\pythonProject\\spark-warehouse' local_scrtatchdir ='c:\\Users\\admin\\PycharmProjects\\pythonProject\\hive-localscratchdir' scrtatchdir ='c:\\Users\\admin\\PycharmProjects\\pythonProject\\hive-scratchdir' tmp_dir ='d:\\temp\\hive' metastore_db ='jdbc:derby:C:\\Users\\admin\\PycharmProjects\\pythonProject\\metastore_db;create=true' ConnectionDriverName ='org.apache.derby.EmbeddedDriver' spark = SparkSession \
     .builder \
     .appName("App1") \
     .config("hive.exec.local.scratchdir", local_scrtatchdir) \
     .config("hive.exec.scratchdir", scrtatchdir) \
     .config("spark.sql.warehouse.dir", warehouseLocation) \
     .config("hadoop.tmp.dir", tmp_dir) \
     .config("javax.jdo.option.ConnectionURL", metastore_db ) \
     .config("javax.jdo.option.ConnectionDriverName", ConnectionDriverName) \
     .enableHiveSupport() \
     .getOrCreate()
print(os.listdir(warehouseLocation))
print(os.listdir(local_scrtatchdir))
print(os.listdir(scrtatchdir))
print(os.listdir(tmp_dir))
sc = SparkContext.getOrCreate()
sqlContext = SQLContext(sc)
HiveContext = HiveContext(sc)
spark.sql("CREATE DATABASE IF NOT EXISTS test")

Now this comes back with the following:


C:\Users\admin\PycharmProjects\pythonProject\venv\Scripts\python.exe C:/Users/admin/PycharmProjects/pythonProject/main.py

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

Setting default log level to "WARN".

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

[]

[]

[]

['hive-localscratchdir', 'hive-scratchdir', 'hive-warehouse']

Traceback (most recent call last):

  File "C:/Users/admin/PycharmProjects/pythonProject/main.py", line 76, in <module>

    spark.sql("CREATE DATABASE IF NOT EXISTS test")

  File "D:\temp\spark\python\pyspark\sql\session.py", line 649, in sql

    return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)

  File "D:\temp\spark\python\lib\py4j-0.10.9-src.zip\py4j\java_gateway.py", line 1305, in __call__

  File "D:\temp\spark\python\pyspark\sql\utils.py", line 134, in deco

raise_from(converted)

  File "<string>", line 3, in raise_from

*pyspark.sql.utils.AnalysisException: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(Ljava/lang/String;I)V;*


Process finished with exit code 1


Also under %SPARK_HOME%/conf I also have hive-site.xml file. It is not obvious to me why it is throwing this error?

Thanks


LinkedIn /https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>/



*Disclaimer:* Use it at your own risk.Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

Reply via email to