Re: Error with Spark + IGFS (HDFS cache) through Hive

Evgenii Zhuravlev Tue, 11 Sep 2018 06:25:36 -0700

Hi,

Do you really need to use Hive here? You can just use Spark integration
with Ignite, which allows to run sql: DataFrame(
https://apacheignite-fs.readme.io/docs/ignite-data-frame) or RDD(
https://apacheignite-fs.readme.io/docs/ignitecontext-igniterdd). For sure,
this solution will work much faster.


Evgenii

пн, 10 сент. 2018 г. в 23:08, Maximiliano Patricio Méndez <
mmen...@despegar.com>:

> Hi,
>
> I'm having an LinkageError in spark trying to read a hive table that has
> the external location in IGFS:
> java.lang.LinkageError: loader constraint violation: when resolving field
> "LOG" the class loader (instance of
> org/apache/spark/sql/hive/client/IsolatedClientLoader$$anon$1) of the
> referring class, org/apache/hadoop/fs/FileSystem, and the class loader
> (instance of sun/misc/Launcher$AppClassLoader) for the field's resolved
> type, org/apache/commons/logging/Log, have different Class objects for that
> type
>   at
> org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem.initialize(IgniteHadoopFileSystem.java:255)
>
> From what I can see the exception comes when spark tries to read a table
> from Hive and then through IGFS and passing the "LOG" variable of the
> FileSystem around to the HadoopIgfsWrapper (and beyond...).
>
> The steps I followed to reach this error were:
>
>    - Create a file /tmp/test.parquet in HDFS
>    - Create an external table test.test in hive with location =
>    igfs://igfs@<host>/tmp/test.parquet
>    - Start spark-shell with the command:
>       - ./bin/spark-shell --jars
>       
> $IGNITE_HOME/ignite-core-2.6.0.jar,$IGNITE_HOME/ignite-hadoop/ignite-hadoop-2.6.0.jar,$IGNITE_HOME/ignite-shmem-1.0.0.jar,$IGNITE_HOME/ignite-spark-2.6.0.jar
>       - Read the table through spark.sql
>       - spark.sql("SELECT * FROM test.test")
>
> Is there maybe a way to avoid having this issue? Has anyone used ignite
> through hive as HDFS cache in a similar way?
>
>
>

Re: Error with Spark + IGFS (HDFS cache) through Hive

Reply via email to