Hi,
I want to select from a parquet based table in shark, but receive the error:
shark> select * from wl_parquet;
14/04/17 11:33:49 INFO shark.SharkCliDriver: Execution Mode: shark
14/04/17 11:33:49 INFO ql.Driver: <PERFLOG method=Driver.run>
14/04/17 11:33:49 INFO ql.Driver: <PERFLOG method=TimeToSubmit>
14/04/17 11:33:49 INFO ql.Driver: <PERFLOG method=compile>
14/04/17 11:33:49 INFO parse.ParseDriver: Parsing command: select * from
wl_parquet
14/04/17 11:33:49 INFO parse.ParseDriver: Parse Completed
14/04/17 11:33:49 INFO parse.SharkSemanticAnalyzer: Get metadata for source
tables
FAILED: Hive Internal Error:
java.lang.RuntimeException(java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat)
14/04/17 11:33:50 ERROR shark.SharkDriver: FAILED: Hive Internal Error:
java.lang.RuntimeException(java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat)
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
at
org.apache.hadoop.hive.ql.metadata.Table.getInputFormatClass(Table.java:306)
at org.apache.hadoop.hive.ql.metadata.Table.<init>(Table.java:99)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:988)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:891)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1083)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at
shark.parse.SharkSemanticAnalyzer.analyzeInternal(SharkSemanticAnalyzer.scala:137)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:279)
at shark.SharkDriver.compile(SharkDriver.scala:215)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909)
at shark.SharkCliDriver.processCmd(SharkCliDriver.scala:338)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:235)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at
org.apache.hadoop.hive.ql.metadata.Table.getInputFormatClass(Table.java:302)
... 14 more
I can successfully select from that table with Hive and Impala, but shark
doesn't work. I am using CDH5 incl. Spark parcel and Shark 0.9.1.
In what jar is this class "hidden", how can I get rid of this exception ?!?!
The lib folder of shark contains:
[root@hadoop-pg-9 shark-0.9.1]# ll lib
total 180
lrwxrwxrwx 1 root root 67 16. Apr 14:17 hive-serdes-1.0-SNAPSHOT.jar ->
/opt/cloudera/parcels/CDH/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar
-rwxrwxr-x 1 root root 23086 9. Apr 10:57 JavaEWAH-0.4.2.jar
lrwxrwxrwx 1 root root 53 14. Apr 21:46 parquet-avro.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-avro.jar
lrwxrwxrwx 1 root root 58 14. Apr 21:46 parquet-cascading.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-cascading.jar
lrwxrwxrwx 1 root root 55 14. Apr 21:46 parquet-column.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-column.jar
lrwxrwxrwx 1 root root 55 14. Apr 21:46 parquet-common.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-common.jar
lrwxrwxrwx 1 root root 57 14. Apr 21:46 parquet-encoding.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-encoding.jar
lrwxrwxrwx 1 root root 55 14. Apr 21:46 parquet-format.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-format.jar
lrwxrwxrwx 1 root root 58 14. Apr 21:46 parquet-generator.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-generator.jar
lrwxrwxrwx 1 root root 62 14. Apr 21:46 parquet-hadoop-bundle.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-hadoop-bundle.jar
lrwxrwxrwx 1 root root 55 14. Apr 21:46 parquet-hadoop.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-hadoop.jar
-rw-r--r-- 1 root root 70103 27. Nov 21:24 parquet-hive-1.2.8.jar
lrwxrwxrwx 1 root root 56 14. Apr 21:46 parquet-scrooge.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-scrooge.jar
lrwxrwxrwx 1 root root 55 14. Apr 21:46 parquet-thrift.jar ->
/opt/cloudera/parcels/CDH/lib/hadoop/parquet-thrift.jar
-rw-rw-r-- 1 root root 76220 9. Apr 10:57 pyrolite.jar
thanks in advance, Gerd