[jira] [Created] (SPARK-32380) sparksql cannot access hbase external table in hive

deyzhong (Jira) Tue, 21 Jul 2020 04:11:43 -0700

deyzhong created SPARK-32380:
--------------------------------

             Summary: sparksql cannot access hbase external table in hive
                 Key: SPARK-32380
                 URL: https://issues.apache.org/jira/browse/SPARK-32380
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.0
         Environment: ||component||version||
|hadoop|2.8.5|
|hive|2.3.7|
|spark|3.0.0|
|hbase|1.4.9|
            Reporter: deyzhong



java.io.IOException: Cannot create a record reader because of a previous error. 
Please look at the previous logs lines from the task's full log for more 
details.
 at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:270)
 at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:131)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2158)
 at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1004)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:388)
 at org.apache.spark.rdd.RDD.collect(RDD.scala:1003)
 at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:385)
 at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:412)
 at 
org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:58)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
 at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:377)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:496)
 at scala.collection.Iterator.foreach(Iterator.scala:941)
 at scala.collection.Iterator.foreach$(Iterator.scala:941)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
 at scala.collection.IterableLike.foreach(IterableLike.scala:74)
 at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:490)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
 at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:474)
 at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:490)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:206)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
 at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
 at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
 at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalStateException: The input format instance has not 
been properly initialized. Ensure you call initializeTable either in your 
constructor or initialize method
 at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getTable(TableInputFormatBase.java:652)
 at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:265)
 ... 62 more
java.io.IOException: Cannot create a record reader because of a previous error. 
Please look at the previous logs lines from the task's full log for more 
details.
 at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:270)
 at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:131)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
 at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
 at scala.Option.getOrElse(Option.scala:189)
 at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
 at org.apache.spark.SparkContext.runJob(SparkContext.scala:2158)
 at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1004)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:388)
 at org.apache.spark.rdd.RDD.collect(RDD.scala:1003)
 at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:385)
 at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:412)
 at 
org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:58)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
 at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:377)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:496)
 at scala.collection.Iterator.foreach(Iterator.scala:941)
 at scala.collection.Iterator.foreach$(Iterator.scala:941)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
 at scala.collection.IterableLike.foreach(IterableLike.scala:74)
 at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:490)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
 at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:474)
 at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:490)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:206)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
 at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
 at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
 at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalStateException: The input format instance has not 
been properly initialized. Ensure you call initializeTable either in your 
constructor or initialize method
 at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getTable(TableInputFormatBase.java:652)
 at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:265)
 ... 62 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-32380) sparksql cannot access hbase external table in hive

Reply via email to