from:"dominic kim"

Re: pyspark(sparksql-v 2.4) cannot read hive table which is created

2020-03-16 Thread dominic kim

I solved the problem with the option below
spark.sql ("SET spark.hadoop.metastore.catalog.default = hive") 
spark.sql ("SET spark.sql.hive.convertMetastoreOrc = false")



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: pyspark(sparksql-v 2.4) cannot read hive table which is created

2020-03-16 Thread dominic kim

I solved the problem with the option below
spark.sql ("SET spark.hadoop.metastore.catalog.default = hive") 
spark.sql ("SET spark.sql.hive.convertMetastoreOrc = false")



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

pyspark(sparksql-v 2.4) cannot read hive table which is created

2020-03-16 Thread dominic kim

I use related spark config value but not works like below(success in spark
2.1.1) :
spark.hive.mapred.supports.subdirectories=true
spark.hive.supports.subdirectories=true
spark.mapred.input.dir.recursive=true
spark.hive.mapred.supports.subdirectories=true

And when I query, I also use related hive config but not works like below:
mapred.input.dir.recursive=true
hive.mapred.supports.subdirectories=true

I already know if load the path like
'/user/test/warehouse/somedb.db/dt=20200312/*/' as Dataframein pyspark, it
works. But for complex business logic, I should use spark.sql().

Please give me advise.
Thanks !

* Code
from pyspark import SparkConf, SparkContext
from pyspark.sql import HiveContext, SparkSession
if __name__ == "__main__":
spark = SparkSession \
.builder \
.appName("Sub-Directory Test") \
.enableHiveSupport() \
.getOrCreate()
spark.sql("select * from somedb.table where dt = '20200301' limit
10").show()

* Hive table directory path
/user/test/warehouse/somedb.db/dt=20200312/1/00_0
/user/test/warehouse/somedb.db/dt=20200312/1/00_1
.
.
/user/test/warehouse/somedb.db/dt=20200312/2/00_0
/user/test/warehouse/somedb.db/dt=20200312/3/00_0



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: pyspark(sparksql-v 2.4) cannot read hive table which is created

Re: pyspark(sparksql-v 2.4) cannot read hive table which is created

pyspark(sparksql-v 2.4) cannot read hive table which is created

3 matches

Site Navigation

Mail list logo

Footer information