Piotr Milanowski created SPARK-16224:
----------------------------------------

             Summary: Hive context created by HiveContext can't access Hive 
databases when used in a script launched be spark-submit
                 Key: SPARK-16224
                 URL: https://issues.apache.org/jira/browse/SPARK-16224
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.0.0
         Environment: branch-2.0
            Reporter: Piotr Milanowski


Hi,
This is a continuation of a resolved bug 
[SPARK-15345|https://issues.apache.org/jira/browse/SPARK-15345]

I can access databases when using new methodology, i.e:

{code}
from pyspark.sql import SparkSession
from pyspark import SparkConf

if __name__ == "__main__":
    conf = SparkConf()
    hc = 
SparkSession.builder.config(conf=conf).enableHiveSupport().getOrCreate()
    print(hc.sql("show databases").collect())
{code}
This shows all database in hive.

However, using HiveContext, i.e.:
{code}
from pyspark.sql import HiveContext
from pyspark improt SparkContext, SparkConf

if __name__ == "__main__":
    conf = SparkConf()
    sc = SparkContext(conf=conf)
    hive_context = HiveContext(sc)
    print(hive_context.sql("show databases").collect())

    # The result is
    #[Row(result='default')]
{code}
prints only default database.

I have {{hive-site.xml}} file configured.

Those snippets are for scripts launched with {{spark-submit}} command. With 
pyspark those code fragments work fine, displaying all the databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to