Re: Spark JDBC errors out

Mich Talebzadeh Fri, 30 Apr 2021 02:37:40 -0700

Hi,

If you are using Spark JDBC connection then you can do the following
generic JDBC from PySpark say. that tablename could be sql query as well
(select col1, col2 from <table>)


## load from database
def loadTableFromJDBC(spark, url, tableName, user, password, driver,
fetchsize):
    try:
       house_df = spark.read. \
            format("jdbc"). \
            option("url", url). \
            option("dbtable", tableName). \
            option("user", user). \
            option("password", password). \
            option("driver", driver). \
            option("fetchsize", fetchsize). \
            load()
       return house_df
    except Exception as e:
        print(f"""{e}, quitting""")
        sys.exit(1)

## write to database
def writeTableWithJDBC(dataFrame, url, tableName, user, password, driver,
mode):
    try:
        dataFrame. \
            write. \
            format("jdbc"). \
            option("url", url). \
            option("dbtable", tableName). \
            option("user", user). \
            option("password", password). \
            option("driver", driver). \
            mode(mode). \
            save()
    except Exception as e:
        print(f"""{e}, quitting""")
        sys.exit(1)

I don't know about AWS, but this will allow you to connect to Google
BigQuery. Check AWS documentation


def loadTableFromBQ(spark,dataset,tableName):
    try:
        df = spark.read. \
            format("bigquery"). \
            option("credentialsFile",
config['GCPVariables']['jsonKeyFile']). \
            option("dataset", dataset). \
            option("table", tableName). \
            load()
        return df
    except Exception as e:
        print(f"""{e}, quitting""")
        sys.exit(1)

def writeTableToBQ(dataFrame,mode,dataset,tableName):
    try:
        dataFrame. \
            write. \
            format("bigquery"). \
            option("credentialsFile",
config['GCPVariables']['jsonKeyFile']). \
            mode(mode). \
            option("dataset", dataset). \
            option("table", tableName). \
            save()
    except Exception as e:
        print(f"""{e}, quitting""")
        sys.exit(1)

HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 30 Apr 2021 at 09:32, Farhan Misarwala <farhan.misarw...@gmail.com>
wrote:

> Hi All,
>
> I am trying to read data from 'Azure Analysis Services' using CData's AAS
> JDBC driver <https://cdn.cdata.com/help/OAF/jdbc/pg_JDBCconnectcode.htm>.
>
> I am doing spark.read().jdbc(jdbcURL, aasQuery, connectionProperties)
>
> This errors out as the driver is throwing an exception saying there's no
> 'dbtable' as a connection property. This means spark is trying to set
> 'dbtable' as a connection property on the driver, instead of creating a
> connection and then executing the query I specify in the options 'dbtable'
> or 'query'.
>
> Usually, in my non-spark/vanilla java code, I use java.sql.DriverManager
> to create a connection and then execute a query successfully using
> execute() from java.sql.Statement.
>
> Does the JDBC data source not follow such a flow? if not then what could
> be the best solution in this case? please advice. It would be great if I
> can query AAS through our beloved Apache Spark ;)
>
> TIA,
> Farhan.
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark JDBC errors out

Reply via email to