[ 
https://issues.apache.org/jira/browse/SPARK-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-7869:
-------------------------------
    Description: 
Most of our tables load into dataframes just fine with postgres. However we 
have a number of tables leveraging the JSONB datatype. Spark will error and 
refuse to load this table. While asking for Spark to support JSONB might be a 
tall order in the short term, it would be great if Spark would at least load 
the table ignoring the columns it can't load or have it be an option.
{code}
pdf = sql_context.load(source="jdbc", url=url, dbtable="table_of_json")

Py4JJavaError: An error occurred while calling o41.load.
: java.sql.SQLException: Unsupported type 1111
    at org.apache.spark.sql.jdbc.JDBCRDD$.getCatalystType(JDBCRDD.scala:78)
    at org.apache.spark.sql.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:112)
    at org.apache.spark.sql.jdbc.JDBCRelation.<init>(JDBCRelation.scala:133)
    at 
org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:121)
    at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:219)
    at org.apache.spark.sql.SQLContext.load(SQLContext.scala:697)
    at org.apache.spark.sql.SQLContext.load(SQLContext.scala:685)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:745)
{code}

  was:
Most of our tables load into dataframes just fine with postgres. However we 
have a number of tables leveraging the JSONB datatype. Spark will error and 
refuse to load this table. While asking for Spark to support JSONB might be a 
tall order in the short term, it would be great if Spark would at least load 
the table ignoring the columns it can't load or have it be an option.

pdf = sql_context.load(source="jdbc", url=url, dbtable="table_of_json")

Py4JJavaError: An error occurred while calling o41.load.
: java.sql.SQLException: Unsupported type 1111
    at org.apache.spark.sql.jdbc.JDBCRDD$.getCatalystType(JDBCRDD.scala:78)
    at org.apache.spark.sql.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:112)
    at org.apache.spark.sql.jdbc.JDBCRelation.<init>(JDBCRelation.scala:133)
    at 
org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:121)
    at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:219)
    at org.apache.spark.sql.SQLContext.load(SQLContext.scala:697)
    at org.apache.spark.sql.SQLContext.load(SQLContext.scala:685)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:745)


> Spark Data Frame Fails to Load Postgres Tables with JSONB DataType Columns
> --------------------------------------------------------------------------
>
>                 Key: SPARK-7869
>                 URL: https://issues.apache.org/jira/browse/SPARK-7869
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.3.0, 1.3.1
>         Environment: Spark 1.3.1
>            Reporter: Brad Willard
>            Priority: Minor
>
> Most of our tables load into dataframes just fine with postgres. However we 
> have a number of tables leveraging the JSONB datatype. Spark will error and 
> refuse to load this table. While asking for Spark to support JSONB might be a 
> tall order in the short term, it would be great if Spark would at least load 
> the table ignoring the columns it can't load or have it be an option.
> {code}
> pdf = sql_context.load(source="jdbc", url=url, dbtable="table_of_json")
> Py4JJavaError: An error occurred while calling o41.load.
> : java.sql.SQLException: Unsupported type 1111
>     at org.apache.spark.sql.jdbc.JDBCRDD$.getCatalystType(JDBCRDD.scala:78)
>     at org.apache.spark.sql.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:112)
>     at org.apache.spark.sql.jdbc.JDBCRelation.<init>(JDBCRelation.scala:133)
>     at 
> org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:121)
>     at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:219)
>     at org.apache.spark.sql.SQLContext.load(SQLContext.scala:697)
>     at org.apache.spark.sql.SQLContext.load(SQLContext.scala:685)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>     at py4j.Gateway.invoke(Gateway.java:259)
>     at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>     at py4j.commands.CallCommand.execute(CallCommand.java:79)
>     at py4j.GatewayConnection.run(GatewayConnection.java:207)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to