[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073036#comment-15073036 ] Peng Cheng commented on SPARK-10625: I think the pull request has been merged. Can it be closed now? > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073046#comment-15073046 ] Sean Owen commented on SPARK-10625: --- It has not been merged. https://github.com/apache/spark/pull/8785 > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059959#comment-15059959 ] Chandra Sekhar commented on SPARK-10625: Hi I built spark with the pull request now It's able to handle JDBC drivers that adds unserializable objects in connection properties > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059390#comment-15059390 ] Peng Cheng commented on SPARK-10625: https://github.com/apache/spark/pull/8785 this one :) others are already withdrawn, forgive me for not knowing the rule for pull requests. > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052617#comment-15052617 ] Chandra Sekhar commented on SPARK-10625: Can i test this now? which version I have to download which contains this change. > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030087#comment-15030087 ] Peng Cheng commented on SPARK-10625: Yes it is 2 months behind master branch. I've just rebased on master and patched all unit test errors (a dependency used by the new JDBCRDD now validates if all connection properties can be casted into String, so the deep copy is now almost used anywhere to correct some particularly undisciplined JDBC driver implementations). Is it mergeable now? If you like I can rebase on 1.6 branch and issue a pull request shortly. > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029810#comment-15029810 ] Sean Owen commented on SPARK-10625: --- [~peng] your patch is not mergeable. Please see the PR review. I assume that's why it can't proceed. > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030248#comment-15030248 ] Apache Spark commented on SPARK-10625: -- User 'tribbloid' has created a pull request for this issue: https://github.com/apache/spark/pull/10020 > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030247#comment-15030247 ] Apache Spark commented on SPARK-10625: -- User 'tribbloid' has created a pull request for this issue: https://github.com/apache/spark/pull/10021 > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030256#comment-15030256 ] Peng Cheng commented on SPARK-10625: rebased on 1.5 branch > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030257#comment-15030257 ] Peng Cheng commented on SPARK-10625: Rebased on 1.6 branch > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030251#comment-15030251 ] Apache Spark commented on SPARK-10625: -- User 'tribbloid' has created a pull request for this issue: https://github.com/apache/spark/pull/10022 > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030259#comment-15030259 ] Peng Cheng commented on SPARK-10625: squash rebased on master > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029361#comment-15029361 ] Peng Cheng commented on SPARK-10625: Hi committers, after 1.5.2 release this is still not merged, now this bugfix is blocking us from committing other features (namely JDBC dialects for several databases). Can you prompt me what should be done next to make it through the next release? (resolve conflict? add more tests? should be easy). Thanks a lot pals --Yours Peng > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941775#comment-14941775 ] Peng Cheng commented on SPARK-10625: The patch can be merged immediately, can some one verify and merge it? > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900980#comment-14900980 ] Peng Cheng commented on SPARK-10625: Exactly as you described, driver can always be initialized from username, password and url, the other objects are for resource pooling or saving object creation costs. Not sure if it applies to all drivers, so far I haven't seen a counter-example > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791830#comment-14791830 ] Sean Owen commented on SPARK-10625: --- Dumb question here, but if the driver needs these objects, and Spark needs to serialize them, how will that ever work? Does the driver not actually need them, just uses them if they're around? > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791252#comment-14791252 ] Peng Cheng commented on SPARK-10625: A pull request has been send that contains 2 extra unit tests and a simple fix: https://github.com/apache/spark/pull/8785 Can you help me validating it and merge in 1.5.1? > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791420#comment-14791420 ] Apache Spark commented on SPARK-10625: -- User 'tribbloid' has created a pull request for this issue: https://github.com/apache/spark/pull/8785 > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10625) Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds unserializable objects into connection properties
[ https://issues.apache.org/jira/browse/SPARK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746450#comment-14746450 ] Peng Cheng commented on SPARK-10625: branch: https://github.com/Schedule1/spark/tree/SPARK-10625 will submit a pull request after I fixed all tests > Spark SQL JDBC read/write is unable to handle JDBC Drivers that adds > unserializable objects into connection properties > -- > > Key: SPARK-10625 > URL: https://issues.apache.org/jira/browse/SPARK-10625 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1, 1.5.0 > Environment: Ubuntu 14.04 >Reporter: Peng Cheng > Labels: jdbc, spark, sparksql > > Some JDBC drivers (e.g. SAP HANA) tries to optimize connection pooling by > adding new objects into the connection properties, which is then reused by > Spark to be deployed to workers. When some of these new objects are unable to > be serializable it will trigger an org.apache.spark.SparkException: Task not > serializable. The following test code snippet demonstrate this problem by > using a modified H2 driver: > test("INSERT to JDBC Datasource with UnserializableH2Driver") { > object UnserializableH2Driver extends org.h2.Driver { > override def connect(url: String, info: Properties): Connection = { > val result = super.connect(url, info) > info.put("unserializableDriver", this) > result > } > override def getParentLogger: Logger = ??? > } > import scala.collection.JavaConversions._ > val oldDrivers = > DriverManager.getDrivers.filter(_.acceptsURL("jdbc:h2:")).toSeq > oldDrivers.foreach{ > DriverManager.deregisterDriver > } > DriverManager.registerDriver(UnserializableH2Driver) > sql("INSERT INTO TABLE PEOPLE1 SELECT * FROM PEOPLE") > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", properties).count) > assert(2 === sqlContext.read.jdbc(url1, "TEST.PEOPLE1", > properties).collect()(0).length) > DriverManager.deregisterDriver(UnserializableH2Driver) > oldDrivers.foreach{ > DriverManager.registerDriver > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org