[ https://issues.apache.org/jira/browse/SPARK-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Charles Drotar updated SPARK-13156: ----------------------------------- Comment: was deleted (was: Thanks Sean. The driver inhibiting the concurrent connections was the issue. Apparently the Teradata driver does not support concurrent connections and instead suggests creating different sessions for each query. I don't think this is truly an issue so I will close out the JIRA.) > JDBC using multiple partitions creates additional tasks but only executes on > one > -------------------------------------------------------------------------------- > > Key: SPARK-13156 > URL: https://issues.apache.org/jira/browse/SPARK-13156 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 1.5.0 > Environment: Hadoop 2.6.0-cdh5.4.0, Teradata, yarn-client > Reporter: Charles Drotar > > I can successfully kick off a query through JDBC to Teradata, and when it > runs it creates a task on each executor for every partition. The problem is > that all of the tasks except for one complete within a couple seconds and the > final task handles the entire dataset. > Example Code: > private val properties = new java.util.Properties() > properties.setProperty("driver","com.teradata.jdbc.TeraDriver") > properties.setProperty("username","foo") > properties.setProperty("password","bar") > val url = "jdbc:teradata://oneview/, TMODE=TERA,TYPE=FASTEXPORT,SESSIONS=10" > val numPartitions = 5 > val dbTableTemp = "( SELECT id MOD $numPartitions%d AS modulo, id FROM > db.table) AS TEMP_TABLE" > val partitionColumn = "modulo" > val lowerBound = 0.toLong > val upperBound = (numPartitions-1).toLong > val df = > sqlContext.read.jdbc(url,dbTableTemp,partitionColumn,lowerBound,upperBound,numPartitions,properties) > df.write.parquet("/output/path/for/df/") > When I look at the Spark UI I see the 5 tasks, but only 1 is actually > querying. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org