+1 on escaping column names.
> On Apr 1, 2015, at 5:50 AM, fergjo00 <johngfergu...@gmail.com> wrote: > > Question: > ----------- > Is there a way to have JDBC DataFrames use quoted/escaped column names? > Right now, it looks like it "sees" the names correctly in the schema created > but does not escape them in the SQL it creates when they are not compliant: > > org.apache.spark.sql.jdbc.JDBCRDD > .... > private val columnList: String = { > val sb = new StringBuilder() > columns.foreach(x => sb.append(",").append(x)) > if (sb.length == 0) "1" else sb.substring(1) > } > > > If you see value in this, I would take a shot at adding the quoting > (escaping) of column names here. If you don't do it, some drivers... like > postgresql's will simply drop case all names when parsing the query. As you > can see in the TL;DR below that means they won't match the schema I am > given. > > Thanks. > > TL;DR: > -------- > I am able to connect to a Postgres database in the shell (with driver > referenced): > > val jdbcDf = > sqlContext.jdbc("jdbc:postgresql://localhost/sparkdemo?user=dbuser", > "sp500") > > In fact when I run: > > jdbcDf.registerTempTable("sp500") > val avgEPSNamed = sqlContext.sql("SELECT AVG(`Earnings/Share`) as AvgCPI > FROM sp500") > > and > > val avgEPSProg = jsonDf.agg(avg(jsonDf.col("Earnings/Share"))) > > The values come back as expected. However, if I try: > > jdbcDf.show > > Or if I try > > val all = sqlContext.sql("SELECT * FROM sp500") > all.show > > I get errors about column names not being found. In fact the error includes > a mention of column names all lower cased. For now I will change my schema > to be more restrictive. Right now it is, per a Stack Overflow poster, not > ANSI compliant by doing things that are allowed by ""'s in pgsql, MySQL and > SQLServer. BTW, our users are giving us tables like this... because various > tools they already use support non-compliant names. In fact, this is mild > compared to what we've had to support. > > Currently the schema in question uses mixed case, quoted names with special > characters and spaces: > > CREATE TABLE sp500 > ( > "Symbol" text, > "Name" text, > "Sector" text, > "Price" double precision, > "Dividend Yield" double precision, > "Price/Earnings" double precision, > "Earnings/Share" double precision, > "Book Value" double precision, > "52 week low" double precision, > "52 week high" double precision, > "Market Cap" double precision, > "EBITDA" double precision, > "Price/Sales" double precision, > "Price/Book" double precision, > "SEC Filings" text > ) > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-3-0-DataFrame-and-Postgres-tp22338.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org