Hi Spark Users, We do a lot of processing in Spark using data that is in MS SQL server. Today, I created a DataFrame against a table in SQL Server using the following:
val dfSql=spark.read.jdbc(connectionString, table, props) I noticed that every column in the DataFrame showed as *nullable=true, *even though many of them are required. I went hunting in the code, and I found that in JDBCRDD, when it resolves the schema of a table, it passes in *alwaysNullable=true* to JdbcUtils, which forces all columns to resolve as nullable. https://github.com/apache/spark/blob/branch-2.3/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L62 I don't see a way to change that functionality. Is this by design, or could it be a bug? Thanks! Subhash