[ https://issues.apache.org/jira/browse/SPARK-24583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516332#comment-16516332 ]
Apache Spark commented on SPARK-24583: -------------------------------------- User 'maryannxue' has created a pull request for this issue: https://github.com/apache/spark/pull/21585 > Wrong schema type in InsertIntoDataSourceCommand > ------------------------------------------------ > > Key: SPARK-24583 > URL: https://issues.apache.org/jira/browse/SPARK-24583 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Reporter: Maryann Xue > Priority: Major > Fix For: 2.4.0 > > > For a DataSource table, whose schema contains a field with "nullable=false", > while user tries to insert a NULL value into this field, the input dataFrame > will return an incorrect value or throw NullPointerException. And that's > because, the schema nullability of the input relation has been overridden > bluntly with the destination schema by the code below in > {{InsertIntoDataSourceCommand}}: > {code:java} > override def run(sparkSession: SparkSession): Seq[Row] = { > val relation = logicalRelation.relation.asInstanceOf[InsertableRelation] > val data = Dataset.ofRows(sparkSession, query) > // Apply the schema of the existing table to the new data. > val df = sparkSession.internalCreateDataFrame(data.queryExecution.toRdd, > logicalRelation.schema) > relation.insert(df, overwrite) > // Re-cache all cached plans(including this relation itself, if it's > cached) that refer to this > // data source relation. > sparkSession.sharedState.cacheManager.recacheByPlan(sparkSession, > logicalRelation) > Seq.empty[Row] > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org