Yang Jie created SPARK-48805: -------------------------------- Summary: Replace calls to bridged APIs based on SparkSession#sqlContext with SparkSession API Key: SPARK-48805 URL: https://issues.apache.org/jira/browse/SPARK-48805 Project: Spark Issue Type: Improvement Components: Examples, ML, SQL, Structured Streaming Affects Versions: 4.0.0 Reporter: Yang Jie
In the internal code of Spark, there are instances where, despite having a SparkSession instance, the bridged APIs based on SparkSession#sqlContext are still used. So we can makes some simplifications: 1. `SparkSession#sqlContext#read` -> `SparkSession#read` ```scala /** * Returns a [[DataFrameReader]] that can be used to read non-streaming data in as a * `DataFrame`. * {{{ * sqlContext.read.parquet("/path/to/file.parquet") * sqlContext.read.schema(schema).json("/path/to/file.json") * }}} * * @group genericdata * @since 1.4.0 */ def read: DataFrameReader = sparkSession.read ``` 2. `SparkSession#sqlContext#setConf` -> `SparkSession#conf#set` ```scala /** * Set the given Spark SQL configuration property. * * @group config * @since 1.0.0 */ def setConf(key: String, value: String): Unit = { sparkSession.conf.set(key, value) } ``` 3. `SparkSession#sqlContext#getConf` -> `SparkSession#conf#get` ```scala /** * Return the value of Spark SQL configuration property for the given key. * * @group config * @since 1.0.0 */ def getConf(key: String): String = { sparkSession.conf.get(key) } ``` 4. `SparkSession#sqlContext#createDataFrame` -> `SparkSession#createDataFrame` ```scala /** * Creates a DataFrame from an RDD of Product (e.g. case classes, tuples). * * @group dataframes * @since 1.3.0 */ def createDataFrame[A <: Product : TypeTag](rdd: RDD[A]): DataFrame = { sparkSession.createDataFrame(rdd) } ``` 5. `SparkSession#sqlContext#sessionState` -> `SparkSession#sessionState` ```scala private[sql] def sessionState: SessionState = sparkSession.sessionState ``` 6. `SparkSession#sqlContext#sharedState` -> `SparkSession#sharedState` ```scala private[sql] def sharedState: SharedState = sparkSession.sharedState ``` 7. `SparkSession#sqlContext#streams` -> `SparkSession#streams` ``` /** * Returns a `StreamingQueryManager` that allows managing all the * [[org.apache.spark.sql.streaming.StreamingQuery StreamingQueries]] active on `this` context. * * @since 2.0.0 */ def streams: StreamingQueryManager = sparkSession.streams ``` 8. `SparkSession#sqlContext#uncacheTable` -> ``SparkSession#catalog#uncacheTable` ``` /** * Removes the specified table from the in-memory cache. * @group cachemgmt * @since 1.3.0 */ def uncacheTable(tableName: String): Unit = { sparkSession.catalog.uncacheTable(tableName) } ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org