I want to highlight in case I missed this in the original email: The 4 API will not be deleted. They will just be marked as deprecated annotations and we encourage users to use their alternatives.
-Rui On Thu, Jul 7, 2022 at 2:23 PM Rui Wang <amaliu...@apache.org> wrote: > Hi Community, > > Proposal: > I want to discuss a proposal to deprecate the following Catalog API: > def listColumns(dbName: String, tableName: String): Dataset[Column] > def getTable(dbName: String, tableName: String): Table > def getFunction(dbName: String, functionName: String): Function > def tableExists(dbName: String, tableName: String): Boolean > > > Context: > We have been adding table identifier with catalog name (aka 3 layer > namespace) support to Catalog API in > https://issues.apache.org/jira/browse/SPARK-39235. > The basic idea is, if an API accepts: > 1. only tableName:String, we allow it accepts "a.b.c" and > goes analyzer which treats a as catalog name, b namespace name and c table > name. > 2. only dbName:String, we allow it accepts "a.b" and goes analyzer which > treats a as catalog name, b namespace name. > Meanwhile we still maintain the backwards compatibility for such API to > make sure past behavior remains the same. E.g. If you only use tableName it > is still recognized by the session catalog. > > With this effort ongoing, the above 4 API becomes not fully > compatible with the 3 layer namespace. > > use tableExists(dbName: String, tableName: String) as an example, given > that it takes two parameters but leaves no room for the extra catalog name. > Also if we want to reuse the two parameters, which one will be the one that > takes more than one name part? > > > How? > So how to improve the above 4 API? There are two options: > a. Expand those four API to let those API accept catalog names. For > example, tableExists(catalogName: String, dbName: String, tableName: > String). > b. mark those API as `deprecated`. > > I am proposing to follow option B which does API deprecation. > > Why? > 1. Reduce unneeded API. The existing API can support the same behavior > given SPARK-39235. For example, tableExists(dbName, tableName) can be > replaced to use tableExists("dbName.tableName"). > 2. Reduce incomplete API. The proposed API to deprecate does not support 3 > layer namespace now, and it is hard to do so (where to take 3 part names)? > 3. Deprecation suggests users to migrate their usage on API. > 4. There was existing practice that we deprecated CreateExternalTable API > when adding CreateTable API: > https://github.com/apache/spark/blob/7dcb4bafd02dd43213d3cc4a936c170bda56ddc5/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala#L220 > > > What do you think? > > Thanks, > Rui Wang > > >