Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/21306#discussion_r238046730 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/PartitionTransforms.java --- @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2; + +/** + * A standard set of transformations that are passed to data sources during table creation. + * + * @see PartitionTransform + */ +public class PartitionTransforms { + private PartitionTransforms() { + } + + /** + * Create a transform for a column with the given name. + * <p> + * This transform is used to pass named transforms that are not known to Spark. + * + * @param transform a name of the transform to apply to the column + * @param colName a column name + * @return an Apply transform for the column + */ + public static PartitionTransform apply(String transform, String colName) { + if ("identity".equals(transform)) { --- End diff -- I think we should get this done now. Partition transforms are a generalization of Hive partitioning (which uses some columns directly) and bucketing (which is one specific transform). If we add transformation functions now, we will support both of those with a simple API instead of building in special cases for identity and bucket transforms. I also have a data source that allows users to configure partitioning using more transforms than just identity and bucketing, so I'd like to get this in so that DDL for those tables works.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org