[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-17 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1025098163


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {
+/**
+ * Extract key type from table schema.
+ *
+ * @param schema the table schema
+ * @return the key type
+ */
+RowType keyType(TableSchema schema);
+
+/**
+ * Extract key fields from table schema.
+ *
+ * @param schema the table schema
+ * @return the key fields
+ */
+List keyFields(TableSchema schema);

Review Comment:
   So can we add this prefix to `List` too? And document this in 
method comments.
   I think this is confuse when I see two methods: `keytype` and `keyfields`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024745027


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {
+/**
+ * Extract key type from table schema.
+ *
+ * @param schema the table schema
+ * @return the key type
+ */
+RowType keyType(TableSchema schema);
+
+/**
+ * Extract key fields from table schema.
+ *
+ * @param schema the table schema
+ * @return the key fields
+ */
+List keyFields(TableSchema schema);

Review Comment:
   `(RowType) new RowDataType(false, fields).logicalType` this can convert 
`fields` to `RowType`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024714412


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {

Review Comment:
   Rename this to `KeyTypeExtractor`?



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {

Review Comment:
   Rename this to `KeyFieldsExtractor`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024713874


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaFieldTypeExtractor.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.RowType;
+
+import java.io.Serializable;
+import java.util.List;
+
+/** Extractor of schema for different tables. */
+public interface SchemaFieldTypeExtractor extends Serializable {
+/**
+ * Extract key type from table schema.
+ *
+ * @param schema the table schema
+ * @return the key type
+ */
+RowType keyType(TableSchema schema);
+
+/**
+ * Extract key fields from table schema.
+ *
+ * @param schema the table schema
+ * @return the key fields
+ */
+List keyFields(TableSchema schema);

Review Comment:
   We can just have util to convert `List` to `RowType`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024712989


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/operation/KeyValueFileStoreScan.java:
##
@@ -83,6 +96,26 @@ protected boolean filterByStats(ManifestEntry entry) {
 return keyFilter == null
 || keyFilter.test(
 entry.file().rowCount(),
-entry.file().keyStats().fields(keyStatsConverter, 
entry.file().rowCount()));
+entry.file()
+.keyStats()
+.fields(
+
getFieldStatsArraySerializer(entry.file().schemaId()),
+entry.file().rowCount()));
+}
+
+private FieldStatsArraySerializer getFieldStatsArraySerializer(long id) {
+return schemaKeyStatsConverters.computeIfAbsent(
+id,
+key -> {
+final TableSchema tableSchema = getTableSchema();

Review Comment:
   `scanSchema`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024712686


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/operation/AppendOnlyFileStoreScan.java:
##
@@ -84,6 +93,23 @@ protected boolean filterByStats(ManifestEntry entry) {
 entry.file().rowCount(),
 entry.file()
 .valueStats()
-.fields(rowStatsConverter, 
entry.file().rowCount()));
+.fields(
+
getFieldStatsArraySerializer(entry.file().schemaId()),
+entry.file().rowCount()));
+}
+
+private FieldStatsArraySerializer getFieldStatsArraySerializer(long 
schemaId) {
+return schemaRowStatsConverters.computeIfAbsent(
+schemaId,
+id -> {
+TableSchema tableSchema = getTableSchema();

Review Comment:
   `scanSchema`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024712478


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/operation/AbstractFileStoreScan.java:
##
@@ -244,6 +257,14 @@ public List files() {
 };
 }
 
+protected TableSchema getTableSchema() {

Review Comment:
   `scanTableSchema`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #376: [FLINK-27843] Schema evolution for data file meta

2022-11-16 Thread GitBox


JingsongLi commented on code in PR #376:
URL: https://github.com/apache/flink-table-store/pull/376#discussion_r1024709249


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaEvolutionUtil.java:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+/** Utils for schema evolution. */
+public class SchemaEvolutionUtil {
+
+private static final int NULL_FIELD_INDEX = -1;
+
+/**
+ * Create index mapping from table fields to underlying data fields.
+ *
+ * @param tableFields the fields of table
+ * @param dataFields the fields of underlying data
+ * @return the index mapping
+ */
+public static int[] createIndexMapping(

Review Comment:
   `@Nullable`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org