[GitHub] [flink-table-store] ajian2002 commented on a diff in pull request #121: Introduce AggregatuibMergeFunction

2022-05-14 Thread GitBox


ajian2002 commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r872990828


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregationMergeFunction.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+/**
+ * A {@link MergeFunction} where key is primary key (unique) and value is the 
partial record, update
+ * non-null fields on merge.
+ */
+public class AggregationMergeFunction implements MergeFunction {
+
+private static final long serialVersionUID = 1L;
+
+private final RowData.FieldGetter[] getters;
+
+private transient GenericRowData row;
+
+public AggregationMergeFunction(RowData.FieldGetter[] getters) {
+this.getters = getters;
+}
+
+@Override
+public void reset() {
+this.row = new GenericRowData(getters.length);
+}
+
+@Override
+public void add(RowData value) {
+for (int i = 0; i < getters.length; i++)
+{
+Object currentField = getters[i].getFieldOrNull(value);
+Object oldValue = row.getField(i);
+Object result = sum(oldValue, currentField);
+if (result != null)
+{
+row.setField(i, result);
+}
+}
+}
+
+private Object sum(Object oldValue, Object currentField) {
+if (currentField == null)
+{
+return null;
+}
+if (oldValue == null)
+{
+return currentField;
+}
+if (oldValue instanceof Integer && currentField instanceof Integer)
+{
+return Integer.sum((Integer) oldValue, (Integer) currentField);
+}
+else if (oldValue instanceof Long && currentField instanceof Long)
+{
+return Long.sum((Long) oldValue, (Long) currentField);
+}

Review Comment:
   should it be used
   org.apache.flink.api.java.aggregation.AggregationFunction to handle 
aggregation? This is a convenient way to deal with different field types, but 
it also brings some problems. It is difficult for me to deal with delete/update 
RowData#getRowKind. How to solve it?
   1. Modify the original interface 
org.apache.flink.api.java.aggregation.AggregationFunction to add delete and 
update functions
   2. Implement the Aggregator class by yourself, manually switch the class of 
different fields and implement it?
   Is there any other solution
   For the type processing of object/class, I guess flink/flink-table-store 
already has useful tool classes to reduce development workload (such as 
org.apache.flink.api.java.aggregation.AggregationFunction, etc.), you can 
provide guide?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] ajian2002 commented on a diff in pull request #121: Introduce AggregatuibMergeFunction

2022-05-14 Thread GitBox


ajian2002 commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r872947138


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregationMergeFunction.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+/**
+ * A {@link MergeFunction} where key is primary key (unique) and value is the 
partial record, update
+ * non-null fields on merge.
+ */
+public class AggregationMergeFunction implements MergeFunction {
+
+private static final long serialVersionUID = 1L;
+
+private final RowData.FieldGetter[] getters;
+
+private transient GenericRowData row;
+
+public AggregationMergeFunction(RowData.FieldGetter[] getters) {
+this.getters = getters;
+}
+
+@Override
+public void reset() {
+this.row = new GenericRowData(getters.length);
+}
+
+@Override
+public void add(RowData value) {
+for (int i = 0; i < getters.length; i++)
+{
+Object currentField = getters[i].getFieldOrNull(value);
+Object oldValue = row.getField(i);
+Object result = sum(oldValue, currentField);
+if (result != null)
+{
+row.setField(i, result);
+}
+}
+}
+
+private Object sum(Object oldValue, Object currentField) {
+if (currentField == null)
+{
+return null;
+}
+if (oldValue == null)
+{
+return currentField;
+}
+if (oldValue instanceof Integer && currentField instanceof Integer)
+{
+return Integer.sum((Integer) oldValue, (Integer) currentField);
+}
+else if (oldValue instanceof Long && currentField instanceof Long)
+{
+return Long.sum((Long) oldValue, (Long) currentField);
+}
+else if (oldValue instanceof Double && currentField instanceof Double)
+{
+return Double.sum((Double) oldValue, (Double) currentField);
+}
+else if (oldValue instanceof Float && currentField instanceof Float)
+{
+return Float.sum((Float) oldValue, (Float) currentField);
+}
+else if (oldValue instanceof String && currentField instanceof String)
+{
+return "null";
+}
+return null;

Review Comment:
   Is there a more explicit exception than RuntimeException?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] ajian2002 commented on a diff in pull request #121: Introduce AggregatuibMergeFunction

2022-05-14 Thread GitBox


ajian2002 commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r872947013


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregationMergeFunction.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+
+import javax.annotation.Nullable;
+
+/**
+ * A {@link MergeFunction} where key is primary key (unique) and value is the 
partial record, update
+ * non-null fields on merge.
+ */
+public class AggregationMergeFunction implements MergeFunction {
+
+private static final long serialVersionUID = 1L;
+
+private final RowData.FieldGetter[] getters;
+
+private transient GenericRowData row;
+
+public AggregationMergeFunction(RowData.FieldGetter[] getters) {
+this.getters = getters;
+}
+
+@Override
+public void reset() {
+this.row = new GenericRowData(getters.length);
+}
+
+@Override
+public void add(RowData value) {

Review Comment:
   Should update_before and update_after be supported for Aggregation? I don't 
know how to handle this situation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] ajian2002 commented on a diff in pull request #121: Introduce AggregatuibMergeFunction

2022-05-13 Thread GitBox


ajian2002 commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r872505878


##
flink-table-store-connector/src/test/java/org/apache/flink/table/store/connector/AggregationITCase.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.connector;
+
+import org.apache.flink.types.Row;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.ExecutionException;
+
+import static org.apache.flink.util.CollectionUtil.iteratorToList;
+import static org.assertj.core.api.Assertions.assertThat;
+
+/**
+ * ITCase for partial update.
+ */
+public class AggregationITCase extends FileStoreTableITCase {
+
+@Override
+protected List ddl() {
+return Collections.singletonList("CREATE TABLE IF NOT EXISTS T3 (" + 
"a STRING," + "b INT," + "c INT ," + "PRIMARY KEY (a) NOT ENFORCED )" + " WITH 
('merge-engine'='aggregation' );");
+}
+
+@Test
+public void testMergeInMemory() throws ExecutionException, 
InterruptedException {
+bEnv.executeSql("INSERT INTO T3 VALUES " + "('pk1',1, 2), " + 
"('pk1',1, 2)").await();
+List result = iteratorToList(bEnv.from("T3").execute().collect());
+assertThat(result).containsExactlyInAnyOrder(Row.of("pk1", 2, 4));
+}
+
+@Test
+public void testMergeRead() throws ExecutionException, 
InterruptedException {
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk1',1, 2)").await();
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk1',1, 4)").await();
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk1',2, 0)").await();
+List result = iteratorToList(bEnv.from("T3").execute().collect());
+assertThat(result).containsExactlyInAnyOrder(Row.of("pk1", 4, 6));
+}
+
+
+@Test
+public void testMergeCompaction() throws ExecutionException, 
InterruptedException {
+// Wait compaction
+bEnv.executeSql("ALTER TABLE T3 SET ('commit.force-compact'='true')");
+
+// key pk1
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk1', 3, 1)").await();
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk1', 4, 5)").await();
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk1', 4, 6)").await();
+
+// key pk2
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk2', 6,7)").await();
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk2', 9,0)").await();
+bEnv.executeSql("INSERT INTO T3 VALUES ('pk2', 4,4)").await();
+
+List result = iteratorToList(bEnv.from("T3").execute().collect());
+assertThat(result).containsExactlyInAnyOrder(Row.of("pk1",11,12), 
Row.of("pk2",19,11));
+}
+
+//@Test

Review Comment:
   This test example comes from 
`flink-table-store-connector/src/test/java/org/apache/flink/table/store/connector/PartialUpdateITCase.java`
   I don't know if this test makes sense for 
`flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregationMergeFunction.java`,
   do I need to keep this test .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org