lindong28 commented on code in PR #193:
URL: https://github.com/apache/flink-ml/pull/193#discussion_r1062011981


##########
flink-ml-benchmark/src/test/java/org/apache/flink/ml/benchmark/BenchmarkTest.java:
##########
@@ -54,6 +60,44 @@ public void testParseJsonFile() throws Exception {
         assertTrue(benchmarks.containsKey("KMeansModel-1"));
     }
 
+    @Test
+    public void testJsonFileLegality() throws IOException, 
ClassNotFoundException {
+        File resourcesDir =

Review Comment:
   Can you help update `testParseJsonFile()` to also use this approach instead 
of having to copy the file to `test-conf.json`?



##########
flink-ml-benchmark/src/main/resources/robustscaler-benchmark.json:
##########
@@ -0,0 +1,42 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+{
+  "version": 1,
+  "robustscaler10000000": {
+    "inputData": {
+      "className": 
"org.apache.flink.ml.benchmark.datagenerator.common.DenseVectorGenerator",
+      "paramMap": {
+        "vectorDim": 100,
+        "colNames": [
+          [
+            "features"
+          ]
+        ],
+        "seed": 2,
+        "numValues": 10000000
+      }
+    },
+    "stage": {
+      "className": "org.apache.flink.ml.feature.robustscaler.RobustScaler",
+      "paramMap": {
+        "inputCol": "features",

Review Comment:
   Would it be simpler to remove this line and use the default value? Same for 
`outputCol`.
   
   Same for similar parameters used in other benchmark json files added in this 
PR.



##########
flink-ml-benchmark/src/test/java/org/apache/flink/ml/benchmark/BenchmarkTest.java:
##########
@@ -54,6 +60,44 @@ public void testParseJsonFile() throws Exception {
         assertTrue(benchmarks.containsKey("KMeansModel-1"));
     }
 
+    @Test
+    public void testJsonFileLegality() throws IOException, 
ClassNotFoundException {
+        File resourcesDir =
+                new File(
+                                this.getClass()
+                                        .getClassLoader()
+                                        .getResource("benchmark-demo.json")
+                                        .getPath())
+                        .getParentFile();
+        File[] jsonFiles =
+                resourcesDir.listFiles(
+                        (dir, name) ->
+                                name.endsWith(".json") && 
!name.equals("benchmark-demo.json"));

Review Comment:
   It might be a bit simpler to remove `name.equals("benchmark-demo.json")` 
since the overhead of parsing this file is negligible compared to all the other 
work we need to do in the CI.



##########
flink-ml-benchmark/src/test/java/org/apache/flink/ml/benchmark/BenchmarkTest.java:
##########
@@ -54,6 +60,44 @@ public void testParseJsonFile() throws Exception {
         assertTrue(benchmarks.containsKey("KMeansModel-1"));
     }
 
+    @Test
+    public void testJsonFileLegality() throws IOException, 
ClassNotFoundException {
+        File resourcesDir =
+                new File(
+                                this.getClass()
+                                        .getClassLoader()
+                                        .getResource("benchmark-demo.json")
+                                        .getPath())
+                        .getParentFile();
+        File[] jsonFiles =
+                resourcesDir.listFiles(
+                        (dir, name) ->
+                                name.endsWith(".json") && 
!name.equals("benchmark-demo.json"));
+
+        for (File file : jsonFiles) {
+            Map<String, Map<String, Map<String, ?>>> benchmarks =
+                    BenchmarkUtils.parseJsonFile(file.getAbsolutePath());
+            for (Map<String, Map<String, ?>> params : benchmarks.values()) {
+                assertTrue(
+                        Arrays.asList("stage", "inputData", "modelData")
+                                .containsAll(params.keySet()));
+
+                WithParams stage = 
ReadWriteUtils.instantiateWithParams(params.get("stage"));
+                assertTrue(stage instanceof Stage);

Review Comment:
   It is nice to add this test šŸ‘
   
   Would it be simpler to just copy/paste the following lines from 
`BenchmarkUtils#runBenchmark` instead using `instanceof`? This approach also 
reduces the `Raw use of parameterized class ...` warnings. 
   
   ```
   Stage stage = ReadWriteUtils.instantiateWithParams(params.get("stage"));
   InputDataGenerator inputDataGenerator =
           ReadWriteUtils.instantiateWithParams(params.get("inputData"));
   DataGenerator modelDataGenerator = null;
   if (params.containsKey("modelData")) {
       modelDataGenerator = 
ReadWriteUtils.instantiateWithParams(params.get("modelData"));
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to