[GitHub] [flink-ml] lindong28 commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

GitBox Fri, 01 Apr 2022 19:58:35 -0700


lindong28 commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r840998428




##########
File path: 
flink-ml-core/src/main/java/org/apache/flink/ml/util/ReadWriteUtils.java
##########
@@ -94,25 +95,29 @@
      */
     public static void saveMetadata(Stage<?> stage, String path, Map<String, 
?> extraMetadata)
             throws IOException {
-        // Creates parent directories if not already created.
-        FileSystem fs = mkdirs(path);
-
         Map<String, Object> metadata = new HashMap<>(extraMetadata);
         metadata.put("className", stage.getClass().getName());
         metadata.put("timestamp", System.currentTimeMillis());
         metadata.put("paramMap", jsonEncode(stage.getParamMap()));
         // TODO: add version in the metadata.
         String metadataStr = OBJECT_MAPPER.writeValueAsString(metadata);
 
-        Path metadataPath = new Path(path, "metadata");
-        if (fs.exists(metadataPath)) {
-            throw new IOException("File " + metadataPath + " already exists.");
+        saveToFile(new Path(path, "metadata"), metadataStr);
+    }
+
+    /** Saves a given string to the specified file. */
+    public static void saveToFile(Path path, String content) throws 
IOException {
+        // Creates parent directories if not already created.
+        FileSystem fs = mkdirs(path.getParent().toString());
+
+        if (fs.exists(path)) {
+            throw new IOException("File " + path + " already exists.");

Review comment:
       Spark ML don't overwrite model data files by default. I think we can 
follow the same practice for model data files.
   
   Based on my past experience, I believe we typically allow overwriting output 
files specified on the command line tool. But I don't have an example (or 
counter-example) currently.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] lindong28 commented on a change in pull request #71: [FLINK-26443] Add benchmark framework

Reply via email to