lindong28 commented on a change in pull request #71: URL: https://github.com/apache/flink-ml/pull/71#discussion_r840998428
########## File path: flink-ml-core/src/main/java/org/apache/flink/ml/util/ReadWriteUtils.java ########## @@ -94,25 +95,29 @@ */ public static void saveMetadata(Stage<?> stage, String path, Map<String, ?> extraMetadata) throws IOException { - // Creates parent directories if not already created. - FileSystem fs = mkdirs(path); - Map<String, Object> metadata = new HashMap<>(extraMetadata); metadata.put("className", stage.getClass().getName()); metadata.put("timestamp", System.currentTimeMillis()); metadata.put("paramMap", jsonEncode(stage.getParamMap())); // TODO: add version in the metadata. String metadataStr = OBJECT_MAPPER.writeValueAsString(metadata); - Path metadataPath = new Path(path, "metadata"); - if (fs.exists(metadataPath)) { - throw new IOException("File " + metadataPath + " already exists."); + saveToFile(new Path(path, "metadata"), metadataStr); + } + + /** Saves a given string to the specified file. */ + public static void saveToFile(Path path, String content) throws IOException { + // Creates parent directories if not already created. + FileSystem fs = mkdirs(path.getParent().toString()); + + if (fs.exists(path)) { + throw new IOException("File " + path + " already exists."); Review comment: Spark ML don't overwrite model data files by default. I think we can follow the same practice for model data files. Based on my past experience, I believe we typically allow overwriting output files specified on the command line tool. But I don't have an example (or counter-example) currently. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org