[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20706 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171933520 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1876,17 +1822,6 @@ private[spark] object Utils extends Logging { obj.getClass.getSimpleName.replace("$", "") } - /** Return an option that translates JNothing to None */ - def jsonOption(json: JValue): Option[JValue] = { --- End diff -- It encourages writing manual serialization and deserialization code, which is error prone. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171931822 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1876,17 +1822,6 @@ private[spark] object Utils extends Logging { obj.getClass.getSimpleName.replace("$", "") } - /** Return an option that translates JNothing to None */ - def jsonOption(json: JValue): Option[JValue] = { --- End diff -- What is the reason we should not use json4s for new things? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171927685 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1018,70 +1016,18 @@ private[spark] object Utils extends Logging { " " + (System.currentTimeMillis - startTimeMs) + " ms" } - private def listFilesSafely(file: File): Seq[File] = { -if (file.exists()) { - val files = file.listFiles() - if (files == null) { -throw new IOException("Failed to list files for dir: " + file) - } - files -} else { - List() -} - } - - /** - * Lists files recursively. - */ - def recursiveList(f: File): Array[File] = { --- End diff -- I disagree that we should keep things in production code "just in case". This is also an example of a function that if used without care is dangerous. `Files.walk` is a much better way of achieving this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171927283 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1876,17 +1822,6 @@ private[spark] object Utils extends Logging { obj.getClass.getSimpleName.replace("$", "") } - /** Return an option that translates JNothing to None */ - def jsonOption(json: JValue): Option[JValue] = { --- End diff -- We should not be using json4s for new things, so it's better to have these hidden away. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171881215 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1876,17 +1822,6 @@ private[spark] object Utils extends Logging { obj.getClass.getSimpleName.replace("$", "") } - /** Return an option that translates JNothing to None */ - def jsonOption(json: JValue): Option[JValue] = { --- End diff -- This is only used in `JSONProtocol` for now, but this is also a useful JSON method that could apply to some other JSON operations in the future. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171879418 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1018,70 +1016,18 @@ private[spark] object Utils extends Logging { " " + (System.currentTimeMillis - startTimeMs) + " ms" } - private def listFilesSafely(file: File): Seq[File] = { -if (file.exists()) { - val files = file.listFiles() - if (files == null) { -throw new IOException("Failed to list files for dir: " + file) - } - files -} else { - List() -} - } - - /** - * Lists files recursively. - */ - def recursiveList(f: File): Array[File] = { --- End diff -- Yes this is only used by test suites for now, but can you still keep this in `Utils` so we don't have to move it back when we want to use it in the production code (I assume this may happen someday). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171683828 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" +val dir = if (root != null) { + val rootDir = new File(root) + require(rootDir.isDirectory(), s"Root path $root must be an existing directory.") --- End diff -- Not sure what that point really is, but if you feel so strongly about using the nio APIs, that's easy enough. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171683202 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" +val dir = if (root != null) { + val rootDir = new File(root) + require(rootDir.isDirectory(), s"Root path $root must be an existing directory.") --- End diff -- Yeah, I missed that sorry. However my point still stands. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171681024 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" +val dir = if (root != null) { + val rootDir = new File(root) + require(rootDir.isDirectory(), s"Root path $root must be an existing directory.") --- End diff -- This is for a directory, not file. `File.createTempFile` does not work. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171676609 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" +val dir = if (root != null) { + val rootDir = new File(root) + require(rootDir.isDirectory(), s"Root path $root must be an existing directory.") --- End diff -- That is also an argument for not changing this method at all. Just use `File.createTempFile(...)` and be done with it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171674875 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" +val dir = if (root != null) { + val rootDir = new File(root) + require(rootDir.isDirectory(), s"Root path $root must be an existing directory.") --- End diff -- Same thing in the end, no? This method needs to return `File` regardless of what it does internally. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171674429 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" +val dir = if (root != null) { + val rootDir = new File(root) + require(rootDir.isDirectory(), s"Root path $root must be an existing directory.") --- End diff -- Why not use the nio api's? e.g.: ```scala val rootDir = Paths.get(root) require(Files.isDirectory(rootDir), s"Root path $root must be an existing directory.") Files.createTempDirectory(rootDir, prefix) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171667286 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" --- End diff -- This code was written when Spark still supported Java 1.6, and `java.nio.file` did not exist. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/20706#discussion_r171666996 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -267,44 +264,20 @@ private[spark] object Utils extends Logging { } } - /** - * JDK equivalent of `chmod 700 file`. - * - * @param file the file whose permissions will be modified - * @return true if the permissions were successfully changed, false otherwise. - */ - def chmod700(file: File): Boolean = { -file.setReadable(false, false) && -file.setReadable(true, true) && -file.setWritable(false, false) && -file.setWritable(true, true) && -file.setExecutable(false, false) && -file.setExecutable(true, true) - } - /** * Create a directory inside the given parent directory. The directory is guaranteed to be * newly created, and is not marked for automatic deletion. */ def createDirectory(root: String, namePrefix: String = "spark"): File = { -var attempts = 0 -val maxAttempts = MAX_DIR_CREATION_ATTEMPTS -var dir: File = null -while (dir == null) { - attempts += 1 - if (attempts > maxAttempts) { -throw new IOException("Failed to create a temp directory (under " + root + ") after " + - maxAttempts + " attempts!") - } - try { -dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString) -if (dir.exists() || !dir.mkdirs()) { - dir = null -} - } catch { case e: SecurityException => dir = null; } +val prefix = namePrefix + "-" --- End diff -- was there a reason you rewriting this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20706: [SPARK-23550][core] Cleanup `Utils`.
GitHub user vanzin opened a pull request: https://github.com/apache/spark/pull/20706 [SPARK-23550][core] Cleanup `Utils`. A few different things going on: - Remove unused methods. - Move JSON methods to the only class that uses them. - Move test-only methods to TestUtils. - Make getMaxResultSize() a config constant. - Reuse functionality from existing libraries (JRE or JavaUtils) where possible. There is a slight change in the behavior of "createDirectory()": it does not create a full directory tree anymore, just the directory being requested. That's better since the code would only set up proper permissions for the leaf directory, and this change requires its parent to already exist. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vanzin/spark SPARK-23550 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20706.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20706 commit 4b8b8e25f4c7d43616dd09988e1dabce77380cf8 Author: Marcelo VanzinDate: 2018-03-01T00:09:52Z [SPARK-23550][core] Cleanup `Utils`. A few different things going on: - Remove unused methods. - Move JSON methods to the only class that uses them. - Move test-only methods to TestUtils. - Make getMaxResultSize() a config constant. - Reuse functionality from existing libraries (JRE or JavaUtils) where possible. There is a slight change in the behavior of "createDirectory()": it does not create a full directory tree anymore, just the directory being requested. That's better since the code would only set up proper permissions for the leaf directory, and this change requires its parent to already exist. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org