GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/5246
Various minor improvements to FileSystem and related classes ## What is the purpose of the change Many small improvements, like - harmonization of of `FileSystem.mkdirs()` behavior (with test suite) - avoiding repeated re-parsing of URIs - avoiding repeated regex compilation - removal of unneeded and unused methods ## Verifying this change The change is partially trivial rework and adds some additional unit tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no)** - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no)** - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no)** - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink various_fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5246.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5246 ---- commit db52d6b22058d106e0bce57a646d68a64adbe5f4 Author: Stephan Ewen <sewen@...> Date: 2017-10-26T18:54:55Z [hotfix] [checkpoints] Remove never used method 'close()' on CheckpointStreamFactory The fact that the method was never called (and never implemented) strongly suggests that it should be removed, otherwise someone might eventually end up implementing it for a new state backend and wonder why it is never called. commit 40524a6cd9ae4db92bd17cf25af6178487d2921d Author: Stephan Ewen <sewen@...> Date: 2017-10-27T17:23:51Z [hotfix] [core] Improve local fs exists() performance This avoids going though an exception in the case of non-existing files. commit c147967dc1aed07495fb7a7fb834d11e38eeae1d Author: Stephan Ewen <sewen@...> Date: 2017-10-27T17:25:22Z [hotfix] [hdfs] Avoid re-parsing URIs for all Hadoop File System calls. Previously, this converted Flink paths (internally URIs) to strings and then let the Hadoop Paths parse, validate, and normalize the strings to URIs again. Now we simply pass the URIs directly. commit 3771db1d6eef8079796229deb53a6a42f7def33e Author: Stephan Ewen <sewen@...> Date: 2017-12-06T13:51:22Z [hotfix] [checkpoints] Improve performance of ByteStreamStateHandle The input stream from ByteStreamStateHandle did not overwrite the 'read(byte[], int, int)' method, meaning that bulk byte reads resulted in many individual byte accesses. Additionally, this change avoids accessing the data array through an outer class, but instead adds a reference directly to the input stream class, avoiding one hop per access. That also allows a more restricted access level on the fields, which may additionally help the jitter in some cases. commit 355fa00b956b6717ab7ef9350cd59154a85b4091 Author: Stephan Ewen <sewen@...> Date: 2017-12-06T14:10:22Z [hotfix] [core] Avoid redundant File path conversion in LocalFileSystem.getFileStatus(Path) commit 4f35cf3774e0879c98d61e251283ae955cdc66c0 Author: Stephan Ewen <sewen@...> Date: 2017-12-07T15:11:24Z [FLINK-8373] [core, hdfs] Ensure consistent semantics of FileSystem.mkdirs() across file system implementations. commit 9cab78ac5c39334aacbb09879d3e1ebf99414185 Author: Stephan Ewen <sewen@...> Date: 2017-12-13T14:07:52Z [hotfix] [core] Pre-compile regex pattern in Path class commit e8cf409358e81690d4bc66e7497aaad7576a148b Author: Stephan Ewen <sewen@...> Date: 2018-01-05T13:11:58Z [hotfix] [tests] Remove unnecessary stack trace printing in StreamTaskTest commit 08dfba993f9eaf64fa44c497a0925e9e494ecaa0 Author: Stephan Ewen <sewen@...> Date: 2017-12-13T16:06:37Z [hotfix] [core] Add a factory method to create Path from local file This makes it easier for users and contributors to figure out how to create local file paths in way that works cross operating systems. ---- ---