GitHub user chamikaramj opened a pull request:
https://github.com/apache/beam/pull/2770
[BEAM-539] Fixes several issues of FileSink
(1) Updates FileSink to fail for file name prefixes that only contain a
single component (for example GCS buckets).
For example, currently FileSink fails for gs://aaa while passing for
gs://aaa/. This change makes FileSink fail for both cases (and makes the
behavior consistent with Java).
(2) Updates the name of the temporary directory created by FileSink
Currently , for a filename prefix 'gs://aaa/bbb', the temp path would be of
the form gs://aaa/bbb-temp-... .
This is error prone since a user pattern 'gs://aaa/bbb*' would match temp
files. This changes makes the temp path format 'gs://aaa/beam-temp-bbb-...'
instead.
To achieve above this PR adds a method 'split()' to FileSystem interface
that is analogous to Python 'os.path.split()' (and which has the opposite
effect of current method FileSystem.join())
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/chamikaramj/beam gcs_root_location_file_sink
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/beam/pull/2770.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2770
commit b66ae881a5adcdc5be8ee67a6d4ad842a2ea0147
Author: Chamikara Jayalath
Date: 2017-04-28T21:38:35Z
Fixes several issues of FileSink.
(1) Updates FileSink to fail for file name prefixes that only contain a
single component (for example GCS buckets).
For example, currently FileSink fails for gs://aaa while passing for
gs://aaa/. This change makes FileSink fail for both cases (and makes the
behaviour consistent with Java).
(2) Updates the name of the temporary directory created by FileSink
Currently , for a filename prefix 'gs://aaa/bbb', the temp path would be of
the form gs://aaa/bbb-temp-... .
This is error prone since a user pattern 'gs://aaa/bbb*' would match temp
files. This changes makes the temp path format 'gs://aaa/beam-temp-bbb-...'
instead.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---