[GitHub] beam pull request #2770: [BEAM-539] Fixes several issues of FileSink

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2770


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2770: [BEAM-539] Fixes several issues of FileSink

2017-04-28 Thread chamikaramj
GitHub user chamikaramj opened a pull request:

https://github.com/apache/beam/pull/2770

[BEAM-539] Fixes several issues of FileSink

(1) Updates FileSink to fail for file name prefixes that only contain a 
single component (for example GCS buckets).

For example, currently FileSink fails for  gs://aaa while passing for 
gs://aaa/. This change makes FileSink fail for both cases (and makes the 
behavior consistent with Java).

(2) Updates the name of the temporary directory created by FileSink

Currently , for a filename prefix 'gs://aaa/bbb', the temp path would be of 
the form gs://aaa/bbb-temp-... .
This is error prone since a user pattern 'gs://aaa/bbb*' would match temp 
files. This changes makes the temp path format 'gs://aaa/beam-temp-bbb-...' 
instead.

To achieve above this PR adds a method 'split()' to FileSystem interface 
that is analogous to Python 'os.path.split()' (and which has the opposite 
effect of current method FileSystem.join())

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/beam gcs_root_location_file_sink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2770.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2770


commit b66ae881a5adcdc5be8ee67a6d4ad842a2ea0147
Author: Chamikara Jayalath 
Date:   2017-04-28T21:38:35Z

Fixes several issues of FileSink.

(1) Updates FileSink to fail for file name prefixes that only contain a 
single component (for example GCS buckets).

For example, currently FileSink fails for  gs://aaa while passing for 
gs://aaa/. This change makes FileSink fail for both cases (and makes the 
behaviour consistent with Java).

(2) Updates the name of the temporary directory created by FileSink

Currently , for a filename prefix 'gs://aaa/bbb', the temp path would be of 
the form gs://aaa/bbb-temp-... .
This is error prone since a user pattern 'gs://aaa/bbb*' would match temp 
files. This changes makes the temp path format 'gs://aaa/beam-temp-bbb-...' 
instead.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---