[ https://issues.apache.org/jira/browse/BEAM-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15456740#comment-15456740 ]
ASF GitHub Bot commented on BEAM-577: ------------------------------------- GitHub user sbilac opened a pull request: https://github.com/apache/incubator-beam/pull/912 [BEAM-577] Add support for reading compressed files to filebasedsource. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-<Jira issue #>] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/sbilac/incubator-beam fbs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/912.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #912 ---- commit 25cf6748d8bb4643dffeff29cb1dcadb2771c97e Author: Slaven Bilac <sla...@google.com> Date: 2016-09-01T21:09:54Z Add support for reading compressed files. ---- > Update filebasedsource to support compressed files > -------------------------------------------------- > > Key: BEAM-577 > URL: https://issues.apache.org/jira/browse/BEAM-577 > Project: Beam > Issue Type: Improvement > Components: sdk-py > Reporter: Chamikara Jayalath > Assignee: Chamikara Jayalath > > FileBasedSource framework [1] should be updated to properly read compressed > files. > One possible way to do this might be to update FileBasedSource.open_file() > [2] to return a CompressedFile [3]. > Similar to Java implementation, we may not be able to support dynamic work > rebalancing for compressed files. > [1] > https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/filebasedsource.py > [2] > https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/filebasedsource.py#L125 > [3] > https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L300 -- This message was sent by Atlassian JIRA (v6.3.4#6332)