Chamikara Jayalath created BEAM-360: ---------------------------------------
Summary: Add a framework for creating Python-SDK sources for new file types Key: BEAM-360 URL: https://issues.apache.org/jira/browse/BEAM-360 Project: Beam Issue Type: New Feature Components: sdk-py Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath We already have a framework for creating new sources for Beam Python SDK - https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/iobase.py#L326 It would be great if we can add a framework on top of this that encapsulates logic common to sources that are based on files. This framework can include following features that are common to sources based on files. (1) glob expansion (2) support for new file-systems (3) dynamic work rebalancing based on byte offsets (4) support for reading compressed files. Java SDK has a similar framework and it's available at - https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)