[GitHub] mccheah commented on a change in pull request #6: Support customizing the location where data is written in Spark

2018-11-27 Thread GitBox
mccheah commented on a change in pull request #6: Support customizing the location where data is written in Spark URL: https://github.com/apache/incubator-iceberg/pull/6#discussion_r236869168 ## File path: spark/src/main/java/com/netflix/iceberg/spark/source/IcebergSource.java

[GitHub] rdblue closed issue #13: Allow overriding provision of FileSystem instances to HadoopTableOperations

2018-11-27 Thread GitBox
rdblue closed issue #13: Allow overriding provision of FileSystem instances to HadoopTableOperations URL: https://github.com/apache/incubator-iceberg/issues/13 This is an automated message from the Apache Git Service. To

[GitHub] rdblue closed pull request #15: Allow FileSystem provision to be overridden.

2018-11-27 Thread GitBox
rdblue closed pull request #15: Allow FileSystem provision to be overridden. URL: https://github.com/apache/incubator-iceberg/pull/15 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a

[GitHub] rdblue commented on issue #15: Allow FileSystem provision to be overridden.

2018-11-27 Thread GitBox
rdblue commented on issue #15: Allow FileSystem provision to be overridden. URL: https://github.com/apache/incubator-iceberg/pull/15#issuecomment-442183487 I don't think that `FileIO` should handle other operations like `rename`. We could add `exists` to `InputFile` and `OutputFile`.

Re: Issue list?

2018-11-27 Thread Ryan Blue
I think it makes sense to sent issue comments to issues@. That will make it easy to remove them or set up filters to handle them. We can't really use the Arrow or Parquet solution because we're using GitHub issues instead of JIRA. On Tue, Nov 27, 2018 at 10:24 AM Uwe L. Korn wrote: > We have

[GitHub] mccheah commented on issue #15: Allow FileSystem provision to be overridden.

2018-11-27 Thread GitBox
mccheah commented on issue #15: Allow FileSystem provision to be overridden. URL: https://github.com/apache/incubator-iceberg/pull/15#issuecomment-442181361 Should the FileIO API support file rename as well though? The FileSystem here is used for things other than `fs.create`, `fs.delete`,

[GitHub] rdblue commented on issue #15: Allow FileSystem provision to be overridden.

2018-11-27 Thread GitBox
rdblue commented on issue #15: Allow FileSystem provision to be overridden. URL: https://github.com/apache/incubator-iceberg/pull/15#issuecomment-442174208 This looks good to me, but I wonder: should this be done along with the addition of a FileIO API? Maybe you should supply the FileIO

[GitHub] mccheah edited a comment on issue #16: Custom metadata in data files

2018-11-27 Thread GitBox
mccheah edited a comment on issue #16: Custom metadata in data files URL: https://github.com/apache/incubator-iceberg/issues/16#issuecomment-442171391 We're planning to use Iceberg as an ephemeral table storage format so that we can take advantage of Iceberg's Data Source V2

[GitHub] mccheah commented on issue #16: Custom metadata in data files

2018-11-27 Thread GitBox
mccheah commented on issue #16: Custom metadata in data files URL: https://github.com/apache/incubator-iceberg/issues/16#issuecomment-442171391 We're planning to use Iceberg as an ephemeral table storage format so that we can take advantage of Iceberg's Data Source V2 implementation to

Re: Issue list?

2018-11-27 Thread Uwe L. Korn
We have nearly the same setup as Hive for Arrow and Parquet and this working really well. The main difference is that we don't have the git mails directly on the issues list but get them through JIRA where PR comments are posted as work logs. The work log workaround is only there because we

[GitHub] rdsr edited a comment on issue #16: Custom metadata in data files

2018-11-27 Thread GitBox
rdsr edited a comment on issue #16: Custom metadata in data files URL: https://github.com/apache/incubator-iceberg/issues/16#issuecomment-442159958 > instead of putting such info into a custom map, I think having these be recognized as first-class known entities will help iceberg data to

[GitHub] rdsr commented on issue #16: Custom metadata in data files

2018-11-27 Thread GitBox
rdsr commented on issue #16: Custom metadata in data files URL: https://github.com/apache/incubator-iceberg/issues/16#issuecomment-442159958 > instead of putting such info into a custom map, I think having these be recognized as first-class known entities will help iceberg data to be

[GitHub] hiteshs edited a comment on issue #16: Custom metadata in data files

2018-11-27 Thread GitBox
hiteshs edited a comment on issue #16: Custom metadata in data files URL: https://github.com/apache/incubator-iceberg/issues/16#issuecomment-442154938 If iceberg itself does not plan to parse or understand the map, could a json blob be considered instead? For CSV and compression

Issue list?

2018-11-27 Thread Owen O'Malley
All, As we move over to Apache infrastructure, we need to decide what works for the community. The dev list is getting a lot of traffic and is probably intimidating to new comers. Currently the notices are: Pull Requests and issue creation/comment/close -> dev@ Git commit -> commits@ One