[GitHub] flink pull request: Implementation of distributed copying utility ...

2015-09-08 Thread detonator413
Github user detonator413 commented on the pull request: https://github.com/apache/flink/pull/1090#issuecomment-138547980 1 profile check mysteriously fails and seems unrelated to the changes I introduced. The code should be now compliant to the guidelines. --- If your project

[GitHub] flink pull request: Implementation of distributed copying utility ...

2015-09-07 Thread detonator413
Github user detonator413 commented on the pull request: https://github.com/apache/flink/pull/1090#issuecomment-138241466 Sure, will push some changes soon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] flink pull request: Implementation of distributed copying utility ...

2015-09-04 Thread detonator413
Github user detonator413 commented on the pull request: https://github.com/apache/flink/pull/1090#issuecomment-137680835 Actually hadoop distcp also has an implementation of a dynamic input format which in my taste is a bit overcomplicated. So not sure if this Flink tool will give

[GitHub] flink pull request: Implementation of distributed copying utility ...

2015-09-03 Thread detonator413
Github user detonator413 commented on the pull request: https://github.com/apache/flink/pull/1090#issuecomment-137480178 Hi Max, Look at the distcp utility (http://hadoop.apache.org/docs/r1.2.1/distcp.html <http://hadoop.apache.org/docs/r1.2.1/distcp.html>). The p

[GitHub] flink pull request: Implementation of distributed copying utility ...

2015-09-03 Thread detonator413
GitHub user detonator413 opened a pull request: https://github.com/apache/flink/pull/1090 Implementation of distributed copying utility using Flink Uses a "dynamic" input format where faster nodes will get more stuff to be copied. The finest level of granularity

[GitHub] flink pull request: Implementation of distributed copying utility ...

2015-09-03 Thread detonator413
Github user detonator413 commented on the pull request: https://github.com/apache/flink/pull/1090#issuecomment-137516063 It could be faster because of dynamic assignment of files to copy as opposed to the default method of distcp where set of files are preassigned to mappers