Re: [PROPOSAL] External Join with KV Stores

2017-08-28 Thread JingsongLee
Yes, the runner can hold the entire side input in the right way.But it will be  some waste, in the case of large amounts of data. Best, Jingsong Lee --From:Lukasz Cwik Time:2017 Aug 25 (Fri) 23:26To:dev

Re: How to test a transform against an inaccessible ValueProvider?

2017-08-28 Thread Eugene Kirpichov
I sent a PR for review with something that I think is a still better option: https://github.com/apache/beam/pull/3753 +Ben Chambers Example usage: p.apply("Read", AvroIO.read(GenericClass.class) .from(*p.newProvider*(outputFile.getAbsolutePath()

Re: Proposal: file-based IOs should support readAllMatches()

2017-08-28 Thread Eugene Kirpichov
Thanks. I think I agree that file-based IOs (at least widely used ones) should, for convenience, still provide FooIO.read().from(filepattern), and for performance until SDF has full support in all runners, implement it via a BoundedSource. The second case with Create.of(filepattern) illustrates

Re: [DISCUSS] Capability Matrix revamp

2017-08-28 Thread Aljoscha Krettek
I like where this is going! Regarding benchmarking, I think we could do this if we had common benchmarking infrastructure and pipelines that regularly run on different Runners so that we have up-to-date data. I think we can also have a more technical section where we show stats on the level

Re: Proposal: file-based IOs should support readAllMatches()

2017-08-28 Thread Etienne Chauchot
Hi Eugene, +1 to this, it is nice to add this common behavior to all the file-based IOs. I find the design elegant, I just have one minor API comment, I would prefer p.apply(FooIO.read().from(filepattern)) to p.apply(Create.of(filepattern)) IMHO, it is more readable and analogous to the