Re: Will Beam add any overhead or lack certain API/functions available in Spark/Flink?

2019-05-04 Thread kant kodali
I believe this comes down to more of abstractions vs execution engines and I am sure people can take on both sides. I think both are important however It is worth noting that the execution framework themselves have a lot of abstractions but sure more generic ones can be built on top. Are

Re: Will Beam add any overhead or lack certain API/functions available in Spark/Flink?

2019-05-04 Thread Pankaj Chand
Hi Matt, My project is for my PhD. So, I am interested in those 0.1% of use cases. --Pankaj On Sat, May 4, 2019, 10:48 AM Matt Casters wrote: > Anything can be coded in any form or language on any platform. > However, doing so takes time and effort. Maintaining the code takes time > as well

Re: python3-avro with CombineGlobally(CombineFn)

2019-05-04 Thread Chengxuan Wang
Yes. The only thing I can’t control is this line: https://github.com/apache/beam/pull/8130/files#diff-04fef9e0550df0b0c4e1cd0264406eb5L608 On Sat, May 4, 2019 at 04:46 Valentyn Tymofieiev wrote: > HI Chengxuan, > > We will try to include this change in the next release. As I said, you > could

Is it safe to cache the value of a singleton view (with a global window) in a DoFn?

2019-05-04 Thread Steve Niemitz
I have a singleton view in a global window that is read from a DoFn. I'm curious if its "correct" to cache that value from the view, or if I need to read it every time. As a (simplified) example, if I were to generate the view as such: input.getPipeline

Re: Will Beam add any overhead or lack certain API/functions available in Spark/Flink?

2019-05-04 Thread Matt Casters
Anything can be coded in any form or language on any platform. However, doing so takes time and effort. Maintaining the code takes time as well as protecting the investments you made from changes in the ecosystem. This is obviously where APIs like Beam come into play quite heavily. New

Re: python3-avro with CombineGlobally(CombineFn)

2019-05-04 Thread Valentyn Tymofieiev
HI Chengxuan, We will try to include this change in the next release. As I said, you could also set use_fastavro=true in your pipeline code without having to wait for the change the makes this flag set to true by default. Thanks, Valentyn *From:*Chengxuan Wang *Date:*Sat, May 4, 2019, 3:28 AM

Re: python3-avro with CombineGlobally(CombineFn)

2019-05-04 Thread Chengxuan Wang
Hi Valentyn, Thanks a lot. By following https://github.com/apache/beam/pull/8130 , I made changes in my apache beam package locally, now my test can pass. This line https://github.com/apache/beam/pull/8130/files#diff-04fef9e0550df0b0c4e1cd0264406eb5L608 is important. Is there a way to accelerate