I'm not sure. It depends on whether the Spark -> Beam Python integration will interfere with the magic built into AWS Glue.
On Wed, Jun 10, 2020 at 8:57 AM Noah Goodrich <m...@noahgoodrich.com> wrote: > I was hoping to use the Spark runner since Glue is just Spark with some > magic on top. And in our specific use case, we'd be looking at working with > S3, Kinesis, and MySQL RDS. > > Sounds like this is a non-starter? > > On Wed, Jun 10, 2020 at 9:33 AM Luke Cwik <lc...@google.com> wrote: > >> Most runners are written in Java while others are cloud offerings which >> wouldn't work for your use case which limits you to use the direct runner >> (not meant for production/high performance applications). Beam Python SDK >> uses cython for performance reasons but I don't believe it strictly >> requires it as many unit tests run with and without cython enabled. >> Integrations between Beam and third party libraries may require it though >> so it likely depends on what you plan to do. >> >> On Wed, Jun 10, 2020 at 8:17 AM Noah Goodrich <m...@noahgoodrich.com> >> wrote: >> >>> I am looking at using the Beam Python SDK in AWS Glue but it doesn't >>> support non-native python libraries (anything that is c/c++ based). >>> >>> Is the Beam Python SDK / runners able to be used without any c/c++ >>> library dependencies? >>> >>