I'm not sure. It depends on whether the Spark -> Beam Python integration
will interfere with the magic built into AWS Glue.
On Wed, Jun 10, 2020 at 8:57 AM Noah Goodrich wrote:
> I was hoping to use the Spark runner since Glue is just Spark with some
> magic on top. And in our specific use case, we'd be looking at working with
> S3, Kinesis, and MySQL RDS.
>
> Sounds like this is a non-starter?
>
> On Wed, Jun 10, 2020 at 9:33 AM Luke Cwik wrote:
>
>> Most runners are written in Java while others are cloud offerings which
>> wouldn't work for your use case which limits you to use the direct runner
>> (not meant for production/high performance applications). Beam Python SDK
>> uses cython for performance reasons but I don't believe it strictly
>> requires it as many unit tests run with and without cython enabled.
>> Integrations between Beam and third party libraries may require it though
>> so it likely depends on what you plan to do.
>>
>> On Wed, Jun 10, 2020 at 8:17 AM Noah Goodrich
>> wrote:
>>
>>> I am looking at using the Beam Python SDK in AWS Glue but it doesn't
>>> support non-native python libraries (anything that is c/c++ based).
>>>
>>> Is the Beam Python SDK / runners able to be used without any c/c++
>>> library dependencies?
>>>
>>