Hi Chamikara,

I've tried both with and without "--experiment=use_beam_bq_sink" and
it doesn't seem to affect the outcome.

Regarding "--experiment=beam_fn_api", we're using the portability
framework [1] so that we can ship our python runtime environment as an
image via "--worker_harness_container_image" instead of maintaining a
setup.py file.  It's worked very well for us so far (with the
exception of this latest issue).


[1] Am I using the term right?
https://beam.apache.org/documentation/runtime/environments/

On Mon, Jan 13, 2020 at 4:40 PM Chamikara Jayalath <[email protected]> wrote:
>
>
>
> On Mon, Jan 13, 2020 at 12:45 PM Bo Shi <[email protected]> wrote:
>>
>> Python 3.7.5
>> Beam 2.17
>>
>> I've used both WriteToBigquery and BigQueryBatchFileLoads successfully
>> using DirectRunner.  I've boiled the issue down to a small
>> reproducible case (attached).  The following works great:
>>
>> $ pipenv run python repro-direct.py \
>>   --project=CHANGEME \
>>   --runner=DirectRunner \
>>   --temp_location=gs://beam-to-bq--tmp/bshi/tmp \
>>   --staging_location=gs://beam-to-bq--tmp/bshi/stg \
>>   --experiment=beam_fn_api \   # <<<< NB! (I assume this is a no-op w/ Direct
>>   --save_main_function
>>
>> $ bq --project=CHANGEME query 'select * from bshi_test.direct'
>> Waiting on bqjob_r53488a6b9ed6cb20_0000016fa05fb570_1 ... (0s) Current
>> status: DONE
>> +----------+----------+
>> | some_str | some_int |
>> +----------+----------+
>> | hello    |        1 |
>> | world    |        2 |
>> +----------+----------+
>>
>> When changing to --runner=DataflowRunner, I get a cryptic Java error
>> with no other useful Python feedback (see class_cast_exception.txt)
>>
>> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException:
>> Dataflow pipeline failed. State: FAILED, Error:
>> java.util.concurrent.ExecutionException: java.lang.ClassCastException:
>> [B cannot be cast to org.apache.beam.sdk.values.KV
>>
>> On a hunch, I removed --experiment=beam_fn_api and DataflowRunner
>> works fine.  Am I doing something against the intent of the SDK?
>
>
> Can you try adding "--experiment=use_beam_bq_sink"  ?
> Also, out of curiosity, why are you setting "--experiment=beam_fn_api" ?
>
>
>>
>> Happy to share access to the GCP project if that would assist in debugging.

Reply via email to