It's not public because it was added for use in unit tests, and modifying
this value can have very unexpected results (e.g. making it smaller can
trigger a completely different codepath that is triggered when there are
too many files, leading to unexpected cost increases in the pipeline).

Out of curiosity, what is your use case for needing to control this file
size?

On Thu, Sep 29, 2022 at 8:01 AM Ahmed Abualsaud <ahmedabuals...@google.com>
wrote:

> Hey Julien,
>
> I don't see a problem with exposing that method. That part of the code was
> committed ~6 years ago, my guess is it wasn't requested to be public.
>
> One workaround is to hardcode another value for DEFAULT_MAX_FILE_SIZE [1].
> Would this work temporarily? @Chamikara Jayalath <chamik...@google.com> 
> @Reuven
> Lax <re...@google.com> other thoughts?
>
> [1]
> https://github.com/apache/beam/blob/17453e71a81ba774ab451ad141fc8c21ea8770c9/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java#L109
>
> Best,
> Ahmed
>
> On Wed, Sep 28, 2022 at 4:55 PM Julien Phalip <jpha...@gmail.com> wrote:
>
>> Hi,
>>
>> I'd like to control the size of files written to GCS when using
>> BigQueryIO's FILE_LOAD write method.
>>
>> However, it looks like the withMaxFileSize method (
>> https://github.com/apache/beam/blob/948af30a5b665fe74b7052b673e95ff5f5fc426a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2597)
>> is not public.
>>
>> Is that intentional? Is there a workaround to control the file size?
>>
>> Thanks,
>>
>> Julien
>>
>
> On Wed, Sep 28, 2022 at 4:55 PM Julien Phalip <jpha...@gmail.com> wrote:
>
>> Hi,
>>
>> I'd like to control the size of files written to GCS when using
>> BigQueryIO's FILE_LOAD write method.
>>
>> However, it looks like the withMaxFileSize method (
>> https://github.com/apache/beam/blob/948af30a5b665fe74b7052b673e95ff5f5fc426a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2597)
>> is not public.
>>
>> Is that intentional? Is there a workaround to control the file size?
>>
>> Thanks,
>>
>> Julien
>>
>

Reply via email to