Re: Beam BigtableIO versus Google CloudBigtableIO

Diego Gomez via dev Tue, 16 Aug 2022 10:06:00 -0700

Hello Sahith,

We recommend using BigtableIO over CloudBigtableIO. Both of them have
similar performances and main differences being than CloudBigtableIO uses
HBase Result and Puts, while BigtableIO uses protos to read results and
mutations.

The two connectors should result in similar spending on Bigtable's side,
more write requests doesn't necessarily mean more cost/nodes. What version
of CloudBigtableIO are you using and are you using an autoscaling CBT
cluster?

-Diego

On Tue, Aug 16, 2022 at 11:55 AM Sahith Nallapareddy via dev <
dev@beam.apache.org> wrote:

> Hello,
>
> I see that there are two implementations of reading and writing from
> Bigtable, one in beam and one that is references in Google cloud
> documentation. Is one preferred over the other? We often use the Beam
> BigtableIO to write to bigtable but I have found that sometimes the default
> configuration can lead to a lot of write requests (which can lead to having
> more nodes as well it seems, more cost associated). I am about to try
> messing around with the bulk options to see if that can raise the batching
> of mutations, but is there anything else I should try, like switching the
> actual transform we use?
>
> Thanks,
>
> Sahith
>

Re: Beam BigtableIO versus Google CloudBigtableIO

Reply via email to