Hi Steve,

The public dataset is accessible from anywhere. BigLake offers a free tier
with the first 50,000 requests being free each month [1]. While not
entirely free, it's essentially "freeish." I'm uncertain about egress
charges. When using the dataset, users must specify a project that will be
billed. However, based on my personal experience with my project, I haven't
incurred any charges. I know spinning up a Spark cluster is not a big deal
for you, but if you want to give it a fast try, I also created a gist with
pyiceberg [2].

[1] https://cloud.google.com/products/biglake/pricing
[2] https://gist.github.com/talatuyarer/02568a38a7630434556e7dc1f0a5ab40

On Wed, Jan 21, 2026 at 5:31 AM Steve Loughran <[email protected]> wrote:

>
> are these remotely accessible? and who pays?
>
> I'm just thinking of whether its an datasource for regression testing.
>
> For s3a we use public (free) parquet datasets for some of the scale read
> testing...keeps setup time minimal and stops "needs a few hundred MB of
> data in s3" as a cost blocker to contributors (*).
>
> It'd be nice to have public iceberg datasets in the various stores for
> similar regression tests
>
> steve
>
> (*) we use NOAA data, luckily the s3 bucket hasn't been decommissioned by
> the US govt, though I did worry about that last year
>
> On Wed, 14 Jan 2026 at 21:27, Alex Stephen via dev <[email protected]>
> wrote:
>
>> Hi all,
>>
>> We just launched a public dataset (backed by a public Iceberg REST
>> Catalog) that can be accessed by any Iceberg-enabled query engine. The goal
>> is for Iceberg developers to begin diving into the ecosystem without
>> bootstrapping a full catalog and creating data.
>>
>> We'd love to hear any of your thoughts on how we can improve it.
>>
>> Announcement blog post
>> <https://opensource.googleblog.com/2026/01/explore-public-datasets-with-apache-iceberg-and-biglake.html>
>> Example PySpark script
>> <https://gist.github.com/rambleraptor/7fd2fd55a208da7e5c000430d54d8db4>
>>
>> Thanks!
>>
>> -- Alex Stephen
>>
>

Reply via email to