Re: Basic iceberg metrics viz tool

2021-03-02 Thread Filip
Ooops, sorry, forgot to post a link to the tools screenshot sample https://i.imgur.com/FERzd8X.png On 2021/03/02 16:18:22, Filip wrote: > Hi devs, > > With a lot of help from plotly.js [1] and some very basic vanilla > javascript (sorry, I peaked in javascript back in the days of p

Basic iceberg metrics viz tool

2021-03-02 Thread Filip
f you think this would make a good addition to the repo. [1] https://github.com/plotly/plotly.js/blob/master/LICENSE -- Filip Bocse

Re: An edge-case on snapshot expiration, incremental reads and very slow consecutive writes

2021-01-13 Thread Filip
probability of intersection even more. Thank you, /Filip On Tue, Jan 12, 2021 at 11:34 PM John Clara wrote: > To get around this, my team expires snapshots based on the number of > snapshots rather than by time. For example, if the reader jobs is scheduled > to consume 2k snapshot incre

An edge-case on snapshot expiration, incremental reads and very slow consecutive writes

2021-01-11 Thread Filip
expire snapshots older than 10 days but we observe two consecutive write operations 11 days apart. -- Filip Bocse

Has the topic of CDC (change data capture) been considered for Iceberg? If not, should it?

2020-03-10 Thread Filip
ceberg for such use-cases? Oh and I also was thinking that CDC could be a valuable candidate to consider for the Update/Delete/Upsert spec/ implementation. -- Filip Bocse

Re: Iceberg tombstone?

2020-01-28 Thread Filip
uns against a particular snapshot id and commits against a possibly later one should no conflict surface wrt to the list of files matching the filtering predicates. /Filip On Tue, Jan 28, 2020 at 1:34 AM Miao Wang wrote: > Hi Ryan, > > > > Just found your comment in my junk mail box. >

Iceberg tombstone?

2020-01-14 Thread Filip
ing that we could leverage to translate tombstone options into row level filter/ predicates into https://github.com/apache/incubator-iceberg/blob/6048e5a794242cb83871e3838d0d40aa71e36a91/spark/src/main/java/org/apache/iceberg/spark/source/Reader.java#L438-L439 /Filip

Re: [DISCUSS] Write-audit-publish support

2019-07-22 Thread Filip
that has WAP enabled by the >>> table property write.wap.enabled=true will stage the new snapshot >>> instead of fully committing, with the WAP ID in the snapshot’s metadata. >>> >>> Is this something we should open a PR to add to Iceberg? It seems a >>> little strange to make it appear that a commit has succeeded, but not >>> actually change a table, which is why we didn’t submit it before now. >>> >>> Thanks, >>> >>> rb >>> -- >>> Ryan Blue >>> Software Engineer >>> Netflix >>> >> >> > > -- > Ryan Blue > Software Engineer > Netflix > -- Filip Bocse

Iceberg API incompatibility - written file paths (to human representation) breaks partition discovery based on generated path

2019-06-24 Thread Filip
translate to extracting date as 17907 (days since epoch till 2019-01-01) - assuming that `date` is a `day` function backed partition transform. Makes sense? Is it a known limitation or is it an issue? If it's an issue, do you know if it's already tracked? /Filip

Use a Spark schema util method to figure out new delta nested fields

2019-06-14 Thread Filip
ma of the inbound file by comparing it to the Iceberg current schema and generate the effective schema commit? -- /Filip

(How) Does Iceberg support migration from/ emedding to a generic datalake architecture?

2019-06-07 Thread Filip
in a generic data-lake architecture in the first place, thinking solely from an adoption pov? If anyone else has been giving some thought to this and maybe either figured some stuff out or wants to share ideas on the topic please do. [1] https://github.com/apache/incubator-iceberg/pull/201 -- /Filip

Re: Need help trying to figure out if the issue on multiple partition specs on same field is a tracked issue or not

2019-06-03 Thread Filip
structure >> won’t contain year/month/day folders. If you are to have that directory >> structure, you need to have actual columns for year/month/day in your >> dataset and use identity partition function. >> >> Thanks, >> Anton >> >> >> >

Need help trying to figure out if the issue on multiple partition specs on same field is a tracked issue or not

2019-05-28 Thread filip
g.TestScansAndSchemaEvolution.testMultiPartitionPerFieldTransform(TestScansAndSchemaEvolution.java:177) I was wondering if this issue is tracked so maybe I could help out. Thanks, /Filip

Re: [VOTE] Add the python implementation

2019-03-06 Thread filip
e it > elsewhere. > > Please vote in the next 72 hours: > > [ ] +1: Commit the current Python PR implementation > [ ] +0: . . . > [ ] -1: Do not add the current implementation because . . . > > Thanks! > > rb > > -- > Ryan Blue > > > -- Filip Bocse

Re: Would we consider adding support for metrics collection/tracing instrumentation such as opencensus or opentracing?

2019-02-25 Thread filip
7;ve used DropWizard before, which I thought was trying to be the SLF4J of > metrics. Is that still the case? I'd prefer to go with an established > project that is likely to have broad support. And one that has a reasonable > dependency set. > > On Mon, Feb 18, 2019 at 2:33 PM fil

Re: Would we consider adding support for metrics collection/tracing instrumentation such as opencensus or opentracing?

2019-02-18 Thread filip
know. Can you elaborate on what opencensus and opentracing are? > > On Mon, Feb 18, 2019 at 12:51 PM filip wrote: > >> >> /Filip >> > > > -- > Ryan Blue > Software Engineer > Netflix > -- Filip Bocse

Would we consider adding support for metrics collection/tracing instrumentation such as opencensus or opentracing?

2019-02-18 Thread filip
/Filip

Re: Is it by design that we can only add optional top-level fields?

2019-01-30 Thread filip
for making unsafe changes to make that easier for > administrators. > > On Wed, Jan 30, 2019 at 1:43 AM filip wrote: > >> Thank you for the details Ryan but I think I was quite vague on the >> initial question so please let me try rephrasing the question by adding >> more cont

Re: Is it by design that we can only add optional top-level fields?

2019-01-30 Thread filip
required(11, "string", Types.StringType.get()), required(12, "time", Types.TimeType.get()), required(13, "timestampz", Types.TimestampType.withoutZone()), required(14, "timestamp", Types.TimestampType.withZone()), require

Is it by design that we can only add optional top-level fields?

2019-01-29 Thread filip
Is it by design that the schema evolution API for adding top-level fields will always create an optional field as per SchemaUpdate code [1]? [1] https://github.com/Netflix/iceberg/blob/master/core/src/main/java/com/netflix/iceberg/SchemaUpdate.java#L102 -- Filip Bocse