Re: Observer Namenode and Committer Algorithm V1

2021-09-20 Thread Venkatakrishnan Sowrirajan
I have created a JIRA (https://issues.apache.org/jira/browse/SPARK-36810) to track this issue. Will look into this issue further in the coming days. Regards Venkata krishnan On Tue, Sep 7, 2021 at 5:57 AM Steve Loughran wrote: > FileContext came in Hadoop 2.x with a cleaner split of client

Re: Observer Namenode and Committer Algorithm V1

2021-09-07 Thread Steve Loughran
FileContext came in Hadoop 2.x with a cleaner split of client API and driver implementation, and stricter definition of some things considered broken in FileSystem (rename() corner cases, notion of a current directory, ...) But as it came out after the platform was broadly adopted & never

Re: Observer Namenode and Committer Algorithm V1

2021-09-06 Thread Adam Binford
Sharing some things I learned looking into the Delta Lake issue: - This was a read after write inconsistency _all on the driver_. Specifically it currently uses the FileSystem API for reading table logs for greater compatibility, but the FileContext API for writes for atomic renames. This led to

Re: Observer Namenode and Committer Algorithm V1

2021-08-20 Thread Steve Loughran
ooh, this is fun, v2 isn't safe to use unless every task attempt generates files with exactly the same names and it is okay to intermingle the output of two task attempts. This is because task commit can felt partway through (or worse, that process pause for a full GC), and a second attempt

Re: Observer Namenode and Committer Algorithm V1

2021-08-20 Thread Adam Binford
So it turns out Delta Lake isn't compatible out of the box due to it's mixed use of the FileContext API for writes and the FileSystem API for reads on the driver. Bringing that up with those devs now but in the meantime the auto-msync-only-on-driver trick is already coming in handy, thanks! On

Re: Observer Namenode and Committer Algorithm V1

2021-08-18 Thread Adam Binford
Ahhh we don't do any RDD checkpointing but that makes sense too. Thanks for the tip on setting that on the driver only, I didn't know that was possible but it makes a lot of sense. I couldn't tell you the first thing about reflection but good to know it's actually something possible to implement

Re: Observer Namenode and Committer Algorithm V1

2021-08-17 Thread Erik Krogen
Hi Adam, Thanks for this great writeup of the issue. We (LinkedIn) also operate Observer NameNodes, and have observed the same issues, but have not yet gotten around to implementing a proper fix. To add a bit of context from our side, there is at least one other place besides the committer v1