I have created a JIRA (https://issues.apache.org/jira/browse/SPARK-36810)
to track this issue. Will look into this issue further in the coming days.
Regards
Venkata krishnan
On Tue, Sep 7, 2021 at 5:57 AM Steve Loughran
wrote:
> FileContext came in Hadoop 2.x with a cleaner split of client API
FileContext came in Hadoop 2.x with a cleaner split of client API and
driver implementation, and stricter definition of some things considered
broken in FileSystem (rename() corner cases, notion of a current directory,
...)
But as it came out after the platform was broadly adopted & never
backport
Sharing some things I learned looking into the Delta Lake issue:
- This was a read after write inconsistency _all on the driver_.
Specifically it currently uses the FileSystem API for reading table logs
for greater compatibility, but the FileContext API for writes for atomic
renames. This led to t
ooh, this is fun,
v2 isn't safe to use unless every task attempt generates files with exactly
the same names and it is okay to intermingle the output of two task
attempts.
This is because task commit can felt partway through (or worse, that
process pause for a full GC), and a second attempt commi
So it turns out Delta Lake isn't compatible out of the box due to it's
mixed use of the FileContext API for writes and the FileSystem API for
reads on the driver. Bringing that up with those devs now but in the
meantime the auto-msync-only-on-driver trick is already coming in handy,
thanks!
On Wed
Ahhh we don't do any RDD checkpointing but that makes sense too. Thanks for
the tip on setting that on the driver only, I didn't know that was possible
but it makes a lot of sense.
I couldn't tell you the first thing about reflection but good to know it's
actually something possible to implement o
Hi Adam,
Thanks for this great writeup of the issue. We (LinkedIn) also operate
Observer NameNodes, and have observed the same issues, but have not yet
gotten around to implementing a proper fix.
To add a bit of context from our side, there is at least one other place
besides the committer v1 alg
Hi,
We ran into an interesting issue that I wanted to share as well as get
thoughts on if anything should be done about this. We run our own Hadoop
cluster and recently deployed an Observer Namenode to take some burden off
of our Active Namenode. We mostly use Delta Lake as our format, and
everyth