As someone who mainly operates in AWS it would be very welcome to have the
option to use an updated version of hadoop using pyspark sourced from pypi.
Acknowledging the issues of backwards compatability...
The most vexing issue is the lack of ability to use s3a STS, ie
org.apache.hadoop.fs.s3a.Te
I call attention to https://github.com/apache/spark/pull/28971 which
represents part 1 of several changes that are the last large change
for Scala 2.13, except for REPL updates: dealing with the fact that
default collection types will be immutable.
The goal of course is to cross-compile without se