steveloughran commented on pull request #30135:
URL: https://github.com/apache/spark/pull/30135#issuecomment-895209346
known 3.3.1 regressions. Not AFAIK
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
steveloughran commented on pull request #30135:
URL: https://github.com/apache/spark/pull/30135#issuecomment-872974884
> To my surprise the read is slower(with same resource and same config) in
Hadoop 3.3.1 than Hadoop 3.2.0 without the mentioned issue. It is possible I am
missing somethin
steveloughran commented on pull request #30135:
URL: https://github.com/apache/spark/pull/30135#issuecomment-868530293
the parquet EOF fix is also in hadoop-3.2.2, so you could try that. However,
testing with 3.3.1 is better because
1. we can do workarounds in spark before the releas
steveloughran commented on pull request #30135:
URL: https://github.com/apache/spark/pull/30135#issuecomment-868515080
@arghya18
1. we havent seen HADOOP-17755 in any of our testing, unless it is
HADOOP-16109
2. still waiting on that JIRA for you to provide config details. Like I've
steveloughran commented on pull request #30135:
URL: https://github.com/apache/spark/pull/30135#issuecomment-859463610
> For the regression, don't know the full context behind the original change
but seems like a good thing to do, although a boolean flag returned might be
less disruptive I
steveloughran commented on pull request #30135:
URL: https://github.com/apache/spark/pull/30135#issuecomment-859048545
> reverted from RC3 so I reverted my earlier change of handling it in Spark
code. Let me check later.
maybe the build is picking up the old RC.
FWIW I'm surpr