[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-08-05 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-894006737 +1 for backporting! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-19 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-864466090 Hi, All. FYI, From Apache Hadoop 3.3.1, we reverted HADOOP-16878 as the last commit on `branch-3.3.1`. - https://github.com/apache/hadoop/commit/a3b9c37a397ad418

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-19 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-864469570 Since Apache Hadoop trunk has this behavior, I made a PR in order to be more robust on the underlying behavior difference. - https://github.com/apache/spark/pull/32983

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-24 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-868190212 We are preparing Apache Spark 3.2.0 with Hadoop 3.3.1, @arghya18 . - SPARK-35831 is fixed. - SPARK-35868 is also fixed. - SPARK-35878 has the PR from @steveloughran

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-25 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-868284598 Thank you for sharing, @arghya18 . HADOOP-17755 sounds like read-side issue and Magic committer is write-side feature. I don't think they are related. If you hit a magi

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-07-01 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872729358 Thank you for sharing, @arghya18 . It's interesting. The read statistic increase is also observed in my environment, but TPCDS 1TB on S3 parquet performance was faster for

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-07-01 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-872736822 Oh, if you are using ORC, please try to bring SPARK-35783. It's irrelevant to Hadoop, but it helps you reduce the traffic. - https://github.com/apache/spark/pull/32923

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-05-24 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-847220566 Yay! Thank you, @sunchao . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-05-27 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-849931542 Thank you for updates, @sunchao ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-03 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-854198489 Interesting. The same test fails in both Jenkins and GitHub Action. Is something changed at this new RC? - `ClientSuite.distribute jars archive` ``` org.apache.had

[GitHub] [spark] dongjoon-hyun commented on pull request #30135: [SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1

2021-06-04 Thread GitBox
dongjoon-hyun commented on pull request #30135: URL: https://github.com/apache/spark/pull/30135#issuecomment-854198489 Interesting. The same test fails in both Jenkins and GitHub Action. Is something changed at this new RC? - `ClientSuite.distribute jars archive` ``` org.apache.had