Thank you for testing and sharing the result. The NaN failures look interesting to me. Let me take a look at that too. If this is an ORC bug, we had better add similar NaN tests to Apache ORC and Apache Spark side.
Thanks, Dongjoon. On Sun, Apr 3, 2022 at 12:39 PM William Hyun <williamhy...@gmail.com> wrote: > Hello All, > > I tested the following in branch-1.7: > - docker tests > - Apache Spark IT > - Apache Iceberg IT > > There are two failures in Iceberg IT. ORC-1121 looks relevant to these > failures. > > I created a GH issue to track this, please take a look together if you have > a chance. > > https://github.com/apache/orc/issues/1075 > > Thank you, > William > > On Sat, Apr 2, 2022 at 4:28 PM Dongjoon Hyun <dongjoon.h...@gmail.com> > wrote: > > > Thank you, William. > > > > +1 for your decision to align with Apache Spark 3.3.0 > > > > Dongjoon. > > > > > > On Sat, Apr 2, 2022 at 2:19 PM William Hyun <williamhy...@gmail.com> > > wrote: > > > > > Thank you, it looks good to me. > > > > > > As a release manager of Apache ORC 1.7.4, I would like to discuss the > > > release of 1.7.4. > > > We have the following patches already. > > > > > > ORC-236: Support `UNION` type in Java Convert tool (#1025) > > > ORC-1117: Add `Dask` page at `Using in Python` section (#1045) > > > ORC-1116: [C++] Fix csv-import tool when exporting long bytes (#1044) > > > ORC-1118: Support Java 17 and ARM64 docker tests (#1047) > > > ORC-1119: Remove timestamp from ORC API docs (#1049) > > > ORC-1123: Add estimationMemory method for writer > > > ORC-1120: Remove C++ library limitation about write version (#1054) > > > ORC-1121: Fix column coversion check bug which causes column filters > > don't > > > work (#1055) > > > ORC-1127: [C++] add missing version of UNSTABLE-PRE-2.0 (#1064) > > > > > > For me, ORC-1121 is a notable bug fix and > > > ORC-1123 looks important because it was requested by the Iceberg > > community. > > > > > > In addition, Apache Spark is going to start v3.3.0 RC soon. > > > Shall we release Apache ORC 1.7.4 next week in order to be included in > > > Spark 3.3.0? > > > > > > Regards, > > > William > > > > > > > > > > > > On Fri, Apr 1, 2022 at 5:44 PM Dongjoon Hyun <dongjoon.h...@gmail.com> > > > wrote: > > > > > > > It's time for our quarterly report to the ASF board. I wrote a draft. > > > > Please let me know if you'd like to add or change anything. > > > > > > > > ---- > > > > ## Description: > > > > The mission of ORC is the creation and maintenance of software > related > > > > to the smallest, fastest columnar storage for Hadoop workloads. > > > > > > > > ## Issues: > > > > There are no issues requiring board attention. > > > > > > > > ## Membership Data: > > > > Apache ORC was founded 2015-04-21 (7 years ago) > > > > There are currently 45 committers and 13 PMC members in this project. > > > > The Committer-to-PMC ratio is roughly 3:1. > > > > > > > > The PMC has been focusing on the community's growth to be > > > > a healthier community by helping candidates. > > > > > > > > - No new PMC members. Last addition was William Hyun on > 2021-09-30. > > > > - Quanlong Huang was added as committer on 2022-03-04. > > > > - One invited candidate has been working on ICLA/CCLA. > > > > > > > > ## Project Activity: > > > > According to our release cadence, we released two maintenance > releases > > > > in this quarter and helped Apache Spark, Iceberg, and Arrow projects > to > > > use > > > > it. > > > > > > > > - 1.6.13 (2022-01-20) > > > > - 1.7.3 (2022-02-09) > > > > > > > > In addition, we collaborated with the Arrow community and added the > > > > official > > > > 'USING IN PYTHON' pages for the users. > > > > > > > > https://orc.apache.org/docs/pyarrow.html > > > > https://orc.apache.org/docs/dask.html > > > > > > > > William proposed to use the 'GitHub issue' feature to lower the > hurdle > > to > > > > contribute > > > > and it's implemented via 'ORC-1094: Enable GitHub issues tab'. > > > > So far, it helps us a lot as a more user-friendly channel by skipping > > > JIRA > > > > login. > > > > > > > > ## Community Health: > > > > > > > > Since we started to use 'GitHub issues', the mailing list activity > has > > > > decreased. > > > > However, all the other activities are increased. > > > > > > > > > >