No, I don't have any timeline yet. My intention is to find if anyone is interested in working on any part or proposing any new feature. The items I listed above can be the starting point to discuss. It would be good to collect enough information here on the ML and then make a list on a Github issue to sort things out.
Best, Gang On Thu, Dec 7, 2023 at 1:45 PM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > Thank you for the heads up. As a part of discussion, do you have any > timeline or target ORC version for orc-format v2.0? > Given that it's one of the non-trivial efforts, I'm wondering what we can > achieve in 2024. > > Thanks, > Dongjoon. > > On Wed, Dec 6, 2023 at 9:00 PM Gang Wu <ust...@gmail.com> wrote: > > > The Apache ORC community has created a separate orc-format > > repo [1] to hold format specs. It can help us decouple the versions > > of format and implementation. > > > > IMO, it is now a good time to discuss the next step to evolve the > > ORC format. To give my two cents, following items are what we can do: > > - Follow up with the ORC Format v2 proposal [2] > > - Parquet feature parity [3] > > - Lance feature parity [4] > > > > Considering the activity in the community, I'd like to hear different > > opinions before taking any action. Any suggestions are welcome. > > > > [1] https://github.com/apache/orc-format > > [2] https://orc.apache.org/specification/ORCv2 > > [3] https://github.com/apache/parquet-format > > [4] https://lancedb.github.io/lance/format.html > > > > Best, > > Gang > > >