Hi Vukasin, The order is not fixed and we should follow the same order of *Stream *fields in the *StripeFooter*.
Best, Gang On Thu, Mar 23, 2023 at 8:23 AM Vukasin Milovanovic <vmilovano...@nvidia.com.invalid> wrote: > Hi, > > I'm working on an issue in the ORC reader in > https://github.com/rapidsai/cudf. This reader uses the row index to > parallelize the reads of row groups on the GPU. > I've found that the issue stems from the unexpected order of row index > streams. Namely, the order does not seem to match the order of > corresponding data stream descriptors in the file footer. > In this specific case, file footer contains the LENGTH stream of a string > column before its DATA stream. However, the row index streams seem to be > stored in the opposite order. > > So, my question is: what is the order of row index streams in an ORC file > (within each column)? Is it fixed for the given TypeKind, or are they > indeed ordered to correspond to the data stream order? > > Thank you, > Vukasin >