> Does anyone know if we can recover existing data affected by it? In the PR #1271, there are two data types which have correctness bugs: decimal18 and timestampZone.
For decimal18, we actually write the correct decimal value, but read it in an incorrect way. saying the decimal(10,3) and value = 10.100, the orc writer will store it in file as 101*10^(-1), while before this patch we will read it as 101*10^(-3). If we use the scale=-1 to construct the BigDecimal and then adjust to scale=3, then in theory we could still get the correct decimal 10100*10^(-3). For timestampZone, I'd say that we've stored the wrong value in the file, the error range between the written timestamp and correct timestamp should be less than a few seconds. Because here [1] for negative value, -5 / 2 = -2, floorDiv(-5, 2) = -3, the error range should be less than 1, the nanoseconds of timestamp is the value that is less than one second. While I did not get the way to recover the existing data. 1. https://github.com/apache/iceberg/pull/1271/files#diff-5aa4840155ec70fdf7f725e122cde7b7L218 On Tue, Aug 4, 2020 at 3:08 AM Ryan Blue <rb...@netflix.com.invalid> wrote: > Yes, we should get #1269 into a patch release as well since it is a > correctness bug. > > Does anyone know if we can recover existing data affected by it? > > On Mon, Aug 3, 2020 at 11:08 AM Anton Okolnychyi <aokolnyc...@apple.com> > wrote: > >> I see a few open issues for ORC. Some of them seem critical (like issue >> #1269). Do we want to fix those before the release? Or is ORC support still >> experimental? >> >> - Anton >> >> On 1 Aug 2020, at 20:04, Jungtaek Lim <kabhwan.opensou...@gmail.com> >> wrote: >> >> Sure! I just submitted #1285 >> <https://github.com/apache/iceberg/pull/1285> to exclude the refactor. >> Once #1285 is merged I'll rebase the existing PR to do the refactor. Thanks >> for the input! >> >> On Sun, Aug 2, 2020 at 4:41 AM Ryan Blue <rb...@netflix.com.invalid> >> wrote: >> >>> Thanks, Jungtaek! I agree it would be great to fix that problem. I took >>> a quick look at the PR and it is a little big to go into a patch release >>> since it refactors quite a few places to consolidate the list copy. What do >>> you think about making a PR that just fixes the problem with >>> BaseCombinedScanTask and Kryo, then doing the remainder of the refactor in >>> master? >>> >>> On Fri, Jul 31, 2020 at 5:29 PM Jungtaek Lim < >>> kabhwan.opensou...@gmail.com> wrote: >>> >>>> If we still have some more days I think #1280 >>>> <https://github.com/apache/iceberg/pull/1280>: "fix serialization >>>> issue in BaseCombinedScanTask with Kyro" is a good candidate to be >>>> included. The bug affects both Spark and Flink (according to #1279 >>>> <https://github.com/apache/iceberg/pull/1279>). >>>> >>>> On Sat, Aug 1, 2020 at 8:04 AM Ryan Blue <b...@apache.org> wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> We’ve accumulated a few bug fixes in the last couple of weeks and I >>>>> think it might make sense to get some of them out in an 0.9.1 release >>>>> since >>>>> they make it harder to work with Iceberg. Here are the ones I know about: >>>>> >>>>> - #1282 <https://github.com/apache/iceberg/pull/1282>: rewriteNot >>>>> fails for binary and unary predicates >>>>> - #1278 <https://github.com/apache/iceberg/pull/1278>: Bad import >>>>> from commons-compress causes query failures >>>>> - #1251 <https://github.com/apache/iceberg/pull/1251>: Fixes more >>>>> imports from non-Iceberg Guava >>>>> - #1283 <https://github.com/apache/iceberg/pull/1283>: Query >>>>> descriptions fail when IN predicates are pushed >>>>> - #1228 <https://github.com/apache/iceberg/pull/1228>: Data >>>>> imports fail when paths include whitespace >>>>> - #1194 <https://github.com/apache/iceberg/pull/1194>: USING >>>>> should set format when used in a CTAS command >>>>> - #1203 <https://github.com/apache/iceberg/pull/1203>: Table cache >>>>> should not expire >>>>> >>>>> If there are no objections, I’ll get started and create a release >>>>> branch. And please reply if there are other issues you’ve seen that should >>>>> also be included in a patch release. >>>>> >>>>> rb >>>>> -- >>>>> Ryan Blue >>>>> >>>> >>> >>> -- >>> Ryan Blue >>> Software Engineer >>> Netflix >>> >> >> > > -- > Ryan Blue > Software Engineer > Netflix >