[ https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690080#comment-16690080 ]
Vincent Poon edited comment on PHOENIX-5018 at 11/16/18 10:33 PM: ------------------------------------------------------------------ I think an easy first step would be to make sure all the index timestamps are correct. It wouldn't be too difficult to unify the codebases if we use the MetadataRegionObserver approach, since it's just a scan with an attribute set. But given PHOENIX-5026, it's not clear if issuing mutations from a scan is the right approach. Maybe we can defer the refactoring to another JIRA. was (Author: vincentpoon): I think an easy first step would be to make sure all the index timestamps are correct. It wouldn't be too difficult to unify the codebases if we use the MetadataRegionObserver approach, since it's just a scan with an attribute set. But given PHOENIX-5026, it's not clear if do mutations from a scan is the right approach. Maybe we can defer the refactoring to another JIRA. > Index mutations created by IndexTool will have wrong timestamps > --------------------------------------------------------------- > > Key: PHOENIX-5018 > URL: https://issues.apache.org/jira/browse/PHOENIX-5018 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.14.0, 5.0.0 > Reporter: Geoffrey Jacoby > Assignee: Geoffrey Jacoby > Priority: Major > > When doing a full rebuild (or initial async build) on an index using the > IndexTool and PhoenixIndexImportDirectMapper, we generate the index mutations > by creating an UPSERT SELECT query from the base table to the index, then > taking the Mutations from it and inserting it directly into the index via an > HBase HTable. > The timestamps of the Mutations use the default HBase behavior, which is to > take the current wall clock. However, the timestamp of an index KeyValue > should use the timestamp of the initial KeyValue in the base table. > Having base table and index timestamps out of sync can cause all sorts of > weird side effects, such as if the base table has data with an expired TTL > that isn't expired in the index yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)