[ https://issues.apache.org/jira/browse/PHOENIX-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani updated PHOENIX-6273: ---------------------------------- Fix Version/s: (was: 4.16.1) (was: 4.x) 4.16.0 5.1.0 > All the map tasks should operate on the same restored snapshot > -------------------------------------------------------------- > > Key: PHOENIX-6273 > URL: https://issues.apache.org/jira/browse/PHOENIX-6273 > Project: Phoenix > Issue Type: Bug > Components: core > Affects Versions: 5.0.0, 4.14.3 > Reporter: Saksham Gangwar > Assignee: Saksham Gangwar > Priority: Major > Fix For: 5.1.0, 4.16.0 > > > Recently we switched an MR application from scanning live tables to scanning > snapshots (PHOENIX-3744). We ran into a severe performance issue, which > turned out to a correctness issue due to over-lapping scan splits generation. > After some debugging we figured that it has been fixed via PHOENIX-4997. > We also *need not restore the snapshot per map task*. Currently, we restore > the snapshot once per map task into a temp directory. For large tables on big > clusters, this creates a storm of NN RPCs. We can do this once per job and > let all the map tasks operate on the same restored snapshot. HBase already > did this via HBASE-18806, we can do something similar. > > All other performance suggestions here: > https://issues.apache.org/jira/browse/PHOENIX-6081 -- This message was sent by Atlassian Jira (v8.3.4#803005)