[ https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Saksham Gangwar updated PHOENIX-6153: ------------------------------------- Summary: Phoenix Table Map Reduce After Snapshot Map Reduce fails with Snapshot Corrupt (was: MapReduceParallelScanGrouper getRegionBoundaries should use connection specific properties vs common config) > Phoenix Table Map Reduce After Snapshot Map Reduce fails with Snapshot Corrupt > ------------------------------------------------------------------------------ > > Key: PHOENIX-6153 > URL: https://issues.apache.org/jira/browse/PHOENIX-6153 > Project: Phoenix > Issue Type: Bug > Components: core > Affects Versions: 4.14.3, 4.x > Reporter: Saksham Gangwar > Assignee: Saksham Gangwar > Priority: Major > Fix For: 4.16.0 > > > Different MR job requests which reach [MapReduceParallelScanGrouper > getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65] > we currently make use of shared configuration among jobs to figure out > snapshot names, which is wrong. > Example jobs' sequence: first two jobs work over snapshot and the third job > over a regular table. > Prininting hashcode of objects when entering: > [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65] > *Job 1:* (over snapshot of *ABC_TABLE_1* and is successful) > context.getConnection(): 521093916 > ConnectionQueryServices: 1772519705 > *ReadOnlyProps props: 1520403731* > props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1* > *Configuration conf: 813285994* > conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1* > > *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful) > context.getConnection(): 1928017473 > ConnectionQueryServices: 961279422 > *ReadOnlyProps props: 1520602316* > props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2* > *Configuration conf: 813285994* > conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2* > > *Job 3:* (over the table *ABC_TABLE_3* but fails with > CorruptedSnapshotException while it got nothing to do with snapshot) > context.getConnection(): 28889670 > ConnectionQueryServices: 424389847 > *ReadOnlyProps props: 1573377628* > props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null* > *Configuration: 813285994* > conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2* > > Exception which we get: > [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] > [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3 > java.lang.RuntimeException: > org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read > snapshot info > from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo > at > org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:511) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:62) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89) > > ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT] > at > org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301) > ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18] > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318) > ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18] > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196) > ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18] > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18] > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18] > at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_172] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_172] > > > Change Required: > 1. Don't set snapshot name in shared configuration which is being used by > multiple/every jobs: > [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/PhoenixInputFormat.java#L210] > > 2. Access it from connection specific properties here: > [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65] > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)