[ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-------------------------------------
    Attachment: Screen Shot 2020-09-30 at 4.01.19 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-6153
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6153
>             Project: Phoenix
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 4.15.0, 4.14.3, master
>            Reporter: Saksham Gangwar
>            Assignee: Saksham Gangwar
>            Priority: Major
>             Fix For: 5.1.0, 4.16.0
>
>         Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.10 AM.png, Screen Shot 2020-09-30 at 4.01.19 
> AM.png, Screen Shot 2020-09-30 at 4.01.19 AM.png, Screen Shot 2020-09-30 at 
> 4.01.19 AM.png, Screen Shot 2020-09-30 at 4.01.34 AM.png, Screen Shot 
> 2020-09-30 at 4.01.52 AM.png, Screen Shot 2020-09-30 at 4.01.52 AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
>  ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318) 
> ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
>  ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) 
> ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) 
> ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_172]
>  at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_172]
>   
>  
>  Change Required:
> 1. While setting the snapshot name in a shared configuration we also need to 
> add a mechanism to remove it as well when jobs are not snapshot related:
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/PhoenixInputFormat.java#L210]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to