[jira] [Updated] (PHOENIX-6273) Add support to handle MR Snapshot restore externally

2021-01-21 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6273:
-
Description: 
Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. Currently, we restore the 
snapshot once per map task into a temp directory. For large tables on big 
clusters, this creates a storm of NN RPCs. We can do this once per job and let 
all the map tasks operate on the same restored snapshot. HBase already did this 
via HBASE-18806, we can do something similar. Jira to correct this behavior: 
https://issues.apache.org/jira/browse/PHOENIX-6334

*The purpose of this Jira* is to resolve this issue immediately by providing 
the ability to the caller to decide whether or not snapshot restore needs to be 
handled externally or internally on the Phoenix side (the buggy approach). 

All other performance suggestions here: 
https://issues.apache.org/jira/browse/PHOENIX-6081

  was:
Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. Currently, we restore the 
snapshot once per map task into a temp directory. For large tables on big 
clusters, this creates a storm of NN RPCs. We can do this once per job and let 
all the map tasks operate on the same restored snapshot. HBase already did this 
via HBASE-18806, we can do something similar.

The purpose of this Jira is to resolve this issue immediately by providing the 
ability to the caller to decide whether or not snapshot restore needs to be 
handled externally or internally on the Phoenix side (the buggy approach). 

All other performance suggestions here: 
https://issues.apache.org/jira/browse/PHOENIX-6081


> Add support to handle MR Snapshot restore externally
> 
>
> Key: PHOENIX-6273
> URL: https://issues.apache.org/jira/browse/PHOENIX-6273
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.0.0, 4.14.3
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
>
> Recently we switched an MR application from scanning live tables to scanning 
> snapshots (PHOENIX-3744). We ran into a severe performance issue, which 
> turned out to a correctness issue due to over-lapping scan splits generation. 
> After some debugging we figured that it has been fixed via PHOENIX-4997. 
> We also *need not restore the snapshot per map task*. Currently, we restore 
> the snapshot once per map task into a temp directory. For large tables on big 
> clusters, this creates a storm of NN RPCs. We can do this once per job and 
> let all the map tasks operate on the same restored snapshot. HBase already 
> did this via HBASE-18806, we can do something similar. Jira to correct this 
> behavior: https://issues.apache.org/jira/browse/PHOENIX-6334
> *The purpose of this Jira* is to resolve this issue immediately by providing 
> the ability to the caller to decide whether or not snapshot restore needs to 
> be handled externally or internally on the Phoenix side (the buggy approach). 
> All other performance suggestions here: 
> https://issues.apache.org/jira/browse/PHOENIX-6081



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6334) All map tasks should operate on the same restored snapshot

2021-01-21 Thread Saksham Gangwar (Jira)
Saksham Gangwar created PHOENIX-6334:


 Summary: All map tasks should operate on the same restored snapshot
 Key: PHOENIX-6334
 URL: https://issues.apache.org/jira/browse/PHOENIX-6334
 Project: Phoenix
  Issue Type: Bug
  Components: core
Affects Versions: 4.14.3, 5.0.0
Reporter: Saksham Gangwar
 Fix For: 5.1.0, 4.16.0, 4.x


Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. The purpose of this Jira 
is to correct that behavior. Currently, we restore the snapshot once per map 
task into a temp directory. For large tables on big clusters, this creates a 
storm of NN RPCs. We can do this once per job and let all the map tasks operate 
on the same restored snapshot. HBase already did this via HBASE-18806, we can 
do something similar.

 

All other performance suggestions here: 
https://issues.apache.org/jira/browse/PHOENIX-6081

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6273) Add support to handle MR Snapshot restore externally

2021-01-21 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6273:
-
Description: 
Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. Currently, we restore the 
snapshot once per map task into a temp directory. For large tables on big 
clusters, this creates a storm of NN RPCs. We can do this once per job and let 
all the map tasks operate on the same restored snapshot. HBase already did this 
via HBASE-18806, we can do something similar.

The purpose of this Jira is to resolve this issue immediately by providing the 
ability to the caller to decide whether or not snapshot restore needs to be 
handled externally or internally on the Phoenix side (the buggy approach). 

All other performance suggestions here: 
https://issues.apache.org/jira/browse/PHOENIX-6081

  was:
Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. Currently, we restore the 
snapshot once per map task into a temp directory. For large tables on big 
clusters, this creates a storm of NN RPCs. We can do this once per job and let 
all the map tasks operate on the same restored snapshot. HBase already did this 
via HBASE-18806, we can do something similar.

 

All other performance suggestions here: 
https://issues.apache.org/jira/browse/PHOENIX-6081

Summary: Add support to handle MR Snapshot restore externally  (was: 
All the map tasks should operate on the same restored snapshot)

> Add support to handle MR Snapshot restore externally
> 
>
> Key: PHOENIX-6273
> URL: https://issues.apache.org/jira/browse/PHOENIX-6273
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.0.0, 4.14.3
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
>
> Recently we switched an MR application from scanning live tables to scanning 
> snapshots (PHOENIX-3744). We ran into a severe performance issue, which 
> turned out to a correctness issue due to over-lapping scan splits generation. 
> After some debugging we figured that it has been fixed via PHOENIX-4997. 
> We also *need not restore the snapshot per map task*. Currently, we restore 
> the snapshot once per map task into a temp directory. For large tables on big 
> clusters, this creates a storm of NN RPCs. We can do this once per job and 
> let all the map tasks operate on the same restored snapshot. HBase already 
> did this via HBASE-18806, we can do something similar.
> The purpose of this Jira is to resolve this issue immediately by providing 
> the ability to the caller to decide whether or not snapshot restore needs to 
> be handled externally or internally on the Phoenix side (the buggy approach). 
> All other performance suggestions here: 
> https://issues.apache.org/jira/browse/PHOENIX-6081



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6273) All the map tasks should operate on the same restored snapshot

2020-12-18 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6273:
-
Description: 
Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. Currently, we restore the 
snapshot once per map task into a temp directory. For large tables on big 
clusters, this creates a storm of NN RPCs. We can do this once per job and let 
all the map tasks operate on the same restored snapshot. HBase already did this 
via HBASE-18806, we can do something similar.

 

All other performance suggestions here: 
https://issues.apache.org/jira/browse/PHOENIX-6081

  was:
Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. Currently, we restore the 
snapshot once per map task into a temp directory. For large tables on big 
clusters, this creates a storm of NN RPCs. We can do this once per job and let 
all the map tasks operate on the same restored snapshot. HBase already did this 
via HBASE-18806, we can do something similar.


> All the map tasks should operate on the same restored snapshot
> --
>
> Key: PHOENIX-6273
> URL: https://issues.apache.org/jira/browse/PHOENIX-6273
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.0.0, 4.14.3
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.x, 4.16.1
>
>
> Recently we switched an MR application from scanning live tables to scanning 
> snapshots (PHOENIX-3744). We ran into a severe performance issue, which 
> turned out to a correctness issue due to over-lapping scan splits generation. 
> After some debugging we figured that it has been fixed via PHOENIX-4997. 
> We also *need not restore the snapshot per map task*. Currently, we restore 
> the snapshot once per map task into a temp directory. For large tables on big 
> clusters, this creates a storm of NN RPCs. We can do this once per job and 
> let all the map tasks operate on the same restored snapshot. HBase already 
> did this via HBASE-18806, we can do something similar.
>  
> All other performance suggestions here: 
> https://issues.apache.org/jira/browse/PHOENIX-6081



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6273) All the map tasks should operate on the same restored snapshot

2020-12-18 Thread Saksham Gangwar (Jira)
Saksham Gangwar created PHOENIX-6273:


 Summary: All the map tasks should operate on the same restored 
snapshot
 Key: PHOENIX-6273
 URL: https://issues.apache.org/jira/browse/PHOENIX-6273
 Project: Phoenix
  Issue Type: Bug
  Components: core
Affects Versions: 4.14.3, 5.0.0
Reporter: Saksham Gangwar
Assignee: Saksham Gangwar
 Fix For: 4.x, 4.16.1


Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. 

We also *need not restore the snapshot per map task*. Currently, we restore the 
snapshot once per map task into a temp directory. For large tables on big 
clusters, this creates a storm of NN RPCs. We can do this once per job and let 
all the map tasks operate on the same restored snapshot. HBase already did this 
via HBASE-18806, we can do something similar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-30 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Attachment: Screen Shot 2020-09-30 at 9.30.06 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0, 4.14.3, master
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.10 AM.png, Screen Shot 2020-09-30 at 4.01.19 
> AM.png, Screen Shot 2020-09-30 at 4.01.19 AM.png, Screen Shot 2020-09-30 at 
> 4.01.19 AM.png, Screen Shot 2020-09-30 at 4.01.34 AM.png, Screen Shot 
> 2020-09-30 at 4.01.52 AM.png, Screen Shot 2020-09-30 at 4.01.52 AM.png, 
> Screen Shot 2020-09-30 at 9.30.06 AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> 

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-30 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Attachment: Screen Shot 2020-09-30 at 4.01.52 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0, 4.14.3, master
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.10 AM.png, Screen Shot 2020-09-30 at 4.01.19 
> AM.png, Screen Shot 2020-09-30 at 4.01.19 AM.png, Screen Shot 2020-09-30 at 
> 4.01.19 AM.png, Screen Shot 2020-09-30 at 4.01.34 AM.png, Screen Shot 
> 2020-09-30 at 4.01.52 AM.png, Screen Shot 2020-09-30 at 4.01.52 AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SN

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-30 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Attachment: Screen Shot 2020-09-30 at 4.01.19 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0, 4.14.3, master
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.10 AM.png, Screen Shot 2020-09-30 at 4.01.19 
> AM.png, Screen Shot 2020-09-30 at 4.01.19 AM.png, Screen Shot 2020-09-30 at 
> 4.01.19 AM.png, Screen Shot 2020-09-30 at 4.01.34 AM.png, Screen Shot 
> 2020-09-30 at 4.01.52 AM.png, Screen Shot 2020-09-30 at 4.01.52 AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SN

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-30 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Attachment: Screen Shot 2020-09-30 at 4.01.19 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0, 4.14.3, master
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.10 AM.png, Screen Shot 2020-09-30 at 4.01.19 
> AM.png, Screen Shot 2020-09-30 at 4.01.19 AM.png, Screen Shot 2020-09-30 at 
> 4.01.19 AM.png, Screen Shot 2020-09-30 at 4.01.34 AM.png, Screen Shot 
> 2020-09-30 at 4.01.52 AM.png, Screen Shot 2020-09-30 at 4.01.52 AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SN

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-30 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Attachment: Screen Shot 2020-09-30 at 4.01.34 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0, 4.14.3, master
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.10 AM.png, Screen Shot 2020-09-30 at 4.01.19 
> AM.png, Screen Shot 2020-09-30 at 4.01.19 AM.png, Screen Shot 2020-09-30 at 
> 4.01.19 AM.png, Screen Shot 2020-09-30 at 4.01.34 AM.png, Screen Shot 
> 2020-09-30 at 4.01.52 AM.png, Screen Shot 2020-09-30 at 4.01.52 AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SN

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-30 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Attachment: Screen Shot 2020-09-30 at 4.01.10 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0, 4.14.3, master
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.10 AM.png, Screen Shot 2020-09-30 at 4.01.19 
> AM.png, Screen Shot 2020-09-30 at 4.01.52 AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-30 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Attachment: Screen Shot 2020-09-30 at 4.00.58 AM.png
Screen Shot 2020-09-30 at 4.01.10 AM.png
Screen Shot 2020-09-30 at 4.01.19 AM.png
Screen Shot 2020-09-30 at 4.01.52 AM.png

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0, 4.14.3, master
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6153.4.x.v1.patch, PHOENIX-6153.master.v1.patch, 
> PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch, 
> PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch, Screen Shot 
> 2020-09-30 at 4.00.58 AM.png, Screen Shot 2020-09-30 at 4.01.10 AM.png, 
> Screen Shot 2020-09-30 at 4.01.19 AM.png, Screen Shot 2020-09-30 at 4.01.52 
> AM.png
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSH

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-24 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Description: 
Different MR job requests which reach [MapReduceParallelScanGrouper 
getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
 we currently make use of shared configuration among jobs to figure out 
snapshot names. 

Example jobs' sequence: first two jobs work over snapshot and the third job 
over a regular table.

Prininting hashcode of objects when entering: 
[https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]

*Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)

context.getConnection(): 521093916
 ConnectionQueryServices: 1772519705
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*

 

*Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)

context.getConnection(): 1928017473
 ConnectionQueryServices: 961279422
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 3:* (over the table *ABC_TABLE_3* but fails with 
CorruptedSnapshotException while it got nothing to do with snapshot)

context.getConnection(): 28889670
 ConnectionQueryServices: 424389847
 *Configuration: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

Exception which we get:
 [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
[c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
 java.lang.RuntimeException: 
org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
snapshot info 
from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
 at 
org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301) 
~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
 at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318) 
~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0

[jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

2020-09-24 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Summary: Table Map Reduce job after a Snapshot based job fails with 
CorruptedSnapshotException  (was: Phoenix Table Map Reduce After Snapshot Map 
Reduce fails with Snapshot Corrupt)

> Table Map Reduce job after a Snapshot based job fails with 
> CorruptedSnapshotException
> -
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names, which is wrong. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *ReadOnlyProps props: 1520403731*
>      props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1*
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *ReadOnlyProps props: 1520602316*
>      props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *ReadOnlyProps props: 1573377628*
>      props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null*
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
> ~[phoen

[jira] [Updated] (PHOENIX-6153) Phoenix Table Map Reduce After Snapshot Map Reduce fails with Snapshot Corrupt

2020-09-24 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Summary: Phoenix Table Map Reduce After Snapshot Map Reduce fails with 
Snapshot Corrupt  (was: MapReduceParallelScanGrouper getRegionBoundaries should 
use connection specific properties vs common config)

> Phoenix Table Map Reduce After Snapshot Map Reduce fails with Snapshot Corrupt
> --
>
> Key: PHOENIX-6153
> URL: https://issues.apache.org/jira/browse/PHOENIX-6153
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper 
> getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
>  we currently make use of shared configuration among jobs to figure out 
> snapshot names, which is wrong. 
> Example jobs' sequence: first two jobs work over snapshot and the third job 
> over a regular table.
> Prininting hashcode of objects when entering: 
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *ReadOnlyProps props: 1520403731*
>      props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1*
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *ReadOnlyProps props: 1520602316*
>      props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with 
> CorruptedSnapshotException while it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *ReadOnlyProps props: 1573377628*
>      props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null*
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
> [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
>  java.lang.RuntimeException: 
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
> snapshot info 
> from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at 
> org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>  
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
> ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
> ~[

[jira] [Updated] (PHOENIX-6153) MapReduceParallelScanGrouper getRegionBoundaries should use connection specific properties vs common config

2020-09-22 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Description: 
Different MR job requests which reach [MapReduceParallelScanGrouper 
getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
 we currently make use of shared configuration among jobs to figure out 
snapshot names, which is wrong. 

Example jobs' sequence: first two jobs work over snapshot and the third job 
over a regular table.

Prininting hashcode of objects when entering: 
[https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]

*Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)

context.getConnection(): 521093916
 ConnectionQueryServices: 1772519705
 *ReadOnlyProps props: 1520403731*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*

 

*Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)

context.getConnection(): 1928017473
 ConnectionQueryServices: 961279422
 *ReadOnlyProps props: 1520602316*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 3:* (over the table *ABC_TABLE_3* but fails with 
CorruptedSnapshotException while it got nothing to do with snapshot)

context.getConnection(): 28889670
 ConnectionQueryServices: 424389847
 *ReadOnlyProps props: 1573377628*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null*
 *Configuration: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

Exception which we get:
 [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
[c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
 java.lang.RuntimeException: 
org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
snapshot info 
from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
 at 
org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-

[jira] [Updated] (PHOENIX-6153) MapReduceParallelScanGrouper getRegionBoundaries should use connection specific properties vs common config

2020-09-22 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Description: 
Different MR job requests which reach [MapReduceParallelScanGrouper 
getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
 we currently make use of shared configuration among jobs to figure out 
snapshot names, which is wrong. 

Example jobs' sequence: first two jobs work over snapshot and the third job 
over a regular table.

Prininting hashcode of objects when entering: 
[https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]

*Job 1:* (over snapshot and is successful)

context.getConnection(): 521093916
 ConnectionQueryServices: 1772519705
 *ReadOnlyProps props: 1520403731*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 2:* (over snapshot and is successful)

context.getConnection(): 1928017473
 ConnectionQueryServices: 961279422
 *ReadOnlyProps props: 1520602316*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 3:* (over the table and fails with CorruptedSnapshotException while it got 
nothing to do with snapshot)

context.getConnection(): 28889670
 ConnectionQueryServices: 424389847
 *ReadOnlyProps props: 1573377628*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null*
 *Configuration: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

Exception which we get:
 [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
[c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
 java.lang.RuntimeException: 
org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
snapshot info 
from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
 at 
org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.h

[jira] [Updated] (PHOENIX-6153) MapReduceParallelScanGrouper getRegionBoundaries should use connection specific properties vs common config

2020-09-22 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Description: 
Different MR job requests which reach [MapReduceParallelScanGrouper 
getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
 we currently make use of shared configuration among jobs to figure out 
snapshot names, which is wrong. 

Example jobs' sequence: first two jobs work over snapshot and the third job 
over a regular table.

Prininting hashcode of objects when entering: 
[https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]

*Job 1:* (over snapshot and is successful)

context.getConnection(): 521093916
 ConnectionQueryServices: 1772519705
 *ReadOnlyProps props: 1520403731*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 2:* (over snapshot and is successful)

context.getConnection(): 1928017473
 ConnectionQueryServices: 961279422
 *ReadOnlyProps props: 1520602316*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 3:* (over the table and fails with CorruptedSnapshotException while it got 
nothing to do with snapshot)

context.getConnection(): 28889670
 ConnectionQueryServices: 424389847
 *ReadOnlyProps props: 1573377628*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null*
 *Configuration: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

Exception which we get:
 [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
[c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
 java.lang.RuntimeException: 
org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
snapshot info 
from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
 at 
org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.h

[jira] [Updated] (PHOENIX-6153) MapReduceParallelScanGrouper getRegionBoundaries should use connection specific properties vs common config

2020-09-22 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6153:
-
Description: 
Different MR job requests which reach [MapReduceParallelScanGrouper 
getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
 we currently make use of shared configuration among jobs to figure out 
snapshot names, which is wrong. 

Example jobs' sequence: first two jobs work over snapshot and the third job 
over a regular table.

Prininting hashcode of objects when entering: 
[https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]

*Job 1:* (over snapshot and is successful)

context.getConnection(): 521093916
 ConnectionQueryServices: 1772519705
 *ReadOnlyProps props: 1520403731*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 2:* (over snapshot and is successful)

context.getConnection(): 1928017473
 ConnectionQueryServices: 961279422
 *ReadOnlyProps props: 1520602316*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
 *Configuration conf: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 3:* (over the table and fails with CorruptedSnapshotException while it got 
nothing to do with snapshot)

context.getConnection(): 28889670
 ConnectionQueryServices: 424389847
 *ReadOnlyProps props: 1573377628*
     props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null*
 *Configuration: 813285994*
     conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

Exception which we get:
 [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
[c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
 java.lang.RuntimeException: 
org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
snapshot info 
from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
 at 
org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
 at 
org.apache.h

[jira] [Created] (PHOENIX-6153) MapReduceParallelScanGrouper getRegionBoundaries should use connection specific properties vs common config

2020-09-22 Thread Saksham Gangwar (Jira)
Saksham Gangwar created PHOENIX-6153:


 Summary: MapReduceParallelScanGrouper getRegionBoundaries should 
use connection specific properties vs common config
 Key: PHOENIX-6153
 URL: https://issues.apache.org/jira/browse/PHOENIX-6153
 Project: Phoenix
  Issue Type: Bug
  Components: core
Affects Versions: 4.14.3, 4.x
Reporter: Saksham Gangwar
Assignee: Saksham Gangwar
 Fix For: 4.16.0


Different MR job requests which reach [MapReduceParallelScanGrouper 
getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
 we currently make use of shared configuration among jobs to figure out 
snapshot names, which is wrong. 

Two job sequences first two jobs work over snapshot and thirdd job over regular 
table.

Prininting hashcode of objects when entering: 
[https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]

*Job 1:* (over snapshot and is successful)

context.getConnection(): 521093916
ConnectionQueryServices: 1772519705
*ReadOnlyProps props: 1520403731*
    props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_1*
*Configuration conf: 813285994*
    conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 2:* (over snapshot and is successful)

context.getConnection(): 1928017473
ConnectionQueryServices: 961279422
*ReadOnlyProps props: 1520602316*
    props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
*Configuration conf: 813285994*
    conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

*Job 3:* (over the table and fails with CorruptedSnapshotException while it got 
nothing to do with snapshot)

context.getConnection(): 28889670
ConnectionQueryServices: 424389847
*ReadOnlyProps props: 1573377628*
    props.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *null*
*Configuration: 813285994*
    conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*

 

Exception which we get:
[2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] 
[c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
java.lang.RuntimeException: 
org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read 
snapshot info 
from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
at 
org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:511)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
 
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
at 
org.apa

[jira] [Updated] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-19 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6078:
-
Attachment: (was: PHOENIX-6078.4.x-v1.patch)

> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-6078.4.x.patch, PHOENIX-6078.4.x.v2.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-19 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6078:
-
Attachment: (was: PHOENIX-6078.4.x.patch)

> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-6078.4.x.patch, PHOENIX-6078.4.x.v2.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-19 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6078:
-
Attachment: (was: PHOENIX-6078.4.x.patch)

> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-6078.4.x.patch, PHOENIX-6078.4.x.v2.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-19 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6078:
-
Attachment: (was: PHOENIX-6078.4.x-v1.patch)

> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-6078.4.x.patch, PHOENIX-6078.4.x.patch, 
> PHOENIX-6078.4.x.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-14 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6078:
-
Attachment: (was: PHOENIX-6078.4.x.patch)

> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-6078.4.x.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-14 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6078:
-
Attachment: PHOENIX-6078.4.x.patch

> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-6078.4.x.patch
>
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6079) Handle state of Phoenix Internal Connections gracefully even during runtime exceptions

2020-08-14 Thread Saksham Gangwar (Jira)
Saksham Gangwar created PHOENIX-6079:


 Summary: Handle state of Phoenix Internal Connections gracefully 
even during runtime exceptions
 Key: PHOENIX-6079
 URL: https://issues.apache.org/jira/browse/PHOENIX-6079
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.14.3, 4.x
Reporter: Saksham Gangwar
 Fix For: 4.16.0


We made a happy path fix for handling internal connections in the following 
JIRAs:

https://issues.apache.org/jira/browse/PHOENIX-5872

https://issues.apache.org/jira/browse/PHOENIX-6078

But ideally, we need to handle those internal connections gracefully for e.g. 
in a separate reaper thread only managing state of these connections. So as to 
avoid connection leaks due to any exceptions.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-14 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar reassigned PHOENIX-6078:


Assignee: Saksham Gangwar  (was: Daniel Wong)

> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Saksham Gangwar
>Priority: Major
> Fix For: 4.16.0
>
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-14 Thread Saksham Gangwar (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-6078:
-
Description: 
In https://issues.apache.org/jira/browse/PHOENIX-5872

We started maintaining parent-child relationships between phoenix connections 
to close those connections. But after closing those child connections we should 
be removing them from the Queue which is being maintained for the same.

Here:

[https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]

 

If not removed from the queue, and if the parent connection is being reused: we 
are observing the OOM issue on the container side during the mapper run.

 

  was:
3 part approach:

1 don't count internal phoenix connections toward the client limit.

2 count internal phoenix connections toward a newly defined limit

3 track parent and child relationships between connections to close those 
connections


> Remove Internal Phoenix Connections from parent LinkedQueue when closed
> ---
>
> Key: PHOENIX-6078
> URL: https://issues.apache.org/jira/browse/PHOENIX-6078
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Saksham Gangwar
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.16.0
>
>
> In https://issues.apache.org/jira/browse/PHOENIX-5872
> We started maintaining parent-child relationships between phoenix connections 
> to close those connections. But after closing those child connections we 
> should be removing them from the Queue which is being maintained for the same.
> Here:
> [https://github.com/apache/phoenix/blob/affa9e889efcc2ad7dac009a0d294b09447d281e/phoenix-core/src/main/java/org/apache/phoenix/compile/MutatingParallelIteratorFactory.java#L114]
>  
> If not removed from the queue, and if the parent connection is being reused: 
> we are observing the OOM issue on the container side during the mapper run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6078) Remove Internal Phoenix Connections from parent LinkedQueue when closed

2020-08-14 Thread Saksham Gangwar (Jira)
Saksham Gangwar created PHOENIX-6078:


 Summary: Remove Internal Phoenix Connections from parent 
LinkedQueue when closed
 Key: PHOENIX-6078
 URL: https://issues.apache.org/jira/browse/PHOENIX-6078
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.14.3, 4.x
Reporter: Saksham Gangwar
Assignee: Daniel Wong
 Fix For: 4.16.0


3 part approach:

1 don't count internal phoenix connections toward the client limit.

2 count internal phoenix connections toward a newly defined limit

3 track parent and child relationships between connections to close those 
connections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5407) Adding phoenix side log when throwing Incompatible jars detected between client and server Exception

2019-07-23 Thread Saksham Gangwar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-5407:
-
External issue ID:   (was: PHOENIX-3377)
  Description: There are no logs on phoenix side when the client makes 
a call to checkClientServerCompatibility and we directly throw an exception 
without logging: Incompatible jars detected between client and server 
Exception.  (was: There have been scenarios similar to: deleting a 
tenant-specific view, recreating the same tenant-specific view with new columns 
and while querying the query fails with NPE over syscat due to corrupt data. 
View column count is changed but Phoenix syscat table did not properly update 
this info which causing querying the view always trigger null pointer 
exception. So the addition of this unit test will help us further debug the 
exact issue of corruption and give us confidence over this use case.

Exception Stacktrace:

org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50

at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)

at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)

at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.NullPointerException: at index 50

at 
com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:191)

at com.google.common.collect.ImmutableList.construct(ImmutableList.java:320)

at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:290)

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:548)

at org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

at org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1015)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:578)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3220)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3167)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:532)

... 10 more

 

 

Related issue: https://issues.apache.org/jira/browse/PHOENIX-3377)
   Issue Type: Improvement  (was: Bug)

> Adding phoenix side log when throwing Incompatible jars detected between 
> client and server Exception
> 
>
> Key: PHOENIX-5407
> URL: https://issues.apache.org/jira/browse/PHOENIX-5407
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Saksham Gangwar
>Priority: Minor
>
> There are no logs on phoenix side when the client makes a call to 
> checkClientServerCompatibility and we directly throw an exception without 
> logging: Incompatible jars detected between client and server Exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (PHOENIX-5407) Adding phoenix side log when throwing Incompatible jars detected between client and server Exception

2019-07-23 Thread Saksham Gangwar (JIRA)
Saksham Gangwar created PHOENIX-5407:


 Summary: Adding phoenix side log when throwing Incompatible jars 
detected between client and server Exception
 Key: PHOENIX-5407
 URL: https://issues.apache.org/jira/browse/PHOENIX-5407
 Project: Phoenix
  Issue Type: Bug
Reporter: Saksham Gangwar


There have been scenarios similar to: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. View column count is 
changed but Phoenix syscat table did not properly update this info which 
causing querying the view always trigger null pointer exception. So the 
addition of this unit test will help us further debug the exact issue of 
corruption and give us confidence over this use case.

Exception Stacktrace:

org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50

at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)

at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)

at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.NullPointerException: at index 50

at 
com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:191)

at com.google.common.collect.ImmutableList.construct(ImmutableList.java:320)

at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:290)

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:548)

at org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

at org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1015)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:578)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3220)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3167)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:532)

... 10 more

 

 

Related issue: https://issues.apache.org/jira/browse/PHOENIX-3377



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (PHOENIX-5278) Add unit test to make sure drop/recreate of tenant view with added columns doesn't corrupt syscat

2019-05-10 Thread Saksham Gangwar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-5278:
-
Description: 
There have been scenarios similar to: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. View column count is 
changed but Phoenix syscat table did not properly update this info which 
causing querying the view always trigger null pointer exception. So the 
addition of this unit test will help us further debug the exact issue of 
corruption and give us confidence over this use case.

Exception Stacktrace:

org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50

at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)

at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)

at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.NullPointerException: at index 50

at 
com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:191)

at com.google.common.collect.ImmutableList.construct(ImmutableList.java:320)

at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:290)

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:548)

at org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

at org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1015)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:578)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3220)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3167)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:532)

... 10 more

 

 

Related issue: https://issues.apache.org/jira/browse/PHOENIX-3377

  was:
There have been scenarios similar to: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. View column count is 
changed but Phoenix syscat table did not properly update this info which 
causing querying the view always trigger null pointer exception. So the 
addition of this unit test will help us further debug the exact issue of 
corruption and give us confidence over this use case.

Exception Stacktrace:

org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50

at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)

at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)

at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.NullPointerException: at index 50

at 
com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:191)

at com.google.common.collect.ImmutableList.construct(ImmutableList.java:320)

at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:290)

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.

[jira] [Updated] (PHOENIX-5278) Add unit test to make sure drop/recreate of tenant view with added columns doesn't corrupt syscat

2019-05-10 Thread Saksham Gangwar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-5278:
-
Description: 
There have been scenarios similar to: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. View column count is 
changed but Phoenix syscat table did not properly update this info which 
causing querying the view always trigger null pointer exception. So the 
addition of this unit test will help us further debug the exact issue of 
corruption and give us confidence over this use case.

Exception Stacktrace:

org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50

at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)

at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)

at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.NullPointerException: at index 50

at 
com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:191)

at com.google.common.collect.ImmutableList.construct(ImmutableList.java:320)

at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:290)

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:548)

at org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

at org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1015)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:578)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3220)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3167)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:532)

... 10 more

  was:
There have been scenarios similar to: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. So the addition of 
this unit test will help us further debug the exact issue of corruption and 
give us confidence over this use case.

Exception Stacktrace:

org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50

at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)

at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)

at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.NullPointerException: at index 50

at 
com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:191)

at com.google.common.collect.ImmutableList.construct(ImmutableList.java:320)

at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:290)

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:548)

at org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

at org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.j

[jira] [Updated] (PHOENIX-5278) Add unit test to make sure drop/recreate of tenant view with added columns doesn't corrupt syscat

2019-05-10 Thread Saksham Gangwar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-5278:
-
Description: 
There have been scenarios similar to: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. So the addition of 
this unit test will help us further debug the exact issue of corruption and 
give us confidence over this use case.

Exception Stacktrace:

org.apache.phoenix.exception.PhoenixIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50

at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)

at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)

at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.NullPointerException: at index 50

at 
com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:191)

at com.google.common.collect.ImmutableList.construct(ImmutableList.java:320)

at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:290)

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:548)

at org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

at org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1015)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:578)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3220)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3167)

at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:532)

... 10 more

  was:There have been scenarios similar to: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. So the addition of 
this unit test will help us further debug the exact issue of corruption and 
give us confidence over this use case.


> Add unit test to make sure drop/recreate of tenant view with added columns 
> doesn't corrupt syscat
> -
>
> Key: PHOENIX-5278
> URL: https://issues.apache.org/jira/browse/PHOENIX-5278
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Saksham Gangwar
>Priority: Minor
>
> There have been scenarios similar to: deleting a tenant-specific view, 
> recreating the same tenant-specific view with new columns and while querying 
> the query fails with NPE over syscat due to corrupt data. So the addition of 
> this unit test will help us further debug the exact issue of corruption and 
> give us confidence over this use case.
> Exception Stacktrace:
> org.apache.phoenix.exception.PhoenixIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: VIEW_NAME_ABC: at index 50
> at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:111)
> at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:566)
> at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16267)
> at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6143)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:3552)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3534)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32496)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2213)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExec

[jira] [Updated] (PHOENIX-5278) Add unit test to make sure drop/recreate of tenant view with added columns doesn't corrupt syscat

2019-05-10 Thread Saksham Gangwar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saksham Gangwar updated PHOENIX-5278:
-
Description: There have been scenarios similar to: deleting a 
tenant-specific view, recreating the same tenant-specific view with new columns 
and while querying the query fails with NPE over syscat due to corrupt data. So 
the addition of this unit test will help us further debug the exact issue of 
corruption and give us confidence over this use case.  (was: There have been 
customer scenarios where their use case: deleting a tenant-specific view, 
recreating the same tenant-specific view with new columns and while querying 
the query fails with NPE over syscat due to corrupt data. So the addition of 
this unit test will help us further debug the exact issue of corruption and 
give us confidence over this use case.)

> Add unit test to make sure drop/recreate of tenant view with added columns 
> doesn't corrupt syscat
> -
>
> Key: PHOENIX-5278
> URL: https://issues.apache.org/jira/browse/PHOENIX-5278
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Saksham Gangwar
>Priority: Minor
>
> There have been scenarios similar to: deleting a tenant-specific view, 
> recreating the same tenant-specific view with new columns and while querying 
> the query fails with NPE over syscat due to corrupt data. So the addition of 
> this unit test will help us further debug the exact issue of corruption and 
> give us confidence over this use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5278) Add unit test to make sure drop/recreate of tenant view with added columns doesn't corrupt syscat

2019-05-10 Thread Saksham Gangwar (JIRA)
Saksham Gangwar created PHOENIX-5278:


 Summary: Add unit test to make sure drop/recreate of tenant view 
with added columns doesn't corrupt syscat
 Key: PHOENIX-5278
 URL: https://issues.apache.org/jira/browse/PHOENIX-5278
 Project: Phoenix
  Issue Type: Bug
Reporter: Saksham Gangwar


There have been customer scenarios where their use case: deleting a 
tenant-specific view, recreating the same tenant-specific view with new columns 
and while querying the query fails with NPE over syscat due to corrupt data. So 
the addition of this unit test will help us further debug the exact issue of 
corruption and give us confidence over this use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)