[jira] [Updated] (PHOENIX-5860) Throw exception which region is closing or splitting when delete data

2020-08-17 Thread Chao Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PHOENIX-5860:
---
Affects Version/s: 4.x

> Throw exception which region is closing or splitting when delete data
> -
>
> Key: PHOENIX-5860
> URL: https://issues.apache.org/jira/browse/PHOENIX-5860
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.13.1, 4.x
>Reporter: Chao Wang
>Assignee: Chao Wang
>Priority: Blocker
> Attachments: PHOENIX-5860-4.x.patch, 
> PHOENIX-5860.4.13.x-HBASE.1.3.x.002.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently delete data is UngroupedAggregateRegionObserver class  on server 
> side, this class check if isRegionClosingOrSplitting is true. when 
> isRegionClosingOrSplitting is true, will throw new IOException("Temporarily 
> unable to write from scan because region is closing or splitting"). 
> when region online , which initialize phoenix CP that 
> isRegionClosingOrSplitting  is false.before region split, region change  
> isRegionClosingOrSplitting to true.but if region split failed,split will roll 
> back where not change   isRegionClosingOrSplitting  to false. after that all 
> write  opration will always throw exception which is Temporarily unable to 
> write from scan because region is closing or splitting。
> so we should change isRegionClosingOrSplitting   to false  when region 
> preRollBackSplit in UngroupedAggregateRegionObserver class。
> A simple test where a data table split failed, then roll back success.but 
> delete data always throw exception.
>  # create data table 
>  # bulkload data for this table
>  # alter hbase-server code, which region split will throw exception , then 
> rollback.
>  # use hbase shell , split region
>  # view regionserver log, where region split failed, and then rollback 
> success.
>  # user phoenix sqlline.py for delete data, which  will throw exption
>  Caused by: java.io.IOException: Temporarily unable to write from scan 
> because region is closing or splitting Caused by: java.io.IOException: 
> Temporarily unable to write from scan because region is closing or splitting 
> at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:516)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:245)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:293)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2881)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3082)
>  ... 5 more
> at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) 
> at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:548)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
>  at 
> org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
>  at 
> org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
>  at 
> org.apache.phoenix.compile.DeleteCompiler$2.execute(DeleteCompiler.java:498) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:303) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:295) 
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:293)
>  at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:200)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2253)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2249)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:893) at 
> org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2249)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2243)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:798)
>  at 
> 

[jira] [Updated] (PHOENIX-5860) Throw exception which region is closing or splitting when delete data

2020-08-17 Thread Chao Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PHOENIX-5860:
---
Attachment: PHOENIX-5860-4.x.patch

> Throw exception which region is closing or splitting when delete data
> -
>
> Key: PHOENIX-5860
> URL: https://issues.apache.org/jira/browse/PHOENIX-5860
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.13.1
>Reporter: Chao Wang
>Assignee: Chao Wang
>Priority: Blocker
> Attachments: PHOENIX-5860-4.x.patch, 
> PHOENIX-5860.4.13.x-HBASE.1.3.x.002.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently delete data is UngroupedAggregateRegionObserver class  on server 
> side, this class check if isRegionClosingOrSplitting is true. when 
> isRegionClosingOrSplitting is true, will throw new IOException("Temporarily 
> unable to write from scan because region is closing or splitting"). 
> when region online , which initialize phoenix CP that 
> isRegionClosingOrSplitting  is false.before region split, region change  
> isRegionClosingOrSplitting to true.but if region split failed,split will roll 
> back where not change   isRegionClosingOrSplitting  to false. after that all 
> write  opration will always throw exception which is Temporarily unable to 
> write from scan because region is closing or splitting。
> so we should change isRegionClosingOrSplitting   to false  when region 
> preRollBackSplit in UngroupedAggregateRegionObserver class。
> A simple test where a data table split failed, then roll back success.but 
> delete data always throw exception.
>  # create data table 
>  # bulkload data for this table
>  # alter hbase-server code, which region split will throw exception , then 
> rollback.
>  # use hbase shell , split region
>  # view regionserver log, where region split failed, and then rollback 
> success.
>  # user phoenix sqlline.py for delete data, which  will throw exption
>  Caused by: java.io.IOException: Temporarily unable to write from scan 
> because region is closing or splitting Caused by: java.io.IOException: 
> Temporarily unable to write from scan because region is closing or splitting 
> at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:516)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:245)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:293)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2881)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3082)
>  ... 5 more
> at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) 
> at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:548)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
>  at 
> org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
>  at 
> org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
>  at 
> org.apache.phoenix.compile.DeleteCompiler$2.execute(DeleteCompiler.java:498) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:303) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:295) 
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:293)
>  at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:200)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2253)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2249)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:893) at 
> org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2249)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2243)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:798)
>  at 
> 

[jira] [Updated] (PHOENIX-5860) Throw exception which region is closing or splitting when delete data

2020-08-17 Thread Chao Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PHOENIX-5860:
---
Attachment: (was: PHOENIX-5860-4.x.patch)

> Throw exception which region is closing or splitting when delete data
> -
>
> Key: PHOENIX-5860
> URL: https://issues.apache.org/jira/browse/PHOENIX-5860
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.13.1
>Reporter: Chao Wang
>Assignee: Chao Wang
>Priority: Blocker
> Attachments: PHOENIX-5860-4.x.patch, 
> PHOENIX-5860.4.13.x-HBASE.1.3.x.002.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently delete data is UngroupedAggregateRegionObserver class  on server 
> side, this class check if isRegionClosingOrSplitting is true. when 
> isRegionClosingOrSplitting is true, will throw new IOException("Temporarily 
> unable to write from scan because region is closing or splitting"). 
> when region online , which initialize phoenix CP that 
> isRegionClosingOrSplitting  is false.before region split, region change  
> isRegionClosingOrSplitting to true.but if region split failed,split will roll 
> back where not change   isRegionClosingOrSplitting  to false. after that all 
> write  opration will always throw exception which is Temporarily unable to 
> write from scan because region is closing or splitting。
> so we should change isRegionClosingOrSplitting   to false  when region 
> preRollBackSplit in UngroupedAggregateRegionObserver class。
> A simple test where a data table split failed, then roll back success.but 
> delete data always throw exception.
>  # create data table 
>  # bulkload data for this table
>  # alter hbase-server code, which region split will throw exception , then 
> rollback.
>  # use hbase shell , split region
>  # view regionserver log, where region split failed, and then rollback 
> success.
>  # user phoenix sqlline.py for delete data, which  will throw exption
>  Caused by: java.io.IOException: Temporarily unable to write from scan 
> because region is closing or splitting Caused by: java.io.IOException: 
> Temporarily unable to write from scan because region is closing or splitting 
> at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:516)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:245)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:293)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2881)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3082)
>  ... 5 more
> at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) 
> at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:548)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
>  at 
> org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
>  at 
> org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
>  at 
> org.apache.phoenix.compile.DeleteCompiler$2.execute(DeleteCompiler.java:498) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:303) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:295) 
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:293)
>  at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:200)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2253)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2249)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:893) at 
> org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2249)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2243)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:798)
>  at 
> 

[jira] [Updated] (PHOENIX-5860) Throw exception which region is closing or splitting when delete data

2020-08-17 Thread Chao Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PHOENIX-5860:
---
Attachment: PHOENIX-5860-4.x.patch

> Throw exception which region is closing or splitting when delete data
> -
>
> Key: PHOENIX-5860
> URL: https://issues.apache.org/jira/browse/PHOENIX-5860
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.13.1
>Reporter: Chao Wang
>Assignee: Chao Wang
>Priority: Blocker
> Attachments: PHOENIX-5860-4.x.patch, 
> PHOENIX-5860.4.13.x-HBASE.1.3.x.002.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently delete data is UngroupedAggregateRegionObserver class  on server 
> side, this class check if isRegionClosingOrSplitting is true. when 
> isRegionClosingOrSplitting is true, will throw new IOException("Temporarily 
> unable to write from scan because region is closing or splitting"). 
> when region online , which initialize phoenix CP that 
> isRegionClosingOrSplitting  is false.before region split, region change  
> isRegionClosingOrSplitting to true.but if region split failed,split will roll 
> back where not change   isRegionClosingOrSplitting  to false. after that all 
> write  opration will always throw exception which is Temporarily unable to 
> write from scan because region is closing or splitting。
> so we should change isRegionClosingOrSplitting   to false  when region 
> preRollBackSplit in UngroupedAggregateRegionObserver class。
> A simple test where a data table split failed, then roll back success.but 
> delete data always throw exception.
>  # create data table 
>  # bulkload data for this table
>  # alter hbase-server code, which region split will throw exception , then 
> rollback.
>  # use hbase shell , split region
>  # view regionserver log, where region split failed, and then rollback 
> success.
>  # user phoenix sqlline.py for delete data, which  will throw exption
>  Caused by: java.io.IOException: Temporarily unable to write from scan 
> because region is closing or splitting Caused by: java.io.IOException: 
> Temporarily unable to write from scan because region is closing or splitting 
> at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:516)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:245)
>  at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:293)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2881)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3082)
>  ... 5 more
> at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) 
> at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:548)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97)
>  at 
> org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
>  at 
> org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
>  at 
> org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
>  at 
> org.apache.phoenix.compile.DeleteCompiler$2.execute(DeleteCompiler.java:498) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:303) 
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:295) 
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:293)
>  at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:200)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2253)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40$$anonfun$apply$19.apply(EcidProcessCommon.scala:2249)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:893) at 
> org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2249)
>  at 
> com.huawei.mds.apps.ecidRepeatProcess.EcidProcessCommon$$anonfun$40.apply(EcidProcessCommon.scala:2243)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:798)
>  at 
> 

[jira] [Updated] (PHOENIX-5881) Port MaxLookbackAge logic to 5.x

2020-08-17 Thread Geoffrey Jacoby (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby updated PHOENIX-5881:
-
Attachment: PHOENIX-5881.v2.patch

> Port MaxLookbackAge logic to 5.x
> 
>
> Key: PHOENIX-5881
> URL: https://issues.apache.org/jira/browse/PHOENIX-5881
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Blocker
> Fix For: 5.1.0
>
> Attachments: PHOENIX-5881.v1.patch, PHOENIX-5881.v2.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> PHOENIX-5645 wasn't included in the master (5.x) branch because an HBase 2.x 
> change prevented the logic from being useful in the case of deletes, since 
> HBase 2.x no longer allows us to show deleted cells on an SCN query before 
> the point of deletion. Unfortunately, PHOENIX-5645 wound up requiring a lot 
> of follow-up work in the IndexTool and IndexScrutinyTool to deal with its 
> implications, and because of that, the 4.x and 5.x codebases around indexes 
> have diverged a good bit. 
> This work item is to get them back in sync, even though the behavior in the 
> face of deletes will be somewhat different, and so most likely some tests 
> will have to be changed or Ignored. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6081) Improvements to snapshot based MR input format

2020-08-17 Thread Bharath Vissapragada (Jira)
Bharath Vissapragada created PHOENIX-6081:
-

 Summary: Improvements to snapshot based MR input format
 Key: PHOENIX-6081
 URL: https://issues.apache.org/jira/browse/PHOENIX-6081
 Project: Phoenix
  Issue Type: Improvement
  Components: core
Affects Versions: 4.14.3, 4.15.1, master
Reporter: Bharath Vissapragada


Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. Even with 
that fix there are quite a few things we could improve about the snapshot based 
input format. Listing them here, perhaps we can break them into subtasks as 
needed.

- Do not restore the snapshot per map task. Currently we restore the snapshot 
once per map task into a temp directory. For large tables on big clusters, this 
creates a storm of NN RPCs. We can do this once per job and let all the map 
tasks operate on the same restored snapshot. HBase already did this via 
HBASE-18806, we can do something similar.

- Disable 
[cacheBlocks|[https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-]]
 on scans generated by input format. In our experiments block cache took a lot 
of memory for MR jobs. For full table scans this isn't of much use and can save 
a lot of memory.

- Short circuit live-table codepaths when snapshots are enabled. Currently some 
codepaths make live table HBase RPCs to get a bunch of data. For example
{noformat}
private List generateSplits(final QueryPlan qplan, Configuration 
config) throws IOException {
// We must call this in order to initialize the scans and splits from the 
query plan
  
// Get the RegionSizeCalculator
try(org.apache.hadoop.hbase.client.Connection connection =

HBaseFactoryProvider.getHConnectionFactory().createConnection(config)) {
RegionLocator regionLocator = 
connection.getRegionLocator(TableName.valueOf(tableName));
RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(regionLocator, 
connection
.getAdmin()); {noformat}
This defeats the purpose of using snapshots. Refactor the code in a way that 
the snapshot based codepaths do minimal HBase RPCs and rely solely on snapshot 
manifest. Even better, improve locality of task scheduling based on snapshot's 
hfile block locations.

- Disable indexes for query plan for scanning over snapshots. If there is an 
index based access path, getScans() can potentially return index based splits 
which is not what we want for snapshots.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-6080) Add a check to Index Rebuild jobs to check region closing before every inner batch

2020-08-17 Thread Swaroopa Kadam (Jira)
Swaroopa Kadam created PHOENIX-6080:
---

 Summary: Add a check to Index Rebuild jobs to check region closing 
before every inner batch
 Key: PHOENIX-6080
 URL: https://issues.apache.org/jira/browse/PHOENIX-6080
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.15.0
Reporter: Swaroopa Kadam
Assignee: Swaroopa Kadam
 Fix For: 5.1.0, 4.16.0


Add a check to Index Rebuild jobs to check if the region is closing before 
every inner batch to allow the acquire write lock to Close



--
This message was sent by Atlassian Jira
(v8.3.4#803005)