[jira] [Updated] (PHOENIX-5274) ConnectionQueryServiceImpl#ensureNamespaceCreated and ensureTableCreated should use HBase APIs that do not require ADMIN permissions for existence checks

2019-05-14 Thread Thomas D'Silva (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-5274:

Fix Version/s: (was: 4.14.2)

> ConnectionQueryServiceImpl#ensureNamespaceCreated and ensureTableCreated 
> should use HBase APIs that do not require ADMIN permissions for existence 
> checks
> -
>
> Key: PHOENIX-5274
> URL: https://issues.apache.org/jira/browse/PHOENIX-5274
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0, 4.14.2
>Reporter: Chinmay Kulkarni
>Assignee: Chinmay Kulkarni
>Priority: Major
> Fix For: 5.0.0, 4.15.0
>
>
> [HBASE-22377|https://issues.apache.org/jira/browse/HBASE-22377] will 
> introduce a new API that does not require ADMIN permissions to check the 
> existence of a namespace.
> Currently, CQSI#ensureNamespaceCreated calls 
> HBaseAdmin#getNamespaceDescriptor which eventually on the server causes a 
> call to AccessController#preGetNamespaceDescriptor. This tries to acquire 
> ADMIN permissions on the namespace. We should ideally use the new API 
> provided by HBASE-22377 which does not require the phoenix client to get 
> ADMIN permissions on the namespace. We should acquire ADMIN permissions only 
> in case we need to create the namespace if it doesn't already exist.
> Similarly, CQSI#ensureTableCreated should first check the existence of a 
> table before trying to do HBaseAdmin#getTableDescriptor since this requires 
> CREATE and ADMIN permissions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (PHOENIX-5267) With namespaces enabled Phoenix client times out with high loads

2019-05-14 Thread Thomas D'Silva (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva resolved PHOENIX-5267.
-
Resolution: Fixed

> With namespaces enabled Phoenix client times out with high loads
> 
>
> Key: PHOENIX-5267
> URL: https://issues.apache.org/jira/browse/PHOENIX-5267
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1
>Reporter: Kiran Kumar Maturi
>Priority: Major
> Fix For: 4.14.2
>
>
> Steps to reproduce:
>  * Enable namespaces for phoenix 4.14.1 and hbase 1.3
>  * Run high load using pherf client with 48 threads
> After sometime the client hangs. and gives timeout exception
> {code:java}
> [pool-1-thread-1] WARN org.apache.phoenix.pherf.workload.WriteWorkload -
> java.util.concurrent.ExecutionException: 
> org.apache.phoenix.exception.PhoenixIOException: callTimeout=120, 
> callDuration=1238263: Call to  failed on local exception: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=857, 
> waitTime=120001, operationTimeout=12 expired. row '^@TEST^@TABLE' on 
> table 'SYSTEM:CATALOG' at 
> region=SYSTEM:CATALOG,1556024429507.0f80d6de0a002d1421b8fd384e956254., 
> hostname=, seqNum=2
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.phoenix.pherf.workload.WriteWorkload.waitForBatches(WriteWorkload.java:239)
> at 
> org.apache.phoenix.pherf.workload.WriteWorkload.exec(WriteWorkload.java:189)
> at 
> org.apache.phoenix.pherf.workload.WriteWorkload.access$100(WriteWorkload.java:56)
> at 
> org.apache.phoenix.pherf.workload.WriteWorkload$1.run(WriteWorkload.java:165)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.phoenix.exception.PhoenixIOException: 
> callTimeout=120, callDuration=1238263: Call to  failed on local 
> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=857, 
> waitTime=120001, operationTimeout=12 expired. row '^@TEST^@TABLE' on 
> table 'SYSTEM:CATALOG' at 
> region=SYSTEM:CATALOG,1556024429507.0f80d6de0a002d1421b8fd384e956254., 
> hostname=, seqNum=2
> at 
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:144)
> at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1379)
> at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1343)
> at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1560)
> at 
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:644)
> at 
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:538)
> at 
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:530)
> at 
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:526)
> at 
> org.apache.phoenix.execute.MutationState.validateAndGetServerTimestamp(MutationState.java:755)
> at 
> org.apache.phoenix.execute.MutationState.validateAll(MutationState.java:743)
> at org.apache.phoenix.execute.MutationState.send(MutationState.java:875)
> at org.apache.phoenix.execute.MutationState.send(MutationState.java:1360)
> at 
> org.apache.phoenix.execute.MutationState.commit(MutationState.java:1183)
> at 
> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:670)
> at 
> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:666)
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> at 
> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:666)
> at 
> org.apache.phoenix.pherf.workload.WriteWorkload$2.call(WriteWorkload.java:297)
> at 
> org.apache.phoenix.pherf.workload.WriteWorkload$2.call(WriteWorkload.java:256)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5269) PhoenixAccessController should use AccessChecker instead of AccessControlClient for permission checks

2019-05-14 Thread Thomas D'Silva (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-5269:

Fix Version/s: 4.14.2

> PhoenixAccessController should use AccessChecker instead of 
> AccessControlClient for permission checks
> -
>
> Key: PHOENIX-5269
> URL: https://issues.apache.org/jira/browse/PHOENIX-5269
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1, 4.14.2
>Reporter: Andrew Purtell
>Assignee: Kiran Kumar Maturi
>Priority: Critical
> Fix For: 4.14.2
>
> Attachments: PHOENIX-5269-4.14-HBase-1.4.patch, 
> PHOENIX-5269-4.14-HBase-1.4.v1.patch, PHOENIX-5269-4.14-HBase-1.4.v2.patch, 
> PHOENIX-5269.4.14-HBase-1.4.v3.patch, PHOENIX-5269.4.14-HBase-1.4.v4.patch
>
>
> PhoenixAccessController should use AccessChecker instead of 
> AccessControlClient for permission checks. 
> In HBase, every RegionServer's AccessController maintains a local cache of 
> permissions. At startup time they are initialized from the ACL table. 
> Whenever the ACL table is changed (via grant or revoke) the AC on the ACL 
> table "broadcasts" the change via zookeeper, which updates the cache. This is 
> performed and managed by TableAuthManager but is exposed as API by 
> AccessChecker. AccessChecker is the result of a refactor that was committed 
> as far back as branch-1.4 I believe.
> Phoenix implements its own access controller and is using the client API 
> AccessControlClient instead. AccessControlClient does not cache nor use the 
> ZK-based cache update mechanism, because it is designed for client side use.
> The use of AccessControlClient instead of AccessChecker is not scalable. 
> Every permissions check will trigger a remote RPC to the ACL table, which is 
> generally going to be a single region hosted on a single RegionServer. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5281) 95%+ of request time spent in Phoenix driver

2019-05-14 Thread Laran Evans (JIRA)
Laran Evans created PHOENIX-5281:


 Summary: 95%+ of request time spent in Phoenix driver
 Key: PHOENIX-5281
 URL: https://issues.apache.org/jira/browse/PHOENIX-5281
 Project: Phoenix
  Issue Type: Bug
 Environment: We're running on Azure using HDInsight for HBase.
Reporter: Laran Evans
 Attachments: Screen Shot 2019-05-14 at 6.14.29 PM.png, Screen Shot 
2019-05-14 at 6.14.29 PM.png, Screen Shot 2019-05-14 at 6.38.39 PM.png

Our application consistently spends 95% of it's time in what shows up in New 
Relic as "Reading database result". From what I've read, this usually means 
that the time is being spent in the database driver, which in our case is 



org.apache.phoenix:phoenix-core:4.7.0-HBase-1.1



I have profiled the application and all I see is this:

!Screen Shot 2019-05-14 at 6.14.29 PM.png!

All I see is lots of time in poll().

Here's the line of code from Phoenix:

[https://github.com/apache/phoenix/blob/4.7.0-HBase-1.1/phoenix-core/src/main/java/org/apache/phoenix/job/AbstractRoundRobinQueue.java#L136]

It isn't clear to me why this takes so long?

Here you see that 96% of the response time is stuck in that one section. It 
seems that the two select queries themselves are very fast. All the time is 
spent in reading the result set.

!Screen Shot 2019-05-14 at 6.38.39 PM.png!

What are some likely reasons why this might be happening? And how can we fix it?

Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5281) 95%+ of request time spent in Phoenix driver

2019-05-14 Thread Laran Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laran Evans updated PHOENIX-5281:
-
Affects Version/s: 4.7.0

> 95%+ of request time spent in Phoenix driver
> 
>
> Key: PHOENIX-5281
> URL: https://issues.apache.org/jira/browse/PHOENIX-5281
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.7.0
> Environment: We're running on Azure using HDInsight for HBase.
>Reporter: Laran Evans
>Priority: Major
> Attachments: Screen Shot 2019-05-14 at 6.14.29 PM.png, Screen Shot 
> 2019-05-14 at 6.14.29 PM.png, Screen Shot 2019-05-14 at 6.38.39 PM.png
>
>
> Our application consistently spends 95% of it's time in what shows up in New 
> Relic as "Reading database result". From what I've read, this usually means 
> that the time is being spent in the database driver, which in our case is 
> org.apache.phoenix:phoenix-core:4.7.0-HBase-1.1
> I have profiled the application and all I see is this:
> !Screen Shot 2019-05-14 at 6.14.29 PM.png!
> All I see is lots of time in poll().
> Here's the line of code from Phoenix:
> [https://github.com/apache/phoenix/blob/4.7.0-HBase-1.1/phoenix-core/src/main/java/org/apache/phoenix/job/AbstractRoundRobinQueue.java#L136]
> It isn't clear to me why this takes so long?
> Here you see that 96% of the response time is stuck in that one section. It 
> seems that the two select queries themselves are very fast. All the time is 
> spent in reading the result set.
> !Screen Shot 2019-05-14 at 6.38.39 PM.png!
> What are some likely reasons why this might be happening? And how can we fix 
> it?
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5280) Provide Improvements to Scan on Composite PK where Leading Edge not fully Specified but the edge next columns are in most leading keys

2019-05-14 Thread Daniel Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Wong updated PHOENIX-5280:
-
Description: 
Provide Improvements to Scan on Composite PK where Leading Edge not fully 
Specified but the edge next columns are in most leading keys

Recently a user has had an issue where they have a composite pk with 2 columns 
say (organizationId varchar, departmentId varchar).  They want to query all 
their data with a condition where department is fully qualified department.  
Example SELECT * FROM TABLE WHERE  departmentId='123'.  They also know that 95% 
of the organization leading edge contains the qualified trailing edge.  However 
department = '123' is less than 5% of the total data in the table.

Based on the explain plan today for this we would run a Round Robin Full Scan 
with a filter on departmentId='123'.
 While one possible approach to not run a full table scan is to build an index 
on department. Another approach could be to construct a new version of a 
skipscan like filter to control this scan.  Essentially we could use 1 lookup 
to find the organizationId then additional skipscan for the trailing key.  This 
could be triggered with a sql syntax hint or in the future data driven.

For a given region assume the data looks like this.
||organizationId||departmentId||
|org1|100|
|org4|100|
|org4|101|
|org4|123|
|org5|100|
|org5|123|

First query the initial row in the region.  We get 'org1','100'.  From this we 
can construct the next rows of ['org1','123' - 'org1','123\x0').  After 
proessing that block (in our case 0 rows) we would run to the row at or greater 
than  nextKey(current orgnaziationId),'123'.  This would give us org4,101.  We 
would then run to the row of 'org4','123'.  Essentailly 1 step to find the 
orgId and then a scan of all the departments for that value.

  was:
Provide Improvements to Scan on Composite PK where Leading Edge not fully 
Specified but the edge next columns are in most leading keys

Recently a user has had an issue where they have a composite pk with 2 columns 
say (organizationId varchar, departmentId varchar).  They want to query all 
their data with a condition where department is fully qualified department.  
Example SELECT * FROM TABLE WHERE  departmentId='123'.  They also know that 95% 
of the organization leading edge contains the qualified trailing edge.  However 
department = '123' is less than 5% of the total data in the table.

Based on the explain plan today for this we would run a Round Robin Full Scan 
with a filter on departmentId='123'.
While one possible approach to not run a full table scan is to build an index 
on department. Another approach could be to construct a new version of a 
skipscan like filter to control this scan.  Essentially we could use 1 lookup 
to find the organizaitonId then additoin skipscan for the trailing key.

For a given region assume the data looks like this.
||organizationId||departmentId||
|org1|100|
|org4|100|
|org4|101|
|org4|123|
|org5|100|
|org5|123|


First query the initial row in the region.  We get 'org1','100'.  From this we 
can construct the next rows of ['org1','123' - 'org1','123\x0').  After 
proessing that block (in our case 0 rows) we would run to the row at or greater 
than  nextKey(current orgnaziationId),'123'.  This would give us org4,101.  We 
would then run to the row of 'org4','123'.  Essentailly 1 step to find the 
orgId and then a scan of all the departments for that value.


> Provide Improvements to Scan on Composite PK where Leading Edge not fully 
> Specified but the edge next columns are in most leading keys
> --
>
> Key: PHOENIX-5280
> URL: https://issues.apache.org/jira/browse/PHOENIX-5280
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Minor
>
> Provide Improvements to Scan on Composite PK where Leading Edge not fully 
> Specified but the edge next columns are in most leading keys
> Recently a user has had an issue where they have a composite pk with 2 
> columns say (organizationId varchar, departmentId varchar).  They want to 
> query all their data with a condition where department is fully qualified 
> department.  Example SELECT * FROM TABLE WHERE  departmentId='123'.  They 
> also know that 95% of the organization leading edge contains the qualified 
> trailing edge.  However department = '123' is less than 5% of the total data 
> in the table.
> Based on the explain plan today for this we would run a Round Robin Full Scan 
> with a filter on departmentId='123'.
>  While one possible approach to not run a full table scan is to build an 
> index on department. Another approach could be to construct a new version of 
> a skipscan like 

[jira] [Updated] (PHOENIX-5280) Provide Improvements to Scan on Composite PK where Leading Edge not fully Specified but the edge next columns are in most leading keys

2019-05-14 Thread Daniel Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Wong updated PHOENIX-5280:
-
Description: 
Provide Improvements to Scan on Composite PK where Leading Edge not fully 
Specified but the edge next columns are in most leading keys

Recently a user has had an issue where they have a composite pk with 2 columns 
say (organizationId varchar, departmentId varchar).  They want to query all 
their data with a condition where department is fully qualified department.  
Example SELECT * FROM TABLE WHERE  departmentId='123'.  They also know that 95% 
of the organization leading edge contains the qualified trailing edge.  However 
department = '123' is less than 5% of the total data in the table.

Based on the explain plan today for this we would run a Round Robin Full Scan 
with a filter on departmentId='123'.
 While one possible approach to not run a full table scan is to build an index 
on department. Another approach could be to construct a new version of a 
skipscan like filter to control this scan.  Essentially we could use 1 lookup 
to find the organizationId then additional skipscan for the trailing key.  This 
could be triggered with a sql syntax hint or in the future data driven.

For a given region assume the data looks like this.
||organizationId||departmentId||
|org1|100|
|org4|100|
|org4|101|
|org4|123|
|org5|100|
|org5|123|

First query the initial row in the region.  We get 'org1','100'.  From this we 
can construct the next rows of ['org1','123' - 'org1','123\x0').  After 
processing that block (in our case 0 rows) we would run to the row at or 
greater than  nextKey(current orgnaziationId),'123'.  This would give us 
org4,101.  We would then run to the row of 'org4','123'.  Essentially 1 step to 
find the orgId and then a scan of all the departments for that value.

  was:
Provide Improvements to Scan on Composite PK where Leading Edge not fully 
Specified but the edge next columns are in most leading keys

Recently a user has had an issue where they have a composite pk with 2 columns 
say (organizationId varchar, departmentId varchar).  They want to query all 
their data with a condition where department is fully qualified department.  
Example SELECT * FROM TABLE WHERE  departmentId='123'.  They also know that 95% 
of the organization leading edge contains the qualified trailing edge.  However 
department = '123' is less than 5% of the total data in the table.

Based on the explain plan today for this we would run a Round Robin Full Scan 
with a filter on departmentId='123'.
 While one possible approach to not run a full table scan is to build an index 
on department. Another approach could be to construct a new version of a 
skipscan like filter to control this scan.  Essentially we could use 1 lookup 
to find the organizationId then additional skipscan for the trailing key.  This 
could be triggered with a sql syntax hint or in the future data driven.

For a given region assume the data looks like this.
||organizationId||departmentId||
|org1|100|
|org4|100|
|org4|101|
|org4|123|
|org5|100|
|org5|123|

First query the initial row in the region.  We get 'org1','100'.  From this we 
can construct the next rows of ['org1','123' - 'org1','123\x0').  After 
proessing that block (in our case 0 rows) we would run to the row at or greater 
than  nextKey(current orgnaziationId),'123'.  This would give us org4,101.  We 
would then run to the row of 'org4','123'.  Essentailly 1 step to find the 
orgId and then a scan of all the departments for that value.


> Provide Improvements to Scan on Composite PK where Leading Edge not fully 
> Specified but the edge next columns are in most leading keys
> --
>
> Key: PHOENIX-5280
> URL: https://issues.apache.org/jira/browse/PHOENIX-5280
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Minor
>
> Provide Improvements to Scan on Composite PK where Leading Edge not fully 
> Specified but the edge next columns are in most leading keys
> Recently a user has had an issue where they have a composite pk with 2 
> columns say (organizationId varchar, departmentId varchar).  They want to 
> query all their data with a condition where department is fully qualified 
> department.  Example SELECT * FROM TABLE WHERE  departmentId='123'.  They 
> also know that 95% of the organization leading edge contains the qualified 
> trailing edge.  However department = '123' is less than 5% of the total data 
> in the table.
> Based on the explain plan today for this we would run a Round Robin Full Scan 
> with a filter on departmentId='123'.
>  While one possible approach to not run a full table scan is to build an 
> index on 

[jira] [Created] (PHOENIX-5280) Provide Improvements to Scan on Composite PK where Leading Edge not fully Specified but the edge next columns are in most leading keys

2019-05-14 Thread Daniel Wong (JIRA)
Daniel Wong created PHOENIX-5280:


 Summary: Provide Improvements to Scan on Composite PK where 
Leading Edge not fully Specified but the edge next columns are in most leading 
keys
 Key: PHOENIX-5280
 URL: https://issues.apache.org/jira/browse/PHOENIX-5280
 Project: Phoenix
  Issue Type: Improvement
Reporter: Daniel Wong


Provide Improvements to Scan on Composite PK where Leading Edge not fully 
Specified but the edge next columns are in most leading keys

Recently a user has had an issue where they have a composite pk with 2 columns 
say (organizationId varchar, departmentId varchar).  They want to query all 
their data with a condition where department is fully qualified department.  
Example SELECT * FROM TABLE WHERE  departmentId='123'.  They also know that 95% 
of the organization leading edge contains the qualified trailing edge.  However 
department = '123' is less than 5% of the total data in the table.

Based on the explain plan today for this we would run a Round Robin Full Scan 
with a filter on departmentId='123'.
While one possible approach to not run a full table scan is to build an index 
on department. Another approach could be to construct a new version of a 
skipscan like filter to control this scan.  Essentially we could use 1 lookup 
to find the organizaitonId then additoin skipscan for the trailing key.

For a given region assume the data looks like this.
||organizationId||departmentId||
|org1|100|
|org4|100|
|org4|101|
|org4|123|
|org5|100|
|org5|123|


First query the initial row in the region.  We get 'org1','100'.  From this we 
can construct the next rows of ['org1','123' - 'org1','123\x0').  After 
proessing that block (in our case 0 rows) we would run to the row at or greater 
than  nextKey(current orgnaziationId),'123'.  This would give us org4,101.  We 
would then run to the row of 'org4','123'.  Essentailly 1 step to find the 
orgId and then a scan of all the departments for that value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5156) Consistent Mutable Global Indexes for Non-Transactional Tables

2019-05-14 Thread Kadir OZDEMIR (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR updated PHOENIX-5156:
---
Attachment: PHOENIX-5156.master.014.patch

> Consistent Mutable Global Indexes for Non-Transactional Tables
> --
>
> Key: PHOENIX-5156
> URL: https://issues.apache.org/jira/browse/PHOENIX-5156
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.13.0, 4.14.0, 5.0.0, 4.14.1
>Reporter: Kadir OZDEMIR
>Assignee: Kadir OZDEMIR
>Priority: Major
> Attachments: PHOENIX-5156.4.x-HBase-1.4.001.patch, 
> PHOENIX-5156.master.001.patch, PHOENIX-5156.master.002.patch, 
> PHOENIX-5156.master.003.patch, PHOENIX-5156.master.004.patch, 
> PHOENIX-5156.master.005.patch, PHOENIX-5156.master.006.patch, 
> PHOENIX-5156.master.007.patch, PHOENIX-5156.master.008.patch, 
> PHOENIX-5156.master.009.patch, PHOENIX-5156.master.010.patch, 
> PHOENIX-5156.master.011.patch, PHOENIX-5156.master.012.patch, 
> PHOENIX-5156.master.013.patch, PHOENIX-5156.master.014.patch
>
>  Time Spent: 21h 50m
>  Remaining Estimate: 0h
>
> Without transactional tables, the mutable global indexes can get easily out 
> of sync with their data tables in Phoenix. Transactional tables require a 
> separate transaction manager, have some restrictions and performance 
> penalties. This issue is to have consistent mutable global indexes without 
> the need for using transactional tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5258) Add support to parse header from the input CSV file as input columns for CsvBulkLoadTool

2019-05-14 Thread Prashant Vithani (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Vithani updated PHOENIX-5258:
--
Attachment: PHOENIX-5258-4.x-HBase-1.4.001.patch

> Add support to parse header from the input CSV file as input columns for 
> CsvBulkLoadTool
> 
>
> Key: PHOENIX-5258
> URL: https://issues.apache.org/jira/browse/PHOENIX-5258
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Prashant Vithani
>Assignee: Prashant Vithani
>Priority: Minor
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-5258-4.x-HBase-1.4.001.patch, 
> PHOENIX-5258-4.x-HBase-1.4.patch, PHOENIX-5258-master.001.patch, 
> PHOENIX-5258-master.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, CsvBulkLoadTool does not support reading header from the input csv 
> and expects the content of the csv to match with the table schema. The 
> support for the header can be added to dynamically map the schema with the 
> header.
> The proposed solution is to introduce another option for the tool 
> `–parse-header`. If this option is passed, the input columns list is 
> constructed by reading the first line of the input CSV file.
>  * If there is only one file, read the header from the first line and 
> generate the `ColumnInfo` list.
>  * If there are multiple files, read the header from all the files, and throw 
> an error if the headers across files do not match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5258) Add support to parse header from the input CSV file as input columns for CsvBulkLoadTool

2019-05-14 Thread Prashant Vithani (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Vithani updated PHOENIX-5258:
--
Attachment: PHOENIX-5258-master.001.patch

> Add support to parse header from the input CSV file as input columns for 
> CsvBulkLoadTool
> 
>
> Key: PHOENIX-5258
> URL: https://issues.apache.org/jira/browse/PHOENIX-5258
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Prashant Vithani
>Assignee: Prashant Vithani
>Priority: Minor
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-5258-4.x-HBase-1.4.patch, 
> PHOENIX-5258-master.001.patch, PHOENIX-5258-master.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, CsvBulkLoadTool does not support reading header from the input csv 
> and expects the content of the csv to match with the table schema. The 
> support for the header can be added to dynamically map the schema with the 
> header.
> The proposed solution is to introduce another option for the tool 
> `–parse-header`. If this option is passed, the input columns list is 
> constructed by reading the first line of the input CSV file.
>  * If there is only one file, read the header from the first line and 
> generate the `ColumnInfo` list.
>  * If there are multiple files, read the header from all the files, and throw 
> an error if the headers across files do not match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)