[jira] [Commented] (PHOENIX-4242) Fix Indexer post-compact hook logging of NPE and TableNotFound

2017-10-04 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192341#comment-16192341
 ] 

Vincent Poon commented on PHOENIX-4242:
---

[~jamestaylor]
In our postCompact hook, we call getTableNoCache which calls 
MetaDataClient#updateCache.  However in there, we bail out early if it's a 
System table, instead of populating the result.
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java#L604

If I comment out that portion, then my test passes for the NPE part of this 
Jira.  Any idea if/why we need this?

> Fix Indexer post-compact hook logging of NPE and TableNotFound
> --
>
> Key: PHOENIX-4242
> URL: https://issues.apache.org/jira/browse/PHOENIX-4242
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>
> The post-compact hook in the Indexer seems to log extraneous log messages 
> indicating NPE or TableNotFound.  The TableNotFound exceptions seem to 
> indicate actual table names prefixed with MERGE or RESTORE, and sometimes 
> suffixed with a digit, so perhaps these are views or something similar.
> Examples:
> 2017-09-28 13:35:03,118 WARN  [ctions-1506410238599] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for SYSTEM.SEQUENCE
> java.lang.NullPointerException
> 2017-09-28 10:20:56,406 WARN  [ctions-1506410238415] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for 
> MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2
> org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table 
> undefined. tableName=MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4232) Hide shadow cell and commit table access in TAL

2017-10-04 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192200#comment-16192200
 ] 

James Taylor commented on PHOENIX-4232:
---

bq. Could you please explain the snapshot read scenario, your solution, and why 
Tephra works correctly in this case?
Tephra doesn't need it's own RegionScanner because it doesn't have to worry 
about scanning a separate commit table. The state is essentially passed through 
the TransactionVisibilityFilter from the client (i.e. the time stamps to skip 
for invalid/inflight transactions). In fact, Tephra could even do the index 
maintenance from the client side (see PHOENIX-4278).

bq. Regarding reread the shadow cells, assuming I would like to run Omid 
standalone with server side filtering. I can get the region from 
RegionCoprocessorEnvironment and run a get at the server side. I assume that 
will work, not sure how efficient will it be. What do you think?
That's what I was thinking as it provides a good way to abstract all this 
interaction away. However, let's brainstorm a bit on what it would take to do 
this on the client-side. HBase is not very happy when you do RS->RS RPCs. If 
it's rare that shadow cells are absent, perhaps we could have the server throw 
an exception back to the client when this happens and then the client could 
perhaps retry after getting whatever state is missing from the commit table and 
passing it along. The downside is that it'd be more complicated.

> Hide shadow cell and commit table access in TAL
> ---
>
> Key: PHOENIX-4232
> URL: https://issues.apache.org/jira/browse/PHOENIX-4232
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ohad Shacham
>  Labels: omid
>
> Omid needs to project the shadow cell column qualifier and then based on the 
> value, filter the row. If the shadow cell is not found, it needs to perform a 
> lookup in the commit table (the source of truth) to get the information 
> instead. For the Phoenix integration, there are likely two TAL methods that 
> can be added to handle this:
> # Add method call to new TAL method in preScannerOpen call on coprocessor 
> that projects the shadow cell qualifiers and sets the time range. This is 
> equivalent to the TransactionProcessor.preScannerOpen that Tephra does. It's 
> possible this work could be done on the client side as well, but it's more 
> likely that the stuff that Phoenix does may override this (but we could get 
> it to work if need be).
> # Add TAL method that returns a RegionScanner to abstract out the filtering 
> of the row (potentially querying commit table). This RegionScanner would be 
> added as the first in the chain in the 
> NonAggregateRegionScannerFactory.getRegionScanner() API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4278) Implement pure client side transactional index maintenance

2017-10-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192187#comment-16192187
 ] 

Andrew Purtell commented on PHOENIX-4278:
-

+1 !!

> Implement pure client side transactional index maintenance
> --
>
> Key: PHOENIX-4278
> URL: https://issues.apache.org/jira/browse/PHOENIX-4278
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>
> The index maintenance for transactions follows the same model as non 
> transactional tables - coprocessor based on data table updates that looks up 
> previous row value to perform maintenance. This is necessary for non 
> transactional tables to ensure the rows are locked so that a consistent view 
> may be obtained. However, for transactional tables, the time stamp oracle 
> ensures uniqueness of time stamps (via transaction IDs) and the filtering 
> handles a scan seeing the "true" last committed value for a row. Thus, 
> there's no hard dependency to perform this on the server side.
> Moving the index maintenance to the client side would prevent any RS->RS RPC 
> calls (which have proved to be troublesome for HBase). It would require 
> returning more data to the client (i.e. the prior row value), but this seems 
> like a reasonable tradeoff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4278) Implement pure client side transactional index maintenance

2017-10-04 Thread James Taylor (JIRA)
James Taylor created PHOENIX-4278:
-

 Summary: Implement pure client side transactional index maintenance
 Key: PHOENIX-4278
 URL: https://issues.apache.org/jira/browse/PHOENIX-4278
 Project: Phoenix
  Issue Type: Improvement
Reporter: James Taylor


The index maintenance for transactions follows the same model as non 
transactional tables - coprocessor based on data table updates that looks up 
previous row value to perform maintenance. This is necessary for non 
transactional tables to ensure the rows are locked so that a consistent view 
may be obtained. However, for transactional tables, the time stamp oracle 
ensures uniqueness of time stamps (via transaction IDs) and the filtering 
handles a scan seeing the "true" last committed value for a row. Thus, there's 
no hard dependency to perform this on the server side.

Moving the index maintenance to the client side would prevent any RS->RS RPC 
calls (which have proved to be troublesome for HBase). It would require 
returning more data to the client (i.e. the prior row value), but this seems 
like a reasonable tradeoff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4267) Add mutable index chaos tests

2017-10-04 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4267:
--
Attachment: PHOENIX-4267.v2.master.patch

rebase on top of PHOENIX-4269

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.13.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4267.v1.master.patch, 
> PHOENIX-4267.v2.master.patch
>
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4269) IndexScrutinyToolIT is flapping

2017-10-04 Thread churro morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191830#comment-16191830
 ] 

churro morales commented on PHOENIX-4269:
-

+1 lgtm

> IndexScrutinyToolIT is flapping
> ---
>
> Key: PHOENIX-4269
> URL: https://issues.apache.org/jira/browse/PHOENIX-4269
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.13.0
>Reporter: James Taylor
>Assignee: Vincent Poon
> Attachments: PHOENIX-4269.master.patch
>
>
> In a local test run (not able to repro when run separately), I saw the 
> following failure:
> {code}
> [ERROR] Tests run: 20, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 193.228 s <<< FAILURE! - in org.apache.phoenix.end2end.IndexScrutinyToolIT
> [ERROR] 
> testBothDataAndIndexAsSource[0](org.apache.phoenix.end2end.IndexScrutinyToolIT)
>   Time elapsed: 11.708 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1> but was:<0>
>   at 
> org.apache.phoenix.end2end.IndexScrutinyToolIT.testBothDataAndIndexAsSource(IndexScrutinyToolIT.java:344)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4269) IndexScrutinyToolIT is flapping

2017-10-04 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4269:
--
Attachment: PHOENIX-4269.master.patch

[~churromorales] Please review, thanks for your help on this!

> IndexScrutinyToolIT is flapping
> ---
>
> Key: PHOENIX-4269
> URL: https://issues.apache.org/jira/browse/PHOENIX-4269
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.13.0
>Reporter: James Taylor
>Assignee: Vincent Poon
> Attachments: PHOENIX-4269.master.patch
>
>
> In a local test run (not able to repro when run separately), I saw the 
> following failure:
> {code}
> [ERROR] Tests run: 20, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 193.228 s <<< FAILURE! - in org.apache.phoenix.end2end.IndexScrutinyToolIT
> [ERROR] 
> testBothDataAndIndexAsSource[0](org.apache.phoenix.end2end.IndexScrutinyToolIT)
>   Time elapsed: 11.708 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1> but was:<0>
>   at 
> org.apache.phoenix.end2end.IndexScrutinyToolIT.testBothDataAndIndexAsSource(IndexScrutinyToolIT.java:344)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-4269) IndexScrutinyToolIT is flapping

2017-10-04 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-4269:
-

Assignee: Vincent Poon  (was: churro morales)

> IndexScrutinyToolIT is flapping
> ---
>
> Key: PHOENIX-4269
> URL: https://issues.apache.org/jira/browse/PHOENIX-4269
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>
> In a local test run (not able to repro when run separately), I saw the 
> following failure:
> {code}
> [ERROR] Tests run: 20, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 193.228 s <<< FAILURE! - in org.apache.phoenix.end2end.IndexScrutinyToolIT
> [ERROR] 
> testBothDataAndIndexAsSource[0](org.apache.phoenix.end2end.IndexScrutinyToolIT)
>   Time elapsed: 11.708 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1> but was:<0>
>   at 
> org.apache.phoenix.end2end.IndexScrutinyToolIT.testBothDataAndIndexAsSource(IndexScrutinyToolIT.java:344)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4267) Add mutable index chaos tests

2017-10-04 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4267:
--
Attachment: (was: PHOENIX-4267.master.patch)

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.13.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4267.v1.master.patch
>
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (PHOENIX-4267) Add mutable index chaos tests

2017-10-04 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4267:
--
Comment: was deleted

(was: [~churromorales] Please review, thanks for your help on this!)

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.13.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4267.v1.master.patch
>
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4267) Add mutable index chaos tests

2017-10-04 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4267:
--
Attachment: PHOENIX-4267.master.patch

[~churromorales] Please review, thanks for your help on this!

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.13.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4267.master.patch, PHOENIX-4267.v1.master.patch
>
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4269) IndexScrutinyToolIT is flapping

2017-10-04 Thread churro morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191782#comment-16191782
 ] 

churro morales commented on PHOENIX-4269:
-

Looks like [~vincentpoon] figured out that this was due to how timestamps were 
set in the test.  He will be putting up a patch shortly which fixes these 
issues.  Great work and thank you!

> IndexScrutinyToolIT is flapping
> ---
>
> Key: PHOENIX-4269
> URL: https://issues.apache.org/jira/browse/PHOENIX-4269
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: churro morales
>
> In a local test run (not able to repro when run separately), I saw the 
> following failure:
> {code}
> [ERROR] Tests run: 20, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 193.228 s <<< FAILURE! - in org.apache.phoenix.end2end.IndexScrutinyToolIT
> [ERROR] 
> testBothDataAndIndexAsSource[0](org.apache.phoenix.end2end.IndexScrutinyToolIT)
>   Time elapsed: 11.708 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1> but was:<0>
>   at 
> org.apache.phoenix.end2end.IndexScrutinyToolIT.testBothDataAndIndexAsSource(IndexScrutinyToolIT.java:344)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-4276) Surface metrics on statistics collection

2017-10-04 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned PHOENIX-4276:
---

Assignee: Ashish Misra

> Surface metrics on statistics collection
> 
>
> Key: PHOENIX-4276
> URL: https://issues.apache.org/jira/browse/PHOENIX-4276
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Samarth Jain
>Assignee: Ashish Misra
>
> It would be good to get an insight on how stats collection is doing over 
> time. An initial set of metrics that I can think of would be:
> Time taken to compute stats (reading cells and computing their size)
> Time taken to commit stats per physical table.
> Number of guide posts collected per physical table
> Number of guide posts collected per region.
> Number of regions on which stats collection happened per physical table
> Number of times stats was collected due to major compaction vs update stats 
> per physical table
> If possible, figure out if stats was collected because minor compaction was 
> promoted to major compaction and surface a metric for it.
> Because most of the collection work happens on server side, one option would 
> be to see how HBase's metrics are surfaced (my guess is JMX) and follow the 
> same pattern. Or we could possibly use the hbase-metrics-api module but that 
> is an HBase 1.4 thing. Another option would be see PHOENIX-3807 for some 
> inspiration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4276) Surface metrics on statistics collection

2017-10-04 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191580#comment-16191580
 ] 

Samarth Jain commented on PHOENIX-4276:
---

FYI, [~Misraji]

> Surface metrics on statistics collection
> 
>
> Key: PHOENIX-4276
> URL: https://issues.apache.org/jira/browse/PHOENIX-4276
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Samarth Jain
>
> It would be good to get an insight on how stats collection is doing over 
> time. An initial set of metrics that I can think of would be:
> Time taken to compute stats (reading cells and computing their size)
> Time taken to commit stats per physical table.
> Number of guide posts collected per physical table
> Number of guide posts collected per region.
> Number of regions on which stats collection happened per physical table
> Number of times stats was collected due to major compaction vs update stats 
> per physical table
> If possible, figure out if stats was collected because minor compaction was 
> promoted to major compaction and surface a metric for it.
> Because most of the collection work happens on server side, one option would 
> be to see how HBase's metrics are surfaced (my guess is JMX) and follow the 
> same pattern. Or we could possibly use the hbase-metrics-api module but that 
> is an HBase 1.4 thing. Another option would be see PHOENIX-3807 for some 
> inspiration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4276) Surface metrics on statistics collection

2017-10-04 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4276:
--
Issue Type: Improvement  (was: Bug)

> Surface metrics on statistics collection
> 
>
> Key: PHOENIX-4276
> URL: https://issues.apache.org/jira/browse/PHOENIX-4276
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Samarth Jain
>
> It would be good to get an insight on how stats collection is doing over 
> time. An initial set of metrics that I can think of would be:
> Time taken to compute stats (reading cells and computing their size)
> Time taken to commit stats per physical table.
> Number of guide posts collected per physical table
> Number of guide posts collected per region.
> Number of regions on which stats collection happened per physical table
> Number of times stats was collected due to major compaction vs update stats 
> per physical table
> If possible, figure out if stats was collected because minor compaction was 
> promoted to major compaction and surface a metric for it.
> Because most of the collection work happens on server side, one option would 
> be to see how HBase's metrics are surfaced (my guess is JMX) and follow the 
> same pattern. Or we could possibly use the hbase-metrics-api module but that 
> is an HBase 1.4 thing. Another option would be see PHOENIX-3807 for some 
> inspiration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4276) Surface metrics on statistics collection

2017-10-04 Thread Samarth Jain (JIRA)
Samarth Jain created PHOENIX-4276:
-

 Summary: Surface metrics on statistics collection
 Key: PHOENIX-4276
 URL: https://issues.apache.org/jira/browse/PHOENIX-4276
 Project: Phoenix
  Issue Type: Bug
Reporter: Samarth Jain


It would be good to get an insight on how stats collection is doing over time. 
An initial set of metrics that I can think of would be:
Time taken to compute stats (reading cells and computing their size)
Time taken to commit stats per physical table.
Number of guide posts collected per physical table
Number of guide posts collected per region.
Number of regions on which stats collection happened per physical table
Number of times stats was collected due to major compaction vs update stats per 
physical table
If possible, figure out if stats was collected because minor compaction was 
promoted to major compaction and surface a metric for it.

Because most of the collection work happens on server side, one option would be 
to see how HBase's metrics are surfaced (my guess is JMX) and follow the same 
pattern. Or we could possibly use the hbase-metrics-api module but that is an 
HBase 1.4 thing. Another option would be see PHOENIX-3807 for some inspiration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4198) Remove the need for users to have access to the Phoenix SYSTEM tables to create tables

2017-10-04 Thread Ankit Singhal (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated PHOENIX-4198:
---
Attachment: PHOENIX-4198_v3.patch

> Remove the need for users to have access to the Phoenix SYSTEM tables to 
> create tables
> --
>
> Key: PHOENIX-4198
> URL: https://issues.apache.org/jira/browse/PHOENIX-4198
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>  Labels: namespaces
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4198.patch, PHOENIX-4198_v2.patch, 
> PHOENIX-4198_v3.patch
>
>
> Problem statement:-
> A user who doesn't have access to a table should also not be able to modify  
> Phoenix Metadata. Currently, every user required to have a write permission 
> to SYSTEM tables which is a security concern as they can 
> create/alter/drop/corrupt meta data of any other table without proper access 
> to the corresponding physical tables.
> [~devaraj] recommended a solution as below.
> 1. A coprocessor endpoint would be implemented and all write accesses to the 
> catalog table would have to necessarily go through that. The 'hbase' user 
> would own that table. Today, there is MetaDataEndpointImpl that's run on the 
> RS where the catalog is hosted, and that could be enhanced to serve the 
> purpose we need.
> 2. The regionserver hosting the catalog table would do the needful for all 
> catalog updates - creating the mutations as needed, that is.
> 3. The coprocessor endpoint could use Ranger to do necessary authorization 
> checks before updating the catalog table. So for example, if a user doesn't 
> have authorization to create a table in a certain namespace, or update the 
> schema, etc., it can reject such requests outright. Only after successful 
> validations, does it perform the operations (physical operations to do with 
> creating the table, and updating the catalog table with the necessary 
> mutations).
> 4. In essence, the code that implements dealing with DDLs, would be hosted in 
> the catalog table endpoint. The client code would be really thin, and it 
> would just invoke the endpoint with the necessary info. The additional thing 
> that needs to be done in the endpoint is the validation of authorization to 
> prevent unauthorized users from making changes to someone else's 
> tables/schemas/etc. For example, one should be able to create a view on a 
> table if he has read access on the base table. That mutation on the catalog 
> table would be permitted. For changing the schema (adding a new column for 
> example), the said user would need write permission on the table... etc etc.
> Thanks [~elserj] for the write-up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4198) Remove the need for users to have access to the Phoenix SYSTEM tables to create tables

2017-10-04 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191011#comment-16191011
 ] 

Ankit Singhal commented on PHOENIX-4198:


bq. Instead of using HTable to make an RPC to change the user context, is to 
possible to set the rpc context to null before running as the login user? 
Thanks [~tdsilva], it seems to be a good idea.  I've updated the patch to 
accommodate this. please review.

> Remove the need for users to have access to the Phoenix SYSTEM tables to 
> create tables
> --
>
> Key: PHOENIX-4198
> URL: https://issues.apache.org/jira/browse/PHOENIX-4198
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>  Labels: namespaces
> Fix For: 4.13.0
>
> Attachments: PHOENIX-4198.patch, PHOENIX-4198_v2.patch
>
>
> Problem statement:-
> A user who doesn't have access to a table should also not be able to modify  
> Phoenix Metadata. Currently, every user required to have a write permission 
> to SYSTEM tables which is a security concern as they can 
> create/alter/drop/corrupt meta data of any other table without proper access 
> to the corresponding physical tables.
> [~devaraj] recommended a solution as below.
> 1. A coprocessor endpoint would be implemented and all write accesses to the 
> catalog table would have to necessarily go through that. The 'hbase' user 
> would own that table. Today, there is MetaDataEndpointImpl that's run on the 
> RS where the catalog is hosted, and that could be enhanced to serve the 
> purpose we need.
> 2. The regionserver hosting the catalog table would do the needful for all 
> catalog updates - creating the mutations as needed, that is.
> 3. The coprocessor endpoint could use Ranger to do necessary authorization 
> checks before updating the catalog table. So for example, if a user doesn't 
> have authorization to create a table in a certain namespace, or update the 
> schema, etc., it can reject such requests outright. Only after successful 
> validations, does it perform the operations (physical operations to do with 
> creating the table, and updating the catalog table with the necessary 
> mutations).
> 4. In essence, the code that implements dealing with DDLs, would be hosted in 
> the catalog table endpoint. The client code would be really thin, and it 
> would just invoke the endpoint with the necessary info. The additional thing 
> that needs to be done in the endpoint is the validation of authorization to 
> prevent unauthorized users from making changes to someone else's 
> tables/schemas/etc. For example, one should be able to create a view on a 
> table if he has read access on the base table. That mutation on the catalog 
> table would be permitted. For changing the schema (adding a new column for 
> example), the said user would need write permission on the table... etc etc.
> Thanks [~elserj] for the write-up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[VOTE] Release of Apache Phoenix 4.12.0 RC0

2017-10-04 Thread James Taylor
Hello Everyone,

This is a call for a vote on Apache Phoenix 4.12.0 RC0. This is the next
minor release of Phoenix 4, compatible with Apache HBase 0.98, 1.1, 1.2, &
1.3. The release includes both a source-only release and a convenience
binary release for each supported HBase version.

This release has feature parity with supported HBase versions and includes
the following improvements:
- Improved scalability of global mutable secondary index
- 100+ bug fixes (the majority around secondary indexing)
- Index Scrutiny tool [1]
- Stabilization of unit tests
- Support for table sampling [2]
- Support for APPROX_COUNT_DISTINCT aggregate function [3]

The source tarball, including signatures, digests, etc can be found at:
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-0.98-rc0/src/
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-1.1-rc0/src/
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-1.2-rc0/src/
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-1.3-rc0/src/

The binary artifacts can be found at:
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-0.98-rc0/bin/
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-1.1-rc0/bin/
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-1.2-rc0/bin/
https://dist.apache.org/repos/dist/dev/phoenix/apache-phoenix-v4.12.0-HBase-1.3-rc0/bin/

For a complete list of changes, see:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315120=12340844

Artifacts are signed with my "CODE SIGNING KEY": 308FBEE06088BE0F

KEYS file available here:
https://dist.apache.org/repos/dist/dev/phoenix/KEYS

The hash and tag to be voted upon:
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=13a7f97b49704642d67481c58a118a68c2e4c2e5
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=tag;h=refs/tags/v4.12.0-HBase-0.98-rc0
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=e40bbfff1150e56e1ecb7cd22c49cee298496c2b
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=tag;h=refs/tags/v4.12.0-HBase-1.1-rc0
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=d79dd50ff732f2673e1414d970cd4742e2c135de
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=tag;h=refs/tags/v4.12.0-HBase-1.2-rc0
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=f0bc4cdb5bbf96b316c78cc816400b04f63e911b
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=tag;h=refs/tags/v4.12.0-HBase-1.3-rc0

Vote will be open for at least 72 hours. Please vote:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove (and reason why)

Thanks,
The Apache Phoenix Team

[1] https://phoenix.apache.org/secondary_indexing.html#Index_Scrutiny_Tool
[2] https://phoenix.apache.org/tablesample.html
[3] https://phoenix.apache.org/language/functions.html#approx_count_distinct


[jira] [Commented] (PHOENIX-3919) Add hbase-hadoop2-compat as compile time dependency

2017-10-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190909#comment-16190909
 ] 

Hadoop QA commented on PHOENIX-3919:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12871941/PHOENIX-3819.patch
  against master branch at commit f0bc4cdb5bbf96b316c78cc816400b04f63e911b.
  ATTACHMENT ID: 12871941

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev patch that doesn't require tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1523//console

This message is automatically generated.

> Add hbase-hadoop2-compat as compile time dependency
> ---
>
> Key: PHOENIX-3919
> URL: https://issues.apache.org/jira/browse/PHOENIX-3919
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.11.0
>Reporter: Alex Araujo
>Assignee: Alex Araujo
>Priority: Minor
> Fix For: 4.13.0
>
> Attachments: PHOENIX-3819.patch
>
>
> HBASE-17448 added hbase-hadoop2-compat as a required dependency for clients, 
> but it is currently a test only dependency in some Phoenix modules.
> Make it an explicit compile time dependency in those modules.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3867) nth_value returns valid values for non-existing rows

2017-10-04 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190899#comment-16190899
 ] 

James Taylor commented on PHOENIX-3867:
---

Ping [~singamteja]?

> nth_value returns valid values for non-existing rows 
> -
>
> Key: PHOENIX-3867
> URL: https://issues.apache.org/jira/browse/PHOENIX-3867
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.10.0
>Reporter: Loknath Priyatham Teja Singamsetty 
> Fix For: 4.12.0
>
>
> Assume a table with two rows as follows:
> id, page_id, date, value
> 2, 8 , 1 , 7
> 3, 8 , 2,  9 
> Fetch 3rd most recent value of page_id 3 should not return any values. 
> However, rs.next() succeeds and rs.getInt(1) returns 0 and the assertion 
> fails. Below is the test case depicting the same. 
> Issues:
> 
> a) From sqline, the 3rd nth_value is returned as null
> b) When programatically accessed, it is coming as 0
> Test Case:
> -
> public void nonExistingNthRowTestWithGroupBy() throws Exception {
> Connection conn = DriverManager.getConnection(getUrl());
> String nthValue = generateUniqueName();
> String ddl = "CREATE TABLE IF NOT EXISTS " + nthValue + " "
> + "(id INTEGER NOT NULL PRIMARY KEY, page_id UNSIGNED_LONG,"
> + " dates INTEGER, val INTEGER)";
> conn.createStatement().execute(ddl);
> conn.createStatement().execute(
> "UPSERT INTO " + nthValue + " (id, page_id, dates, val) VALUES 
> (2, 8, 1, 7)");
> conn.createStatement().execute(
> "UPSERT INTO " + nthValue + " (id, page_id, dates, val) VALUES 
> (3, 8, 2, 9)");
> conn.commit();
> ResultSet rs = conn.createStatement().executeQuery(
> "SELECT NTH_VALUE(val, 3) WITHIN GROUP (ORDER BY dates DESC) FROM 
> " + nthValue
> + " GROUP BY page_id");
> assertTrue(rs.next());
> assertEquals(rs.getInt(1), 4);
> assertFalse(rs.next());
> }
> Root Cause:
> ---
> The underlying issue seems to be with the way NTH_Value aggregation is done 
> by the aggregator. The client aggregator is first populated with the top 'n' 
> rows (if present) and during the iterator.next() never gets evaluated in 
> BaseGroupedAggregatingResultIterator to see if the nth row is actually 
> present or not. Once the iterator.next() succeeds, retrieving the value from 
> the result set using the row projector triggers the client aggregators 
> evaluate() method as part of schema.toBytes(..) which is defaulting to 0 for 
> empty row if it is int when programmatically accessed.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4136) Document APPROX_COUNT_DISTINCT function

2017-10-04 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4136:
--
Summary: Document APPROX_COUNT_DISTINCT function  (was: Document 
APPROXIMATE_COUNT_DISTINCT function)

> Document APPROX_COUNT_DISTINCT function
> ---
>
> Key: PHOENIX-4136
> URL: https://issues.apache.org/jira/browse/PHOENIX-4136
> Project: Phoenix
>  Issue Type: Task
>Reporter: James Taylor
>Assignee: Ethan Wang
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4136-v1.patch
>
>
> Now that PHOENIX-418 has been committed, we need to document this new 
> function by including  APPROXIMATE_COUNT_DISTINCT in our list of functions 
> (which lives in phoenix.csv) so that it shows up here: 
> https://phoenix.apache.org/language/functions.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)