[jira] [Commented] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query

Hadoop QA (JIRA) Fri, 24 Aug 2018 19:28:25 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592394#comment-16592394
 ]


Hadoop QA commented on PHOENIX-4666:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12937079/298.patch
  against master branch at commit 16c5570d38a3aa0527cbbdef21fd3859be4e729f.
  ATTACHMENT ID: 12937079

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:red}-1 release audit{color}.  The applied patch generated 2 release 
audit warnings (more than the master's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
    +        TestUtil.addCoprocessor(conn, 
SchemaUtil.normalizeFullTableName(realName), InvalidateHashCache.class);
+        createTestTable(getUrl(), "CREATE TABLE IF NOT EXISTS states (state 
CHAR(2) NOT NULL, name VARCHAR NOT NULL CONSTRAINT my_pk PRIMARY KEY (state, 
name))");
+        createTestTable(getUrl(), "CREATE TABLE IF NOT EXISTS cities (state 
CHAR(2) NOT NULL, city VARCHAR NOT NULL, population BIGINT CONSTRAINT my_pk 
PRIMARY KEY (state, city))");
+        conn.prepareStatement("UPSERT INTO cities VALUES ('CA', 'San 
Francisco', 50000)").executeUpdate();
+        conn.prepareStatement("UPSERT INTO cities VALUES ('CA', 'Sacramento', 
3000)").executeUpdate();
+        ResultSet rs = conn.prepareStatement("SELECT SUM(population) FROM 
states s JOIN cities c ON c.state = s.state").executeQuery();
+        rs = conn.prepareStatement("SELECT SUM(population) FROM states s JOIN 
cities c ON c.state = s.state").executeQuery();
+        rs = conn.prepareStatement("SELECT /*+ USE_PERSISTENT_CACHE */ 
SUM(population) FROM states s JOIN cities c ON c.state = 
s.state").executeQuery();
+        conn.prepareStatement("UPSERT INTO cities VALUES ('CA', 'Palo Alto', 
2000)").executeUpdate();
+        rs = conn.prepareStatement("SELECT /*+ USE_PERSISTENT_CACHE */ 
SUM(population) FROM states s JOIN cities c ON c.state = 
s.state").executeQuery();

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2006//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2006//artifact/patchprocess/patchReleaseAuditWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2006//console

This message is automatically generated.

> Add a subquery cache that persists beyond the life of a query
> -------------------------------------------------------------
>
>                 Key: PHOENIX-4666
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4666
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Marcell Ortutay
>            Assignee: Marcell Ortutay
>            Priority: Major
>         Attachments: 298.patch, 
> PHOENIX-4666-subquery-cache-4.x-HBase-1.4.patch, 
> PHOENIX-4666-subquery-cache-4.x-HBase-1.4.patch
>
>
> The user list thread for additional context is here: 
> [https://lists.apache.org/thread.html/e62a6f5d79bdf7cd238ea79aed8886816d21224d12b0f1fe9b6bb075@%3Cuser.phoenix.apache.org%3E]
> ----
> A Phoenix query may contain expensive subqueries, and moreover those 
> expensive subqueries may be used across multiple different queries. While 
> whole result caching is possible at the application level, it is not possible 
> to cache subresults in the application. This can cause bad performance for 
> queries in which the subquery is the most expensive part of the query, and 
> the application is powerless to do anything at the query level. It would be 
> good if Phoenix provided a way to cache subquery results, as it would provide 
> a significant performance gain.
> An illustrative example:
>     SELECT * FROM table1 JOIN (SELECT id_1 FROM large_table WHERE x = 10) 
> expensive_result ON table1.id_1 = expensive_result.id_2 AND table1.id_1 = 
> \{id}
> In this case, the subquery "expensive_result" is expensive to compute, but it 
> doesn't change between queries. The rest of the query does because of the 
> \{id} parameter. This means the application can't cache it, but it would be 
> good if there was a way to cache expensive_result.
> Note that there is currently a coprocessor based "server cache", but the data 
> in this "cache" is not persisted across queries. It is deleted after a TTL 
> expires (30sec by default), or when the query completes.
> This is issue is fairly high priority for us at 23andMe and we'd be happy to 
> provide a patch with some guidance from Phoenix maintainers. We are currently 
> putting together a design document for a solution, and we'll post it to this 
> Jira ticket for review in a few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4666) Add a subquery cache that persists beyond the life of a query

Reply via email to