Hi to all,
I'm trying to find the best query for my use case but I found that one
version work and the other one does not (unless that I don't apply some
tuning to timeouts etc like explained in [1]).
The 2 queries extract the same data but, while the first query terminates
the second does not.
*PS*: without the USE_SORT_MERGE_JOIN both queries weren't working
SELECT /*+ USE_SORT_MERGE_JOIN */ COUNT(*)
FROM PEOPLE ds JOIN MYTABLE l ON ds.PERSON_ID = l.LOCALID
WHERE l.EID IS NULL AND l.DSID = 'PEOPLE' AND l.HAS_CANDIDATES = FALSE;
+---------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| PLAN
| EST_BYTES_READ | EST_ROWS_READ |
EST_INFO_TS |
+---------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| SORT-MERGE-JOIN (INNER) TABLES
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT 42-CHUNK 6168903 ROWS 11324622221 BYTES PARALLEL 3-WAY FULL
SCAN OVER PEOPLE | 14155777900 | 12077867 |
1513754378759 |
| SERVER FILTER BY FIRST KEY ONLY
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT MERGE SORT
| 14155777900 | 12077867 |
1513754378759 |
| AND (SKIP MERGE)
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT 15-CHUNK 5908964 ROWS 2831155679 BYTES PARALLEL 15-WAY RANGE
SCAN OVER MYTABLE [0] - [2] | 14155777900 | 12077867 |
1513754378759 |
| SERVER FILTER BY (EID IS NULL AND DSID = 'PEOPLE' AND
HAS_CANDIDATES = false) | 14155777900 | 12077867
| 1513754378759 |
| SERVER SORTED BY [L.LOCALID]
| 14155777900 | 12077867
| 1513754378759 |
| CLIENT MERGE SORT
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT AGGREGATE INTO SINGLE ROW
| 14155777900 | 12077867 |
1513754378759 |
+---------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
SELECT /*+ USE_SORT_MERGE_JOIN */ COUNT(*)
FROM (SELECT LOCALID FROM MYTABLE
WHERE EID IS NULL AND DSID = 'PEOPLE' AND HAS_CANDIDATES = FALSE) l JOIN
PEOPLE ds ON ds.PERSON_ID = l.LOCALID;
+--------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| PLAN
| EST_BYTES_READ | EST_ROWS_READ |
EST_INFO_TS |
+--------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| SORT-MERGE-JOIN (INNER) TABLES
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT 15-CHUNK 5908964 ROWS 2831155679 BYTES PARALLEL 3-WAY RANGE
SCAN OVER MYTABLE [0] - [2] | 14155777900 | 12077867 |
1513754378759 |
| SERVER FILTER BY (EID IS NULL AND DSID = 'PEOPLE' AND
HAS_CANDIDATES = false) | 14155777900 | 12077867
| 1513754378759 |
| CLIENT MERGE SORT
| 14155777900 | 12077867 |
1513754378759 |
| AND (SKIP MERGE)
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT 42-CHUNK 6168903 ROWS 11324622221 BYTES PARALLEL 42-WAY FULL
SCAN OVER PEOPLE | 14155777900 | 12077867 |
1513754378759 |
| SERVER FILTER BY FIRST KEY ONLY
| 14155777900 | 12077867 |
1513754378759 |
| SERVER SORTED BY [DS.PERSON_ID]
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT MERGE SORT
| 14155777900 | 12077867 |
1513754378759 |
| CLIENT AGGREGATE INTO SINGLE ROW
| 14155777900 | 12077867 |
1513754378759 |
+--------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
10 rows selected (0.041 seconds)
What do you think? Whould I try to give more resources to HBase/Phoenix or
is the first query the best (and reliable) one?
Any insight about this is highly appreciated..
Best,
Flavio