Hi,

We have a cache called GAL3EC1, it has

   1. A composite pKey consisting of customer_id and date
   2. An Index on the date column
   3. 300 sparse columns

We are running a single EC2 4x8xlarge node.

The following query takes 8min to finish
Select COUNT (*) FROM (SELECT customer_id FROM GAl3ec1 where dt >
'2018-05-12' GROUP BY customer_id having
SUM(ec1_bknt_total_product_views_app) > 2 AND
MAX(ec1_hnk_total_product_clicks_app) < 1)

I have a few questions:

   1. 'top' command shows %100 cpu utilization (i.e only one of the 32 CPUs
   is used). How can I get the query to use all 32 CPUs? I have tried setting
   Query Parallelism to 32, but it didn't help,
   2. Adding the index on date column seems to have slowed down the query.
   The 8min time from above was without the index, with the index the query
   doesn't finish (I gave up after 30min). A similar query on a smaller date
   range showed a 10x slow down with the index. Why?
   3. Our loads from Spark are very slow as well, and also seem to not use
   the system resource properly, can that be related?
   4. What are some good tools and techniques to troubleshoot these
   problems in Ignite?


All the relevant info is attached (configs, cache stats, node stats, etc.).

Cheers,
Eugene
Cache

+====================================================================================================================================================+
|         Name(@)         |    Mode     | Nodes |       Entries (Heap / 
Off-heap)       |   Hits    |    Misses    |    Reads     |      Writes      |
+====================================================================================================================================================+
| cache1(@c0)             | PARTITIONED | 1     | min: 0 (0 / 0)                
        | min: 0    | min: 0       | min: 0       | min: 0           |
|                         |             |       | avg: 0.00 (0.00 / 0.00)       
        | avg: 0.00 | avg: 0.00    | avg: 0.00    | avg: 0.00        |
|                         |             |       | max: 0 (0 / 0)                
        | max: 0    | max: 0       | max: 0       | max: 0           |
+-------------------------+-------------+-------+---------------------------------------+-----------+--------------+--------------+------------------+
| cache2(@c1)             | PARTITIONED | 1     | min: 0 (0 / 0)                
        | min: 0    | min: 0       | min: 0       | min: 0           |
|                         |             |       | avg: 0.00 (0.00 / 0.00)       
        | avg: 0.00 | avg: 0.00    | avg: 0.00    | avg: 0.00        |
|                         |             |       | max: 0 (0 / 0)                
        | max: 0    | max: 0       | max: 0       | max: 0           |
+-------------------------+-------------+-------+---------------------------------------+-----------+--------------+--------------+------------------+
| SQL_PUBLIC_GAL2RU(@c2)  | PARTITIONED | 1     | min: 8229539 (0 / 8229539)    
        | min: 0    | min: 0       | min: 0       | min: 8229539     |
|                         |             |       | avg: 8229539.00 (0.00 / 
8229539.00)   | avg: 0.00 | avg: 0.00    | avg: 0.00    | avg: 8229539.00  |
|                         |             |       | max: 8229539 (0 / 8229539)    
        | max: 0    | max: 0       | max: 0       | max: 8229539     |
+-------------------------+-------------+-------+---------------------------------------+-----------+--------------+--------------+------------------+
| SQL_PUBLIC_GAL3EC1(@c3) | PARTITIONED | 1     | min: 63991599 (0 / 63991599)  
        | min: 0    | min: 9028    | min: 9028    | min: 83335247    |
|                         |             |       | avg: 63991599.00 (0.00 / 
63991599.00) | avg: 0.00 | avg: 9028.00 | avg: 9028.00 | avg: 83335247.00 |
|                         |             |       | max: 63991599 (0 / 63991599)  
        | max: 0    | max: 9028    | max: 9028    | max: 83335247    |
+-------------------------+-------------+-------+---------------------------------------+-----------+--------------+--------------+------------------+
| SQL_PUBLIC_PERSON(@c4)  | PARTITIONED | 1     | min: 0 (0 / 0)                
        | min: 0    | min: 0       | min: 0       | min: 0           |
|                         |             |       | avg: 0.00 (0.00 / 0.00)       
        | avg: 0.00 | avg: 0.00    | avg: 0.00    | avg: 0.00        |
|                         |             |       | max: 0 (0 / 0)                
        | max: 0    | max: 0       | max: 0       | max: 0           |
+----------------------------------------------------------------------------------------------------------------------------------------------------+

Node

+-------------------------------------------------------------------------------------------+
| ID                          | 2dff6868-b253-48d1-aa7b-128082a26332            
            |
| ID8                         | 2DFF6868                                        
            |
| Node Type                   | Server                                          
            |
| Order                       | 1                                               
            |
| Address (0)                 | 172.17.0.1                                      
            |
| Address (1)                 | 172.21.85.213                                   
            |
| Address (2)                 | 127.0.0.1                                       
            |
| Address (3)                 | 0:0:0:0:0:0:0:1%lo                              
            |
| OS info                     | Linux amd64 4.4.0-1065-aws                      
            |
| OS user                     | root                                            
            |
| Deployment mode             | SHARED                                          
            |
| Language runtime            | Java Platform API Specification ver. 1.8        
            |
| Ignite version              | 2.5.0                                           
            |
| Ignite instance name        | Server                                          
            |
| JRE information             | HotSpot 64-Bit Tiered Compilers                 
            |
| JVM start time              | 2018-08-21 19:56:19                             
            |
| Node start time             | 2018-08-21 19:56:21                             
            |
| Up time                     | 04:03:59.067                                    
            |
| CPUs                        | 32                                              
            |
| Last metric update          | 2018-08-22 00:00:18                             
            |
| Non-loopback IPs            | 172.17.0.1, 172.21.85.213, 
fe80:0:0:0:59:71ff:fe32:36e%ens3 |
| Enabled MACs                | 02423AD5E434, 02597132036E                      
            |
| Maximum active jobs         | 1                                               
            |
| Current active jobs         | 0                                               
            |
| Average active jobs         | 0.00                                            
            |
| Maximum waiting jobs        | 0                                               
            |
| Current waiting jobs        | 0                                               
            |
| Average waiting jobs        | 0.00                                            
            |
| Maximum rejected jobs       | 0                                               
            |
| Current rejected jobs       | 0                                               
            |
| Average rejected jobs       | 0.00                                            
            |
| Maximum cancelled jobs      | 0                                               
            |
| Current cancelled jobs      | 0                                               
            |
| Average cancelled jobs      | 0.00                                            
            |
| Total rejected jobs         | 0                                               
            |
| Total executed jobs         | 26                                              
            |
| Total cancelled jobs        | 0                                               
            |
| Maximum job wait time       | 0ms                                             
            |
| Current job wait time       | 0ms                                             
            |
| Average job wait time       | 0.00ms                                          
            |
| Maximum job execute time    | 0ms                                             
            |
| Current job execute time    | 0ms                                             
            |
| Average job execute time    | 0.00ms                                          
            |
| Total busy time             | 26234ms                                         
            |
| Busy time %                 | 0.18%                                           
            |
| Current CPU load %          | 3.17%                                           
            |
| Average CPU load %          | 2.26%                                           
            |
| Heap memory initialized     | 32gb                                            
            |
| Heap memory used            | 56gb                                            
            |
| Heap memory committed       | 89gb                                            
            |
| Heap memory maximum         | 178gb                                           
            |
| Non-heap memory initialized | 2mb                                             
            |
| Non-heap memory used        | 86mb                                            
            |
| Non-heap memory committed   | 88mb                                            
            |
| Non-heap memory maximum     | 496mb                                           
            |
| Current thread count        | 198                                             
            |
| Maximum thread count        | 262                                             
            |
| Total started thread count  | 1645                                            
            |
| Current daemon thread count | 13                                              
            |
+-------------------------------------------------------------------------------------------+

Data region metrics:
+===================================================================================================================+
|   Name    | Page size |        Pages        |    Memory    |      Rates       
| Checkpoint buffer | Large entries |
+===================================================================================================================+
| default   | 4kb       | Total:  18672622    | Total:  71gb | Allocation: 0.00 
| Pages: 0          | 0.00%         |
|           |           | Dirty:  0           | In RAM: 71gb | Eviction:   0.00 
| Size:  0          |               |
|           |           | Memory: 18672622    |              | Replace:    0.00 
|                   |               |
|           |           | Fill factor: 94.33% |              |                  
|                   |               |
+-----------+-----------+---------------------+--------------+------------------+-------------------+---------------+
| sysMemPlc | 0         | Total:  0           | Total:  0    | Allocation: 0.00 
| Pages: 0          | 0.00%         |
|           |           | Dirty:  0           | In RAM: 0    | Eviction:   0.00 
| Size:  0          |               |
|           |           | Memory: 0           |              | Replace:    0.00 
|                   |               |
|           |           | Fill factor: 0.00%  |              |                  
|                   |               |
+-------------------------------------------------------------------------------------------------------------------+


Cache 'SQL_PUBLIC_GAL3EC1':
+============================================================================================================+
|                   Name                    |                             Value 
                             |
+============================================================================================================+
| Group                                     |                                   
                             |
| Dynamic Deployment ID                     | 
c4cd01e5561-c10f5d87-0ed0-4f34-be31-e282aa924940               |
| System                                    | off                               
                             |
| Mode                                      | PARTITIONED                       
                             |
| Atomicity Mode                            | ATOMIC                            
                             |
| Statistic Enabled                         | on                                
                             |
| Management Enabled                        | on                                
                             |
| On-heap cache enabled                     | off                               
                             |
| Partition Loss Policy                     | IGNORE                            
                             |
| Query Parallelism                         | 32                                
                             |
| Copy On Read                              | on                                
                             |
| Listener Configurations                   |                                   
                             |
| Load Previous Value                       | off                               
                             |
| Memory Policy Name                        |                                   
                             |
| Node Filter                               | 
o.a.i.configuration.CacheConfiguration$IgniteAllNodesPredicate |
| Read From Backup                          | on                                
                             |
| Topology Validator                        |                                   
                             |
| Time To Live Eager Flag                   | true                              
                             |
| Write Synchronization Mode                | FULL_SYNC                         
                             |
| Invalidate                                | off                               
                             |
| Affinity Function                         | 
o.a.i.cache.affinity.rendezvous.RendezvousAffinityFunction     |
| Affinity Backups                          | 0                                 
                             |
| Affinity Partitions                       | 1024                              
                             |
| Affinity Exclude Neighbors                | false                             
                             |
| Affinity Mapper                           | 
o.a.i.i.processors.cache.CacheDefaultBinaryAffinityKeyMapper   |
| Rebalance Mode                            | SYNC                              
                             |
| Rebalance Batch Size                      | 524288                            
                             |
| Rebalance Timeout                         | 10000                             
                             |
| Rebalance Delay                           | 0                                 
                             |
| Time Between Rebalance Messages           | 0                                 
                             |
| Rebalance Batches Count                   | 2                                 
                             |
| Rebalance Cache Order                     | 0                                 
                             |
| Eviction Policy Enabled                   | off                               
                             |
| Eviction Policy Factory                   | <n/a>                             
                             |
| Eviction Policy Max Size                  | <n/a>                             
                             |
| Eviction Filter                           | <n/a>                             
                             |
| Near Cache Enabled                        | off                               
                             |
| Near Start Size                           | 0                                 
                             |
| Near Eviction Policy Factory              | <n/a>                             
                             |
| Near Eviction Policy Max Size             | <n/a>                             
                             |
| Default Lock Timeout                      | 0                                 
                             |
| Metadata type count                       | 0                                 
                             |
| Cache Interceptor                         | <n/a>                             
                             |
| Store Enabled                             | off                               
                             |
| Store Class                               | <n/a>                             
                             |
| Store Factory Class                       |                                   
                             |
| Store Keep Binary                         | false                             
                             |
| Store Read Through                        | off                               
                             |
| Store Write Through                       | off                               
                             |
| Store Write Coalescing                    | on                                
                             |
| Write-Behind Enabled                      | off                               
                             |
| Write-Behind Flush Size                   | 10240                             
                             |
| Write-Behind Frequency                    | 5000                              
                             |
| Write-Behind Flush Threads Count          | 1                                 
                             |
| Write-Behind Batch Size                   | 512                               
                             |
| Concurrent Asynchronous Operations Number | 500                               
                             |
| Loader Factory Class Name                 | <n/a>                             
                             |
| Writer Factory Class Name                 | <n/a>                             
                             |
| Expiry Policy Factory Class Name          | 
javax.cache.configuration.FactoryBuilder$SingletonFactory      |
| Query Execution Time Threshold            | 3000                              
                             |
| Query Escaped Names                       | on                                
                             |
| Query Schema Name                         | PUBLIC                            
                             |
| Query Indexed Types                       |                                   
                             |
| Maximum payload size for offheap indexes  | -1                                
                             |
| Query Metrics History Size                | 10                                
                             |
| Query SQL functions                       | <n/a>                             
                             |
| Query Indexed Types                       | <n/a>                             
                             |
+------------------------------------------------------------------------------------------------------------+

Reply via email to