[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenhai updated ASTERIXDB-1433:
------------------------------
    Description: This is a classic hardware platform that shoes up the TB scale 
of dataset in total. AsterixDB does extremely well for the complex query that 
includes multiple join operators over a high-selectivity select operator. 
However, the running trace results demonstrate that, as compared to the big 
memory configurations, the original tables is always re-loaded from the disk to 
the actual memory even they have been handled in the latest query. To this end, 
why not provide the strategy to keep the intermediate data of the last 
completed query into the memory and free them in case the memory is not  enough 
for the newly query. In some case, the user will always trigger the query with 
the different parameters on the same tables, for example, the variant-parameter 
aggregation on the single big fact table.  (was: This is a classic hardware 
platform that shoes up the TB scale of dataset in total. AsterixDB does 
extremely well for the complex query that includes multiple join operators. 
However, the running trace results demonstrate that, as compared to the big 
memory configurations, the data is always re-loaded from the disk to the actual 
memory. To this end, why not provide the strategy to keep the intermediate data 
of the last completed query into the memory and free them in case the memory is 
not  enough for the newly query. In some case, the user will always trigger the 
query with the different parameters on the same tables, for example, the 
variant-parameter aggregation on the single big fact table.)

> Multiple cores with huge memory slow down in the big fact table aggregation.
> ----------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1433
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1433
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: Hyracks Core
>         Environment: 10 nodes X Linux ubuntu/6 cpu X 4 cores/per cpu, 128 GB 
> memory/per node.
>            Reporter: Wenhai
>
> This is a classic hardware platform that shoes up the TB scale of dataset in 
> total. AsterixDB does extremely well for the complex query that includes 
> multiple join operators over a high-selectivity select operator. However, the 
> running trace results demonstrate that, as compared to the big memory 
> configurations, the original tables is always re-loaded from the disk to the 
> actual memory even they have been handled in the latest query. To this end, 
> why not provide the strategy to keep the intermediate data of the last 
> completed query into the memory and free them in case the memory is not  
> enough for the newly query. In some case, the user will always trigger the 
> query with the different parameters on the same tables, for example, the 
> variant-parameter aggregation on the single big fact table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to