[ 
https://issues.apache.org/jira/browse/DRILL-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067816#comment-16067816
 ] 

Paul Rogers commented on DRILL-5453:
------------------------------------

Noticed that the query finished on the test node as reported in the bug. Can't 
have been run with the memory in the description, however, for the reasons 
stated.

Increased per-node memory to 2 GB. Query ran, but slowly.

{code}
Results: 1 records, 2 batches, 3,776,392 ms
{code}

The 1 hour+ run time is still far shorter than the 16 hours observed by the 
ticket author. Since the original data set is only 40 MB (!) in size, an hour 
run time is still unacceptable. 40 MB of data should be sortable in memory in 
seconds. Hence, there are probably many opportunities for improvement here.

Performance monitoring shows constant disk activity, but at a low rate (~30 
MB/s) compared to the capability of the Mac SSD (~500 MB/s). CPU is at 100% 
during the run, showing that Drill is CPU bound and cannot saturate the disk 
channel.

Another issue in this query is the many references to entries in the columns 
array. Each of these causes Drill to materialize a copy of the column for use 
as a sort key. This cases the already-large 5000 column row to grow even 
larger; putting much more pressure on memory. A far better solution is to 
modify the generated code to access column values directly. Yes, the access may 
be slower, but overall performance may be faster because of the much smaller 
amount of data that must be buffered, spilled and reread.

So, in addition to needing fixes to limit batch size, this ticket also shows 
the need for a likely great number of performance improvements throughout the 
Drill execution engine.

> Managed External Sort : Sorting on a lot of columns is taking unreasonably 
> long time
> ------------------------------------------------------------------------------------
>
>                 Key: DRILL-5453
>                 URL: https://issues.apache.org/jira/browse/DRILL-5453
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.10.0
>            Reporter: Rahul Challapalli
>            Assignee: Paul Rogers
>         Attachments: drill5453.sys.drill
>
>
> The below query ran for ~16hrs before I cancelled it.
> {code}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> alter session set `planner.memory.max_query_memory_per_node` = 482344960;
> alter session set `planner.width.max_per_node` = 1;
> alter session set `planner.width.max_per_query` = 1;
> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50],
>  
> columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[2222],columns[30],columns[2420],columns[1520],
>  columns[1410], 
> columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350],
>  
> columns[3333],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530],
>  columns[3210] ) d where d.col433 = 'sjka skjf';
> alter session set `planner.memory.max_query_memory_per_node` = 2147483648;
> {code}
> The data set and the logs are too large to attach to a jira. But below is a 
> description of the data
> {code}
> No of records : 1,000,000
> No of columns : 3500
> Length of each column : < 50
> {code}
> The profile is attached and I will give my analysis on why I think its an 
> un-reasonable amount of time soon.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to