[ 
https://issues.apache.org/jira/browse/DRILL-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250260#comment-15250260
 ] 

ASF GitHub Bot commented on DRILL-2100:
---------------------------------------

Github user adeneche commented on a diff in the pull request:

    https://github.com/apache/drill/pull/454#discussion_r60447085
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
    @@ -550,7 +565,15 @@ public BatchGroup mergeAndSpill(LinkedList<BatchGroup> 
batchGroups) throws Schem
         c1.buildSchema(BatchSchema.SelectionVectorMode.NONE);
         c1.setRecordCount(count);
     
    -    String outputFile = Joiner.on("/").join(dirs.next(), fileName, 
spillCount++);
    +    String spillDir = dirs.next();
    +    Path currSpillPath = new Path(Joiner.on("/").join(spillDir, fileName));
    +    currSpillDirs.add(currSpillPath);
    +    String outputFile = Joiner.on("/").join(currSpillPath, spillCount++);
    +    try {
    +        fs.deleteOnExit(currSpillPath);
    +    } catch (IOException e) {
    +        throw new RuntimeException(e);
    --- End diff --
    
    I have some concerns about throwing an exception here:
    
    First, does it make sense to fail the query when `deleteOnExit()` fails ? 
shouldn't we just log a warning. After all, if all goes well, this folder will 
get deleted in the close method.
    
    Second, if this exception throws we'll be leaking memory because we'll skip 
clearing batchGroupList and hyperBatch


> Drill not deleting spooling files
> ---------------------------------
>
>                 Key: DRILL-2100
>                 URL: https://issues.apache.org/jira/browse/DRILL-2100
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 0.8.0
>            Reporter: Abhishek Girish
>            Assignee: Vitalii Diravka
>             Fix For: Future
>
>
> Currently, after forcing queries to use an external sort by switching off 
> hash join/agg causes spill-to-disk files accumulating. 
> This causes issues with disk space availability when the spill is configured 
> to be on the local file system (/tmp/drill). Also not optimal when configured 
> to use DFS (custom). 
> Drill must clean up all temporary files created after a query completes or 
> after a drillbit restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to