[ 
https://issues.apache.org/jira/browse/PIG-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830383#action_12830383
 ] 

Gaurav Jain commented on PIG-1115:
----------------------------------

Proposed Solution:

-- Zebra will implement ZebraOutputCommitter

-- Zebra FrontEnd will create all the final directories and schema files 

                    $basicTable/.btschema
                    $basicTable/CG0/.schema
                    $basicTable/CG1/.schema


-- Zebra will create a temporary directory per BasicTable and write all data 
there during RecordWrite.write() under

                     $basicTable/_temporary/CG0/part-0000
                     $basicTable/_temporary/CG1/part-0000

-- _temporary directory will always be created under $basicTable

-- In BackEnd, Zebra created RecordWrites which in turn creates CGInserter. 
CGInserter works on directory, which we call 'workOutputPath' , 
                                  $basicTable/_temporary/$CG/
             But It needs .schema file which is located 2 levels up. So it 
reads schema file from
                                  $basicTable/$workOutputPath.getName()

-- In CGInserter.close(), 
                     $basicTable/_temporary/CG0/part-0000       ----------->    
          $basicTable/CG0/part-0000
-- In ZebraOutputCommitter.cleanupJob(), BasicTableOutputFormat.close() will be 
called.
-- In BasicTableOutPutFormat.close()
                      remove (                $basicTable/_temporary/           
    )






> [zebra] temp files are not cleaned.
> -----------------------------------
>
>                 Key: PIG-1115
>                 URL: https://issues.apache.org/jira/browse/PIG-1115
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Hong Tang
>
> Temp files created by zebra during table creation are not cleaned where there 
> is any task failure, which results in waste of disk space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to