[jira] Commented: (PIG-1115) [zebra] temp files are not cleaned.

2010-02-05 Thread Gaurav Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830383#action_12830383
 ] 

Gaurav Jain commented on PIG-1115:
--

Proposed Solution:

-- Zebra will implement ZebraOutputCommitter

-- Zebra FrontEnd will create all the final directories and schema files 

$basicTable/.btschema
$basicTable/CG0/.schema
$basicTable/CG1/.schema


-- Zebra will create a temporary directory per BasicTable and write all data 
there during RecordWrite.write() under

 $basicTable/_temporary/CG0/part-
 $basicTable/_temporary/CG1/part-

-- _temporary directory will always be created under $basicTable

-- In BackEnd, Zebra created RecordWrites which in turn creates CGInserter. 
CGInserter works on directory, which we call 'workOutputPath' , 
  $basicTable/_temporary/$CG/
 But It needs .schema file which is located 2 levels up. So it 
reads schema file from
  $basicTable/$workOutputPath.getName()

-- In CGInserter.close(), 
 $basicTable/_temporary/CG0/part-   --->
  $basicTable/CG0/part-
-- In ZebraOutputCommitter.cleanupJob(), BasicTableOutputFormat.close() will be 
called.
-- In BasicTableOutPutFormat.close()
  remove ($basicTable/_temporary/   
)






> [zebra] temp files are not cleaned.
> ---
>
> Key: PIG-1115
> URL: https://issues.apache.org/jira/browse/PIG-1115
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Hong Tang
>
> Temp files created by zebra during table creation are not cleaned where there 
> is any task failure, which results in waste of disk space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1115) [zebra] temp files are not cleaned.

2010-02-16 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834373#action_12834373
 ] 

Hong Tang commented on PIG-1115:


Why not requesting the patch to be back ported to Hadoop 0.21 (btw do you mean 
Hadoop 0.21 or 0.20)?

> [zebra] temp files are not cleaned.
> ---
>
> Key: PIG-1115
> URL: https://issues.apache.org/jira/browse/PIG-1115
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Hong Tang
>Assignee: Gaurav Jain
> Attachments: PIG-1115.patch
>
>
> Temp files created by zebra during table creation are not cleaned where there 
> is any task failure, which results in waste of disk space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1115) [zebra] temp files are not cleaned.

2010-02-16 Thread Gaurav Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834380#action_12834380
 ] 

Gaurav Jain commented on PIG-1115:
--


We discussed the backport with M/R team ( patch MAPREDUCE-947), earliest it can 
be done is in the next release of Hadoop.

I meant Hadoop 0.20/0.21 ( any release other than trunk )

> [zebra] temp files are not cleaned.
> ---
>
> Key: PIG-1115
> URL: https://issues.apache.org/jira/browse/PIG-1115
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Hong Tang
>Assignee: Gaurav Jain
> Attachments: PIG-1115.patch
>
>
> Temp files created by zebra during table creation are not cleaned where there 
> is any task failure, which results in waste of disk space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1115) [zebra] temp files are not cleaned.

2010-02-16 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834409#action_12834409
 ] 

Yan Zhou commented on PIG-1115:
---

patch reviewed +1

> [zebra] temp files are not cleaned.
> ---
>
> Key: PIG-1115
> URL: https://issues.apache.org/jira/browse/PIG-1115
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Hong Tang
>Assignee: Gaurav Jain
> Attachments: PIG-1115.patch
>
>
> Temp files created by zebra during table creation are not cleaned where there 
> is any task failure, which results in waste of disk space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1115) [zebra] temp files are not cleaned.

2010-02-16 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834445#action_12834445
 ] 

Yan Zhou commented on PIG-1115:
---

Hudson results on the load-store-redesign branch:

+1 overall.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

 +1 findbugs.  The patch does not introduce any new Findbugs warnings.

 +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

> [zebra] temp files are not cleaned.
> ---
>
> Key: PIG-1115
> URL: https://issues.apache.org/jira/browse/PIG-1115
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Hong Tang
>Assignee: Gaurav Jain
> Attachments: PIG-1115.patch
>
>
> Temp files created by zebra during table creation are not cleaned where there 
> is any task failure, which results in waste of disk space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.