[jira] Commented: (HIVE-1651) ScriptOperator should not forward any output to downstream operators if an exception is happened

2010-09-21 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913301#action_12913301
 ] 

Ning Zhang commented on HIVE-1651:
--

Discussed with Joydeep offline. The side effects of failed task should be 
cleaned after the job finished. _tmp* files are already taken care of in the 
current code base. The only side effect that need to be taken care of is the 
empty directories created by failed dynamic partition inserts. This issue is 
addressed in HIVE-1655. 


> ScriptOperator should not forward any output to downstream operators if an 
> exception is happened
> 
>
> Key: HIVE-1651
> URL: https://issues.apache.org/jira/browse/HIVE-1651
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1651.patch
>
>
> ScriptOperator spawns 2 threads for getting the stdout and stderr from the 
> script and then forward the output from stdout to downstream operators. In 
> case of any exceptions to the script (e.g., got killed), the ScriptOperator 
> got an exception and throw it to upstream operators until MapOperator got it 
> and call close(abort). Before the ScriptOperator.close() is called the script 
> output stream can still forward output to downstream operators. We should 
> terminate it immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1651) ScriptOperator should not forward any output to downstream operators if an exception is happened

2010-09-17 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910837#action_12910837
 ] 

Joydeep Sen Sarma commented on HIVE-1651:
-

yeah - but then the directory itself should be created as a tmp directory. and 
we should promote the directory to it's final name only when closing 
successfully.

> ScriptOperator should not forward any output to downstream operators if an 
> exception is happened
> 
>
> Key: HIVE-1651
> URL: https://issues.apache.org/jira/browse/HIVE-1651
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1651.patch
>
>
> ScriptOperator spawns 2 threads for getting the stdout and stderr from the 
> script and then forward the output from stdout to downstream operators. In 
> case of any exceptions to the script (e.g., got killed), the ScriptOperator 
> got an exception and throw it to upstream operators until MapOperator got it 
> and call close(abort). Before the ScriptOperator.close() is called the script 
> output stream can still forward output to downstream operators. We should 
> terminate it immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1651) ScriptOperator should not forward any output to downstream operators if an exception is happened

2010-09-17 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910834#action_12910834
 ] 

Ning Zhang commented on HIVE-1651:
--

@joydeep, the output file will not be committed if an exception occurred and 
close(abort=true) is called. This bug happened in a short time window after the 
exception occurred and before the close(abort) is called. Although the file got 
deleted, the dynamic partition insert already created a directory which later 
will be considered as an empty partition. 

> ScriptOperator should not forward any output to downstream operators if an 
> exception is happened
> 
>
> Key: HIVE-1651
> URL: https://issues.apache.org/jira/browse/HIVE-1651
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1651.patch
>
>
> ScriptOperator spawns 2 threads for getting the stdout and stderr from the 
> script and then forward the output from stdout to downstream operators. In 
> case of any exceptions to the script (e.g., got killed), the ScriptOperator 
> got an exception and throw it to upstream operators until MapOperator got it 
> and call close(abort). Before the ScriptOperator.close() is called the script 
> output stream can still forward output to downstream operators. We should 
> terminate it immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1651) ScriptOperator should not forward any output to downstream operators if an exception is happened

2010-09-17 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910786#action_12910786
 ] 

Joydeep Sen Sarma commented on HIVE-1651:
-

if a hadoop task is being failed - how is it that any side effect files created 
by hive code running in that task are getting promoted to the final output?

i think the forwarding is a red-herring. we should not commit output files from 
a failed task.

> ScriptOperator should not forward any output to downstream operators if an 
> exception is happened
> 
>
> Key: HIVE-1651
> URL: https://issues.apache.org/jira/browse/HIVE-1651
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1651.patch
>
>
> ScriptOperator spawns 2 threads for getting the stdout and stderr from the 
> script and then forward the output from stdout to downstream operators. In 
> case of any exceptions to the script (e.g., got killed), the ScriptOperator 
> got an exception and throw it to upstream operators until MapOperator got it 
> and call close(abort). Before the ScriptOperator.close() is called the script 
> output stream can still forward output to downstream operators. We should 
> terminate it immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1651) ScriptOperator should not forward any output to downstream operators if an exception is happened

2010-09-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910747#action_12910747
 ] 

Namit Jain commented on HIVE-1651:
--

+1

> ScriptOperator should not forward any output to downstream operators if an 
> exception is happened
> 
>
> Key: HIVE-1651
> URL: https://issues.apache.org/jira/browse/HIVE-1651
> Project: Hadoop Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1651.patch
>
>
> ScriptOperator spawns 2 threads for getting the stdout and stderr from the 
> script and then forward the output from stdout to downstream operators. In 
> case of any exceptions to the script (e.g., got killed), the ScriptOperator 
> got an exception and throw it to upstream operators until MapOperator got it 
> and call close(abort). Before the ScriptOperator.close() is called the script 
> output stream can still forward output to downstream operators. We should 
> terminate it immediately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.