[ 
https://issues.apache.org/jira/browse/PIG-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533168#comment-13533168
 ] 

Cheolsoo Park commented on PIG-3095:
------------------------------------

Hello Nick,

Thank you very much for the patch! I have a few comments:
- The [guava doc|http://code.google.com/p/guava-libraries/wiki/CachesExplained] 
says "If you have defined a CacheLoader that does not declare any checked 
exceptions then you can perform cache lookups using getUnchecked(K)".
{code}
private static final class Which extends CacheLoader<String, String> {
     public String load(String file) {
        ...
         } catch (Exception e) {}
         return null;
     }
}
{code}
Since {{Which.load()}} does not declare any checked exceptions, can't we use 
{{getUnchecked()}} without the {{try/catch}} block? In fact, I think that no 
exception will be ever caught by the following {{catch}} since {{Which.load()}} 
already catches any exception. Please correct me if I am wrong.
{code}
        try {
            argPath = whichCache.getUnchecked(arg);
        } catch(ExecutionException e) {
            argPath = null;
        }
{code}
With this change, "{{import java.util.concurrent.ExecutionException}}" is not 
needed.
- Can you please remove the following statement since it's unused?
{code}
import com.google.common.cache.Cache;
{code}
- Can you please fix indentation in {{CacheLoader}}? This is probably because 
you used "{{git diff -w}}".
{code}
private static final class Which extends CacheLoader<String, String> {
public String load(String file) {
...
}
}
{code}
                
> "which" is called many, many times for each Pig STREAM statement
> ----------------------------------------------------------------
>
>                 Key: PIG-3095
>                 URL: https://issues.apache.org/jira/browse/PIG-3095
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt, impl
>    Affects Versions: 0.12
>            Reporter: Nick White
>            Assignee: Nick White
>              Labels: patch, performance
>             Fix For: 0.12
>
>         Attachments: PIG-3095.patch
>
>
> STREAM statements are checked by the LogicalPlanBuilder as it comes across 
> them - and these checks include running the system utility "which". However, 
> due to the backtracking parsing mechanism "which" is called repeatedly with 
> the same arguments (I noticed this while profiling a script with 4 STREAM 
> statements - "which" was run over 230 times!). The attached patch just caches 
> the return value of "which", reducing the overhead of running a system 
> process to a Map lookup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to