[ https://issues.apache.org/jira/browse/PIG-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533168#comment-13533168 ]
Cheolsoo Park commented on PIG-3095: ------------------------------------ Hello Nick, Thank you very much for the patch! I have a few comments: - The [guava doc|http://code.google.com/p/guava-libraries/wiki/CachesExplained] says "If you have defined a CacheLoader that does not declare any checked exceptions then you can perform cache lookups using getUnchecked(K)". {code} private static final class Which extends CacheLoader<String, String> { public String load(String file) { ... } catch (Exception e) {} return null; } } {code} Since {{Which.load()}} does not declare any checked exceptions, can't we use {{getUnchecked()}} without the {{try/catch}} block? In fact, I think that no exception will be ever caught by the following {{catch}} since {{Which.load()}} already catches any exception. Please correct me if I am wrong. {code} try { argPath = whichCache.getUnchecked(arg); } catch(ExecutionException e) { argPath = null; } {code} With this change, "{{import java.util.concurrent.ExecutionException}}" is not needed. - Can you please remove the following statement since it's unused? {code} import com.google.common.cache.Cache; {code} - Can you please fix indentation in {{CacheLoader}}? This is probably because you used "{{git diff -w}}". {code} private static final class Which extends CacheLoader<String, String> { public String load(String file) { ... } } {code} > "which" is called many, many times for each Pig STREAM statement > ---------------------------------------------------------------- > > Key: PIG-3095 > URL: https://issues.apache.org/jira/browse/PIG-3095 > Project: Pig > Issue Type: Bug > Components: grunt, impl > Affects Versions: 0.12 > Reporter: Nick White > Assignee: Nick White > Labels: patch, performance > Fix For: 0.12 > > Attachments: PIG-3095.patch > > > STREAM statements are checked by the LogicalPlanBuilder as it comes across > them - and these checks include running the system utility "which". However, > due to the backtracking parsing mechanism "which" is called repeatedly with > the same arguments (I noticed this while profiling a script with 4 STREAM > statements - "which" was run over 230 times!). The attached patch just caches > the return value of "which", reducing the overhead of running a system > process to a Map lookup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira