[
https://issues.apache.org/jira/browse/PIG-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533188#comment-13533188
]
Cheolsoo Park commented on PIG-3095:
------------------------------------
I also noticed that your patch breaks {{TestSctreaming}}. Here is one example
among others:
{code}
Testcase: testInputOutputSpecs took 0.022 sec
Caused an ERROR
Error during parsing. CacheLoader returned null for key
script1259692341619822790pl.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during
parsing. CacheLoader returned null for key script1259692341619822790pl.
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1617)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1556)
at org.apache.pig.PigServer.registerQuery(PigServer.java:526)
at org.apache.pig.PigServer.registerQuery(PigServer.java:539)
at
org.apache.pig.test.TestStreaming.testInputOutputSpecs(TestStreaming.java:640)
Caused by: Failed to parse: CacheLoader returned null for key
script1259692341619822790pl.
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:193)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1609)
Caused by: com.google.common.cache.CacheLoader$InvalidCacheLoadException:
CacheLoader returned null for key script1259692341619822790pl.
at
com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2383)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2351)
at
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3967)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3971)
at
com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4831)
at
org.apache.pig.parser.StreamingCommandUtils.checkAndShip(StreamingCommandUtils.java:155)
at
org.apache.pig.parser.StreamingCommandUtils.checkAutoShipSpecs(StreamingCommandUtils.java:135)
at
org.apache.pig.parser.LogicalPlanBuilder.buildCommand(LogicalPlanBuilder.java:1134)
at
org.apache.pig.parser.LogicalPlanBuilder.buildCommand(LogicalPlanBuilder.java:1087)
at
org.apache.pig.parser.LogicalPlanGenerator.cmd(LogicalPlanGenerator.java:2097)
at
org.apache.pig.parser.LogicalPlanGenerator.define_clause(LogicalPlanGenerator.java:1802)
at
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1295)
at
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799)
at
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517)
at
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)
{code}
To reproduce, please run:
{code}
ant clean test -Dtestcase=TestStreaming
{code}
> "which" is called many, many times for each Pig STREAM statement
> ----------------------------------------------------------------
>
> Key: PIG-3095
> URL: https://issues.apache.org/jira/browse/PIG-3095
> Project: Pig
> Issue Type: Bug
> Components: grunt, impl
> Affects Versions: 0.12
> Reporter: Nick White
> Assignee: Nick White
> Labels: patch, performance
> Fix For: 0.12
>
> Attachments: PIG-3095.patch
>
>
> STREAM statements are checked by the LogicalPlanBuilder as it comes across
> them - and these checks include running the system utility "which". However,
> due to the backtracking parsing mechanism "which" is called repeatedly with
> the same arguments (I noticed this while profiling a script with 4 STREAM
> statements - "which" was run over 230 times!). The attached patch just caches
> the return value of "which", reducing the overhead of running a system
> process to a Map lookup.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira