[jira] [Updated] (PIG-2779) Refactoring the code for setting number of reducers

2012-07-27 Thread Jie Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Li updated PIG-2779:


Attachment: PIG-2779.4.patch

Attached PIG-2779.4.patch that incorporates #1 discussed above.

For #2, we probably can do it later with the cleaning up of requestedParallel?

> Refactoring the code for setting number of reducers
> ---
>
> Key: PIG-2779
> URL: https://issues.apache.org/jira/browse/PIG-2779
> Project: Pig
>  Issue Type: Bug
>Reporter: Jie Li
>Assignee: Jie Li
> Fix For: 0.11
>
> Attachments: PIG-2779.0.patch, PIG-2779.1.patch, PIG-2779.2.patch, 
> PIG-2779.3.patch, PIG-2779.4.patch, TestNumberOfReducers.java, 
> TestNumberOfReducers.java
>
>
> As PIG-2652 observed, currently the code for setting number of reducers is a 
> little messy. MapReduceOper.requestedParallelism seems being misused in some 
> plases, and now we support runtime estimation of #reducer which further 
> complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated 
> #reducer will be used. If we specify parallel 2 while it estimates 4, 
> order-by will fail due to "Illegal partition for Null". If we specify 
> parallel 4 while it estimates 2, then some reducers will have nothing to do. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2779) Refactoring the code for setting number of reducers

2012-07-27 Thread Jie Li (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424224#comment-13424224
 ] 

Jie Li commented on PIG-2779:
-

Agreed with #1. 

Re #2, according to http://pig.apache.org/docs/r0.10.0/perf.html#parallel, 
these operators support PARALLEL:

COGROUP, CROSS, DISTINCT, GROUP, JOIN (inner), JOIN (outer), and ORDER BY.

We need to make sure the PARALLEL associate with these operators remains same 
across logic/phycical/mr phases. Seems it suffers from the same complexity 
faced by requestedParallel, such as query transformation, multi-query 
optimization, etc. Seems it's not trivial?

> Refactoring the code for setting number of reducers
> ---
>
> Key: PIG-2779
> URL: https://issues.apache.org/jira/browse/PIG-2779
> Project: Pig
>  Issue Type: Bug
>Reporter: Jie Li
>Assignee: Jie Li
> Fix For: 0.11
>
> Attachments: PIG-2779.0.patch, PIG-2779.1.patch, PIG-2779.2.patch, 
> PIG-2779.3.patch, TestNumberOfReducers.java, TestNumberOfReducers.java
>
>
> As PIG-2652 observed, currently the code for setting number of reducers is a 
> little messy. MapReduceOper.requestedParallelism seems being misused in some 
> plases, and now we support runtime estimation of #reducer which further 
> complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated 
> #reducer will be used. If we specify parallel 2 while it estimates 4, 
> order-by will fail due to "Illegal partition for Null". If we specify 
> parallel 4 while it estimates 2, then some reducers will have nothing to do. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2839) mock.Storage overwrites output with the last relation written when storing UNION

2012-07-27 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424222#comment-13424222
 ] 

Julien Le Dem commented on PIG-2839:


see: https://issues.apache.org/jira/browse/PIG-2848

> mock.Storage overwrites output with the last relation written when storing 
> UNION
> 
>
> Key: PIG-2839
> URL: https://issues.apache.org/jira/browse/PIG-2839
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Fix For: 0.11
>
> Attachments: PIG-2839.patch, PIG-2839_a.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2848) TestBuiltInBagToTupleOrString fails now that mock.Storage enforces not overwriting output

2012-07-27 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-2848:
---

Attachment: PIG-2848.patch

> TestBuiltInBagToTupleOrString fails now that mock.Storage enforces not 
> overwriting output
> -
>
> Key: PIG-2848
> URL: https://issues.apache.org/jira/browse/PIG-2848
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Attachments: PIG-2848.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2848) TestBuiltInBagToTupleOrString fails now that mock.Storage enforces not overwriting output

2012-07-27 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-2848:
---

Patch Info: Patch Available

> TestBuiltInBagToTupleOrString fails now that mock.Storage enforces not 
> overwriting output
> -
>
> Key: PIG-2848
> URL: https://issues.apache.org/jira/browse/PIG-2848
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Attachments: PIG-2848.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (PIG-2848) TestBuiltInBagToTupleOrString fails now that mock.Storage enforces not overwriting output

2012-07-27 Thread Julien Le Dem (JIRA)
Julien Le Dem created PIG-2848:
--

 Summary: TestBuiltInBagToTupleOrString fails now that mock.Storage 
enforces not overwriting output
 Key: PIG-2848
 URL: https://issues.apache.org/jira/browse/PIG-2848
 Project: Pig
  Issue Type: Bug
Reporter: Julien Le Dem
Assignee: Julien Le Dem




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2839) mock.Storage overwrites output with the last relation written when storing UNION

2012-07-27 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424220#comment-13424220
 ] 

Julien Le Dem commented on PIG-2839:


The mock.Storage now enforces that you don't overwrite data.
Which this test is doing.
I'll provide a patch.

> mock.Storage overwrites output with the last relation written when storing 
> UNION
> 
>
> Key: PIG-2839
> URL: https://issues.apache.org/jira/browse/PIG-2839
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Fix For: 0.11
>
> Attachments: PIG-2839.patch, PIG-2839_a.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Pig-trunk #1284

2012-07-27 Thread Apache Jenkins Server
See 

Changes:

[billgraham] Update CHANGES.txt

[billgraham] PIG-2841: Inconsistent URL in Docs (eric59 via billgraham)

[billgraham] PIG-2843: Typo in Documentation (eric59 via billgraham)

--
[...truncated 38294 lines...]
[junit] at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:415)
[junit] at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:403)
[junit] at 
com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:506)
[junit] at 
org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.shutdown(FSDataset.java:1934)
[junit] Shutting down DataNode 2
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:788)
[junit] at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:566)
[junit] at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:550)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutdownMiniDfsClusters(MiniGenericCluster.java:87)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutdownMiniDfsAndMrClusters(MiniGenericCluster.java:77)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutDown(MiniGenericCluster.java:68)
[junit] at 
org.apache.pig.test.TestStore.oneTimeTearDown(TestStore.java:129)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
[junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37)
[junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
[junit] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
[junit] 12/07/27 22:41:22 WARN datanode.FSDatasetAsyncDiskService: 
AsyncDiskService has already shut down.
[junit] 12/07/27 22:41:22 INFO mortbay.log: Stopped 
SelectChannelConnector@localhost:0
[junit] 12/07/27 22:41:22 INFO ipc.Server: Stopping server on 49020
[junit] 12/07/27 22:41:22 INFO ipc.Server: Stopping IPC Server listener on 
49020
[junit] 12/07/27 22:41:22 INFO ipc.Server: IPC Server handler 0 on 49020: 
exiting
[junit] 12/07/27 22:41:22 INFO ipc.Server: IPC Server handler 1 on 49020: 
exiting
[junit] 12/07/27 22:41:22 INFO ipc.Server: IPC Server handler 2 on 49020: 
exiting
[junit] 12/07/27 22:41:22 INFO metrics.RpcInstrumentation: shut down
[junit] 12/07/27 22:41:22 INFO ipc.Server: Stopping IPC Server Responder
[junit] 12/07/27 22:41:22 WARN datanode.DataNode: 
DatanodeRegistration(127.0.0.1:40423, 
storageID=DS-1583873024-67.195.138.20-40423-1343428357031, infoPort=34616, 
ipcPort=49020):DataXceiveServer:java.nio.channels.AsynchronousCloseException
[junit] at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
[junit] at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:159)
[junit] at 
sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:131)
[junit] at java.lang.Thread.run(Thread.java:662)
[junit] 
[junit] 12/07/27 22:41:22 INFO datanode.DataNode: Waiting for threadgroup 
to exit, active threads is 1
[junit] 12/07/27 22:41:22 INFO datanode.DataNode: Exiting DataXceiveServer
[junit] 12/07/27 22:41:22 INFO datanode.DataBlockScanner: Exiting 
DataBlockScanner thread.
[junit] 12/07/27 22:41:23 INFO datanode.DataNode: Waiting for threadgroup 
to exit, active threads is 0
[junit] 12/07/27 22:41:23 INFO datanode.DataNode: 
DatanodeRegistration(127.0.0.1:40423, 
storageID=DS-1583873024-67.195.138.20-40423-1343428357031, infoPort=34616, 
ipcPort=49020):Finishing DataNode in: 
FSDataset{dirpath='

Re: Total count of RandomSampleLoader is unpredicatable

2012-07-27 Thread Jie Li
Not sure if it's the same issue, but I also see the counter of Map
input records is greater than the actual number of input records in
some cases.

Jie

On Thu, Jul 26, 2012 at 6:04 PM, Prasanth J  wrote:
> Hello everyone
>
> I am using RandomSampleLoader to load 1000 tuples per mapper. I have 11 map 
> jobs in a small dataset and 109 map jobs in a large dataset.
>
> I am expecting 11000 tuples from the small dataset and 109000 tuples from the 
> large dataset. But the actual number of tuples that I get is always more than 
> what I expected. In small dataset case I am getting 15000 tuples whereas in 
> large dataset case I am getting 145000 (sometimes 15) tuples.
>
> Is this a bug? or is it an expected behavior? If reservoir sampling is used 
> by all mappers then why is the number of total samples is more?
>
> Thanks
> -- Prasanth
>


[jira] [Commented] (PIG-2839) mock.Storage overwrites output with the last relation written when storing UNION

2012-07-27 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424167#comment-13424167
 ] 

Koji Noguchi commented on PIG-2839:
---

I'm seeing two unit tests failing that use mock.Storage.  Can this be related?

org.apache.pig.test.TestBuiltInBagToTupleOrString.testPigScriptrForBagToStringUDF
{noformat} 
junit.framework.AssertionFailedError: expected:<(a==b==c)> but 
was:<({(a),(b),(c)})>
at 
org.apache.pig.test.TestBuiltInBagToTupleOrString.testPigScriptrForBagToStringUDF(TestBuiltInBagToTupleOrString.java:507)
{noformat} 


org.apache.pig.test.TestBuiltInBagToTupleOrString.testPigScriptMultipleElmementsPerTupleForBagToStringUDF
 
{noformat} 
junit.framework.AssertionFailedError: expected:<(a^b^c^d^e^f)> but 
was:<({(a,b),(c,d),(e,f)})>
at 
org.apache.pig.test.TestBuiltInBagToTupleOrString.testPigScriptMultipleElmementsPerTupleForBagToStringUDF(TestBuiltInBagToTupleOrString.java:528)
{noformat} 

> mock.Storage overwrites output with the last relation written when storing 
> UNION
> 
>
> Key: PIG-2839
> URL: https://issues.apache.org/jira/browse/PIG-2839
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Fix For: 0.11
>
> Attachments: PIG-2839.patch, PIG-2839_a.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2847) Error defining macro within pig script when using PigUnit

2012-07-27 Thread Matthew Hayes (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Hayes updated PIG-2847:
---

Description: 
I'm using PigUnit to test a pig script within which a macro is defined.  When I 
run it I get the error below.

   [testng] org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: 
Error during parsing. Can not create a Path from a null string
   [testng] at 
org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1595)
   [testng] at 
org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1534)
   [testng] at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
   [testng] at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:990)
   [testng] at 
org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
   [testng] at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
   [testng] at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
   [testng] at 
org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56)
   [testng] at 
org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
   [testng] at org.apache.pig.pigunit.PigTest.runScript(PigTest.java:170)
   [testng] at 
datafu.test.pig.macros.MacrosTests.macrosTest(MacrosTests.java:32)
   [testng] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [testng] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   [testng] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   [testng] at java.lang.reflect.Method.invoke(Method.java:597)
   [testng] at 
org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
   [testng] at org.testng.internal.Invoker.invokeMethod(Invoker.java:691)
   [testng] at 
org.testng.internal.Invoker.invokeTestMethod(Invoker.java:883)
   [testng] at 
org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1208)
   [testng] at 
org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
   [testng] at 
org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
   [testng] at org.testng.TestRunner.privateRun(TestRunner.java:754)
   [testng] at org.testng.TestRunner.run(TestRunner.java:614)
   [testng] at org.testng.SuiteRunner.runTest(SuiteRunner.java:335)
   [testng] at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:330)
   [testng] at org.testng.SuiteRunner.privateRun(SuiteRunner.java:292)
   [testng] at org.testng.SuiteRunner.run(SuiteRunner.java:241)
   [testng] at 
org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
   [testng] at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
   [testng] at org.testng.TestNG.runSuitesSequentially(TestNG.java:1169)
   [testng] at org.testng.TestNG.runSuitesLocally(TestNG.java:1094)
   [testng] at org.testng.TestNG.run(TestNG.java:1006)
   [testng] at org.testng.TestNG.privateMain(TestNG.java:1316)
   [testng] at org.testng.TestNG.main(TestNG.java:1280)
   [testng] Caused by: java.lang.IllegalArgumentException: Can not create a 
Path from a null string
   [testng] at org.apache.hadoop.fs.Path.checkPathArg(Path.java:78)
   [testng] at org.apache.hadoop.fs.Path.(Path.java:90)
   [testng] at 
org.apache.pig.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:766)
   [testng] at 
org.apache.pig.impl.io.FileLocalizer.fetchFile(FileLocalizer.java:733)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.getMacroFile(QueryParserDriver.java:350)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:411)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:268)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:169)
   [testng] at 
org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1587)
   [testng] ... 33 more

The pig script below generates this error:

{code}
register $JAR_PATH

DEFINE row_count(data) returns count {
grouped = GROUP $data ALL;
$count = FOREACH grouped GENERATE COUNT_STAR($data);
};

data = LOAD 'input' AS (key:INT);
data2 = row_count(data);

STORE data2 INTO 'output';
{code}

However the pig script below, where I've expanded the macro manually, does not 
have the error and passes:

{code}
register $JAR_PATH

data = LOAD 'input' AS (key:INT);
grouped = GROUP data ALL;
data2 = FOREACH grouped GENERATE COUNT(data);

STORE data2 INTO 'output';
{code}


  was:
I'm using PigUnit to test a pig script within which a macro is defined.  When I 
run it I get the error below.

   [testng] org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: 
Error during parsing. Can

[jira] [Commented] (PIG-2847) Error defining macro within pig script when using PigUnit

2012-07-27 Thread Matthew Hayes (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424160#comment-13424160
 ] 

Matthew Hayes commented on PIG-2847:


I added some logging to {{makeMacroDef}} in {{QueryParserDriver.java}}.  In the 
line below {{fname}} gets set to {{null}}.  Then {{getMacroFile}} is called 
with this {{null}} value.

String fname = ((PigParserNode)t).getFileName();

> Error defining macro within pig script when using PigUnit
> -
>
> Key: PIG-2847
> URL: https://issues.apache.org/jira/browse/PIG-2847
> Project: Pig
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 0.11
>Reporter: Matthew Hayes
>
> I'm using PigUnit to test a pig script within which a macro is defined.  When 
> I run it I get the error below.
>[testng] org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: 
> Error during parsing. Can not create a Path from a null string
>[testng]   at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1595)
>[testng]   at 
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1534)
>[testng]   at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
>[testng]   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:990)
>[testng]   at 
> org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
>[testng]   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
>[testng]   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
>[testng]   at 
> org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56)
>[testng]   at 
> org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
>[testng]   at org.apache.pig.pigunit.PigTest.runScript(PigTest.java:170)
>[testng]   at 
> datafu.test.pig.macros.MacrosTests.macrosTest(MacrosTests.java:32)
>[testng]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[testng]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>[testng]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>[testng]   at java.lang.reflect.Method.invoke(Method.java:597)
>[testng]   at 
> org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
>[testng]   at org.testng.internal.Invoker.invokeMethod(Invoker.java:691)
>[testng]   at 
> org.testng.internal.Invoker.invokeTestMethod(Invoker.java:883)
>[testng]   at 
> org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1208)
>[testng]   at 
> org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
>[testng]   at 
> org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
>[testng]   at org.testng.TestRunner.privateRun(TestRunner.java:754)
>[testng]   at org.testng.TestRunner.run(TestRunner.java:614)
>[testng]   at org.testng.SuiteRunner.runTest(SuiteRunner.java:335)
>[testng]   at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:330)
>[testng]   at org.testng.SuiteRunner.privateRun(SuiteRunner.java:292)
>[testng]   at org.testng.SuiteRunner.run(SuiteRunner.java:241)
>[testng]   at 
> org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
>[testng]   at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
>[testng]   at org.testng.TestNG.runSuitesSequentially(TestNG.java:1169)
>[testng]   at org.testng.TestNG.runSuitesLocally(TestNG.java:1094)
>[testng]   at org.testng.TestNG.run(TestNG.java:1006)
>[testng]   at org.testng.TestNG.privateMain(TestNG.java:1316)
>[testng]   at org.testng.TestNG.main(TestNG.java:1280)
>[testng] Caused by: java.lang.IllegalArgumentException: Can not create a 
> Path from a null string
>[testng]   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:78)
>[testng]   at org.apache.hadoop.fs.Path.(Path.java:90)
>[testng]   at 
> org.apache.pig.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:766)
>[testng]   at 
> org.apache.pig.impl.io.FileLocalizer.fetchFile(FileLocalizer.java:733)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.getMacroFile(QueryParserDriver.java:350)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:411)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:268)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:169)
>[testng]   at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1587)
>[testng]   ... 33 more

--
This message is automatically generated by JIRA.
If you think it was sent inc

[jira] [Updated] (PIG-2847) Error defining macro within pig script when using PigUnit

2012-07-27 Thread Matthew Hayes (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Hayes updated PIG-2847:
---

  Component/s: parser
Affects Version/s: 0.11

> Error defining macro within pig script when using PigUnit
> -
>
> Key: PIG-2847
> URL: https://issues.apache.org/jira/browse/PIG-2847
> Project: Pig
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 0.11
>Reporter: Matthew Hayes
>
> I'm using PigUnit to test a pig script within which a macro is defined.  When 
> I run it I get the error below.
>[testng] org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: 
> Error during parsing. Can not create a Path from a null string
>[testng]   at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1595)
>[testng]   at 
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1534)
>[testng]   at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
>[testng]   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:990)
>[testng]   at 
> org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
>[testng]   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
>[testng]   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
>[testng]   at 
> org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56)
>[testng]   at 
> org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
>[testng]   at org.apache.pig.pigunit.PigTest.runScript(PigTest.java:170)
>[testng]   at 
> datafu.test.pig.macros.MacrosTests.macrosTest(MacrosTests.java:32)
>[testng]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[testng]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>[testng]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>[testng]   at java.lang.reflect.Method.invoke(Method.java:597)
>[testng]   at 
> org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
>[testng]   at org.testng.internal.Invoker.invokeMethod(Invoker.java:691)
>[testng]   at 
> org.testng.internal.Invoker.invokeTestMethod(Invoker.java:883)
>[testng]   at 
> org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1208)
>[testng]   at 
> org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
>[testng]   at 
> org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
>[testng]   at org.testng.TestRunner.privateRun(TestRunner.java:754)
>[testng]   at org.testng.TestRunner.run(TestRunner.java:614)
>[testng]   at org.testng.SuiteRunner.runTest(SuiteRunner.java:335)
>[testng]   at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:330)
>[testng]   at org.testng.SuiteRunner.privateRun(SuiteRunner.java:292)
>[testng]   at org.testng.SuiteRunner.run(SuiteRunner.java:241)
>[testng]   at 
> org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
>[testng]   at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
>[testng]   at org.testng.TestNG.runSuitesSequentially(TestNG.java:1169)
>[testng]   at org.testng.TestNG.runSuitesLocally(TestNG.java:1094)
>[testng]   at org.testng.TestNG.run(TestNG.java:1006)
>[testng]   at org.testng.TestNG.privateMain(TestNG.java:1316)
>[testng]   at org.testng.TestNG.main(TestNG.java:1280)
>[testng] Caused by: java.lang.IllegalArgumentException: Can not create a 
> Path from a null string
>[testng]   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:78)
>[testng]   at org.apache.hadoop.fs.Path.(Path.java:90)
>[testng]   at 
> org.apache.pig.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:766)
>[testng]   at 
> org.apache.pig.impl.io.FileLocalizer.fetchFile(FileLocalizer.java:733)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.getMacroFile(QueryParserDriver.java:350)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:411)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:268)
>[testng]   at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:169)
>[testng]   at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1587)
>[testng]   ... 33 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (PIG-2847) Error defining macro within pig script when using PigUnit

2012-07-27 Thread Matthew Hayes (JIRA)
Matthew Hayes created PIG-2847:
--

 Summary: Error defining macro within pig script when using PigUnit
 Key: PIG-2847
 URL: https://issues.apache.org/jira/browse/PIG-2847
 Project: Pig
  Issue Type: Bug
Reporter: Matthew Hayes


I'm using PigUnit to test a pig script within which a macro is defined.  When I 
run it I get the error below.

   [testng] org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: 
Error during parsing. Can not create a Path from a null string
   [testng] at 
org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1595)
   [testng] at 
org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1534)
   [testng] at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
   [testng] at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:990)
   [testng] at 
org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
   [testng] at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
   [testng] at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
   [testng] at 
org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56)
   [testng] at 
org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
   [testng] at org.apache.pig.pigunit.PigTest.runScript(PigTest.java:170)
   [testng] at 
datafu.test.pig.macros.MacrosTests.macrosTest(MacrosTests.java:32)
   [testng] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [testng] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   [testng] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   [testng] at java.lang.reflect.Method.invoke(Method.java:597)
   [testng] at 
org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
   [testng] at org.testng.internal.Invoker.invokeMethod(Invoker.java:691)
   [testng] at 
org.testng.internal.Invoker.invokeTestMethod(Invoker.java:883)
   [testng] at 
org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1208)
   [testng] at 
org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
   [testng] at 
org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
   [testng] at org.testng.TestRunner.privateRun(TestRunner.java:754)
   [testng] at org.testng.TestRunner.run(TestRunner.java:614)
   [testng] at org.testng.SuiteRunner.runTest(SuiteRunner.java:335)
   [testng] at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:330)
   [testng] at org.testng.SuiteRunner.privateRun(SuiteRunner.java:292)
   [testng] at org.testng.SuiteRunner.run(SuiteRunner.java:241)
   [testng] at 
org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
   [testng] at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
   [testng] at org.testng.TestNG.runSuitesSequentially(TestNG.java:1169)
   [testng] at org.testng.TestNG.runSuitesLocally(TestNG.java:1094)
   [testng] at org.testng.TestNG.run(TestNG.java:1006)
   [testng] at org.testng.TestNG.privateMain(TestNG.java:1316)
   [testng] at org.testng.TestNG.main(TestNG.java:1280)
   [testng] Caused by: java.lang.IllegalArgumentException: Can not create a 
Path from a null string
   [testng] at org.apache.hadoop.fs.Path.checkPathArg(Path.java:78)
   [testng] at org.apache.hadoop.fs.Path.(Path.java:90)
   [testng] at 
org.apache.pig.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:766)
   [testng] at 
org.apache.pig.impl.io.FileLocalizer.fetchFile(FileLocalizer.java:733)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.getMacroFile(QueryParserDriver.java:350)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:411)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:268)
   [testng] at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:169)
   [testng] at 
org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1587)
   [testng] ... 33 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (PIG-2789) NoClassDefFoundError after upgrading to pig 0.10.0 from 0.9.0

2012-07-27 Thread Matthew Hayes (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Hayes resolved PIG-2789.


Resolution: Duplicate

Dup of PIG-2785

> NoClassDefFoundError after upgrading to pig 0.10.0 from 0.9.0
> -
>
> Key: PIG-2789
> URL: https://issues.apache.org/jira/browse/PIG-2789
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Matthew Hayes
>
> It appears that the versions in the pom for pig 0.10.0 are inconsistent with 
> the versions specified in the ivy file used to build pig.  I am building a 
> separate project, and I am getting pig and its dependencies using ivy.  
> Looking in ivy.xml in the pig 0.10.0 release:
>conf="compile->default;checkstyle->master"/>
> ...
>  rev="${jackson.version}"
>   conf="compile->master"/>
>  rev="${jackson.version}"
>   conf="compile->master"/>
> Where avro.version is avro.version=1.5.3 and jackson.version=1.7.3.
> However, in the pom.xml for pig 0.10.0:
>   org.apache.hadoop
>   avro
>   1.3.2
> And when I look up the pom for org.apache.hadoop's avro 1.3.2 in the central 
> repository, I see a version of jackson inconsistent with what pig was 
> compiled with:
> 
>   org.codehaus.jackson
>   jackson-mapper-asl
>   1.4.2
>   compile
> 
> It's 1.4.2, not 1.7.3. 
> Below is my ivy.xml.  It's the same as what I used for 0.9.0 but I changed 
> the pig version to 0.10.0.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>   
> 
> 
> 
> 
> 
> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2829) Use partial aggregation more aggresively

2012-07-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424140#comment-13424140
 ] 

Thejas M Nair commented on PIG-2829:


Thanks for the benchmark Jie. Clearly, partial-agg is working better than 
combiner. 
Can you also run some benchmarks with combiner turned off, so that we can 
verify the appropriate value for pig.exec.mapPartAgg.minReduction - 

||query || combiner off, partial-agg off || combiner off, partial-agg on ||
|g-by with reduction by 3 | | |
|g-by with reduction by 2| | |


> Use partial aggregation more aggresively
> 
>
> Key: PIG-2829
> URL: https://issues.apache.org/jira/browse/PIG-2829
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Jie Li
> Attachments: 2829.1.patch, 2829.2.patch, 2829.separate.options.patch, 
> pigmix-10G.png, tpch-10G.png
>
>
> Partial aggregation (Hash Aggregation, aka in-map combiner) is a new feature 
> in Pig 0.10 that will perform aggregation within map function. The main 
> advantage against combiner is it avoids de/serializing and sorting the data, 
> and it can auto disable itself if the data reduction rate is low. Currently 
> it's disabled by default.
> To leverage the power of PartialAgg more aggressively, several things need to 
> be revisited:
> 1. The threshold of auto-disabling. Currently each mapper looks at first 1k 
> (hard-coded) records to see if there's enough data size reduction (defaults 
> to 10x, configurable). The check would happen earlier if the hash table gets 
> full before processing the 1k records (hash table size is controlled by 
> pig.cachedbag.memusage). We might want to relax these thresholds.
> 2. Dependency on the combiner. Currently the PartialAgg won't work without a 
> combiner following it, so we need to provide separate options to enable each 
> independently. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2829) Use partial aggregation more aggresively

2012-07-27 Thread Jie Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Li updated PIG-2829:


Attachment: 2829.2.patch

Updated the patch with unit test fixes and new unit tests verifying default 
configurations.

Below is the benchmark results on 4-slave cluster with 100GB TPC-H data. Query 
1 and some synthetic queries are used. Each query uses 300 map tasks and 79 
reduce tasks, and each map task is processing 2 million records:

|| query || trunk || patch || comment ||
| TPCH Q1 | 58 min | 34 min | Q1's group-by has four different keys and eight 
aggregations. |
| S-600x | 35 min | 30 min | The reduction rate of output/input records is 600. 
|
| S-4x | 31 min | 21 min | The reduction rate of output/input records is 4. |
| S-1x | 59 min | 44 min | The reduction rate of output/input records is 1. 
Every group-by key is different. |
| S-high memory | map task 5min ~ 6 min | map task 2min ~ 3min | reduction rate 
is 1 (no reduction). 16 aggregations in the same group. |

We can see the performance of new default settings in this patch is always 
better than the old default settings in the trunk.

Also tested the latency of disabling MapAgg using the query S-1x (no 
reduction). There's almost no difference:
|| pig.exec.mapPartAgg.reduction.checkinterval ||  job running time ||
| 1000 | 43 min 54 sec |
| 10 | 43 min 46 sec |



> Use partial aggregation more aggresively
> 
>
> Key: PIG-2829
> URL: https://issues.apache.org/jira/browse/PIG-2829
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Jie Li
> Attachments: 2829.1.patch, 2829.2.patch, 2829.separate.options.patch, 
> pigmix-10G.png, tpch-10G.png
>
>
> Partial aggregation (Hash Aggregation, aka in-map combiner) is a new feature 
> in Pig 0.10 that will perform aggregation within map function. The main 
> advantage against combiner is it avoids de/serializing and sorting the data, 
> and it can auto disable itself if the data reduction rate is low. Currently 
> it's disabled by default.
> To leverage the power of PartialAgg more aggressively, several things need to 
> be revisited:
> 1. The threshold of auto-disabling. Currently each mapper looks at first 1k 
> (hard-coded) records to see if there's enough data size reduction (defaults 
> to 10x, configurable). The check would happen earlier if the hash table gets 
> full before processing the 1k records (hash table size is controlled by 
> pig.cachedbag.memusage). We might want to relax these thresholds.
> 2. Dependency on the combiner. Currently the PartialAgg won't work without a 
> combiner following it, so we need to provide separate options to enable each 
> independently. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (PIG-2841) Inconsistent URL in Docs

2012-07-27 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-2841.
--

Resolution: Fixed

Committed, thanks Eric!

> Inconsistent URL in Docs
> 
>
> Key: PIG-2841
> URL: https://issues.apache.org/jira/browse/PIG-2841
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.0
>Reporter: Eric Spishak
>Assignee: Eric Spishak
> Attachments: PIG-2841-0.patch, PIG-2841-1.patch
>
>
> There are inconsistent links to "cont.html#Parameter-Sub" throughout the 
> documentation. For some "Parameter-Sub" is all lowercase, some have it with 
> the case shown here. 
> At least for Chrome, this results in a broken link, where the browser won't 
> scroll to the correct section in the page.
> The attached patch updates all to use the "Parameter-Sub" casing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2779) Refactoring the code for setting number of reducers

2012-07-27 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424117#comment-13424117
 ] 

Bill Graham commented on PIG-2779:
--

Re #1, for later analysis it would be useful to know the estimated parallelism 
only if estimation was in-fact kicked in because it was needed. If it didn't 
kick in, that's also useful to know. Hence I think we should move that call 
into the {{else}} block.

Agreed that {{requestedParallel}} is used all over and we shouldn't mess with 
that now. Let's instead set another field when {{PARALLEL}} is passed. 
{{pig.info.reducers.keyword.parallel}}? That said, do you still see value in 
setting {{pig.info.reducers.requested.parallel}}, since it could be used for so 
many different things?

> Refactoring the code for setting number of reducers
> ---
>
> Key: PIG-2779
> URL: https://issues.apache.org/jira/browse/PIG-2779
> Project: Pig
>  Issue Type: Bug
>Reporter: Jie Li
>Assignee: Jie Li
> Fix For: 0.11
>
> Attachments: PIG-2779.0.patch, PIG-2779.1.patch, PIG-2779.2.patch, 
> PIG-2779.3.patch, TestNumberOfReducers.java, TestNumberOfReducers.java
>
>
> As PIG-2652 observed, currently the code for setting number of reducers is a 
> little messy. MapReduceOper.requestedParallelism seems being misused in some 
> plases, and now we support runtime estimation of #reducer which further 
> complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated 
> #reducer will be used. If we specify parallel 2 while it estimates 4, 
> order-by will fail due to "Illegal partition for Null". If we specify 
> parallel 4 while it estimates 2, then some reducers will have nothing to do. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (PIG-2841) Inconsistent URL in Docs

2012-07-27 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham reassigned PIG-2841:


Assignee: Eric Spishak

> Inconsistent URL in Docs
> 
>
> Key: PIG-2841
> URL: https://issues.apache.org/jira/browse/PIG-2841
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.0
>Reporter: Eric Spishak
>Assignee: Eric Spishak
> Attachments: PIG-2841-0.patch, PIG-2841-1.patch
>
>
> There are inconsistent links to "cont.html#Parameter-Sub" throughout the 
> documentation. For some "Parameter-Sub" is all lowercase, some have it with 
> the case shown here. 
> At least for Chrome, this results in a broken link, where the browser won't 
> scroll to the correct section in the page.
> The attached patch updates all to use the "Parameter-Sub" casing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2841) Inconsistent URL in Docs

2012-07-27 Thread Eric Spishak (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Spishak updated PIG-2841:
--

Attachment: PIG-2841-1.patch

Update patch to correctly change files in the Pig rep.

> Inconsistent URL in Docs
> 
>
> Key: PIG-2841
> URL: https://issues.apache.org/jira/browse/PIG-2841
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.0
>Reporter: Eric Spishak
> Attachments: PIG-2841-0.patch, PIG-2841-1.patch
>
>
> There are inconsistent links to "cont.html#Parameter-Sub" throughout the 
> documentation. For some "Parameter-Sub" is all lowercase, some have it with 
> the case shown here. 
> At least for Chrome, this results in a broken link, where the browser won't 
> scroll to the correct section in the page.
> The attached patch updates all to use the "Parameter-Sub" casing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (PIG-2843) Typo in Documentation

2012-07-27 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-2843.
--

Resolution: Fixed

Committed, thanks Eric!

> Typo in Documentation
> -
>
> Key: PIG-2843
> URL: https://issues.apache.org/jira/browse/PIG-2843
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.0
>Reporter: Eric Spishak
>Assignee: Eric Spishak
> Attachments: PIG-2843-0.patch, PIG-2843-1.patch
>
>
> There's a small typo in start.html (missing a space). The attached patch 
> fixes the issue.
> This same typo is in start.pdf as well, but I'm unsure how to update that 
> file. If someone can point me to directions I'll gladly add that to the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2843) Typo in Documentation

2012-07-27 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-2843:
-

Attachment: PIG-2843-1.patch

Thanks for the fix Eric!

This change actually needs to be made in the pig repo in 
{{src/docs/src/documentation/content/xdocs/start.xml}} (i.e., the user docs). 
These files get sourced by the site docs in {{pig/site}} at build time to 
generate the html file you submitted. See my revised patch.

Will commit shortly. 

> Typo in Documentation
> -
>
> Key: PIG-2843
> URL: https://issues.apache.org/jira/browse/PIG-2843
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.0
>Reporter: Eric Spishak
>Assignee: Eric Spishak
> Attachments: PIG-2843-0.patch, PIG-2843-1.patch
>
>
> There's a small typo in start.html (missing a space). The attached patch 
> fixes the issue.
> This same typo is in start.pdf as well, but I'm unsure how to update that 
> file. If someone can point me to directions I'll gladly add that to the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (PIG-2843) Typo in Documentation

2012-07-27 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham reassigned PIG-2843:


Assignee: Eric Spishak

> Typo in Documentation
> -
>
> Key: PIG-2843
> URL: https://issues.apache.org/jira/browse/PIG-2843
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.10.0
>Reporter: Eric Spishak
>Assignee: Eric Spishak
> Attachments: PIG-2843-0.patch
>
>
> There's a small typo in start.html (missing a space). The attached patch 
> fixes the issue.
> This same typo is in start.pdf as well, but I'm unsure how to update that 
> file. If someone can point me to directions I'll gladly add that to the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2779) Refactoring the code for setting number of reducers

2012-07-27 Thread Jie Li (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424105#comment-13424105
 ] 

Jie Li commented on PIG-2779:
-

Hi Bill, regarding the first comment, I agree we can avoid estimating and 
should be adjusted before. But now would you need it for later analysis in any 
way? (e.g. comparing estimated #reducer and PALALLEL keyword). If not I'll fix 
it then.

Regarding the second comment, yes requestedParallel has been used for multiple 
purposes for a long time and adjusted across logical/physical/mr phases, and 
also set as the final number of reducers. Many unit tests also assume it as the 
final number of reducers. I'm afraid cleaning it up can be another ticket? Or 
maybe an easier approach is to add another field just recording the PARALLEL 
keyword?

> Refactoring the code for setting number of reducers
> ---
>
> Key: PIG-2779
> URL: https://issues.apache.org/jira/browse/PIG-2779
> Project: Pig
>  Issue Type: Bug
>Reporter: Jie Li
>Assignee: Jie Li
> Fix For: 0.11
>
> Attachments: PIG-2779.0.patch, PIG-2779.1.patch, PIG-2779.2.patch, 
> PIG-2779.3.patch, TestNumberOfReducers.java, TestNumberOfReducers.java
>
>
> As PIG-2652 observed, currently the code for setting number of reducers is a 
> little messy. MapReduceOper.requestedParallelism seems being misused in some 
> plases, and now we support runtime estimation of #reducer which further 
> complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated 
> #reducer will be used. If we specify parallel 2 while it estimates 4, 
> order-by will fail due to "Illegal partition for Null". If we specify 
> parallel 4 while it estimates 2, then some reducers will have nothing to do. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2779) Refactoring the code for setting number of reducers

2012-07-27 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424070#comment-13424070
 ] 

Bill Graham commented on PIG-2779:
--

Thanks Jie, I've been running the test suite on our CI server and so far things 
look good. We're close. A few more comments though:

- In {{JobControlCompiler}} we should only call {{estimateNumberOfReducers}} if 
we need to, since we don't need it if {{requestedParallelism}} or 
{{defaultParallel}} will govern and the call could be expensive. So we'd have 
this:
{noformat}
} else {
  mro.estimatedParallelism = estimateNumberOfReducers(conf, lds, nwJob);
  jobParallelism = mro.estimatedParallelism;
}
{noformat}

- The semantics of {{pig.info.reducers.requested.parallel}} is a bit misleading 
as implemented. I would expect that would be the value set via the {{PARALLEL}} 
statement, but that's not the case, since {{requestedParallel}} gets set to 
{{jobParallelism}} on line 796 of JCC. Would you please add to the tests in 
{{TestJobSubmission}} that each of the {{pig.info.reducers.*}} fields are set 
as expected (or not set) after each of the scenarios. I suspect there are cases 
where {{pig.info.reducers.requested.parallel}} is being set when {{PARALLEL}} 
isn't used.


> Refactoring the code for setting number of reducers
> ---
>
> Key: PIG-2779
> URL: https://issues.apache.org/jira/browse/PIG-2779
> Project: Pig
>  Issue Type: Bug
>Reporter: Jie Li
>Assignee: Jie Li
> Fix For: 0.11
>
> Attachments: PIG-2779.0.patch, PIG-2779.1.patch, PIG-2779.2.patch, 
> PIG-2779.3.patch, TestNumberOfReducers.java, TestNumberOfReducers.java
>
>
> As PIG-2652 observed, currently the code for setting number of reducers is a 
> little messy. MapReduceOper.requestedParallelism seems being misused in some 
> plases, and now we support runtime estimation of #reducer which further 
> complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated 
> #reducer will be used. If we specify parallel 2 while it estimates 4, 
> order-by will fail due to "Illegal partition for Null". If we specify 
> parallel 4 while it estimates 2, then some reducers will have nothing to do. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2791) Pig does not work with Namenode Federation

2012-07-27 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424058#comment-13424058
 ] 

Daniel Dai commented on PIG-2791:
-

Commit PIG-2791-4-branch10.patch to 0.10 branch. Will check into trunk once 
trunk tests pass.

> Pig does not work with Namenode Federation
> --
>
> Key: PIG-2791
> URL: https://issues.apache.org/jira/browse/PIG-2791
> Project: Pig
>  Issue Type: Bug
>  Components: grunt
>Affects Versions: 0.10.0
> Environment: Pig QE
>Reporter: patrick white
>Assignee: Rohini Palaniswamy
>Priority: Blocker
> Attachments: PIG-2791-0.patch, PIG-2791-1.patch, PIG-2791-2.patch, 
> PIG-2791-3-branch10.patch, PIG-2791-3-trunk.patch, PIG-2791-4-branch10.patch, 
> PIG-2791-4-trunk.patch, asf_test_notes.txt
>
>
> The Yahoo Pig QE team ran into a blocking issue when trying to test 
> Client-Side Mount Tables, on a Federated cluster with two NNs, this blocks 
> Pig Testing on Federation. 
> Federation relies strongly on the use of CSMT with viewFS, QE found that in 
> this configuration it is not possible to enter grunt shell because Pig makes 
> a call to getDefaultReplication() on the fs, which is ambiguous over viewFS 
> and causes core to throw a 
> org.apache.hadoop.fs.viewfs.NotInMountpointException: "getDefaultReplication 
> on empty path is invalid".
> This in turn cause Pig to exit with an internal error as follows:
> 2012-07-06 22:20:25,657 [main] INFO  org.apache.pig.Main - Apache Pig version 
> 0.10.1.0.1206081058 (r1348169) compiled Jun 08 2012, 17:58:42
> 2012-07-06 22:20:26,074 [main] WARN  org.apache.hadoop.conf.Configuration - 
> mapred.used.genericoptionsparser is deprecated. Instead, use 
> mapreduce.client.genericoptionsparser.used
> 2012-07-06 22:20:26,076 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: viewfs:///
> 2012-07-06 22:20:26,080 [main] WARN  org.apache.hadoop.conf.Configuration - 
> fs.default.name is deprecated. Instead, use fs.defaultFS
> 2012-07-06 22:20:26,522 [main] ERROR org.apache.pig.Main - ERROR 2999: 
> Unexpected internal error. getDefaultReplication on empty path is invalid
> 2012-07-06 22:20:26,522 [main] WARN  org.apache.pig.Main - There is no log 
> file to write to.
> 2012-07-06 22:20:26,522 [main] ERROR org.apache.pig.Main - 
> org.apache.hadoop.fs.viewfs.NotInMountpointException: getDefaultReplication 
> on empty path is invalid
> at 
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:482)
> at 
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:77)
> at 
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.(HDataStorage.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:205)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:118)
> at org.apache.pig.impl.PigContext.connect(PigContext.java:208)
> at org.apache.pig.PigServer.(PigServer.java:246)
> at org.apache.pig.PigServer.(PigServer.java:231)
> at org.apache.pig.tools.grunt.Grunt.(Grunt.java:47)
> at org.apache.pig.Main.run(Main.java:487)
> at org.apache.pig.Main.main(Main.java:111)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2791) Pig does not work with Namenode Federation

2012-07-27 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-2791:


Attachment: PIG-2791-4-trunk.patch
PIG-2791-4-branch10.patch

Just made a minor change to the patch so that if the ivy dependency was changed 
to 0.23.3 also all the tests pass. Changed Util.Hadoop2_0() to (Util.isHadoop23 
|| Util.isHadoop2_0()) for fs copy command.

 

> Pig does not work with Namenode Federation
> --
>
> Key: PIG-2791
> URL: https://issues.apache.org/jira/browse/PIG-2791
> Project: Pig
>  Issue Type: Bug
>  Components: grunt
>Affects Versions: 0.10.0
> Environment: Pig QE
>Reporter: patrick white
>Assignee: Rohini Palaniswamy
>Priority: Blocker
> Attachments: PIG-2791-0.patch, PIG-2791-1.patch, PIG-2791-2.patch, 
> PIG-2791-3-branch10.patch, PIG-2791-3-trunk.patch, PIG-2791-4-branch10.patch, 
> PIG-2791-4-trunk.patch, asf_test_notes.txt
>
>
> The Yahoo Pig QE team ran into a blocking issue when trying to test 
> Client-Side Mount Tables, on a Federated cluster with two NNs, this blocks 
> Pig Testing on Federation. 
> Federation relies strongly on the use of CSMT with viewFS, QE found that in 
> this configuration it is not possible to enter grunt shell because Pig makes 
> a call to getDefaultReplication() on the fs, which is ambiguous over viewFS 
> and causes core to throw a 
> org.apache.hadoop.fs.viewfs.NotInMountpointException: "getDefaultReplication 
> on empty path is invalid".
> This in turn cause Pig to exit with an internal error as follows:
> 2012-07-06 22:20:25,657 [main] INFO  org.apache.pig.Main - Apache Pig version 
> 0.10.1.0.1206081058 (r1348169) compiled Jun 08 2012, 17:58:42
> 2012-07-06 22:20:26,074 [main] WARN  org.apache.hadoop.conf.Configuration - 
> mapred.used.genericoptionsparser is deprecated. Instead, use 
> mapreduce.client.genericoptionsparser.used
> 2012-07-06 22:20:26,076 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: viewfs:///
> 2012-07-06 22:20:26,080 [main] WARN  org.apache.hadoop.conf.Configuration - 
> fs.default.name is deprecated. Instead, use fs.defaultFS
> 2012-07-06 22:20:26,522 [main] ERROR org.apache.pig.Main - ERROR 2999: 
> Unexpected internal error. getDefaultReplication on empty path is invalid
> 2012-07-06 22:20:26,522 [main] WARN  org.apache.pig.Main - There is no log 
> file to write to.
> 2012-07-06 22:20:26,522 [main] ERROR org.apache.pig.Main - 
> org.apache.hadoop.fs.viewfs.NotInMountpointException: getDefaultReplication 
> on empty path is invalid
> at 
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:482)
> at 
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:77)
> at 
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.(HDataStorage.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:205)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:118)
> at org.apache.pig.impl.PigContext.connect(PigContext.java:208)
> at org.apache.pig.PigServer.(PigServer.java:246)
> at org.apache.pig.PigServer.(PigServer.java:231)
> at org.apache.pig.tools.grunt.Grunt.(Grunt.java:47)
> at org.apache.pig.Main.run(Main.java:487)
> at org.apache.pig.Main.main(Main.java:111)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2846) Can we skip hcat related e2e when hcat is not installed?

2012-07-27 Thread Koji Noguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2846:
--

Attachment: pig-2846-trunk-v1.txt

Adding a flag 'ifhcat_exists' for e2e testcases that only work when hcat exists.

> Can we skip hcat related e2e when hcat is not installed?
> 
>
> Key: PIG-2846
> URL: https://issues.apache.org/jira/browse/PIG-2846
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Priority: Trivial
> Attachments: pig-2846-trunk-v1.txt
>
>
> Trying pig e2e for the first time, I see couple of the tests 
> (HCatDDL_1,HCatDDL_2 and Jython_Command_1) failing with 
> bq. java.io.IOException: Cannot run program /usr/local/hcat/bin/hcat:
> bq. java.io.IOException: error=2, No such file or directory
> Is it ok to change the test_harness to skip these tests when hcat does not 
> exist?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (PIG-2846) Can we skip hcat related e2e when hcat is not installed?

2012-07-27 Thread Koji Noguchi (JIRA)
Koji Noguchi created PIG-2846:
-

 Summary: Can we skip hcat related e2e when hcat is not installed?
 Key: PIG-2846
 URL: https://issues.apache.org/jira/browse/PIG-2846
 Project: Pig
  Issue Type: Test
Reporter: Koji Noguchi
Priority: Trivial


Trying pig e2e for the first time, I see couple of the tests 
(HCatDDL_1,HCatDDL_2 and Jython_Command_1) failing with 

bq. java.io.IOException: Cannot run program /usr/local/hcat/bin/hcat:
bq. java.io.IOException: error=2, No such file or directory

Is it ok to change the test_harness to skip these tests when hcat does not 
exist?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2279) Error when using PigUnit on a script that uses IMPORT another script with macros

2012-07-27 Thread Leonardo Brambilla (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423895#comment-13423895
 ] 

Leonardo Brambilla commented on PIG-2279:
-

Hello Mark, do you have the patch to fix this issue? Would be great if you can 
post it here or send a link to github for instance.

Thanks in advance.

> Error when using PigUnit on a script that uses IMPORT another script with 
> macros
> 
>
> Key: PIG-2279
> URL: https://issues.apache.org/jira/browse/PIG-2279
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Mark Roddy
> Attachments: duplication-pigunit-parser-exception-HACK.patch, 
> test-macro.zip
>
>
> Executing PigUnit against a script which uses the import command always fails 
> with error:
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing.  : Duplicated import file 'somemacro.pig'
> Even though the script being tested does not preform an import of the same 
> script twice.  
> I've tried with a couple of different scripts/tests and it appears that 
> PigUnit fails on any test of a pig script where an import command is issued.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira