[jira] Subscription: PIG patch available

2015-01-16 Thread jira
Issue Subscription
Filter: PIG patch available (25 issues)

Subscriber: pigdaily

Key Summary
PIG-4385testDefaultBootup fails because it cannot find "pig.properties"
https://issues.apache.org/jira/browse/PIG-4385
PIG-4381 PIG grunt shell DEFINE commands fails when it spans multiple lines
https://issues.apache.org/jira/browse/PIG-4381
PIG-4366Port local mode tests to Tez - part5
https://issues.apache.org/jira/browse/PIG-4366
PIG-4362Make ship work with spark
https://issues.apache.org/jira/browse/PIG-4362
PIG-4359Port local mode tests to Tez - part4
https://issues.apache.org/jira/browse/PIG-4359
PIG-4352Port local mode tests to Tez - TestUnionOnSchema
https://issues.apache.org/jira/browse/PIG-4352
PIG-4340PigStorage fails parsing empty map.
https://issues.apache.org/jira/browse/PIG-4340
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4264Port TestAvroStorage to tez local mode
https://issues.apache.org/jira/browse/PIG-4264
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4193Make collected group work with Spark
https://issues.apache.org/jira/browse/PIG-4193
PIG-4111Make Pig compiles with avro-1.7.7
https://issues.apache.org/jira/browse/PIG-4111
PIG-4103Fix TestRegisteredJarVisibility(after PIG-4083)
https://issues.apache.org/jira/browse/PIG-4103
PIG-4004Upgrade the Pigmix queries from the (old) mapred API to mapreduce
https://issues.apache.org/jira/browse/PIG-4004
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3635Fix e2e tests for Hadoop 2.X on Windows
https://issues.apache.org/jira/browse/PIG-3635
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-3441Allow Pig to use default resources from Configuration objects
https://issues.apache.org/jira/browse/PIG-3441

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384


[jira] [Commented] (PIG-4386) How many files can be submitted to a pig job at once?

2015-01-16 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281201#comment-14281201
 ] 

Daniel Dai commented on PIG-4386:
-

There is no hard limit of how many files in the input directory. You need to go 
to Jobtracker UI to find the real error message.

> How many files can be submitted to a pig job at once?
> -
>
> Key: PIG-4386
> URL: https://issues.apache.org/jira/browse/PIG-4386
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.13.1
> Environment: {code}
> $pig --version
> Apache Pig version 0.13.1-mapr-1410 (rexported) 
> compiled Nov 05 2014, 10:16:28
> {code}
>Reporter: Madhavi Nadig
>
> Pig fails mysteriously when I specify the root of a large directory tree as 
> the LOAD input in my script. The exception that it throws offers no insight 
> into what's happening. The same script works perfectly when there are fewer 
> files.
> It's a very simple script as you can see below:
> {code}
> SET pig.noSplitCombination true;
> raw_record = LOAD '/data/directory/tree/root' USING PigStorage(',');
> filtered = FILTER raw_record by $1 == 251068;
> filtered_data = FOREACH filtered GENERATE (chararray)$0, (chararray)$1, 
> (chararray)$2;
> STORE filtered_data INTO '/data/output/directory/' USING PigStorage();
> {code}
> Here's the error message I see :
> {code}
>ERROR 2244: Job scope-594 failed, hadoop does not return any error message
> org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job 
> scope-594 failed, hadoop does not return any error message
> at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:178)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
> at org.apache.pig.Main.run(Main.java:608)
> at org.apache.pig.Main.main(Main.java:156)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}
> How many files can PIG process at once?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4384) TezLauncher thread should be deamon thread

2015-01-16 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4384:

Component/s: tez

> TezLauncher thread should be deamon thread 
> ---
>
> Key: PIG-4384
> URL: https://issues.apache.org/jira/browse/PIG-4384
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Fix For: 0.15.0
>
> Attachments: PIG_4384_1.patch
>
>
> The following piece of code would hang there because TezLauncher thread is 
> not deamon thread. 
> {code}
>   public static void main(String[] args) throws IOException,
>   InterruptedException {
> FileSystem fs = FileSystem.get(new Configuration());
> fs.delete(new Path("/tmp/output"), true);
> PigServer pig = new PigServer(new TezExecType());
> pig.registerScript("scripts/test.pig");
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4384) TezLauncher thread should be deamon thread

2015-01-16 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4384:

   Resolution: Fixed
Fix Version/s: 0.15.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks Jeff!

> TezLauncher thread should be deamon thread 
> ---
>
> Key: PIG-4384
> URL: https://issues.apache.org/jira/browse/PIG-4384
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Fix For: 0.15.0
>
> Attachments: PIG_4384_1.patch
>
>
> The following piece of code would hang there because TezLauncher thread is 
> not deamon thread. 
> {code}
>   public static void main(String[] args) throws IOException,
>   InterruptedException {
> FileSystem fs = FileSystem.get(new Configuration());
> fs.delete(new Path("/tmp/output"), true);
> PigServer pig = new PigServer(new TezExecType());
> pig.registerScript("scripts/test.pig");
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4366) Port local mode tests to Tez - part5

2015-01-16 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4366:

Attachment: PIG-4366-3.patch

Update the patch to address Rohini's review comments.

> Port local mode tests to Tez - part5
> 
>
> Key: PIG-4366
> URL: https://issues.apache.org/jira/browse/PIG-4366
> Project: Pig
>  Issue Type: Sub-task
>  Components: tez
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.15.0
>
> Attachments: PIG-4366-1.patch, PIG-4366-2.patch, PIG-4366-3.patch
>
>
> Covers the following tests:
> TestCubeOperator
> TestJoin
> TestMultiQuery
> TestNewPlanColumnPrune
> TestPigServer
> TestPigStats
> TestPruneColumn
> TestRank3
> TestScalarAliases



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29645: Port local mode tests to Tez - part5

2015-01-16 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29645/
---

(Updated Jan. 17, 2015, 2:21 a.m.)


Review request for pig and Rohini Palaniswamy.


Bugs: PIG-4366
https://issues.apache.org/jira/browse/PIG-4366


Repository: pig


Description
---

See PIG-4366


Diffs (updated)
-

  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POReservoirSample.java
 1652551 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java
 1652551 
  
trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/MultiQueryOptimizerTez.java
 1652551 
  trunk/src/org/apache/pig/builtin/mock/Storage.java 1652551 
  trunk/src/org/apache/pig/tools/pigstats/tez/TezScriptState.java 1652551 
  trunk/test/excluded-tests-20 1652551 
  trunk/test/org/apache/pig/test/TestCubeOperator.java 1652551 
  trunk/test/org/apache/pig/test/TestJoin.java 1652551 
  trunk/test/org/apache/pig/test/TestJoinBase.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestJoinLocal.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestMultiQuery.java 1652551 
  trunk/test/org/apache/pig/test/TestNewPlanColumnPrune.java 1652551 
  trunk/test/org/apache/pig/test/TestPigServer.java 1652551 
  trunk/test/org/apache/pig/test/TestPigServerLocal.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestPigStats.java 1652551 
  trunk/test/org/apache/pig/test/TestPigStatsMR.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestPruneColumn.java 1652551 
  trunk/test/org/apache/pig/test/TestRank3.java 1652551 
  trunk/test/org/apache/pig/test/TestScalarAliases.java 1652551 
  trunk/test/org/apache/pig/test/TestScalarAliasesLocal.java PRE-CREATION 
  trunk/test/org/apache/pig/test/Util.java 1652551 
  trunk/test/org/apache/pig/tez/TestPigStatsTez.java PRE-CREATION 
  trunk/test/tez-local-tests 1652551 

Diff: https://reviews.apache.org/r/29645/diff/


Testing
---


Thanks,

Daniel Dai



Re: Review Request 29645: Port local mode tests to Tez - part5

2015-01-16 Thread Daniel Dai


> On Jan. 16, 2015, 5:54 p.m., Rohini Palaniswamy wrote:
> > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java,
> >  line 300
> > 
> >
> > Why do we need this condition. i.e, In what case is the store operator 
> > predecessor of the tezOp?

I simply prevent connecting twice. If "from" and "tezOp" is already connected, 
skip it.


> On Jan. 16, 2015, 5:54 p.m., Rohini Palaniswamy wrote:
> > trunk/src/org/apache/pig/builtin/mock/Storage.java, line 462
> > 
> >
> > why should we do this?  Having to create a copy in a StoreFunc does not 
> > look good.

We have to do it since the reference of the tuple will be saved in output 
dataset in Storage, and this tuple keeps changing if we don't make a copy


> On Jan. 16, 2015, 5:54 p.m., Rohini Palaniswamy wrote:
> > trunk/test/org/apache/pig/test/TestJoin.java, line 61
> > 
> >
> > TestJoinLocal seems to be missing in the patch.
> > 
> > I think we can keep TestJoin as is for cluster tests without having to 
> > create a abstract TestJoinBase and just create a TestJoinLocal which 
> > extends TestJoin and uses Local mode as all tests are same. Will help keep 
> > changes simple and have just two classes instead of three.

Not every test is tested in both mode. Some only in cluster mode (TestJoin) and 
some only in local mode (TestJoinLocal). For those tested in both mode, I put 
into TestJoinBase


> On Jan. 16, 2015, 5:54 p.m., Rohini Palaniswamy wrote:
> > trunk/test/org/apache/pig/test/TestPigStatsMR.java, line 46
> > 
> >
> > Make method more generic
> > 
> > assertNumberOfJobs(ExecJob job, int expectedNumJobs) {
> > 
> > 
> > assertEquals(expectedNumJobs, jobGraph.getJobList().size());
> > 
> > }

This check method is very specific to the test case testPigStatsAlias. 
expectedNumJobs has different values in tez and mr. If we make checkPigStats 
more generic, then we need to make another abstract method, in MR, we check 
assertNumberOfJobs(job, 2), and in Tez, we check assertNumberOfJobs(job, 1)


> On Jan. 16, 2015, 5:54 p.m., Rohini Palaniswamy wrote:
> > trunk/test/tez-local-tests, line 72
> > 
> >
> > Can we add tests in the order they will be executed? Find it easy to 
> > comment the already run ones and re-run if I find a failure in the middle.

Sure, but I want to do it later after most of local mode tests patches check 
in. The file always create conflicts and need to manually edit from time to 
time.


- Daniel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29645/#review67688
---


On Jan. 6, 2015, 11:49 p.m., Daniel Dai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29645/
> ---
> 
> (Updated Jan. 6, 2015, 11:49 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-4366
> https://issues.apache.org/jira/browse/PIG-4366
> 
> 
> Repository: pig
> 
> 
> Description
> ---
> 
> See PIG-4366
> 
> 
> Diffs
> -
> 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POReservoirSample.java
>  1649730 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java
>  1649730 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/MultiQueryOptimizerTez.java
>  1649730 
>   trunk/src/org/apache/pig/builtin/mock/Storage.java 1649730 
>   trunk/src/org/apache/pig/tools/pigstats/tez/TezScriptState.java 1649730 
>   trunk/test/excluded-tests-20 1649730 
>   trunk/test/org/apache/pig/test/TestCubeOperator.java 1649730 
>   trunk/test/org/apache/pig/test/TestJoin.java 1649730 
>   trunk/test/org/apache/pig/test/TestJoinBase.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestMultiQuery.java 1649730 
>   trunk/test/org/apache/pig/test/TestNewPlanColumnPrune.java 1649730 
>   trunk/test/org/apache/pig/test/TestPigServer.java 1649730 
>   trunk/test/org/apache/pig/test/TestPigServerLocal.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestPigStats.java 1649730 
>   trunk/test/org/apache/pig/test/TestPigStatsMR.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestPruneColumn.java 1649730 
>   trunk/test/org/apache/pig/test/TestRank3.java 1649730 
>   trunk/test/org/apache/pig/test/TestScalarAliases.java 1649730 
>   trunk/test/org/

[jira] [Updated] (PIG-4359) Port local mode tests to Tez - part4

2015-01-16 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4359:

Attachment: PIG-4359-2.patch

Update the patch to address Rihini's review comments.

> Port local mode tests to Tez - part4
> 
>
> Key: PIG-4359
> URL: https://issues.apache.org/jira/browse/PIG-4359
> Project: Pig
>  Issue Type: Sub-task
>  Components: tez
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.15.0
>
> Attachments: PIG-4359-1.patch, PIG-4359-2.patch
>
>
> Covers:
> TestMultiQueryLocal
> TestPOPartialAggPlan
> TestStore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29644: Port local mode tests to Tez - part4

2015-01-16 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29644/
---

(Updated Jan. 16, 2015, 11:58 p.m.)


Review request for pig and Rohini Palaniswamy.


Repository: pig


Description
---

See PIG-4359


Diffs (updated)
-

  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java
 1652499 
  trunk/test/excluded-tests-20 1652499 
  trunk/test/org/apache/pig/test/TestMultiQueryLocal.java 1652499 
  trunk/test/org/apache/pig/test/TestMultiQueryLocalMR.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestPOPartialAggPlan.java 1652499 
  trunk/test/org/apache/pig/test/TestPOPartialAggPlanMR.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestStore.java 1652499 
  trunk/test/org/apache/pig/test/TestStoreBase.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestStoreLocal.java PRE-CREATION 
  trunk/test/org/apache/pig/tez/TestMultiQueryLocalTez.java PRE-CREATION 
  trunk/test/org/apache/pig/tez/TestPOPartialAggPlanTez.java PRE-CREATION 
  trunk/test/tez-local-tests 1652499 

Diff: https://reviews.apache.org/r/29644/diff/


Testing
---


Thanks,

Daniel Dai



Re: Review Request 29644: Port local mode tests to Tez - part4

2015-01-16 Thread Daniel Dai


> On Jan. 13, 2015, 10:52 p.m., Rohini Palaniswamy wrote:
> > trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java,
> >  line 338
> > 
> >
> > Why do we need this?

I need PigStoreTez comparable with PigStore. In TezCompiler, we add PigStore 
into phyToTezOpMap, and later we change PigStore to PigStoreTez, then we were 
not able to retrieve the Tez plan back.


> On Jan. 13, 2015, 10:52 p.m., Rohini Palaniswamy wrote:
> > trunk/test/org/apache/pig/test/TestMultiQueryLocal.java, line 447
> > 
> >
> > If the location is different why do we have to pass two diffent keys 
> > for two different output formats to get the test passing?

This test is to make sure different storer is using different context. So I use 
different key for different storer for the test. There is a difference which 
makes the test pass in MR. In tez, OutputFormat and Storer is using the same 
context, while MR is not. I don't think this is a big issue.


> On Jan. 13, 2015, 10:52 p.m., Rohini Palaniswamy wrote:
> > trunk/test/org/apache/pig/test/TestMultiQueryLocal.java, line 680
> > 
> >
> > Instead of splitting this test class into MR and Tez classes can we 
> > just add a method to Util class for the below line?
> > 
> > Launcher launcher = Util.getLauncher(execType);

I try to, but how to do that? Use reflection?


> On Jan. 13, 2015, 10:52 p.m., Rohini Palaniswamy wrote:
> > trunk/test/org/apache/pig/test/TestStoreLocal.java, line 36
> > 
> >
> > The old TestStore code had it, but do we actually need this setting? 
> > Not done for other local test classes.

Yes, we don't really need it.


- Daniel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29644/#review67958
---


On Jan. 6, 2015, 11:47 p.m., Daniel Dai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29644/
> ---
> 
> (Updated Jan. 6, 2015, 11:47 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Repository: pig
> 
> 
> Description
> ---
> 
> See PIG-4359
> 
> 
> Diffs
> -
> 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java
>  1645046 
>   trunk/test/excluded-tests-20 1645046 
>   trunk/test/org/apache/pig/test/TestMultiQueryLocal.java 1645046 
>   trunk/test/org/apache/pig/test/TestMultiQueryLocalMR.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestPOPartialAggPlan.java 1645046 
>   trunk/test/org/apache/pig/test/TestPOPartialAggPlanMR.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestStore.java 1645046 
>   trunk/test/org/apache/pig/test/TestStoreBase.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestStoreLocal.java PRE-CREATION 
>   trunk/test/org/apache/pig/tez/TestMultiQueryLocalTez.java PRE-CREATION 
>   trunk/test/org/apache/pig/tez/TestPOPartialAggPlanTez.java PRE-CREATION 
>   trunk/test/org/apache/pig/tez/TezUtil.java PRE-CREATION 
>   trunk/test/tez-local-tests 1645046 
> 
> Diff: https://reviews.apache.org/r/29644/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>



Is ther a way to run one test of special unit test?

2015-01-16 Thread lulynn_2008
Hi All,

There are multiple tests in one Test* file. Is there a way to just run only one 
pointed test?

Thanks


Re: Review Request 29645: Port local mode tests to Tez - part5

2015-01-16 Thread Rohini Palaniswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29645/#review67688
---



trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POReservoirSample.java


This can be moved inside the new if block



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java


Why do we need this condition. i.e, In what case is the store operator 
predecessor of the tezOp?



trunk/src/org/apache/pig/builtin/mock/Storage.java


why should we do this?  Having to create a copy in a StoreFunc does not 
look good.



trunk/test/org/apache/pig/test/TestCubeOperator.java


Unintended change which reverts the recent rollup patch. Need to rollback.

testRollupHIIAfterCogroup - Naming is weird though. Need to go back and 
review the rollup patch.



trunk/test/org/apache/pig/test/TestCubeOperator.java


Can you add PIG-3993 to the message? Easy to search later. Need to add to 
some illustrate local mode tests in TestGrunt as well.



trunk/test/org/apache/pig/test/TestJoin.java


TestJoinLocal seems to be missing in the patch.

I think we can keep TestJoin as is for cluster tests without having to 
create a abstract TestJoinBase and just create a TestJoinLocal which extends 
TestJoin and uses Local mode as all tests are same. Will help keep changes 
simple and have just two classes instead of three.



trunk/test/org/apache/pig/test/TestPigStatsMR.java


Make method more generic

assertNumberOfJobs(ExecJob job, int expectedNumJobs) {


assertEquals(expectedNumJobs, jobGraph.getJobList().size());

}



trunk/test/org/apache/pig/test/TestPigStatsMR.java


private



trunk/test/org/apache/pig/test/TestPigStatsMR.java


private



trunk/test/tez-local-tests


Can we add tests in the order they will be executed? Find it easy to 
comment the already run ones and re-run if I find a failure in the middle.


- Rohini Palaniswamy


On Jan. 6, 2015, 11:49 p.m., Daniel Dai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29645/
> ---
> 
> (Updated Jan. 6, 2015, 11:49 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-4366
> https://issues.apache.org/jira/browse/PIG-4366
> 
> 
> Repository: pig
> 
> 
> Description
> ---
> 
> See PIG-4366
> 
> 
> Diffs
> -
> 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POReservoirSample.java
>  1649730 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java
>  1649730 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/MultiQueryOptimizerTez.java
>  1649730 
>   trunk/src/org/apache/pig/builtin/mock/Storage.java 1649730 
>   trunk/src/org/apache/pig/tools/pigstats/tez/TezScriptState.java 1649730 
>   trunk/test/excluded-tests-20 1649730 
>   trunk/test/org/apache/pig/test/TestCubeOperator.java 1649730 
>   trunk/test/org/apache/pig/test/TestJoin.java 1649730 
>   trunk/test/org/apache/pig/test/TestJoinBase.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestMultiQuery.java 1649730 
>   trunk/test/org/apache/pig/test/TestNewPlanColumnPrune.java 1649730 
>   trunk/test/org/apache/pig/test/TestPigServer.java 1649730 
>   trunk/test/org/apache/pig/test/TestPigServerLocal.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestPigStats.java 1649730 
>   trunk/test/org/apache/pig/test/TestPigStatsMR.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/TestPruneColumn.java 1649730 
>   trunk/test/org/apache/pig/test/TestRank3.java 1649730 
>   trunk/test/org/apache/pig/test/TestScalarAliases.java 1649730 
>   trunk/test/org/apache/pig/test/TestScalarAliasesLocal.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/Util.java 1649730 
>   trunk/test/org/apache/pig/tez/TestPigStatsTez.java PRE-CREATION 
>   trunk/test/tez-local-tests 1649730 
> 
> Diff: https://reviews.apache.org/r/29645/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>



[jira] [Created] (PIG-4386) How many files can be submitted to a pig job at once?

2015-01-16 Thread Madhavi Nadig (JIRA)
Madhavi Nadig created PIG-4386:
--

 Summary: How many files can be submitted to a pig job at once?
 Key: PIG-4386
 URL: https://issues.apache.org/jira/browse/PIG-4386
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.13.1
 Environment: {code}
$pig --version
Apache Pig version 0.13.1-mapr-1410 (rexported) 
compiled Nov 05 2014, 10:16:28
{code}

Reporter: Madhavi Nadig


Pig fails mysteriously when I specify the root of a large directory tree as the 
LOAD input in my script. The exception that it throws offers no insight into 
what's happening. The same script works perfectly when there are fewer files.

It's a very simple script as you can see below:

{code}
SET pig.noSplitCombination true;
raw_record = LOAD '/data/directory/tree/root' USING PigStorage(',');
filtered = FILTER raw_record by $1 == 251068;
filtered_data = FOREACH filtered GENERATE (chararray)$0, (chararray)$1, 
(chararray)$2;
STORE filtered_data INTO '/data/output/directory/' USING PigStorage();
{code}
Here's the error message I see :
{code}
   ERROR 2244: Job scope-594 failed, hadoop does not return any error message
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job scope-594 
failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:178)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:608)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{code}
How many files can PIG process at once?




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Is ther a way to run one test of special unit test?

2015-01-16 Thread Pradeep Gollakota
If you're using maven AND using surefire plugin 2.7.3+ AND using Junit 4,
then you can do this by specifying -Dtest=TestClass#methodName

ref:
http://maven.apache.org/surefire/maven-surefire-plugin/examples/single-test.html

On Thu, Jan 15, 2015 at 8:02 PM, Cheolsoo Park  wrote:

> I don't think you can disable test cases on the fly in JUnit. You will need
> to add @Ignore annotation and recompile the test file. Correct me if I am
> wrong.
>
> On Thu, Jan 15, 2015 at 6:55 PM, lulynn_2008  wrote:
>
> > Hi All,
> >
> > There are multiple tests in one Test* file. Is there a way to just run
> > only one pointed test?
> >
> > Thanks
> >
>


[jira] [Updated] (PIG-4385) testDefaultBootup fails because it cannot find "pig.properties"

2015-01-16 Thread Martin Kudlej (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Kudlej updated PIG-4385:
---
Status: Patch Available  (was: Open)

> testDefaultBootup fails because it cannot find "pig.properties"
> ---
>
> Key: PIG-4385
> URL: https://issues.apache.org/jira/browse/PIG-4385
> Project: Pig
>  Issue Type: Test
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Martin Kudlej
> Attachments: 0001-PIG-4385.patch
>
>
> testDefaultBootup fails because Pig cannot find created file 
> "pig.properties". I think  test of using "pig.property" file should be 
> separated from testDefaultBootup. Only clean way how to do it is to use 
> Properties for start of PigServer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4385) testDefaultBootup fails because it cannot find "pig.properties"

2015-01-16 Thread Martin Kudlej (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Kudlej updated PIG-4385:
---
Attachment: 0001-PIG-4385.patch

I would like to propose this change to testDefaultBootup test.

> testDefaultBootup fails because it cannot find "pig.properties"
> ---
>
> Key: PIG-4385
> URL: https://issues.apache.org/jira/browse/PIG-4385
> Project: Pig
>  Issue Type: Test
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Martin Kudlej
> Attachments: 0001-PIG-4385.patch
>
>
> testDefaultBootup fails because Pig cannot find created file 
> "pig.properties". I think  test of using "pig.property" file should be 
> separated from testDefaultBootup. Only clean way how to do it is to use 
> Properties for start of PigServer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4385) testDefaultBootup fails because it cannot find "pig.properties"

2015-01-16 Thread Martin Kudlej (JIRA)
Martin Kudlej created PIG-4385:
--

 Summary: testDefaultBootup fails because it cannot find 
"pig.properties"
 Key: PIG-4385
 URL: https://issues.apache.org/jira/browse/PIG-4385
 Project: Pig
  Issue Type: Test
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Martin Kudlej


testDefaultBootup fails because Pig cannot find created file "pig.properties". 
I think  test of using "pig.property" file should be separated from 
testDefaultBootup. Only clean way how to do it is to use Properties for start 
of PigServer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4384) TezLauncher thread should be deamon thread

2015-01-16 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated PIG-4384:

Affects Version/s: (was: 0.15.0)

> TezLauncher thread should be deamon thread 
> ---
>
> Key: PIG-4384
> URL: https://issues.apache.org/jira/browse/PIG-4384
> Project: Pig
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: PIG_4384_1.patch
>
>
> The following piece of code would hang there because TezLauncher thread is 
> not deamon thread. 
> {code}
>   public static void main(String[] args) throws IOException,
>   InterruptedException {
> FileSystem fs = FileSystem.get(new Configuration());
> fs.delete(new Path("/tmp/output"), true);
> PigServer pig = new PigServer(new TezExecType());
> pig.registerScript("scripts/test.pig");
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4384) TezLauncher thread should be deamon thread

2015-01-16 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated PIG-4384:

Status: Patch Available  (was: Open)

> TezLauncher thread should be deamon thread 
> ---
>
> Key: PIG-4384
> URL: https://issues.apache.org/jira/browse/PIG-4384
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: PIG_4384_1.patch
>
>
> The following piece of code would hang there because TezLauncher thread is 
> not deamon thread. 
> {code}
>   public static void main(String[] args) throws IOException,
>   InterruptedException {
> FileSystem fs = FileSystem.get(new Configuration());
> fs.delete(new Path("/tmp/output"), true);
> PigServer pig = new PigServer(new TezExecType());
> pig.registerScript("scripts/test.pig");
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4384) TezLauncher thread should be deamon thread

2015-01-16 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated PIG-4384:

Attachment: PIG_4384_1.patch

> TezLauncher thread should be deamon thread 
> ---
>
> Key: PIG-4384
> URL: https://issues.apache.org/jira/browse/PIG-4384
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: PIG_4384_1.patch
>
>
> The following piece of code would hang there because TezLauncher thread is 
> not deamon thread. 
> {code}
>   public static void main(String[] args) throws IOException,
>   InterruptedException {
> FileSystem fs = FileSystem.get(new Configuration());
> fs.delete(new Path("/tmp/output"), true);
> PigServer pig = new PigServer(new TezExecType());
> pig.registerScript("scripts/test.pig");
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4384) TezLauncher thread should be deamon thread

2015-01-16 Thread Jeff Zhang (JIRA)
Jeff Zhang created PIG-4384:
---

 Summary: TezLauncher thread should be deamon thread 
 Key: PIG-4384
 URL: https://issues.apache.org/jira/browse/PIG-4384
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Jeff Zhang
Assignee: Jeff Zhang


The following piece of code would hang there because TezLauncher thread is not 
deamon thread. 

{code}
  public static void main(String[] args) throws IOException,
  InterruptedException {
FileSystem fs = FileSystem.get(new Configuration());
fs.delete(new Path("/tmp/output"), true);
PigServer pig = new PigServer(new TezExecType());
pig.registerScript("scripts/test.pig");
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)