[jira] Subscription: PIG patch available

2014-10-03 Thread jira
Issue Subscription
Filter: PIG patch available (0 issues)

Subscriber: pigdaily

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384


[jira] [Commented] (PIG-4221) I should not specify the schema for JsonLoader

2014-10-03 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158884#comment-14158884
 ] 

Rohini Palaniswamy commented on PIG-4221:
-

Have you tried com.twitter.elephantbird.pig.load.JsonLoader ?

> I should not specify the schema for JsonLoader 
> ---
>
> Key: PIG-4221
> URL: https://issues.apache.org/jira/browse/PIG-4221
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.12.0
>Reporter: Joao Salcedo
>
> I should be able to import JSON data without specify the schema
> raw = LOAD 'testjson' USING JsonLoader() as (json:map[]); 
> and call every fields as :
> data = foreach raw generate (chararray)$0#'field1' as text,(long)$0#'field2' 
> as id; ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4221) I should not specify the schema for JsonLoader

2014-10-03 Thread Joao Salcedo (JIRA)
Joao Salcedo created PIG-4221:
-

 Summary: I should not specify the schema for JsonLoader 
 Key: PIG-4221
 URL: https://issues.apache.org/jira/browse/PIG-4221
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.12.0
Reporter: Joao Salcedo


I should be able to import JSON data without specify the schema

raw = LOAD 'testjson' USING JsonLoader() as (json:map[]); 

and call every fields as :

data = foreach raw generate (chararray)$0#'field1' as text,(long)$0#'field2' as 
id; ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4084) Port TestPigRunner to Tez

2014-10-03 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158791#comment-14158791
 ] 

Rohini Palaniswamy commented on PIG-4084:
-

Shouldn't tez also report all aliases processed instead of just the final one? 
pig.alias is the one that I heavily use for debugging. Having all aliases would 
be very helpful for debugging in Tez as well. Can you also put the patch in the 
review board so that it is easy to get more context about each test?

> Port TestPigRunner to Tez
> -
>
> Key: PIG-4084
> URL: https://issues.apache.org/jira/browse/PIG-4084
> Project: Pig
>  Issue Type: Sub-task
>  Components: tez
>Reporter: Rohini Palaniswamy
>Assignee: Daniel Dai
> Fix For: 0.14.0
>
> Attachments: PIG-4084-1.patch, initial.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4173) Move to Spark 1.x

2014-10-03 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-4173:
--
Attachment: PIG-4174_5.patch

This patch fixed the cogroup issue for Spark 1.1.0. Spark version is updated to 
1.1.0.

> Move to Spark 1.x
> -
>
> Key: PIG-4173
> URL: https://issues.apache.org/jira/browse/PIG-4173
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: bc Wong
>Assignee: Richard Ding
> Attachments: PIG-4173.patch, PIG-4173_2.patch, PIG-4173_3.patch, 
> PIG-4174_4.patch, PIG-4174_5.patch, TEST-org.apache.pig.spark.TestSpark.txt
>
>
> The Spark branch is using Spark 0.9: 
> https://github.com/apache/pig/blob/spark/ivy.xml#L438. We should probably 
> switch to Spark 1.x asap, due to Spark interface changes since 1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-2692) Make the Pig unit faciliities more generalizable and update javadocs

2014-10-03 Thread Juan Gentile (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158532#comment-14158532
 ] 

Juan Gentile commented on PIG-2692:
---

I kept extending PigTest... just in case someone needs this:

  public void overrideInput(String aliasInput, String[] input) throws 
IOException, ParseException {
super.registerScript();
StringBuilder sb = new StringBuilder();
Schema.stringifySchema(sb, getPigServer().dumpSchema(aliasInput), 
DataType.TUPLE);

final String destination = 
FileLocalizer.getTemporaryPath(getPigServer().getPigContext()).toString();
PigTest.getCluster().copyFromLocalFile(input, destination, true);
override(aliasInput, String.format("%s = LOAD '%s' USING PigStorage('%s') 
AS %s;", aliasInput, destination, "\\t", sb.toString()));
  }



> Make the Pig unit faciliities more generalizable and update javadocs
> 
>
> Key: PIG-2692
> URL: https://issues.apache.org/jira/browse/PIG-2692
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Priority: Minor
>
> This ticket has two goals for Pig unit:
> 1) Pig unit has a really nice method assertOutput(String inputAlias, String[] 
> inputValues, String outputAlias, String[] expectedOutputValues).  That method 
> lets you override an input alias variable with a hardcoded list of values. 
> That way, the script doesn't actually have to read that input variable from 
> hdfs or cassandra. Then, it runs the script and checks the specified output 
> alias variable against the expected set of values.  It's a really nice way to 
> test your entire pig script with a single method call, but only IF your 
> script has exactly 1 input and 1 output.  If you want to test more 
> complicated scripts, you have to jump through some hoops in order to override 
> more input variables. But, it would be fairly easy to change PigUnit so that 
> it can override any number of inputs and check any number of outputs and do 
> so easily.  That's basically the change that I put into the base testing 
> class I wrote. But, it would be better to push that into PigUnit itself, and 
> it's something that could easily be done in an afternoon.
> 2) Update javadocs for the pig unit test classes to make them more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4220) MapReduce-based Rank failing with NPE due to missing Counters

2014-10-03 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158514#comment-14158514
 ] 

Rohini Palaniswamy commented on PIG-4220:
-

Sorry that change was required only for backporting to 0.11. Your v2 patch is 
good and you can go ahead and check that in. 

> MapReduce-based Rank failing with NPE due to missing Counters
> -
>
> Key: PIG-4220
> URL: https://issues.apache.org/jira/browse/PIG-4220
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
> Attachments: pig-4220-v01.txt, pig-4220-v02.txt, pig-4220-v03.txt
>
>
> User reported his pig job with Rank was failing at 
> {noformat}
> Pig Stack Trace
> ---
> ERROR 2043: Unexpected error during execution.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected
> error during execution.
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1296)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1270)
> at org.apache.pig.PigServer.execute(PigServer.java:1260)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:354)
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:138)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:200)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:171)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:480)
> at org.apache.pig.Main.main(Main.java:157)
> Caused by: java.lang.RuntimeException: Error to read counters into Rank
> operation counterSize 277
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:384)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.updateMROpPlan(JobControlCompiler.java:330)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:385)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1285)
> ... 9 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:375)
> ... 12 more
> {noformat}
> (this is different from PIG:3985 NPE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4173) Move to Spark 1.x

2014-10-03 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-4173:
--
Attachment: PIG-4174_4.patch

This patch fixed the unit tests. 

The version of Spark used is 1.0.2. In Spark 1.1.0, the CoGroupRDD is changed 
and breaks the cogroup runtime. I'm looking into this.

> Move to Spark 1.x
> -
>
> Key: PIG-4173
> URL: https://issues.apache.org/jira/browse/PIG-4173
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: bc Wong
>Assignee: Richard Ding
> Attachments: PIG-4173.patch, PIG-4173_2.patch, PIG-4173_3.patch, 
> PIG-4174_4.patch, TEST-org.apache.pig.spark.TestSpark.txt
>
>
> The Spark branch is using Spark 0.9: 
> https://github.com/apache/pig/blob/spark/ivy.xml#L438. We should probably 
> switch to Spark 1.x asap, due to Spark interface changes since 1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4220) MapReduce-based Rank failing with NPE due to missing Counters

2014-10-03 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158481#comment-14158481
 ] 

Koji Noguchi commented on PIG-4220:
---

hmm. My v3 patch fails on compile error.  I'll upload another one.

> MapReduce-based Rank failing with NPE due to missing Counters
> -
>
> Key: PIG-4220
> URL: https://issues.apache.org/jira/browse/PIG-4220
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
> Attachments: pig-4220-v01.txt, pig-4220-v02.txt, pig-4220-v03.txt
>
>
> User reported his pig job with Rank was failing at 
> {noformat}
> Pig Stack Trace
> ---
> ERROR 2043: Unexpected error during execution.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected
> error during execution.
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1296)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1270)
> at org.apache.pig.PigServer.execute(PigServer.java:1260)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:354)
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:138)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:200)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:171)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:480)
> at org.apache.pig.Main.main(Main.java:157)
> Caused by: java.lang.RuntimeException: Error to read counters into Rank
> operation counterSize 277
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:384)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.updateMROpPlan(JobControlCompiler.java:330)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:385)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1285)
> ... 9 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:375)
> ... 12 more
> {noformat}
> (this is different from PIG:3985 NPE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4220) MapReduce-based Rank failing with NPE due to missing Counters

2014-10-03 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158476#comment-14158476
 ] 

Rohini Palaniswamy commented on PIG-4220:
-

+1

> MapReduce-based Rank failing with NPE due to missing Counters
> -
>
> Key: PIG-4220
> URL: https://issues.apache.org/jira/browse/PIG-4220
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
> Attachments: pig-4220-v01.txt, pig-4220-v02.txt, pig-4220-v03.txt
>
>
> User reported his pig job with Rank was failing at 
> {noformat}
> Pig Stack Trace
> ---
> ERROR 2043: Unexpected error during execution.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected
> error during execution.
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1296)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1270)
> at org.apache.pig.PigServer.execute(PigServer.java:1260)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:354)
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:138)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:200)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:171)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:480)
> at org.apache.pig.Main.main(Main.java:157)
> Caused by: java.lang.RuntimeException: Error to read counters into Rank
> operation counterSize 277
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:384)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.updateMROpPlan(JobControlCompiler.java:330)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:385)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1285)
> ... 9 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:375)
> ... 12 more
> {noformat}
> (this is different from PIG:3985 NPE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-2692) Make the Pig unit faciliities more generalizable and update javadocs

2014-10-03 Thread Juan Gentile (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158475#comment-14158475
 ] 

Juan Gentile commented on PIG-2692:
---

+1 for being able to override multiple inputs. Regarding the ordering of 
results, I did my own assert:

void assertUnsortedOutput(String[] expected, String alias) throws IOException, 
ParseException, AssertionError {
List expectedResults = new LinkedList<>(Arrays.asList(expected));
Iterator resultsIterator = getAlias(alias);

int size = 0;
while (resultsIterator.hasNext()) {
  String result = resultsIterator.next().toString();
  assertTrue(expectedResults.contains(result));
  expectedResults.remove(result);
  size++;
}

Assert.assertEquals(expected.length, size);
  }

> Make the Pig unit faciliities more generalizable and update javadocs
> 
>
> Key: PIG-2692
> URL: https://issues.apache.org/jira/browse/PIG-2692
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Priority: Minor
>
> This ticket has two goals for Pig unit:
> 1) Pig unit has a really nice method assertOutput(String inputAlias, String[] 
> inputValues, String outputAlias, String[] expectedOutputValues).  That method 
> lets you override an input alias variable with a hardcoded list of values. 
> That way, the script doesn't actually have to read that input variable from 
> hdfs or cassandra. Then, it runs the script and checks the specified output 
> alias variable against the expected set of values.  It's a really nice way to 
> test your entire pig script with a single method call, but only IF your 
> script has exactly 1 input and 1 output.  If you want to test more 
> complicated scripts, you have to jump through some hoops in order to override 
> more input variables. But, it would be fairly easy to change PigUnit so that 
> it can override any number of inputs and check any number of outputs and do 
> so easily.  That's basically the change that I put into the base testing 
> class I wrote. But, it would be better to push that into PigUnit itself, and 
> it's something that could easily be done in an afternoon.
> 2) Update javadocs for the pig unit test classes to make them more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4220) MapReduce-based Rank failing with NPE due to missing Counters

2014-10-03 Thread Koji Noguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-4220:
--
Attachment: pig-4220-v03.txt

bq. Need to call PigStatusReporter.setContext(context); before incrementCounter 
in PigReduceCounter

Thanks for catching that.

> MapReduce-based Rank failing with NPE due to missing Counters
> -
>
> Key: PIG-4220
> URL: https://issues.apache.org/jira/browse/PIG-4220
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
> Attachments: pig-4220-v01.txt, pig-4220-v02.txt, pig-4220-v03.txt
>
>
> User reported his pig job with Rank was failing at 
> {noformat}
> Pig Stack Trace
> ---
> ERROR 2043: Unexpected error during execution.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected
> error during execution.
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1296)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1270)
> at org.apache.pig.PigServer.execute(PigServer.java:1260)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:354)
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:138)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:200)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:171)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:480)
> at org.apache.pig.Main.main(Main.java:157)
> Caused by: java.lang.RuntimeException: Error to read counters into Rank
> operation counterSize 277
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:384)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.updateMROpPlan(JobControlCompiler.java:330)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:385)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1285)
> ... 9 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:375)
> ... 12 more
> {noformat}
> (this is different from PIG:3985 NPE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4220) MapReduce-based Rank failing with NPE due to missing Counters

2014-10-03 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158320#comment-14158320
 ] 

Rohini Palaniswamy commented on PIG-4220:
-

Need to call PigStatusReporter.setContext(context); before incrementCounter in 
PigReduceCounter

> MapReduce-based Rank failing with NPE due to missing Counters
> -
>
> Key: PIG-4220
> URL: https://issues.apache.org/jira/browse/PIG-4220
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
> Attachments: pig-4220-v01.txt, pig-4220-v02.txt
>
>
> User reported his pig job with Rank was failing at 
> {noformat}
> Pig Stack Trace
> ---
> ERROR 2043: Unexpected error during execution.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected
> error during execution.
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1296)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1270)
> at org.apache.pig.PigServer.execute(PigServer.java:1260)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:354)
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:138)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:200)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:171)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:480)
> at org.apache.pig.Main.main(Main.java:157)
> Caused by: java.lang.RuntimeException: Error to read counters into Rank
> operation counterSize 277
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:384)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.updateMROpPlan(JobControlCompiler.java:330)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:385)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1285)
> ... 9 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.saveCounters(JobControlCompiler.java:375)
> ... 12 more
> {noformat}
> (this is different from PIG:3985 NPE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4173) Move to Spark 1.x

2014-10-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PIG-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157756#comment-14157756
 ] 

Ángel Álvarez commented on PIG-4173:


I'm getting this error whenever I try to load any file from the HDFS:

2014-10-02 17:44:19,592 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
0: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 
3, tldam4602.lda): java.lang.IllegalStateException: unread block data

java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
   java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)

org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
 
It only fails when I run my two-lines script (LOAD+DUMP) on the cluster. 

I think ... it might have something to do with my client libraries or its order 
...

> Move to Spark 1.x
> -
>
> Key: PIG-4173
> URL: https://issues.apache.org/jira/browse/PIG-4173
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: bc Wong
>Assignee: Richard Ding
> Attachments: PIG-4173.patch, PIG-4173_2.patch, PIG-4173_3.patch, 
> TEST-org.apache.pig.spark.TestSpark.txt
>
>
> The Spark branch is using Spark 0.9: 
> https://github.com/apache/pig/blob/spark/ivy.xml#L438. We should probably 
> switch to Spark 1.x asap, due to Spark interface changes since 1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)