[jira] [Updated] (PIG-4438) Can not work when in limit after sort situation in spark mode

2015-03-16 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-4438:
--
Attachment: PIG-4438_1.patch

PIG-4438_1.patch is the initial patch. Meet some problems when running the 
script in the bug description. Need more time to figure out. Error info is:
{code}
269 Caused by: org.apache.spark.SparkException: Job aborted due to stage 
failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 
in stage 0.0 (TID 0, localhost): java.lang.ClassCastException: java.lang.Byte 
cannot be cast to java.util.Iterator
270 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.PackageConverter$PackageFunction.apply(PackageConverter.java:85)
271 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.PackageConverter$PackageFunction.apply(PackageConverter.java:48)
272 at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
273 at 
scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
274 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.readNext(POOutputConsumerIterator.java:35)
275 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.hasNext(POOutputConsumerIterator.java:64)
276 at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
277 at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
278 at 
scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
279 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.readNext(POOutputConsumerIterator.java:30)
280 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.hasNext(POOutputConsumerIterator.java:64)
281 at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
282 at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
283 at 
scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
284 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.readNext(POOutputConsumerIterator.java:30)
285 at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.hasNext(POOutputConsumerIterator.java:64)
286 at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
287 at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
288 at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:987)
289 at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:965)
290 at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
291 at org.apache.spark.scheduler.Task.run(Task.scala:56)
292 at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
293 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
294 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
295 at java.lang.Thread.run(Thread.java:744)
{code}

 Can not work when in limit after sort situation in spark mode
 ---

 Key: PIG-4438
 URL: https://issues.apache.org/jira/browse/PIG-4438
 Project: Pig
  Issue Type: Sub-task
  Components: spark
Reporter: liyunzhang_intel
Assignee: liyunzhang_intel
 Fix For: spark-branch

 Attachments: PIG-4438_1.patch


 when pig script executes order before limit in spark mode, the results 
 will be wrong.
 cat testlimit.txt
 1 orange
 3 coconut
 5 grape
 6 pear
 2 apple
 4 mango
 testlimit.pig:
 a = load './testlimit.txt' as (x:int, y:chararray);
 b = order a by x;
 c = limit b 1;
 store c into './testlimit.out';
 the result:
 1 orange
 2 apple
 3 coconut
 4 mango
 5 grape
 6 pear
 the correct result should be:
 1 orange



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Need JIRA/UNIX Admin that knows confluence --- Warren, NJ----$55 per hr on C2C

2015-03-16 Thread HCL GLOBAL



Thanks, 

Mohan Gadde 
HCL Global Systems, Inc 
24543 Indoplex Circle, Suite# 220 
Farmington Hills, MI 48335 
mo...@hclglobal.com 
Phone # 248 473 0720 EXT 197 
Cell: 302 983 8001  


-Original Message-
From: HCL GLOBAL [mailto:mo...@hclglobal.com] 
Sent: Monday, March 16, 2015 12:06 PM
To: dev@pig.apache.org
Cc: j...@apache.org
Subject: I need JIRA/UNIX admin that knows confluence Wireless in Warren- $55 
per hr on C2C 




Thanks, 

Mohan Gadde 
HCL Global Systems, Inc 
24543 Indoplex Circle, Suite# 220 
Farmington Hills, MI 48335 
mo...@hclglobal.com 
Phone # 248 473 0720 EXT 197 
Cell: 302 983 8001  

-Original Message-
From: j...@apache.org [mailto:j...@apache.org] 
Sent: Sunday, March 15, 2015 3:01 AM
To: dev@pig.apache.org
Subject: [jira] Subscription: PIG patch available

Issue Subscription
Filter: PIG patch available (23 issues)

Subscriber: pigdaily

Key Summary
PIG-4458Support UDFs in a FOREACH Before a Merge Join
https://issues.apache.org/jira/browse/PIG-4458
PIG-4455Should use DependencyOrderWalker instead of DepthFirstWalker in 
MRPrinter
https://issues.apache.org/jira/browse/PIG-4455
PIG-4452Embedded SQL using SQL instead of sql fails with string index 
out of range: -1 error
https://issues.apache.org/jira/browse/PIG-4452
PIG-4422Implement visitMergeJoin in SparkCompiler
https://issues.apache.org/jira/browse/PIG-4422
PIG-4377Skewed outer join produce wrong result in some cases
https://issues.apache.org/jira/browse/PIG-4377
PIG-4341Add CMX support to pig.tmpfilecompression.codec
https://issues.apache.org/jira/browse/PIG-4341
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4193Make collected group work with Spark
https://issues.apache.org/jira/browse/PIG-4193
PIG-4111Make Pig compiles with avro-1.7.7
https://issues.apache.org/jira/browse/PIG-4111
PIG-4004Upgrade the Pigmix queries from the (old) mapred API to mapreduce
https://issues.apache.org/jira/browse/PIG-4004
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3851Upgrade jline to 2.11
https://issues.apache.org/jira/browse/PIG-3851
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3635Fix e2e tests for Hadoop 2.X on Windows
https://issues.apache.org/jira/browse/PIG-3635
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-3294Allow Pig use Hive UDFs
https://issues.apache.org/jira/browse/PIG-3294

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328filterId=12322384




Re: Need JIRA/UNIX Admin that knows confluence --- Warren, NJ----$55 per hr on C2C

2015-03-16 Thread Andi Levin
Stop spamming the list

On Monday, March 16, 2015, HCL GLOBAL mo...@hclglobal.com wrote:




 Thanks,

 Mohan Gadde
 HCL Global Systems, Inc
 24543 Indoplex Circle, Suite# 220
 Farmington Hills, MI 48335
 mo...@hclglobal.com javascript:;
 Phone # 248 473 0720 EXT 197
 Cell: 302 983 8001


 -Original Message-
 From: HCL GLOBAL [mailto:mo...@hclglobal.com javascript:;]
 Sent: Monday, March 16, 2015 12:06 PM
 To: dev@pig.apache.org javascript:;
 Cc: j...@apache.org javascript:;
 Subject: I need JIRA/UNIX admin that knows confluence Wireless in Warren-
 $55 per hr on C2C




 Thanks,

 Mohan Gadde
 HCL Global Systems, Inc
 24543 Indoplex Circle, Suite# 220
 Farmington Hills, MI 48335
 mo...@hclglobal.com javascript:;
 Phone # 248 473 0720 EXT 197
 Cell: 302 983 8001

 -Original Message-
 From: j...@apache.org javascript:; [mailto:j...@apache.org
 javascript:;]
 Sent: Sunday, March 15, 2015 3:01 AM
 To: dev@pig.apache.org javascript:;
 Subject: [jira] Subscription: PIG patch available

 Issue Subscription
 Filter: PIG patch available (23 issues)

 Subscriber: pigdaily

 Key Summary
 PIG-4458Support UDFs in a FOREACH Before a Merge Join
 https://issues.apache.org/jira/browse/PIG-4458
 PIG-4455Should use DependencyOrderWalker instead of DepthFirstWalker
 in MRPrinter
 https://issues.apache.org/jira/browse/PIG-4455
 PIG-4452Embedded SQL using SQL instead of sql fails with string
 index out of range: -1 error
 https://issues.apache.org/jira/browse/PIG-4452
 PIG-4422Implement visitMergeJoin in SparkCompiler
 https://issues.apache.org/jira/browse/PIG-4422
 PIG-4377Skewed outer join produce wrong result in some cases
 https://issues.apache.org/jira/browse/PIG-4377
 PIG-4341Add CMX support to pig.tmpfilecompression.codec
 https://issues.apache.org/jira/browse/PIG-4341
 PIG-4323PackageConverter hanging in Spark
 https://issues.apache.org/jira/browse/PIG-4323
 PIG-4313StackOverflowError in LIMIT operation on Spark
 https://issues.apache.org/jira/browse/PIG-4313
 PIG-4251Pig on Storm
 https://issues.apache.org/jira/browse/PIG-4251
 PIG-4193Make collected group work with Spark
 https://issues.apache.org/jira/browse/PIG-4193
 PIG-4111Make Pig compiles with avro-1.7.7
 https://issues.apache.org/jira/browse/PIG-4111
 PIG-4004Upgrade the Pigmix queries from the (old) mapred API to
 mapreduce
 https://issues.apache.org/jira/browse/PIG-4004
 PIG-4002Disable combiner when map-side aggregation is used
 https://issues.apache.org/jira/browse/PIG-4002
 PIG-3952PigStorage accepts '-tagSplit' to return full split information
 https://issues.apache.org/jira/browse/PIG-3952
 PIG-3911Define unique fields with @OutputSchema
 https://issues.apache.org/jira/browse/PIG-3911
 PIG-3877Getting Geo Latitude/Longitude from Address Lines
 https://issues.apache.org/jira/browse/PIG-3877
 PIG-3873Geo distance calculation using Haversine
 https://issues.apache.org/jira/browse/PIG-3873
 PIG-3866Create ThreadLocal classloader per PigContext
 https://issues.apache.org/jira/browse/PIG-3866
 PIG-3851Upgrade jline to 2.11
 https://issues.apache.org/jira/browse/PIG-3851
 PIG-3668COR built-in function when atleast one of the coefficient
 values is NaN
 https://issues.apache.org/jira/browse/PIG-3668
 PIG-3635Fix e2e tests for Hadoop 2.X on Windows
 https://issues.apache.org/jira/browse/PIG-3635
 PIG-3587add functionality for rolling over dates
 https://issues.apache.org/jira/browse/PIG-3587
 PIG-3294Allow Pig use Hive UDFs
 https://issues.apache.org/jira/browse/PIG-3294

 You may edit this subscription at:

 https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328filterId=12322384




-- 
Thank You,

/andi

Andi Levin
Indeed.com | West Coast Engineering Talent
a...@indeed.com | (425) 954-7085
Indeed | How the World Works.™

https://www.youtube.com/watch?v=aa6hoIgYjtQlist=PL6qIzGkkiXFFktKEuZ-rCdPMWdgbOrmZB100%
of the talent in our new ad was hired on Indeed!
Watch how we made it happen
https://www.youtube.com/watch?v=aa6hoIgYjtQlist=PL6qIzGkkiXFFktKEuZ-rCdPMWdgbOrmZB


[jira] Subscription: PIG patch available

2015-03-16 Thread jira
Issue Subscription
Filter: PIG patch available (23 issues)

Subscriber: pigdaily

Key Summary
PIG-4458Support UDFs in a FOREACH Before a Merge Join
https://issues.apache.org/jira/browse/PIG-4458
PIG-4455Should use DependencyOrderWalker instead of DepthFirstWalker in 
MRPrinter
https://issues.apache.org/jira/browse/PIG-4455
PIG-4452Embedded SQL using SQL instead of sql fails with string index 
out of range: -1 error
https://issues.apache.org/jira/browse/PIG-4452
PIG-4422Implement visitMergeJoin in SparkCompiler
https://issues.apache.org/jira/browse/PIG-4422
PIG-4377Skewed outer join produce wrong result in some cases
https://issues.apache.org/jira/browse/PIG-4377
PIG-4341Add CMX support to pig.tmpfilecompression.codec
https://issues.apache.org/jira/browse/PIG-4341
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4193Make collected group work with Spark
https://issues.apache.org/jira/browse/PIG-4193
PIG-4111Make Pig compiles with avro-1.7.7
https://issues.apache.org/jira/browse/PIG-4111
PIG-4004Upgrade the Pigmix queries from the (old) mapred API to mapreduce
https://issues.apache.org/jira/browse/PIG-4004
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3851Upgrade jline to 2.11
https://issues.apache.org/jira/browse/PIG-3851
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3635Fix e2e tests for Hadoop 2.X on Windows
https://issues.apache.org/jira/browse/PIG-3635
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-3294Allow Pig use Hive UDFs
https://issues.apache.org/jira/browse/PIG-3294

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328filterId=12322384


[jira] [Commented] (PIG-4422) Implement visitMergeJoin in SparkCompiler

2015-03-16 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362873#comment-14362873
 ] 

liyunzhang_intel commented on PIG-4422:
---

[~praveenr019], merge join is a big feature, can you submit the detail design 
doc about how to implement it in spark like 
https://wiki.apache.org/pig/PigMergeJoin( a design about how to implement in 
MR).

 Implement visitMergeJoin in SparkCompiler
 -

 Key: PIG-4422
 URL: https://issues.apache.org/jira/browse/PIG-4422
 Project: Pig
  Issue Type: Sub-task
  Components: spark
Reporter: liyunzhang_intel
Assignee: Praveen Rachabattuni
 Fix For: spark-branch


 in PIG-4374_6.patch. SparkCompiler#visitMergeJoin is marked TODO



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)