[jira] Subscription: PIG patch available

2016-12-07 Thread jira
Issue Subscription Filter: PIG patch available (31 issues) Subscriber: pigdaily Key Summary PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput https://issues.apache.org/jira/browse/PIG-5057 PIG-5043Slowstart not applied in Tez with PARALLEL clau

[jira] Subscription: PIG patch available

2016-12-07 Thread jira
Issue Subscription Filter: PIG patch available (27 issues) Subscriber: pigdaily Key Summary PIG-4926Modify the content of start.xml for spark mode https://issues-test.apache.org/jira/browse/PIG-4926 PIG-4922Deadlock between SpillableMemoryManager and InternalSortedBag

[jira] [Commented] (PIG-5054) Initialize SchemaTupleBackend correctly in backend in spark mode if spark job has more than 1 stage

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731296#comment-15731296 ] liyunzhang_intel commented on PIG-5054: --- [~szita]: you can directly configure spark c

[jira] [Comment Edited] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731254#comment-15731254 ] liyunzhang_intel edited comment on PIG-4952 at 12/8/16 6:26 AM: --

[jira] [Updated] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated PIG-4952: -- Attachment: PIG-4952_2.patch > Calculate the value of parallism for spark mode > -

[jira] [Updated] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated PIG-4952: -- Attachment: (was: PIG-4952_2.patch) > Calculate the value of parallism for spark mode > --

[jira] [Commented] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731260#comment-15731260 ] liyunzhang_intel commented on PIG-4952: --- [~nkollar]: {quote} How can we test that we a

[jira] [Updated] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated PIG-4952: -- Attachment: PIG-4952_2.patch > Calculate the value of parallism for spark mode > -

[jira] [Commented] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731254#comment-15731254 ] liyunzhang_intel commented on PIG-4952: --- [~kexianda]: "spark.default.parallism" is

[jira] [Commented] (PIG-5071) MapReduce concurrency Could Be Better

2016-12-07 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731184#comment-15731184 ] Daniel Dai commented on PIG-5071: - Yes, that's the best solution if works for you. > MapRed

[jira] [Commented] (PIG-5071) MapReduce concurrency Could Be Better

2016-12-07 Thread William Watson (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731128#comment-15731128 ] William Watson commented on PIG-5071: - Okay so I should just switch our engine to Tez an

[jira] [Commented] (PIG-5071) MapReduce concurrency Could Be Better

2016-12-07 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730806#comment-15730806 ] Daniel Dai commented on PIG-5071: - Pig use JobControl to submit multiple jobs at a time, eg,

[jira] [Commented] (PIG-5072) e2e Union_12 fails on typecast when oldpig=0.11

2016-12-07 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730778#comment-15730778 ] Daniel Dai commented on PIG-5072: - +1 > e2e Union_12 fails on typecast when oldpig=0.11 > -

[jira] [Commented] (PIG-5073) Skip e2e Limit_5 test for Tez

2016-12-07 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730775#comment-15730775 ] Daniel Dai commented on PIG-5073: - That's fine. +1. Did you see Limit_4 fail as well? > Ski

[jira] [Commented] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread Xianda Ke (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730669#comment-15730669 ] Xianda Ke commented on PIG-4952: Hi [~kellyzly] & [~nkollar], how about this: {code} // if

[jira] [Updated] (PIG-5073) Skip e2e Limit_5 test for Tez

2016-12-07 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-5073: -- Attachment: pig-5073-v01.patch Attaching a patch that would simply skip LIMIT_5 e2e for Tez. > Skip e2e L

[jira] [Assigned] (PIG-5073) Skip e2e Limit_5 test for Tez

2016-12-07 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi reassigned PIG-5073: - Assignee: Koji Noguchi > Skip e2e Limit_5 test for Tez > - > >

[jira] [Created] (PIG-5073) Skip e2e Limit_5 test for Tez

2016-12-07 Thread Koji Noguchi (JIRA)
Koji Noguchi created PIG-5073: - Summary: Skip e2e Limit_5 test for Tez Key: PIG-5073 URL: https://issues.apache.org/jira/browse/PIG-5073 Project: Pig Issue Type: Test Reporter: Koji N

[jira] [Updated] (PIG-5072) e2e Union_12 fails on typecast when oldpig=0.11

2016-12-07 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-5072: -- Attachment: pig-5072-v01.patch There are other unit tests that test typecastor after union. Here, giving

[jira] [Assigned] (PIG-5072) e2e Union_12 fails on typecast when oldpig=0.11

2016-12-07 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi reassigned PIG-5072: - Assignee: Koji Noguchi > e2e Union_12 fails on typecast when oldpig=0.11 >

[jira] [Created] (PIG-5072) e2e Union_12 fails on typecast when oldpig=0.11

2016-12-07 Thread Koji Noguchi (JIRA)
Koji Noguchi created PIG-5072: - Summary: e2e Union_12 fails on typecast when oldpig=0.11 Key: PIG-5072 URL: https://issues.apache.org/jira/browse/PIG-5072 Project: Pig Issue Type: Test

Re: How to test the efficiency of multiple join

2016-12-07 Thread mingda li
So I need to test something like count (*)? To use count, I need the following query: Bad_OrderIn = JOIN inventory BY inv_item_sk, catalog_sales BY cs_item_sk; Bad_OrderRes = JOIN Bad_OrderIn BY (cs_item_sk, cs_order_number), catalog_returns BY (cr_item_sk, cr_order_number); b = foreach B

Re: How to test the efficiency of multiple join

2016-12-07 Thread Rohini Palaniswamy
Limit 4 would make processing of join stop after 4 records. It is not a good idea to add it if you are testing performance of join. On Tue, Dec 6, 2016 at 8:13 PM mingda li wrote: > Thanks for your quick reply. If so, I can use the limit operator to compare > > good and bad join plan. It takes t

[jira] [Created] (PIG-5071) MapReduce concurrency Could Be Better

2016-12-07 Thread William Watson (JIRA)
William Watson created PIG-5071: --- Summary: MapReduce concurrency Could Be Better Key: PIG-5071 URL: https://issues.apache.org/jira/browse/PIG-5071 Project: Pig Issue Type: Wish Repo

[jira] [Commented] (PIG-5054) Initialize SchemaTupleBackend correctly in backend in spark mode if spark job has more than 1 stage

2016-12-07 Thread Adam Szita (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728996#comment-15728996 ] Adam Szita commented on PIG-5054: - [~kellyzly] I adjusted the number of executors with {cod

[jira] [Updated] (PIG-5054) Initialize SchemaTupleBackend correctly in backend in spark mode if spark job has more than 1 stage

2016-12-07 Thread Adam Szita (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated PIG-5054: Attachment: piglog2.txt > Initialize SchemaTupleBackend correctly in backend in spark mode if spark > job ha

[jira] [Commented] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728334#comment-15728334 ] Nandor Kollar commented on PIG-4952: [~kellyzly] you search for the max default parallel

[jira] [Commented] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728194#comment-15728194 ] Nandor Kollar commented on PIG-4952: This is simpler than I thought! Looks good, however

[jira] [Commented] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728146#comment-15728146 ] liyunzhang_intel commented on PIG-4952: --- [~xuefuz]: Help to check in PIG-4952.patch a

[jira] [Commented] (PIG-4952) Calculate the value of parallism for spark mode

2016-12-07 Thread Xianda Ke (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728138#comment-15728138 ] Xianda Ke commented on PIG-4952: max parallelism value from parent RDDs seems OK. LGTM +1

FW: File could only be replicated to 0 nodes, instead of 1

2016-12-07 Thread Zhang, Liyun
Hi: You can google “File could only be replicated to 0 nodes, instead of 1” there are several reasons for it. In most case, it is because of lack of disk space and all datanodes die. Best Regards Kelly Zhang/Zhang,Liyun From: mingda li [mailto:limingda1...@gmail.com] Sent: Wednesday, Dece