date:20150324

[jira] [Updated] (PIG-4481) e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and StreamingPerformance_4 produce different result on Windows

2015-03-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4481:

Attachment: PIG-4481-3.patch

> e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and  
> StreamingPerformance_4 produce different result on Windows
> --
>
> Key: PIG-4481
> URL: https://issues.apache.org/jira/browse/PIG-4481
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: windows
> Fix For: 0.15.0
>
> Attachments: PIG-4481-1.patch, PIG-4481-2.patch, PIG-4481-3.patch
>
>
> ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and 
> StreamingPerformance_4 produce the wrong result on Windows. Since Pig compare 
> the test result with old version of Pig, which also produce wrong result, the 
> test still pass.
> The cause of the issue is the parameter passing under Windows. Some parameter 
> of executable cannot pass correctly on Windows. StreamingPerformance_3, 
> StreamingPerformance_4 requires a simple quoting change and command line 
> change. However, I didn't find a proper way to fix ComputeSpec_1 and 
> ComputeSpec_2. Changing the test slightly to get around (not changing the 
> intention of the test).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4481) e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and StreamingPerformance_4 produce different result on Windows

2015-03-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4481:

Attachment: PIG-4481-2.patch

> e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and  
> StreamingPerformance_4 produce different result on Windows
> --
>
> Key: PIG-4481
> URL: https://issues.apache.org/jira/browse/PIG-4481
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: windows
> Fix For: 0.15.0
>
> Attachments: PIG-4481-1.patch, PIG-4481-2.patch
>
>
> ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and 
> StreamingPerformance_4 produce the wrong result on Windows. Since Pig compare 
> the test result with old version of Pig, which also produce wrong result, the 
> test still pass.
> The cause of the issue is the parameter passing under Windows. Some parameter 
> of executable cannot pass correctly on Windows. StreamingPerformance_3, 
> StreamingPerformance_4 requires a simple quoting change and command line 
> change. However, I didn't find a proper way to fix ComputeSpec_1 and 
> ComputeSpec_2. Changing the test slightly to get around (not changing the 
> intention of the test).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4481) e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and StreamingPerformance_4 produce different result on Windows

2015-03-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4481:

Description: 
ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and StreamingPerformance_4 
produce the wrong result on Windows. Since Pig compare the test result with old 
version of Pig, which also produce wrong result, the test still pass.

The cause of the issue is the parameter passing under Windows. Some parameter 
of executable cannot pass correctly on Windows. StreamingPerformance_3, 
StreamingPerformance_4 requires a simple quoting change and command line 
change. However, I didn't find a proper way to fix ComputeSpec_1 and 
ComputeSpec_2. Changing the test slightly to get around (not changing the 
intention of the test).

  was:
ComputeSpec_1, ComputeSpec_2 and StreamingPerformance_3 produce the wrong 
result on Windows. Since Pig compare the test result with old version of Pig, 
which also produce wrong result, the test still pass.

The cause of the issue is the parameter passing under Windows. Some parameter 
of executable cannot pass correctly on Windows. StreamingPerformance_3 requires 
a simple quoting change. However, I didn't find a proper way to fix 
ComputeSpec_1 and ComputeSpec_2. Changing the test slightly to get around (not 
changing the intention of the test).

Summary: e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 
and  StreamingPerformance_4 produce different result on Windows  (was: e2e 
tests ComputeSpec_1, ComputeSpec_2 and StreamingPerformance_3 produce different 
result on Windows)

> e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and  
> StreamingPerformance_4 produce different result on Windows
> --
>
> Key: PIG-4481
> URL: https://issues.apache.org/jira/browse/PIG-4481
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: windows
> Fix For: 0.15.0
>
> Attachments: PIG-4481-1.patch, PIG-4481-2.patch
>
>
> ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and 
> StreamingPerformance_4 produce the wrong result on Windows. Since Pig compare 
> the test result with old version of Pig, which also produce wrong result, the 
> test still pass.
> The cause of the issue is the parameter passing under Windows. Some parameter 
> of executable cannot pass correctly on Windows. StreamingPerformance_3, 
> StreamingPerformance_4 requires a simple quoting change and command line 
> change. However, I didn't find a proper way to fix ComputeSpec_1 and 
> ComputeSpec_2. Changing the test slightly to get around (not changing the 
> intention of the test).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (PIG-4482) Pig pushes matches operator to HCatLoader causing script to fail

2015-03-24 Thread Rohini Palaniswamy (JIRA)

Rohini Palaniswamy created PIG-4482:
---

 Summary: Pig pushes matches operator to HCatLoader causing script 
to fail
 Key: PIG-4482
 URL: https://issues.apache.org/jira/browse/PIG-4482
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Rohini Palaniswamy


HCatLoader fails with as it cannot understand the matches operator. Even if we 
don't push down, specifying regular expression in partition key will be bad for 
performance as it will scan the whole table. Need to see if hcat can indeed 
support basic wildcard regular expression and translate it to LIKE clause in 
database query. 

{code}
java.io.IOException: MetaException(message:Error parsing partition filter;
lexer error: null; exception NoViableAltException(11@[]))
at
org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
at
org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:59)
at org.apache.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:121)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4434) Improve auto-parallelism for tez

2015-03-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4434:

Attachment: PIG-4434-2.patch

Several updates to the original patch. This patch also depends on some Tez 
fixes. I will link the Tez Jira later once being created.

> Improve auto-parallelism for tez
> 
>
> Key: PIG-4434
> URL: https://issues.apache.org/jira/browse/PIG-4434
> Project: Pig
>  Issue Type: Improvement
>  Components: tez
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.15.0
>
> Attachments: PIG-4434-1.patch, PIG-4434-2.patch
>
>
> Tez auto-parallelism currently has some limitation:
> 1. ShuffledVertexManager only decrease parallelism not increase
> 2. Pig currently exaggerate parallelism at frontend, ShuffledVertexManager 
> might get initial parallelism way large than actual, that would be costly
> Instead of that, we can gradually adjust initial vertex parallelism at 
> runtime once upstream vertexes finishes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

2015-03-24 Thread Ratandeep Ratti (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378313#comment-14378313
 ] 

Ratandeep Ratti commented on PIG-4417:
--

Some of the reasons I can think of for this feature
* Dependencies (udf jars) are much more declarative which this change, instead 
of copy jar to gateway and register in pig script, all the user has to do is 
add the ivy coordinates in his register command. This saves an annoying step. 
Also if the udf has other dependencies the annoyance is compounded.

* Platforms like Oozie can greatly benefit from this. Instead of shipping a 
large zip of pig udfs along with the pig-script, users could upload the minimal 
zip, the Pig script could take care of downloading those dependencies from the 
internal/external repository. Most commonly used udfs/jars will automatically 
be cached (ivy cached) . Ivy will resolve these commonly used jars from the 
local cache. Instead of say every user bundling up the udf jar in his/her zip.

* By having ivy coordinates for udf jars we know exactly what version of a udf 
jar is being used in a Pig script. 

> Pig's register command should support automatic fetching of jars from repo.
> ---
>
> Key: PIG-4417
> URL: https://issues.apache.org/jira/browse/PIG-4417
> Project: Pig
>  Issue Type: Improvement
>Reporter: Akshay Rai
>Assignee: Akshay Rai
>
> Currently Pig's register command takes a local path to a dependency jar . 
> This clutters the local file-system as users may forget to remove this jar 
> later.
> It would be nice if Pig supported a Gradle like notation to download the jar 
> from a repository.
> Ex: At the top of the Pig script a user could add
> register '::'; 
> It should be backward compatible and should support a local file path if so 
> desired.
> RB: https://reviews.apache.org/r/31662/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4458) Support UDFs in a FOREACH Before a Merge Join

2015-03-24 Thread William Watson (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378251#comment-14378251
 ] 

William Watson commented on PIG-4458:
-

No problem, thanks for merging it down.

> Support UDFs in a FOREACH Before a Merge Join
> -
>
> Key: PIG-4458
> URL: https://issues.apache.org/jira/browse/PIG-4458
> Project: Pig
>  Issue Type: New Feature
>Reporter: William Watson
>Assignee: William Watson
> Fix For: 0.15.0
>
> Attachments: PIG-4458.04.remove-merge-join-udf-restriction.patch, 
> PIG-4458.05.remove-merge-join-udf-restriction.patch
>
>
> Right now, the MapSideMergeValidator outright rejects any foreach that has a 
> UDF in it:
> {code}
> private boolean isAcceptableForEachOp(Operator lo) throws 
> LogicalToPhysicalTranslatorException {
> if (lo instanceof LOForEach) {
> OperatorPlan innerPlan = ((LOForEach) lo).getInnerPlan();
> validateMapSideMerge(innerPlan.getSinks(), innerPlan);
> return !containsUDFs((LOForEach) lo);
> } else {
> return false;
> }
> }
> {code}
> There is a TODO for this later on in that same class (inside containsUDFs):
> {code}
> // TODO (dvryaboy): in the future we could relax this rule by tracing what 
> fields
> // are being passed into the UDF, and only refusing if the UDF is working on 
> the
> // join key. Transforms of other fields should be ok.
> {code}
> We should do the TODO and relax this requirement or just remove it altogether



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

2015-03-24 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378244#comment-14378244
 ] 

Daniel Dai commented on PIG-4417:
-

[~akshayrai09], try to understand why you need this. Sounds like you can simply 
download jar and do register in Pig. Does other language use similar syntax?

> Pig's register command should support automatic fetching of jars from repo.
> ---
>
> Key: PIG-4417
> URL: https://issues.apache.org/jira/browse/PIG-4417
> Project: Pig
>  Issue Type: Improvement
>Reporter: Akshay Rai
>Assignee: Akshay Rai
>
> Currently Pig's register command takes a local path to a dependency jar . 
> This clutters the local file-system as users may forget to remove this jar 
> later.
> It would be nice if Pig supported a Gradle like notation to download the jar 
> from a repository.
> Ex: At the top of the Pig script a user could add
> register '::'; 
> It should be backward compatible and should support a local file path if so 
> desired.
> RB: https://reviews.apache.org/r/31662/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

2015-03-24 Thread Ratandeep Ratti (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378240#comment-14378240
 ] 

Ratandeep Ratti commented on PIG-4417:
--

Thanks [~alangates] for the quick feedback. [~akshayrai09] please update the 
ticket with the latest reviewed patch.

> Pig's register command should support automatic fetching of jars from repo.
> ---
>
> Key: PIG-4417
> URL: https://issues.apache.org/jira/browse/PIG-4417
> Project: Pig
>  Issue Type: Improvement
>Reporter: Akshay Rai
>Assignee: Akshay Rai
>
> Currently Pig's register command takes a local path to a dependency jar . 
> This clutters the local file-system as users may forget to remove this jar 
> later.
> It would be nice if Pig supported a Gradle like notation to download the jar 
> from a repository.
> Ex: At the top of the Pig script a user could add
> register '::'; 
> It should be backward compatible and should support a local file path if so 
> desired.
> RB: https://reviews.apache.org/r/31662/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

2015-03-24 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378198#comment-14378198
 ] 

Alan Gates commented on PIG-4417:
-

A couple of comments:
# Review board is great for reviewing the patch, but to be official it has to 
be attached here too.
# Why is the DownloadResolver all static?  Why not make it an object with a 
single method?  This is just a style gripe and not a blocker for checking in 
the code.

> Pig's register command should support automatic fetching of jars from repo.
> ---
>
> Key: PIG-4417
> URL: https://issues.apache.org/jira/browse/PIG-4417
> Project: Pig
>  Issue Type: Improvement
>Reporter: Akshay Rai
>Assignee: Akshay Rai
>
> Currently Pig's register command takes a local path to a dependency jar . 
> This clutters the local file-system as users may forget to remove this jar 
> later.
> It would be nice if Pig supported a Gradle like notation to download the jar 
> from a repository.
> Ex: At the top of the Pig script a user could add
> register '::'; 
> It should be backward compatible and should support a local file path if so 
> desired.
> RB: https://reviews.apache.org/r/31662/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

2015-03-24 Thread Rohini Palaniswamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4457:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for the review Daniel.

> Error is thrown by JobStats.getOutputSize() when storing to a MySql table
> -
>
> Key: PIG-4457
> URL: https://issues.apache.org/jira/browse/PIG-4457
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Kunal Kumar
>Assignee: Rohini Palaniswamy
> Fix For: 0.15.0
>
> Attachments: PIG-4457-1.patch
>
>
> Here is an example of stack trace printed to console output. Actually, this 
> is a warning message and does not make the job fail. The data is getting 
> stored to mysql table, but i have no idea why pig is looking to store output 
> on hdfs. I am using PIg along with Tez.
> using output size reader: 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader
> unable to find the output file
> java.io.FileNotFoundException: File 
> hdfs://pts0021.persistent.co.in:9000/user/shareinsights/filtered_stock_data 
> does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:647)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:101)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:701)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:701)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader.getOutputSize(FileBasedOutputSizeReader.java:81)
> at 
> org.apache.pig.tools.pigstats.JobStats.getOutputSize(JobStats.java:351)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.addOutputStatistics(TezVertexStats.java:270)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.accumulateStats(TezVertexStats.java:188)
> at 
> org.apache.pig.tools.pigstats.tez.TezDAGStats.accumulateStats(TezDAGStats.java:209)
> at 
> org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:180)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:194)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:167)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4475) Keys in AvroMapWrapper are not proper Pig types

2015-03-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4475:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

+1. Patch committed to trunk. Thanks Ratandeep, Anthony!

> Keys in AvroMapWrapper are not proper Pig types
> ---
>
> Key: PIG-4475
> URL: https://issues.apache.org/jira/browse/PIG-4475
> Project: Pig
>  Issue Type: Bug
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
> Fix For: 0.15.0
>
> Attachments: PIG-4475.patch, PIG-4475_1.patch
>
>
> AvroMapWrapper could contain utf8 keys, which are not supported by Pig. Pig 
> expects keys to be of type String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Feedback on PIG-4417

2015-03-24 Thread RD

Hi Folks,

   We'd appreciate it if we can get more feedback on this.

Ticket: https://issues.apache.org/jira/browse/PIG-4417
RB: https://reviews.apache.org/r/31662/

Best,
R

[jira] [Updated] (PIG-4458) Support UDFs in a FOREACH Before a Merge Join

2015-03-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4458:

   Resolution: Fixed
Fix Version/s: 0.15.0
 Assignee: William Watson
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Sorry miss this. The new patch looks good. Patch committed to trunk. Thanks 
William!

> Support UDFs in a FOREACH Before a Merge Join
> -
>
> Key: PIG-4458
> URL: https://issues.apache.org/jira/browse/PIG-4458
> Project: Pig
>  Issue Type: New Feature
>Reporter: William Watson
>Assignee: William Watson
> Fix For: 0.15.0
>
> Attachments: PIG-4458.04.remove-merge-join-udf-restriction.patch, 
> PIG-4458.05.remove-merge-join-udf-restriction.patch
>
>
> Right now, the MapSideMergeValidator outright rejects any foreach that has a 
> UDF in it:
> {code}
> private boolean isAcceptableForEachOp(Operator lo) throws 
> LogicalToPhysicalTranslatorException {
> if (lo instanceof LOForEach) {
> OperatorPlan innerPlan = ((LOForEach) lo).getInnerPlan();
> validateMapSideMerge(innerPlan.getSinks(), innerPlan);
> return !containsUDFs((LOForEach) lo);
> } else {
> return false;
> }
> }
> {code}
> There is a TODO for this later on in that same class (inside containsUDFs):
> {code}
> // TODO (dvryaboy): in the future we could relax this rule by tracing what 
> fields
> // are being passed into the UDF, and only refusing if the UDF is working on 
> the
> // join key. Transforms of other fields should be ok.
> {code}
> We should do the TODO and relax this requirement or just remove it altogether



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

2015-03-24 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378112#comment-14378112
 ] 

Daniel Dai commented on PIG-4457:
-

+1

> Error is thrown by JobStats.getOutputSize() when storing to a MySql table
> -
>
> Key: PIG-4457
> URL: https://issues.apache.org/jira/browse/PIG-4457
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Kunal Kumar
>Assignee: Rohini Palaniswamy
> Fix For: 0.15.0
>
> Attachments: PIG-4457-1.patch
>
>
> Here is an example of stack trace printed to console output. Actually, this 
> is a warning message and does not make the job fail. The data is getting 
> stored to mysql table, but i have no idea why pig is looking to store output 
> on hdfs. I am using PIg along with Tez.
> using output size reader: 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader
> unable to find the output file
> java.io.FileNotFoundException: File 
> hdfs://pts0021.persistent.co.in:9000/user/shareinsights/filtered_stock_data 
> does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:647)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:101)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:701)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:701)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader.getOutputSize(FileBasedOutputSizeReader.java:81)
> at 
> org.apache.pig.tools.pigstats.JobStats.getOutputSize(JobStats.java:351)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.addOutputStatistics(TezVertexStats.java:270)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.accumulateStats(TezVertexStats.java:188)
> at 
> org.apache.pig.tools.pigstats.tez.TezDAGStats.accumulateStats(TezDAGStats.java:209)
> at 
> org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:180)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:194)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:167)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4475) Keys in AvroMapWrapper are not proper Pig types

2015-03-24 Thread Ratandeep Ratti (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated PIG-4475:
-
Attachment: PIG-4475_1.patch

Addressing Anthony's comments

> Keys in AvroMapWrapper are not proper Pig types
> ---
>
> Key: PIG-4475
> URL: https://issues.apache.org/jira/browse/PIG-4475
> Project: Pig
>  Issue Type: Bug
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
> Fix For: 0.15.0
>
> Attachments: PIG-4475.patch, PIG-4475_1.patch
>
>
> AvroMapWrapper could contain utf8 keys, which are not supported by Pig. Pig 
> expects keys to be of type String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4458) Support UDFs in a FOREACH Before a Merge Join

2015-03-24 Thread William Watson (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377780#comment-14377780
 ] 

William Watson commented on PIG-4458:
-

Anything else I should do to get this merged down? Thanks!

> Support UDFs in a FOREACH Before a Merge Join
> -
>
> Key: PIG-4458
> URL: https://issues.apache.org/jira/browse/PIG-4458
> Project: Pig
>  Issue Type: New Feature
>Reporter: William Watson
> Attachments: PIG-4458.04.remove-merge-join-udf-restriction.patch, 
> PIG-4458.05.remove-merge-join-udf-restriction.patch
>
>
> Right now, the MapSideMergeValidator outright rejects any foreach that has a 
> UDF in it:
> {code}
> private boolean isAcceptableForEachOp(Operator lo) throws 
> LogicalToPhysicalTranslatorException {
> if (lo instanceof LOForEach) {
> OperatorPlan innerPlan = ((LOForEach) lo).getInnerPlan();
> validateMapSideMerge(innerPlan.getSinks(), innerPlan);
> return !containsUDFs((LOForEach) lo);
> } else {
> return false;
> }
> }
> {code}
> There is a TODO for this later on in that same class (inside containsUDFs):
> {code}
> // TODO (dvryaboy): in the future we could relax this rule by tracing what 
> fields
> // are being passed into the UDF, and only refusing if the UDF is working on 
> the
> // join key. Transforms of other fields should be ok.
> {code}
> We should do the TODO and relax this requirement or just remove it altogether



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4343) Tez auto parallelism fails at query compile time

2015-03-24 Thread Rohini Palaniswamy (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377594#comment-14377594
 ] 

Rohini Palaniswamy commented on PIG-4343:
-

This could be a dupe of PIG-4474

> Tez auto parallelism fails at query compile time
> 
>
> Key: PIG-4343
> URL: https://issues.apache.org/jira/browse/PIG-4343
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Affects Versions: 0.14.0
>Reporter: Cheolsoo Park
>
> I was running some legacy MR jobs in Tez mode to do perf benchmarks. But when 
> {{pig.tez.auto.parallelism}} is enabled (by default), Pig fails with the 
> following error-
> {code}
> org.apache.pig.impl.plan.VisitorException: ERROR 0: java.io.IOException: 
> Cannot estimate parallelism for scope-892, effective parallelism for 
> predecessor scope-892 is -1
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.ParallelismSetter.visitTezOp(ParallelismSetter.java:189)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:232)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.TezOperator.visit(TezOperator.java:49)
> at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:70)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.processLoadAndParallelism(TezLauncher.java:429)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher.launchPig(TezLauncher.java:143)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
> at org.apache.pig.LipstickPigServer.launchPlan(LipstickPigServer.java:151)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
> at org.apache.pig.PigServer.execute(PigServer.java:1364)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
> at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
> at com.netflix.lipstick.Main.run(Main.java:496)
> at com.netflix.lipstick.Main.main(Main.java:171)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.io.IOException: Cannot estimate parallelism for scope-892, 
> effective parallelism for predecessor scope-892 is -1
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.TezOperDependencyParallelismEstimator.estimateParallelism(TezOperDependencyParallelismEstimator.java:116)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.ParallelismSetter.visitTezOp(ParallelismSetter.java:134)
> ... 24 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

2015-03-24 Thread Rohini Palaniswamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4457:

Attachment: PIG-4457-1.patch

> Error is thrown by JobStats.getOutputSize() when storing to a MySql table
> -
>
> Key: PIG-4457
> URL: https://issues.apache.org/jira/browse/PIG-4457
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Kunal Kumar
>Assignee: Rohini Palaniswamy
> Fix For: 0.15.0
>
> Attachments: PIG-4457-1.patch
>
>
> Here is an example of stack trace printed to console output. Actually, this 
> is a warning message and does not make the job fail. The data is getting 
> stored to mysql table, but i have no idea why pig is looking to store output 
> on hdfs. I am using PIg along with Tez.
> using output size reader: 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader
> unable to find the output file
> java.io.FileNotFoundException: File 
> hdfs://pts0021.persistent.co.in:9000/user/shareinsights/filtered_stock_data 
> does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:647)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:101)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:701)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:701)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader.getOutputSize(FileBasedOutputSizeReader.java:81)
> at 
> org.apache.pig.tools.pigstats.JobStats.getOutputSize(JobStats.java:351)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.addOutputStatistics(TezVertexStats.java:270)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.accumulateStats(TezVertexStats.java:188)
> at 
> org.apache.pig.tools.pigstats.tez.TezDAGStats.accumulateStats(TezDAGStats.java:209)
> at 
> org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:180)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:194)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:167)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

2015-03-24 Thread Rohini Palaniswamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4457:

Fix Version/s: 0.15.0
Affects Version/s: 0.14.0
   Status: Patch Available  (was: Reopened)

> Error is thrown by JobStats.getOutputSize() when storing to a MySql table
> -
>
> Key: PIG-4457
> URL: https://issues.apache.org/jira/browse/PIG-4457
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Kunal Kumar
>Assignee: Rohini Palaniswamy
> Fix For: 0.15.0
>
> Attachments: PIG-4457-1.patch
>
>
> Here is an example of stack trace printed to console output. Actually, this 
> is a warning message and does not make the job fail. The data is getting 
> stored to mysql table, but i have no idea why pig is looking to store output 
> on hdfs. I am using PIg along with Tez.
> using output size reader: 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader
> unable to find the output file
> java.io.FileNotFoundException: File 
> hdfs://pts0021.persistent.co.in:9000/user/shareinsights/filtered_stock_data 
> does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:647)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:101)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:701)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:701)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader.getOutputSize(FileBasedOutputSizeReader.java:81)
> at 
> org.apache.pig.tools.pigstats.JobStats.getOutputSize(JobStats.java:351)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.addOutputStatistics(TezVertexStats.java:270)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.accumulateStats(TezVertexStats.java:188)
> at 
> org.apache.pig.tools.pigstats.tez.TezDAGStats.accumulateStats(TezDAGStats.java:209)
> at 
> org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:180)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:194)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:167)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

2015-03-24 Thread Rohini Palaniswamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reopened PIG-4457:
-
  Assignee: Rohini Palaniswamy

Reopening to have 
pig.stats.output.size.reader.unsupported=org.apache.hcatalog.pig.HCatStorer,org.apache.hive.hcatalog.pig.HCatStorer,org.apache.pig.piggybank.storage.DBStorage

added to pig-default.properties.

> Error is thrown by JobStats.getOutputSize() when storing to a MySql table
> -
>
> Key: PIG-4457
> URL: https://issues.apache.org/jira/browse/PIG-4457
> Project: Pig
>  Issue Type: Bug
>Reporter: Kunal Kumar
>Assignee: Rohini Palaniswamy
>
> Here is an example of stack trace printed to console output. Actually, this 
> is a warning message and does not make the job fail. The data is getting 
> stored to mysql table, but i have no idea why pig is looking to store output 
> on hdfs. I am using PIg along with Tez.
> using output size reader: 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader
> unable to find the output file
> java.io.FileNotFoundException: File 
> hdfs://pts0021.persistent.co.in:9000/user/shareinsights/filtered_stock_data 
> does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:647)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:101)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:701)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:701)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader.getOutputSize(FileBasedOutputSizeReader.java:81)
> at 
> org.apache.pig.tools.pigstats.JobStats.getOutputSize(JobStats.java:351)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.addOutputStatistics(TezVertexStats.java:270)
> at 
> org.apache.pig.tools.pigstats.tez.TezVertexStats.accumulateStats(TezVertexStats.java:188)
> at 
> org.apache.pig.tools.pigstats.tez.TezDAGStats.accumulateStats(TezDAGStats.java:209)
> at 
> org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:180)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:194)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:167)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4439) Getting exception java.lang.VerifyError: class org.apache.tez.dag.api.records.DAGProtos$DAGPlan overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFiel

2015-03-24 Thread Kunal Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377420#comment-14377420
 ] 

Kunal Kumar commented on PIG-4439:
--

Thanks Daniel. I used the tez jars coming with Pig-0.14 distribution and it is 
working fine now. Earlier I was building the jars using tez-0.5.2 source.

> Getting exception java.lang.VerifyError: class 
> org.apache.tez.dag.api.records.DAGProtos$DAGPlan overrides final method 
> getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet, while trying to run 
> Tez-0.5.2 on pig-0.14
> -
>
> Key: PIG-4439
> URL: https://issues.apache.org/jira/browse/PIG-4439
> Project: Pig
>  Issue Type: Bug
>Reporter: Kunal Kumar
>
> Exception in thread "main" java.lang.VerifyError: class 
> org.apache.tez.dag.api.records.DAGProtos$DAGPlan overrides final method 
> getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
>   at java.lang.Class.getMethod0(Class.java:2774)
>   at java.lang.Class.getMethod(Class.java:1663)
>   at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
>   at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] Subscription: PIG patch available

2015-03-24 Thread jira

Issue Subscription
Filter: PIG patch available (26 issues)

Subscriber: pigdaily

Key Summary
PIG-4481e2e tests ComputeSpec_1, ComputeSpec_2 and StreamingPerformance_3 
produce different result on Windows
https://issues.apache.org/jira/browse/PIG-4481
PIG-4476Fix logging in AvroStorage* classes and SchemaTuple class
https://issues.apache.org/jira/browse/PIG-4476
PIG-4475Keys in AvroMapWrapper are not proper Pig types
https://issues.apache.org/jira/browse/PIG-4475
PIG-4458Support UDFs in a FOREACH Before a Merge Join
https://issues.apache.org/jira/browse/PIG-4458
PIG-4455Should use DependencyOrderWalker instead of DepthFirstWalker in 
MRPrinter
https://issues.apache.org/jira/browse/PIG-4455
PIG-4452Embedded SQL using "SQL" instead of "sql" fails with string index 
out of range: -1 error
https://issues.apache.org/jira/browse/PIG-4452
PIG-4422Implement visitMergeJoin in SparkCompiler
https://issues.apache.org/jira/browse/PIG-4422
PIG-4377Skewed outer join produce wrong result in some cases
https://issues.apache.org/jira/browse/PIG-4377
PIG-4341Add CMX support to pig.tmpfilecompression.codec
https://issues.apache.org/jira/browse/PIG-4341
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4193Make collected group work with Spark
https://issues.apache.org/jira/browse/PIG-4193
PIG-4111Make Pig compiles with avro-1.7.7
https://issues.apache.org/jira/browse/PIG-4111
PIG-4004Upgrade the Pigmix queries from the (old) mapred API to mapreduce
https://issues.apache.org/jira/browse/PIG-4004
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3851Upgrade jline to 2.11
https://issues.apache.org/jira/browse/PIG-3851
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3635Fix e2e tests for Hadoop 2.X on Windows
https://issues.apache.org/jira/browse/PIG-3635
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-3294Allow Pig use Hive UDFs
https://issues.apache.org/jira/browse/PIG-3294

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384

[jira] [Updated] (PIG-4481) e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and StreamingPerformance_4 produce different result on Windows

[jira] [Updated] (PIG-4481) e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and StreamingPerformance_4 produce different result on Windows

[jira] [Updated] (PIG-4481) e2e tests ComputeSpec_1, ComputeSpec_2, StreamingPerformance_3 and StreamingPerformance_4 produce different result on Windows

[jira] [Created] (PIG-4482) Pig pushes matches operator to HCatLoader causing script to fail

[jira] [Updated] (PIG-4434) Improve auto-parallelism for tez

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

[jira] [Commented] (PIG-4458) Support UDFs in a FOREACH Before a Merge Join

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

[jira] [Commented] (PIG-4417) Pig's register command should support automatic fetching of jars from repo.

[jira] [Updated] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

[jira] [Updated] (PIG-4475) Keys in AvroMapWrapper are not proper Pig types

Feedback on PIG-4417

[jira] [Updated] (PIG-4458) Support UDFs in a FOREACH Before a Merge Join

[jira] [Commented] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

[jira] [Updated] (PIG-4475) Keys in AvroMapWrapper are not proper Pig types

[jira] [Commented] (PIG-4458) Support UDFs in a FOREACH Before a Merge Join

[jira] [Commented] (PIG-4343) Tez auto parallelism fails at query compile time

[jira] [Updated] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

[jira] [Updated] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

[jira] [Reopened] (PIG-4457) Error is thrown by JobStats.getOutputSize() when storing to a MySql table

[jira] [Commented] (PIG-4439) Getting exception java.lang.VerifyError: class org.apache.tez.dag.api.records.DAGProtos$DAGPlan overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFiel

[jira] Subscription: PIG patch available

23 matches

Site Navigation

Mail list logo

Footer information