[jira] [Commented] (PIG-3461) Rewrite PartitionFilterOptimizer to make it work for all the cases

2013-09-17 Thread Aniket Mokashi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770466#comment-13770466
 ] 

Aniket Mokashi commented on PIG-3461:
-

RB: https://reviews.apache.org/r/14196/

> Rewrite PartitionFilterOptimizer to make it work for all the cases
> --
>
> Key: PIG-3461
> URL: https://issues.apache.org/jira/browse/PIG-3461
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3461-2.patch, PIG-3461-4.patch
>
>
> Current algorithm for Partition Filter pushdown identification fails in 
> several corner cases. We need to rewrite its logic so that it works in all 
> cases and does the maximum possible filter pushdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3461) Rewrite PartitionFilterOptimizer to make it work for all the cases

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3461:


Status: Patch Available  (was: Open)

> Rewrite PartitionFilterOptimizer to make it work for all the cases
> --
>
> Key: PIG-3461
> URL: https://issues.apache.org/jira/browse/PIG-3461
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3461-2.patch, PIG-3461-4.patch
>
>
> Current algorithm for Partition Filter pushdown identification fails in 
> several corner cases. We need to rewrite its logic so that it works in all 
> cases and does the maximum possible filter pushdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3461) Rewrite PartitionFilterOptimizer to make it work for all the cases

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3461:


Attachment: PIG-3461-4.patch

> Rewrite PartitionFilterOptimizer to make it work for all the cases
> --
>
> Key: PIG-3461
> URL: https://issues.apache.org/jira/browse/PIG-3461
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3461-2.patch, PIG-3461-4.patch
>
>
> Current algorithm for Partition Filter pushdown identification fails in 
> several corner cases. We need to rewrite its logic so that it works in all 
> cases and does the maximum possible filter pushdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3465) Fix problems with LogicalExpressionSimplifier and turn on the optimizer by default

2013-09-17 Thread Aniket Mokashi (JIRA)
Aniket Mokashi created PIG-3465:
---

 Summary: Fix problems with LogicalExpressionSimplifier and turn on 
the optimizer by default 
 Key: PIG-3465
 URL: https://issues.apache.org/jira/browse/PIG-3465
 Project: Pig
  Issue Type: Bug
Reporter: Aniket Mokashi




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3464) Mark ExecType and ExecutionEngine interfaces as evolving

2013-09-17 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3464:
---

Attachment: PIG-3464-1.patch

The attached adds the evolving annotation to the new interfaces: ExecType and 
ExecutionEngine.

> Mark ExecType and ExecutionEngine interfaces as evolving
> 
>
> Key: PIG-3464
> URL: https://issues.apache.org/jira/browse/PIG-3464
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.12
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.12
>
> Attachments: PIG-3464-1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3464) Mark ExecType and ExecutionEngine interfaces as evolving

2013-09-17 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3464:
---

Status: Patch Available  (was: Open)

> Mark ExecType and ExecutionEngine interfaces as evolving
> 
>
> Key: PIG-3464
> URL: https://issues.apache.org/jira/browse/PIG-3464
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.12
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.12
>
> Attachments: PIG-3464-1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3464) Mark ExecType and ExecutionEngine interfaces as evolving

2013-09-17 Thread Cheolsoo Park (JIRA)
Cheolsoo Park created PIG-3464:
--

 Summary: Mark ExecType and ExecutionEngine interfaces as evolving
 Key: PIG-3464
 URL: https://issues.apache.org/jira/browse/PIG-3464
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.12
Reporter: Cheolsoo Park
Assignee: Cheolsoo Park
 Fix For: 0.12




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3457) Provide backward compatibility for PigStatsUtil and JobStats

2013-09-17 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770286#comment-13770286
 ] 

Cheolsoo Park commented on PIG-3457:


[~daijy], isn't your patch ready to commit? It looks good to me.

> Provide backward compatibility for PigStatsUtil and JobStats
> 
>
> Key: PIG-3457
> URL: https://issues.apache.org/jira/browse/PIG-3457
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12
>
> Attachments: PIG-3457-1.patch
>
>
> PIG-3419 restructured PigStatsUtil which break downstream projects such as 
> Oozie. Oozie uses PigStatsUtil.{HDFS_BYTES_WRITTEN, MAP_INPUT_RECORDS, 
> MAP_OUTPUT_RECORDS, REDUCE_INPUT_RECORDS, REDUCE_OUTPUT_RECORDS}. We need to 
> provide a backward compatible way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Subscription: PIG patch available

2013-09-17 Thread jira
Issue Subscription
Filter: PIG patch available (15 issues)

Subscriber: pigdaily

Key Summary
PIG-3455Pig 0.11.1 OutOfMemory error
https://issues.apache.org/jira/browse/PIG-3455
PIG-3451EvalFunc ctor reflection to determine value of type param T is 
brittle
https://issues.apache.org/jira/browse/PIG-3451
PIG-3449Move JobCreationException to 
org.apache.pig.backend.hadoop.executionengine
https://issues.apache.org/jira/browse/PIG-3449
PIG-3448Tez backend layout
https://issues.apache.org/jira/browse/PIG-3448
PIG-3441Allow Pig to use default resources from Configuration objects
https://issues.apache.org/jira/browse/PIG-3441
PIG-3434Null subexpression in bincond nullifies outer tuple (or bag)
https://issues.apache.org/jira/browse/PIG-3434
PIG-3388No support for Regex for row filter in 
org.apache.pig.backend.hadoop.hbase.HBaseStorage
https://issues.apache.org/jira/browse/PIG-3388
PIG-3367Add assert keyword (operator) in pig
https://issues.apache.org/jira/browse/PIG-3367
PIG-3325Adding a tuple to a bag is slow
https://issues.apache.org/jira/browse/PIG-3325
PIG-3292Logical plan invalid state: duplicate uid in schema during 
self-join to get cross product
https://issues.apache.org/jira/browse/PIG-3292
PIG-3257Add unique identifier UDF
https://issues.apache.org/jira/browse/PIG-3257
PIG-3199Expose LogicalPlan via PigServer API
https://issues.apache.org/jira/browse/PIG-3199
PIG-3088Add a builtin udf which removes prefixes
https://issues.apache.org/jira/browse/PIG-3088
PIG-3021Split results missing records when there is null values in the 
column comparison
https://issues.apache.org/jira/browse/PIG-3021
PIG-2417Streaming UDFs -  allow users to easily write UDFs in scripting 
languages with no JVM implementation.
https://issues.apache.org/jira/browse/PIG-2417

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Commented] (PIG-3455) Pig 0.11.1 OutOfMemory error

2013-09-17 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770256#comment-13770256
 ] 

Bill Graham commented on PIG-3455:
--

+1, much better.

> Pig 0.11.1 OutOfMemory error
> 
>
> Key: PIG-3455
> URL: https://issues.apache.org/jira/browse/PIG-3455
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Shubham Chopra
>Priority: Critical
> Fix For: 0.12, 0.11.2
>
> Attachments: PIG-3455-1.patch
>
>
> When running Pig on a relatively large script (around 1.5k lines, 85 
> assignments), Pig fails with the following error even before any jobs are 
> fired:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. Java heap space
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2882)
> at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
> at java.lang.StringBuilder.append(StringBuilder.java:119)
> at 
> org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.depthFirstLP(LogicalPlanPrinter.java:83)
> at 
> org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.visit(LogicalPlanPrinter.java:69)
> at 
> org.apache.pig.newplan.logical.relational.LogicalPlan.getSignature(LogicalPlan.java:122)
> at org.apache.pig.PigServer.execute(PigServer.java:1237)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
> at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:604)
> at org.apache.pig.Main.main(Main.java:157)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> The same script works fine with Pig-0.10.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3455) Pig 0.11.1 OutOfMemory error

2013-09-17 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-3455:


Attachment: PIG-3455-1.patch

> Pig 0.11.1 OutOfMemory error
> 
>
> Key: PIG-3455
> URL: https://issues.apache.org/jira/browse/PIG-3455
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Shubham Chopra
>Priority: Critical
> Fix For: 0.12, 0.11.2
>
> Attachments: PIG-3455-1.patch
>
>
> When running Pig on a relatively large script (around 1.5k lines, 85 
> assignments), Pig fails with the following error even before any jobs are 
> fired:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. Java heap space
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2882)
> at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
> at java.lang.StringBuilder.append(StringBuilder.java:119)
> at 
> org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.depthFirstLP(LogicalPlanPrinter.java:83)
> at 
> org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.visit(LogicalPlanPrinter.java:69)
> at 
> org.apache.pig.newplan.logical.relational.LogicalPlan.getSignature(LogicalPlan.java:122)
> at org.apache.pig.PigServer.execute(PigServer.java:1237)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
> at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:604)
> at org.apache.pig.Main.main(Main.java:157)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> The same script works fine with Pig-0.10.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3455) Pig 0.11.1 OutOfMemory error

2013-09-17 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-3455:


Status: Patch Available  (was: Open)

> Pig 0.11.1 OutOfMemory error
> 
>
> Key: PIG-3455
> URL: https://issues.apache.org/jira/browse/PIG-3455
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Shubham Chopra
>Priority: Critical
> Fix For: 0.12, 0.11.2
>
> Attachments: PIG-3455-1.patch
>
>
> When running Pig on a relatively large script (around 1.5k lines, 85 
> assignments), Pig fails with the following error even before any jobs are 
> fired:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. Java heap space
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2882)
> at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
> at java.lang.StringBuilder.append(StringBuilder.java:119)
> at 
> org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.depthFirstLP(LogicalPlanPrinter.java:83)
> at 
> org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.visit(LogicalPlanPrinter.java:69)
> at 
> org.apache.pig.newplan.logical.relational.LogicalPlan.getSignature(LogicalPlan.java:122)
> at org.apache.pig.PigServer.execute(PigServer.java:1237)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
> at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:604)
> at org.apache.pig.Main.main(Main.java:157)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> The same script works fine with Pig-0.10.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3367) Add assert keyword (operator) in pig

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3367:


Status: Open  (was: Patch Available)

> Add assert keyword (operator) in pig
> 
>
> Key: PIG-3367
> URL: https://issues.apache.org/jira/browse/PIG-3367
> Project: Pig
>  Issue Type: New Feature
>  Components: parser
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3367-2.patch, PIG-3367.patch
>
>
> Assert operator can be used for data validation. With assert you can write 
> script as following-
> {code}
> a = load 'something' as (a0:int, a1:int);
> assert a by a0 > 0, 'a cant be negative for reasons';
> {code}
> This script will fail if assert is violated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3367) Add assert keyword (operator) in pig

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3367:


Attachment: PIG-3367-2.patch

Incorporated code review comments

> Add assert keyword (operator) in pig
> 
>
> Key: PIG-3367
> URL: https://issues.apache.org/jira/browse/PIG-3367
> Project: Pig
>  Issue Type: New Feature
>  Components: parser
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3367-2.patch, PIG-3367.patch
>
>
> Assert operator can be used for data validation. With assert you can write 
> script as following-
> {code}
> a = load 'something' as (a0:int, a1:int);
> assert a by a0 > 0, 'a cant be negative for reasons';
> {code}
> This script will fail if assert is violated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3367) Add assert keyword (operator) in pig

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3367:


Status: Patch Available  (was: Open)

> Add assert keyword (operator) in pig
> 
>
> Key: PIG-3367
> URL: https://issues.apache.org/jira/browse/PIG-3367
> Project: Pig
>  Issue Type: New Feature
>  Components: parser
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3367-2.patch, PIG-3367.patch
>
>
> Assert operator can be used for data validation. With assert you can write 
> script as following-
> {code}
> a = load 'something' as (a0:int, a1:int);
> assert a by a0 > 0, 'a cant be negative for reasons';
> {code}
> This script will fail if assert is violated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3287) MultiQueryOptimizer can prevent CombinerOptimizer from working

2013-09-17 Thread Christon DeWan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770022#comment-13770022
 ] 

Christon DeWan commented on PIG-3287:
-

This is a very long workflow and needs both optimizations to work effectively. 
For now I've refactored my flow to avoid needing both at once.

> MultiQueryOptimizer can prevent CombinerOptimizer from working
> --
>
> Key: PIG-3287
> URL: https://issues.apache.org/jira/browse/PIG-3287
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.10.1
>Reporter: Christon DeWan
>
> The CombinerOptimizer does not operate on the script below. As a result, all 
> work is done in the reducer(s), killing performance. Removing one STORE or 
> refactoring the query to use a single FOREACH after the group allows the 
> CombinerOptimizer to work.
> {noformat}
> %declare DUMMY `bash -c '(for (( i=0; \$i < 10; i++ )); do echo \$i 5; done) 
> | hadoop fs -put - /tmp/test_data.tsv; true'`
> s = LOAD '/tmp/test_data.tsv' USING PigStorage(' ') AS (n:long, g:long);
> grouped = GROUP s BY g;
> counted = FOREACH grouped GENERATE flatten($0), COUNT_STAR($1);
> STORE counted INTO '/tmp/test_count';
> summed = FOREACH grouped GENERATE flatten($0), SUM($1.n);
> STORE summed INTO '/tmp/test_sum';
> FS -rmr /tmp/test_{data.tsv,count,sum}
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2417) Streaming UDFs - allow users to easily write UDFs in scripting languages with no JVM implementation.

2013-09-17 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770005#comment-13770005
 ] 

Daniel Dai commented on PIG-2417:
-

Thanks. Wish to check this in before branch.

> Streaming UDFs -  allow users to easily write UDFs in scripting languages 
> with no JVM implementation.
> -
>
> Key: PIG-2417
> URL: https://issues.apache.org/jira/browse/PIG-2417
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.12
>Reporter: Jeremy Karn
> Fix For: 0.12
>
> Attachments: PIG-2417-4.patch, PIG-2417-5.patch, PIG-2417-6.patch, 
> PIG-2417-7.patch, PIG-2417-8.patch, PIG-2417-e2e.patch, streaming2.patch, 
> streaming3.patch, streaming.patch
>
>
> The goal of Streaming UDFs is to allow users to easily write UDFs in 
> scripting languages with no JVM implementation or a limited JVM 
> implementation.  The initial proposal is outlined here: 
> https://cwiki.apache.org/confluence/display/PIG/StreamingUDFs.
> In order to implement this we need new syntax to distinguish a streaming UDF 
> from an embedded JVM UDF.  I'd propose something like the following (although 
> I'm not sure 'language' is the best term to be using):
> {code}define my_streaming_udfs language('python') 
> ship('my_streaming_udfs.py'){code}
> We'll also need a language-specific controller script that gets shipped to 
> the cluster which is responsible for reading the input stream, deserializing 
> the input data, passing it to the user written script, serializing that 
> script output, and writing that to the output stream.
> Finally, we'll need to add a StreamingUDF class that extends evalFunc.  This 
> class will likely share some of the existing code in POStream and 
> ExecutableManager (where it make sense to pull out shared code) to stream 
> data to/from the controller script.
> One alternative approach to creating the StreamingUDF EvalFunc is to use the 
> POStream operator directly.  This would involve inserting the POStream 
> operator instead of the POUserFunc operator whenever we encountered a 
> streaming UDF while building the physical plan.  This approach seemed 
> problematic because there would need to be a lot of changes in order to 
> support POStream in all of the places we want to be able use UDFs (For 
> example - to operate on a single field inside of a for each statement).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3461) Rewrite PartitionFilterOptimizer to make it work for all the cases

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3461:


Assignee: Aniket Mokashi
  Status: Patch Available  (was: Open)

> Rewrite PartitionFilterOptimizer to make it work for all the cases
> --
>
> Key: PIG-3461
> URL: https://issues.apache.org/jira/browse/PIG-3461
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3461-2.patch
>
>
> Current algorithm for Partition Filter pushdown identification fails in 
> several corner cases. We need to rewrite its logic so that it works in all 
> cases and does the maximum possible filter pushdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2417) Streaming UDFs - allow users to easily write UDFs in scripting languages with no JVM implementation.

2013-09-17 Thread Jeremy Karn (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769981#comment-13769981
 ] 

Jeremy Karn commented on PIG-2417:
--

I'll update the patch (probably tomorrow) to take advantage of PIG-3255.

I think the only outstanding comment in the review board is how the logging 
works with Hadoop2.  I'm hoping to get a chance to test that in the next couple 
of days.

> Streaming UDFs -  allow users to easily write UDFs in scripting languages 
> with no JVM implementation.
> -
>
> Key: PIG-2417
> URL: https://issues.apache.org/jira/browse/PIG-2417
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.12
>Reporter: Jeremy Karn
> Fix For: 0.12
>
> Attachments: PIG-2417-4.patch, PIG-2417-5.patch, PIG-2417-6.patch, 
> PIG-2417-7.patch, PIG-2417-8.patch, PIG-2417-e2e.patch, streaming2.patch, 
> streaming3.patch, streaming.patch
>
>
> The goal of Streaming UDFs is to allow users to easily write UDFs in 
> scripting languages with no JVM implementation or a limited JVM 
> implementation.  The initial proposal is outlined here: 
> https://cwiki.apache.org/confluence/display/PIG/StreamingUDFs.
> In order to implement this we need new syntax to distinguish a streaming UDF 
> from an embedded JVM UDF.  I'd propose something like the following (although 
> I'm not sure 'language' is the best term to be using):
> {code}define my_streaming_udfs language('python') 
> ship('my_streaming_udfs.py'){code}
> We'll also need a language-specific controller script that gets shipped to 
> the cluster which is responsible for reading the input stream, deserializing 
> the input data, passing it to the user written script, serializing that 
> script output, and writing that to the output stream.
> Finally, we'll need to add a StreamingUDF class that extends evalFunc.  This 
> class will likely share some of the existing code in POStream and 
> ExecutableManager (where it make sense to pull out shared code) to stream 
> data to/from the controller script.
> One alternative approach to creating the StreamingUDF EvalFunc is to use the 
> POStream operator directly.  This would involve inserting the POStream 
> operator instead of the POUserFunc operator whenever we encountered a 
> streaming UDF while building the physical plan.  This approach seemed 
> problematic because there would need to be a lot of changes in order to 
> support POStream in all of the places we want to be able use UDFs (For 
> example - to operate on a single field inside of a for each statement).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3463) Pig should use hadoop local mode for small jobs

2013-09-17 Thread Aniket Mokashi (JIRA)
Aniket Mokashi created PIG-3463:
---

 Summary: Pig should use hadoop local mode for small jobs
 Key: PIG-3463
 URL: https://issues.apache.org/jira/browse/PIG-3463
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.11.1
Reporter: Aniket Mokashi
 Fix For: 0.12


Pig should use hadoop local mode for small jobs - few mappers, few reducers and 
few mb of data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3367) Add assert keyword (operator) in pig

2013-09-17 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769985#comment-13769985
 ] 

Julien Le Dem commented on PIG-3367:


Looks good to me.
Is there a way you can factor out some of the content of buildAssertOp() ? It 
looks like some of this would be common with other methods.

> Add assert keyword (operator) in pig
> 
>
> Key: PIG-3367
> URL: https://issues.apache.org/jira/browse/PIG-3367
> Project: Pig
>  Issue Type: New Feature
>  Components: parser
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3367.patch
>
>
> Assert operator can be used for data validation. With assert you can write 
> script as following-
> {code}
> a = load 'something' as (a0:int, a1:int);
> assert a by a0 > 0, 'a cant be negative for reasons';
> {code}
> This script will fail if assert is violated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3461) Rewrite PartitionFilterOptimizer to make it work for all the cases

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3461:


Attachment: PIG-3461-2.patch

> Rewrite PartitionFilterOptimizer to make it work for all the cases
> --
>
> Key: PIG-3461
> URL: https://issues.apache.org/jira/browse/PIG-3461
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3461-2.patch
>
>
> Current algorithm for Partition Filter pushdown identification fails in 
> several corner cases. We need to rewrite its logic so that it works in all 
> cases and does the maximum possible filter pushdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2417) Streaming UDFs - allow users to easily write UDFs in scripting languages with no JVM implementation.

2013-09-17 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769970#comment-13769970
 ] 

Daniel Dai commented on PIG-2417:
-

[~jeremykarn]
With PIG-3255 check in, do you want to add this optimization? Also can you 
respond to my comments in review board.

> Streaming UDFs -  allow users to easily write UDFs in scripting languages 
> with no JVM implementation.
> -
>
> Key: PIG-2417
> URL: https://issues.apache.org/jira/browse/PIG-2417
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.12
>Reporter: Jeremy Karn
> Fix For: 0.12
>
> Attachments: PIG-2417-4.patch, PIG-2417-5.patch, PIG-2417-6.patch, 
> PIG-2417-7.patch, PIG-2417-8.patch, PIG-2417-e2e.patch, streaming2.patch, 
> streaming3.patch, streaming.patch
>
>
> The goal of Streaming UDFs is to allow users to easily write UDFs in 
> scripting languages with no JVM implementation or a limited JVM 
> implementation.  The initial proposal is outlined here: 
> https://cwiki.apache.org/confluence/display/PIG/StreamingUDFs.
> In order to implement this we need new syntax to distinguish a streaming UDF 
> from an embedded JVM UDF.  I'd propose something like the following (although 
> I'm not sure 'language' is the best term to be using):
> {code}define my_streaming_udfs language('python') 
> ship('my_streaming_udfs.py'){code}
> We'll also need a language-specific controller script that gets shipped to 
> the cluster which is responsible for reading the input stream, deserializing 
> the input data, passing it to the user written script, serializing that 
> script output, and writing that to the output stream.
> Finally, we'll need to add a StreamingUDF class that extends evalFunc.  This 
> class will likely share some of the existing code in POStream and 
> ExecutableManager (where it make sense to pull out shared code) to stream 
> data to/from the controller script.
> One alternative approach to creating the StreamingUDF EvalFunc is to use the 
> POStream operator directly.  This would involve inserting the POStream 
> operator instead of the POUserFunc operator whenever we encountered a 
> streaming UDF while building the physical plan.  This approach seemed 
> problematic because there would need to be a lot of changes in order to 
> support POStream in all of the places we want to be able use UDFs (For 
> example - to operate on a single field inside of a for each statement).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3461) Rewrite PartitionFilterOptimizer to make it work for all the cases

2013-09-17 Thread Aniket Mokashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi updated PIG-3461:


Status: Open  (was: Patch Available)

Some extra code (EvalFunc) got added to the patch mistakenly. I will resubmit a 
refactored patch soon. Canceling the patch in meantime.

> Rewrite PartitionFilterOptimizer to make it work for all the cases
> --
>
> Key: PIG-3461
> URL: https://issues.apache.org/jira/browse/PIG-3461
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
>Assignee: Aniket Mokashi
> Fix For: 0.12
>
> Attachments: PIG-3461-2.patch
>
>
> Current algorithm for Partition Filter pushdown identification fails in 
> several corner cases. We need to rewrite its logic so that it works in all 
> cases and does the maximum possible filter pushdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3255) Avoid extra byte array copies in streaming

2013-09-17 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-3255:


Summary: Avoid extra byte array copies in streaming  (was: Avoid extra byte 
array copy in streaming deserialize)

Committed to trunk. Thanks Alan, Daniel and Jeremy

> Avoid extra byte array copies in streaming
> --
>
> Key: PIG-3255
> URL: https://issues.apache.org/jira/browse/PIG-3255
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.12
>
> Attachments: PIG-3255-1.patch, PIG-3255-2.patch, PIG-3255-3.patch, 
> PIG-3255-4.patch, PIG-3255-5.patch
>
>
> PigStreaming.java:
>  public Tuple deserialize(byte[] bytes) throws IOException {
> Text val = new Text(bytes);  
> return StorageUtil.textToTuple(val, fieldDel);
> }
> Should remove new Text(bytes) copy and construct the tuple directly from the 
> bytes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3255) Avoid extra byte array copies in streaming

2013-09-17 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-3255:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Avoid extra byte array copies in streaming
> --
>
> Key: PIG-3255
> URL: https://issues.apache.org/jira/browse/PIG-3255
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.12
>
> Attachments: PIG-3255-1.patch, PIG-3255-2.patch, PIG-3255-3.patch, 
> PIG-3255-4.patch, PIG-3255-5.patch
>
>
> PigStreaming.java:
>  public Tuple deserialize(byte[] bytes) throws IOException {
> Text val = new Text(bytes);  
> return StorageUtil.textToTuple(val, fieldDel);
> }
> Should remove new Text(bytes) copy and construct the tuple directly from the 
> bytes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3462) POForEach evaluates POProject one by one

2013-09-17 Thread Rohini Palaniswamy (JIRA)
Rohini Palaniswamy created PIG-3462:
---

 Summary: POForEach evaluates POProject one by one
 Key: PIG-3462
 URL: https://issues.apache.org/jira/browse/PIG-3462
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.11.1
Reporter: Rohini Palaniswamy


A = load '/tmp/data' using PigStorage() as (a1, a2, a3);
B = foreach A generate a1,a2,a3;\n"

generates the plan as

-B: New For Each(false,false,false)[bag] - scope-45
|   |
|   Project[bytearray][0] - scope-39
|   |
|   Project[bytearray][1] - scope-41
|   |
|   Project[bytearray][2] - scope-43
|
|---A: Load(/tmp/data:PigStorage()) - scope-38

It would be good to change the plan generated to combine all these and fetch 
all projected columns at once instead of looping and projecting one by one. 
POUserFunc, POCast, etc in the Foreach cannot be combined and will have to be 
separate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-09-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769786#comment-13769786
 ] 

Alan Gates commented on PIG-3255:
-

I gave my +1 above, so we're good from my viewpoint.

> Avoid extra byte array copy in streaming deserialize
> 
>
> Key: PIG-3255
> URL: https://issues.apache.org/jira/browse/PIG-3255
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.12
>
> Attachments: PIG-3255-1.patch, PIG-3255-2.patch, PIG-3255-3.patch, 
> PIG-3255-4.patch, PIG-3255-5.patch
>
>
> PigStreaming.java:
>  public Tuple deserialize(byte[] bytes) throws IOException {
> Text val = new Text(bytes);  
> return StorageUtil.textToTuple(val, fieldDel);
> }
> Should remove new Text(bytes) copy and construct the tuple directly from the 
> bytes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3419) Pluggable Execution Engine

2013-09-17 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769734#comment-13769734
 ] 

Cheolsoo Park commented on PIG-3419:


I will open a jira to add "unstable" annotations. I am also reviewing PIG-3457 
now.

> Pluggable Execution Engine 
> ---
>
> Key: PIG-3419
> URL: https://issues.apache.org/jira/browse/PIG-3419
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.12
>Reporter: Achal Soni
>Assignee: Achal Soni
>Priority: Minor
> Fix For: 0.12
>
> Attachments: execengine.patch, mapreduce_execengine.patch, 
> stats_scriptstate.patch, test_failures.txt, test_suite.patch, 
> updated-8-22-2013-exec-engine.patch, updated-8-23-2013-exec-engine.patch, 
> updated-8-27-2013-exec-engine.patch, updated-8-28-2013-exec-engine.patch, 
> updated-8-29-2013-exec-engine.patch
>
>
> In an effort to adapt Pig to work using Apache Tez 
> (https://issues.apache.org/jira/browse/TEZ), I made some changes to allow for 
> a cleaner ExecutionEngine abstraction than existed before. The changes are 
> not that major as Pig was already relatively abstracted out between the 
> frontend and backend. The changes in the attached commit are essentially the 
> barebones changes -- I tried to not change the structure of Pig's different 
> components too much. I think it will be interesting to see in the future how 
> we can refactor more areas of Pig to really honor this abstraction between 
> the frontend and backend. 
> Some of the changes was to reinstate an ExecutionEngine interface to tie 
> together the front end and backend, and making the changes in Pig to delegate 
> to the EE when necessary, and creating an MRExecutionEngine that implements 
> this interface. Other work included changing ExecType to cycle through the 
> ExecutionEngines on the classpath and select the appropriate one (this is 
> done using Java ServiceLoader, exactly how MapReduce does for choosing the 
> framework to use between local and distributed mode). Also I tried to make 
> ScriptState, JobStats, and PigStats as abstract as possible in its current 
> state. I think in the future some work will need to be done here to perhaps 
> re-evaluate the usage of ScriptState and the responsibilities of the 
> different statistics classes. I haven't touched the PPNL, but I think more 
> abstraction is needed here, perhaps in a separate patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3065) pig output format/committer should support recovery for hadoop 0.23

2013-09-17 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-3065.
-

  Resolution: Fixed
Hadoop Flags: Reviewed

Patch committed to trunk.

> pig output format/committer should support recovery for hadoop 0.23
> ---
>
> Key: PIG-3065
> URL: https://issues.apache.org/jira/browse/PIG-3065
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Daniel Dai
>Priority: Minor
> Fix For: 0.12
>
> Attachments: PIG-3065-3.patch, PIG-3065.patch.txt, PIG-3065.patch.txt
>
>
> In hadoop 0.23 the output committer can optionally support recovery to handle
> the application master getting restarted (failing some # of attempts). If its 
> possible the pig outputformat/committer should support recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-3065) pig output format/committer should support recovery for hadoop 0.23

2013-09-17 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-3065:
---

Assignee: Daniel Dai

> pig output format/committer should support recovery for hadoop 0.23
> ---
>
> Key: PIG-3065
> URL: https://issues.apache.org/jira/browse/PIG-3065
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Daniel Dai
>Priority: Minor
> Fix For: 0.12
>
> Attachments: PIG-3065-3.patch, PIG-3065.patch.txt, PIG-3065.patch.txt
>
>
> In hadoop 0.23 the output committer can optionally support recovery to handle
> the application master getting restarted (failing some # of attempts). If its 
> possible the pig outputformat/committer should support recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Rounding with 8 places

2013-09-17 Thread Arun Prakash
Hi,
I don't know, how to round the number with 8 decimal places in pig .

Eg:

Number => Rounded

0 => 0.

3 => 3.

3.1 => 3.1000

-10.18 => -10.180

Any suggestions?

Regards,
Arun Prakash



The contents of this e-mail and any attachment(s) may contain confidential or 
privileged information for the intended recipient(s). Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail and 
using or disseminating the information, and must notify the sender and delete 
it from their system. L&T Infotech will not accept responsibility or liability 
for the accuracy or completeness of, or the presence of any virus or disabling 
code in this e-mail"


[jira] [Updated] (PIG-3388) No support for Regex for row filter in org.apache.pig.backend.hadoop.hbase.HBaseStorage

2013-09-17 Thread Lorand Bendig (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorand Bendig updated PIG-3388:
---

Fix Version/s: 0.12

> No support for Regex for row filter in 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage
> ---
>
> Key: PIG-3388
> URL: https://issues.apache.org/jira/browse/PIG-3388
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.11, 0.11.1
>Reporter: vikram s
>Assignee: Lorand Bendig
> Fix For: 0.12
>
> Attachments: PIG-3388.patch
>
>
> Currently,scan operation with rowfilter has support for gt,lt,gte,etc. 
> However no support for the regular expression.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira