[jira] Subscription: PIG patch available

2016-08-30 Thread jira
Issue Subscription
Filter: PIG patch available (27 issues)

Subscriber: pigdaily

Key Summary
PIG-4967NPE in PigJobControl.run() when job status is null
https://issues.apache.org/jira/browse/PIG-4967
PIG-4926Modify the content of start.xml for spark mode
https://issues.apache.org/jira/browse/PIG-4926
PIG-4922Deadlock between SpillableMemoryManager and 
InternalSortedBag$SortedDataBagIterator
https://issues.apache.org/jira/browse/PIG-4922
PIG-4918Pig on Tez cannot switch pig.temp.dir to another fs
https://issues.apache.org/jira/browse/PIG-4918
PIG-4897Scope of param substitution for run/exec commands
https://issues.apache.org/jira/browse/PIG-4897
PIG-4854Merge spark branch to trunk
https://issues.apache.org/jira/browse/PIG-4854
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues.apache.org/jira/browse/PIG-4849
PIG-4788the value BytesRead metric info always returns 0 even the length of 
input file is not 0 in spark engine
https://issues.apache.org/jira/browse/PIG-4788
PIG-4745DataBag should protect content of passed list of tuples
https://issues.apache.org/jira/browse/PIG-4745
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues.apache.org/jira/browse/PIG-4515
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues.apache.org/jira/browse/PIG-3864
PIG-3851Upgrade jline to 2.11
https://issues.apache.org/jira/browse/PIG-3851
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384


[jira] Subscription: PIG patch available

2016-08-30 Thread jira
Issue Subscription
Filter: PIG patch available (27 issues)

Subscriber: pigdaily

Key Summary
PIG-4926Modify the content of start.xml for spark mode
https://issues-test.apache.org/jira/browse/PIG-4926
PIG-4922Deadlock between SpillableMemoryManager and 
InternalSortedBag$SortedDataBagIterator
https://issues-test.apache.org/jira/browse/PIG-4922
PIG-4918Pig on Tez cannot switch pig.temp.dir to another fs
https://issues-test.apache.org/jira/browse/PIG-4918
PIG-4897Scope of param substitution for run/exec commands
https://issues-test.apache.org/jira/browse/PIG-4897
PIG-4886Add PigSplit#getLocationInfo to fix the NPE found in log in spark 
mode
https://issues-test.apache.org/jira/browse/PIG-4886
PIG-4854Merge spark branch to trunk
https://issues-test.apache.org/jira/browse/PIG-4854
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues-test.apache.org/jira/browse/PIG-4849
PIG-4788the value BytesRead metric info always returns 0 even the length of 
input file is not 0 in spark engine
https://issues-test.apache.org/jira/browse/PIG-4788
PIG-4745DataBag should protect content of passed list of tuples
https://issues-test.apache.org/jira/browse/PIG-4745
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues-test.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues-test.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues-test.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues-test.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues-test.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues-test.apache.org/jira/browse/PIG-4515
PIG-4323PackageConverter hanging in Spark
https://issues-test.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues-test.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues-test.apache.org/jira/browse/PIG-4251
PIG-4002Disable combiner when map-side aggregation is used
https://issues-test.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues-test.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues-test.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues-test.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues-test.apache.org/jira/browse/PIG-3873
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues-test.apache.org/jira/browse/PIG-3864
PIG-3851Upgrade jline to 2.11
https://issues-test.apache.org/jira/browse/PIG-3851
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues-test.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues-test.apache.org/jira/browse/PIG-3587

You may edit this subscription at:
https://issues-test.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384


Jenkins build is back to normal : Pig-trunk #1938

2016-08-30 Thread Apache Jenkins Server
See 



Build failed in Jenkins: Pig-trunk-commit #2365

2016-08-30 Thread Apache Jenkins Server
See 

Changes:

[daijy] PIG-4972: StreamingIO_1 fail on perl 5.22

--
[...truncated 3008 lines...]
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.regex...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.util...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.tez...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.tez.plan...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.tez.plan.operator...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.tez.plan.udf...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.tez.runtime...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.tez.util...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.executionengine.util...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.hbase...
  [javadoc] Loading source files for package 
org.apache.pig.backend.hadoop.streaming...
  [javadoc] Loading source files for package org.apache.pig.builtin...
  [javadoc] Loading source files for package org.apache.pig.builtin.mock...
  [javadoc] Loading source files for package org.apache.pig.classification...
  [javadoc] Loading source files for package org.apache.pig.data...
  [javadoc] Loading source files for package org.apache.pig.data.utils...
  [javadoc] Loading source files for package org.apache.pig.impl...
  [javadoc] Loading source files for package org.apache.pig.impl.builtin...
  [javadoc] Loading source files for package org.apache.pig.impl.io...
  [javadoc] Loading source files for package org.apache.pig.impl.io.compress...
  [javadoc] Loading source files for package org.apache.pig.impl.logicalLayer...
  [javadoc] Loading source files for package 
org.apache.pig.impl.logicalLayer.schema...
  [javadoc] Loading source files for package 
org.apache.pig.impl.logicalLayer.validators...
  [javadoc] Loading source files for package org.apache.pig.impl.plan...
  [javadoc] Loading source files for package 
org.apache.pig.impl.plan.optimizer...
  [javadoc] Loading source files for package org.apache.pig.impl.streaming...
  [javadoc] Loading source files for package org.apache.pig.impl.util...
  [javadoc] Loading source files for package org.apache.pig.impl.util.avro...
  [javadoc] Loading source files for package org.apache.pig.impl.util.hive...
  [javadoc] Loading source files for package org.apache.pig.newplan...
  [javadoc] Loading source files for package org.apache.pig.newplan.logical...
  [javadoc] Loading source files for package 
org.apache.pig.newplan.logical.expression...
  [javadoc] Loading source files for package 
org.apache.pig.newplan.logical.optimizer...
  [javadoc] Loading source files for package 
org.apache.pig.newplan.logical.relational...
  [javadoc] Loading source files for package 
org.apache.pig.newplan.logical.rules...
  [javadoc] Loading source files for package 
org.apache.pig.newplan.logical.visitor...
  [javadoc] Loading source files for package org.apache.pig.newplan.optimizer...
  [javadoc] Loading source files for package org.apache.pig.parser...
  [javadoc] Loading source files for package org.apache.pig.pen...
  [javadoc] Loading source files for package org.apache.pig.pen.util...
  [javadoc] Loading source files for package org.apache.pig.scripting...
  [javadoc] Loading source files for package org.apache.pig.scripting.groovy...
  [javadoc] Loading source files for package org.apache.pig.scripting.jruby...
  [javadoc] Loading source files for package org.apache.pig.scripting.js...
  [javadoc] Loading source files for package org.apache.pig.scripting.jython...
  [javadoc] Loading source files for package 
org.apache.pig.scripting.streaming.python...
  [javadoc] Loading source files for package org.apache.pig.tools...
  [javadoc] Loading source files for package org.apache.pig.tools.cmdline...
  [javadoc] Loading source files for package org.apache.pig.tools.counters...
  [javadoc] Loading source files for package org.apache.pig.tools.grunt...
  [javadoc] Loading source files for package org.apache.pig.tools.parameters...
  [javadoc] Loading source files for package org.apache.pig.tools.pigstats...
  [javadoc] Loading source files for package 
org.apache.pig.tools.pigst

[jira] [Updated] (PIG-4972) StreamingIO_1 fail on perl 5.22

2016-08-30 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4972:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for review Thejas, Koji!

> StreamingIO_1 fail on perl 5.22
> ---
>
> Key: PIG-4972
> URL: https://issues.apache.org/jira/browse/PIG-4972
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.17.0
>
> Attachments: PIG-4972-1.patch
>
>
> Saw StreamingIO_1 on particular perl version due to a warning in 
> PigStreaming.pl. You can see the warning in any version of perl using "perl 
> -w":
> {code}
> defined(%hash) is deprecated at streaming/PigStreaming.pl line 76.
>   (Maybe you should just omit the defined()?)
> {code}
> In some particular version of perl, warning check is mandatory and the perl 
> script just fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4972) StreamingIO_1 fail on perl 5.22

2016-08-30 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15450037#comment-15450037
 ] 

Koji Noguchi commented on PIG-4972:
---

+1 (Just in case, tested the patch with e2e and they ran fine.)

> StreamingIO_1 fail on perl 5.22
> ---
>
> Key: PIG-4972
> URL: https://issues.apache.org/jira/browse/PIG-4972
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.17.0
>
> Attachments: PIG-4972-1.patch
>
>
> Saw StreamingIO_1 on particular perl version due to a warning in 
> PigStreaming.pl. You can see the warning in any version of perl using "perl 
> -w":
> {code}
> defined(%hash) is deprecated at streaming/PigStreaming.pl line 76.
>   (Maybe you should just omit the defined()?)
> {code}
> In some particular version of perl, warning check is mandatory and the perl 
> script just fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4972) StreamingIO_1 fail on perl 5.22

2016-08-30 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15450030#comment-15450030
 ] 

Thejas M Nair commented on PIG-4972:


+1

> StreamingIO_1 fail on perl 5.22
> ---
>
> Key: PIG-4972
> URL: https://issues.apache.org/jira/browse/PIG-4972
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.17.0
>
> Attachments: PIG-4972-1.patch
>
>
> Saw StreamingIO_1 on particular perl version due to a warning in 
> PigStreaming.pl. You can see the warning in any version of perl using "perl 
> -w":
> {code}
> defined(%hash) is deprecated at streaming/PigStreaming.pl line 76.
>   (Maybe you should just omit the defined()?)
> {code}
> In some particular version of perl, warning check is mandatory and the perl 
> script just fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4920) Fail to use Javascript UDF in spark yarn client mode

2016-08-30 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15450004#comment-15450004
 ] 

Rohini Palaniswamy commented on PIG-4920:
-

Liyun,
 This approach is not going to work for following reasons
   - We should not do any if(mr/tez/spark) conditions in main code. Only in 
test cases, we do that. When we move to maven (hopefully that will happen 
sometime) spark code will be in its own module and SparkExecType will not be 
something available to pig-core module.
   - PigContext is very heavy and serializing that costs a lot in terms of 
performance. PigContext is also actually not necessary in the backend 
processing. And so you should avoid serializing that in the first place which 
is what PIG-4866 does. The current patch actually serializes the udfcontext and 
the client properties as part of PigContext which are already part of the 
object doubling the size making it worse.

You should be doing MapRedUtil.setupUDFContext(jobConf); as the first thing in 
all threads used for execution which is what MR and Tez does. I wish we could 
get rid of this whole ThreadLocal business as setting up it is very messy in 
general, but that is required for local mode processing.




> Fail to use Javascript UDF in spark yarn client mode
> 
>
> Key: PIG-4920
> URL: https://issues.apache.org/jira/browse/PIG-4920
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4920.patch, PIG-4920_2.patch, PIG-4920_3.patch
>
>
> udf.pig 
> {code}
> register '/home/zly/prj/oss/merge.pig/pig/bin/udf.js' using javascript as 
> myfuncs;
> A = load './passwd' as (a0:chararray, a1:chararray);
> B = foreach A generate myfuncs.helloworld();
> store B into './udf.out';
> {code}
> udf.js
> {code}
> helloworld.outputSchema = "word:chararray";
> function helloworld() {
> return 'Hello, World';
> }
> 
> complex.outputSchema = "word:chararray";
> function complex(word){
> return {word:word};
> }
> {code}
> run udf.pig in spark local mode(export SPARK_MASTER="local"), it successfully.
> run udf.pig in spark yarn client mode(export SPARK_MASTER="yarn-client"), it 
> fails and error message like following:
> {noformat}
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at 
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:744)
> ... 84 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> org.apache.pig.scripting.js.JsScriptEngine.getInstance(JsScriptEngine.java:87)
> at org.apache.pig.scripting.js.JsFunction.(JsFunction.java:173)
> ... 89 more
> Caused by: java.lang.IllegalStateException: could not get script path from 
> UDFContext
> at 
> org.apache.pig.scripting.js.JsScriptEngine$Holder.(JsScriptEngine.java:69)
> ... 91 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4973) Bigdecimal divison fails

2016-08-30 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4973:

Attachment: PIG-4973.patch

> Bigdecimal divison fails
> 
>
> Key: PIG-4973
> URL: https://issues.apache.org/jira/browse/PIG-4973
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: PIG-4973.patch
>
>
> Division of BigDecimals doesn't work because we're not passing scale and 
> rounding information in divide() method. In cases like 10/3 we'll get 
> ArithmeticException:
> Pig script:
> grunt> A = LOAD 'decimaltest/f1' USING PigStorage(',') AS 
> (id,col1:bigdecimal,col2:bigdecimal);
> grunt> B = foreach A generate col1, col2, col1/col2;
> grunt> dump B
> Input file content:
> 1,10.0,3
> 2,51651351.13153143512,10.00
> 3,252525.252525,123.456
> Output with bigdecimal type in the schema:
> java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: 
> ERROR 0: Exception while executing [Divide (Name: Divide[bigdecimal] - 
> scope-34 Operator Key: scope-34) children: [[POProject (Name: 
> Project[bigdecimal][0] - scope-32 Operator Key: scope-32) children: null at 
> []], [POProject (Name: Project[bigdecimal][1] - scope-33 Operator Key: 
> scope-33) children: null at []]] at []]: java.lang.ArithmeticException: 
> Non-terminating decimal expansion; no exact representable decimal result.
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [Divide (Name: Divide[bigdecimal] - scope-34 
> Operator Key: scope-34) children: [[POProject (Name: Project[bigdecimal][0] - 
> scope-32 Operator Key: scope-32) children: null at []], [POProject (Name: 
> Project[bigdecimal][1] - scope-33 Operator Key: scope-33) children: null at 
> []]] at []]: java.lang.ArithmeticException: Non-terminating decimal 
> expansion; no exact representable decimal result.
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:364)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:404)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:321)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Non-terminating decimal expansion; 
> no exact representable decimal result.
>   at java.math.BigDecimal.divide(BigDecimal.java:1616)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.divide(Divide.java:75)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.genericGetNext(Divide.java:133)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.getNextBigDecimal(Divide.java:166)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:353)
>   ... 14 more
> Output with double in the schema:
> (10.0,3.0,3.3335)
> (5.165135113153143E7,10.0,5165135.113153143)
> (252525.252525,123.456,2045.467636445373)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (PIG-4973) Bigdecimal divison fails

2016-08-30 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on PIG-4973 started by Adam Szita.
---
> Bigdecimal divison fails
> 
>
> Key: PIG-4973
> URL: https://issues.apache.org/jira/browse/PIG-4973
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Reporter: Adam Szita
>Assignee: Adam Szita
>
> Division of BigDecimals doesn't work because we're not passing scale and 
> rounding information in divide() method. In cases like 10/3 we'll get 
> ArithmeticException:
> Pig script:
> grunt> A = LOAD 'decimaltest/f1' USING PigStorage(',') AS 
> (id,col1:bigdecimal,col2:bigdecimal);
> grunt> B = foreach A generate col1, col2, col1/col2;
> grunt> dump B
> Input file content:
> 1,10.0,3
> 2,51651351.13153143512,10.00
> 3,252525.252525,123.456
> Output with bigdecimal type in the schema:
> java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: 
> ERROR 0: Exception while executing [Divide (Name: Divide[bigdecimal] - 
> scope-34 Operator Key: scope-34) children: [[POProject (Name: 
> Project[bigdecimal][0] - scope-32 Operator Key: scope-32) children: null at 
> []], [POProject (Name: Project[bigdecimal][1] - scope-33 Operator Key: 
> scope-33) children: null at []]] at []]: java.lang.ArithmeticException: 
> Non-terminating decimal expansion; no exact representable decimal result.
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [Divide (Name: Divide[bigdecimal] - scope-34 
> Operator Key: scope-34) children: [[POProject (Name: Project[bigdecimal][0] - 
> scope-32 Operator Key: scope-32) children: null at []], [POProject (Name: 
> Project[bigdecimal][1] - scope-33 Operator Key: scope-33) children: null at 
> []]] at []]: java.lang.ArithmeticException: Non-terminating decimal 
> expansion; no exact representable decimal result.
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:364)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:404)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:321)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Non-terminating decimal expansion; 
> no exact representable decimal result.
>   at java.math.BigDecimal.divide(BigDecimal.java:1616)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.divide(Divide.java:75)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.genericGetNext(Divide.java:133)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.getNextBigDecimal(Divide.java:166)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:353)
>   ... 14 more
> Output with double in the schema:
> (10.0,3.0,3.3335)
> (5.165135113153143E7,10.0,5165135.113153143)
> (252525.252525,123.456,2045.467636445373)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4973) Bigdecimal divison fails

2016-08-30 Thread Adam Szita (JIRA)
Adam Szita created PIG-4973:
---

 Summary: Bigdecimal divison fails
 Key: PIG-4973
 URL: https://issues.apache.org/jira/browse/PIG-4973
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Adam Szita
Assignee: Adam Szita


Division of BigDecimals doesn't work because we're not passing scale and 
rounding information in divide() method. In cases like 10/3 we'll get 
ArithmeticException:

Pig script:
grunt> A = LOAD 'decimaltest/f1' USING PigStorage(',') AS 
(id,col1:bigdecimal,col2:bigdecimal);
grunt> B = foreach A generate col1, col2, col1/col2;
grunt> dump B

Input file content:
1,10.0,3
2,51651351.13153143512,10.00
3,252525.252525,123.456

Output with bigdecimal type in the schema:

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: 
ERROR 0: Exception while executing [Divide (Name: Divide[bigdecimal] - scope-34 
Operator Key: scope-34) children: [[POProject (Name: Project[bigdecimal][0] - 
scope-32 Operator Key: scope-32) children: null at []], [POProject (Name: 
Project[bigdecimal][1] - scope-33 Operator Key: scope-33) children: null at 
[]]] at []]: java.lang.ArithmeticException: Non-terminating decimal expansion; 
no exact representable decimal result.
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
Exception while executing [Divide (Name: Divide[bigdecimal] - scope-34 Operator 
Key: scope-34) children: [[POProject (Name: Project[bigdecimal][0] - scope-32 
Operator Key: scope-32) children: null at []], [POProject (Name: 
Project[bigdecimal][1] - scope-33 Operator Key: scope-33) children: null at 
[]]] at []]: java.lang.ArithmeticException: Non-terminating decimal expansion; 
no exact representable decimal result.
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:364)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:404)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:321)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArithmeticException: Non-terminating decimal expansion; no 
exact representable decimal result.
at java.math.BigDecimal.divide(BigDecimal.java:1616)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.divide(Divide.java:75)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.genericGetNext(Divide.java:133)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide.getNextBigDecimal(Divide.java:166)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:353)
... 14 more

Output with double in the schema:
(10.0,3.0,3.3335)
(5.165135113153143E7,10.0,5165135.113153143)
(252525.252525,123.456,2045.467636445373)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ApacheCon Seville CFP closes September 9th

2016-08-30 Thread Rich Bowen
It's traditional. We wait for the last minute to get our talk proposals
in for conferences.

Well, the last minute has arrived. The CFP for ApacheCon Seville closes
on September 9th, which is less than 2 weeks away. It's time to get your
talks in, so that we can make this the best ApacheCon yet.

It's also time to discuss with your developer and user community whether
there's a track of talks that you might want to propose, so that you
have more complete coverage of your project than a talk or two.

For Apache Big Data, the relevant URLs are:
Event details:
http://events.linuxfoundation.org/events/apache-big-data-europe
CFP:
http://events.linuxfoundation.org/events/apache-big-data-europe/program/cfp

For ApacheCon Europe, the relevant URLs are:
Event details: http://events.linuxfoundation.org/events/apachecon-europe
CFP: http://events.linuxfoundation.org/events/apachecon-europe/program/cfp

This year, we'll be reviewing papers "blind" - that is, looking at the
abstracts without knowing who the speaker is. This has been shown to
eliminate the "me and my buddies" nature of many tech conferences,
producing more diversity, and more new speakers. So make sure your
abstracts clearly explain what you'll be talking about.

For further updated about ApacheCon, follow us on Twitter, @ApacheCon,
or drop by our IRC channel, #apachecon on the Freenode IRC network.

-- 
Rich Bowen
WWW: http://apachecon.com/
Twitter: @ApacheCon


[jira] [Updated] (PIG-4967) NPE in PigJobControl.run() when job status is null

2016-08-30 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated PIG-4967:
--
Attachment: PIG-4967-3.patch

> NPE in PigJobControl.run() when job status is null
> --
>
> Key: PIG-4967
> URL: https://issues.apache.org/jira/browse/PIG-4967
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Critical
> Fix For: 0.16.0
>
> Attachments: PIG-4967-0.patch, PIG-4967-1.patch, PIG-4967-2.patch, 
> PIG-4967-3.patch
>
>
> {code}
> [JobControl] ERROR org.apache.pig.backend.hadoop23.PigJobControl  - Error 
> while trying to run jobs.
> java.lang.NullPointerException
>   at org.apache.hadoop.mapreduce.Job.getJobName(Job.java:426)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.toString(ControlledJob.java:93)
>   at java.lang.String.valueOf(String.java:2982)
>   at java.lang.StringBuilder.append(StringBuilder.java:131)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:182)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4967) NPE in PigJobControl.run() when job status is null

2016-08-30 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448539#comment-15448539
 ] 

Xiang Li commented on PIG-4967:
---

Daniel, thanks for the comments! I uploaded patch 3 with the following sentence 
removed from the comment.
-// Because MAPREDUCE-6762 fixes PIG-4967 from Hadoop side.-

> NPE in PigJobControl.run() when job status is null
> --
>
> Key: PIG-4967
> URL: https://issues.apache.org/jira/browse/PIG-4967
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Critical
> Fix For: 0.16.0
>
> Attachments: PIG-4967-0.patch, PIG-4967-1.patch, PIG-4967-2.patch, 
> PIG-4967-3.patch
>
>
> {code}
> [JobControl] ERROR org.apache.pig.backend.hadoop23.PigJobControl  - Error 
> while trying to run jobs.
> java.lang.NullPointerException
>   at org.apache.hadoop.mapreduce.Job.getJobName(Job.java:426)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.toString(ControlledJob.java:93)
>   at java.lang.String.valueOf(String.java:2982)
>   at java.lang.StringBuilder.append(StringBuilder.java:131)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:182)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4969) Optimize combine case for spark mode

2016-08-30 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-4969:
--
Attachment: PIG-4969.patch

> Optimize combine case for spark mode
> 
>
> Key: PIG-4969
> URL: https://issues.apache.org/jira/browse/PIG-4969
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: PIG-4969.patch
>
>
> In our test result of 1 TB pigmix benchmark , it shows that it runs slower in 
> combine case in spark mode .
> ||Script||MR||Spark
> |L_1|8089 |10064
> L1.pig
> {code}
> register pigperf.jar;
> A = load '/user/pig/tests/data/pigmix/page_views' using 
> org.apache.pig.test.udf.storefunc.PigPerformanceLoader()
> as (user, action, timespent, query_term, ip_addr, timestamp,
> estimated_revenue, page_info, page_links);
> B = foreach A generate user, (int)action as action, (map[])page_info as 
> page_info,
> flatten((bag{tuple(map[])})page_links) as page_links;
> C = foreach B generate user,
> (action == 1 ? page_info#'a' : page_links#'b') as header;
> D = group C by user parallel 40;
> E = foreach D generate group, COUNT(C) as cnt;
> store E into 'L1out';
> {code}
> Then spark plan
> {code}
> exec] #--
>  [exec] # Spark Plan  
>  [exec] #--
>  [exec] 
>  [exec] Spark node scope-38
>  [exec] E: 
> Store(hdfs://bdpe81:8020/user/root/output/pig/L1out:org.apache.pig.builtin.PigStorage)
>  - scope-37
>  [exec] |
>  [exec] |---E: New For Each(false,false)[tuple] - scope-42
>  [exec] |   |
>  [exec] |   Project[bytearray][0] - scope-39
>  [exec] |   |
>  [exec] |   Project[bag][1] - scope-40
>  [exec] |   
>  [exec] |   POUserFunc(org.apache.pig.builtin.COUNT$Final)[long] - 
> scope-41
>  [exec] |   |
>  [exec] |   |---Project[bag][1] - scope-57
>  [exec] |
>  [exec] |---Reduce By(false,false)[tuple] - scope-47
>  [exec] |   |
>  [exec] |   Project[bytearray][0] - scope-48
>  [exec] |   |
>  [exec] |   
> POUserFunc(org.apache.pig.builtin.COUNT$Intermediate)[tuple] - scope-49
>  [exec] |   |
>  [exec] |   |---Project[bag][1] - scope-50
>  [exec] |
>  [exec] |---D: Local Rearrange[tuple]{bytearray}(false) - scope-53
>  [exec] |   |
>  [exec] |   Project[bytearray][0] - scope-55
>  [exec] |
>  [exec] |---E: New For Each(false,false)[bag] - scope-43
>  [exec] |   |
>  [exec] |   Project[bytearray][0] - scope-44
>  [exec] |   |
>  [exec] |   
> POUserFunc(org.apache.pig.builtin.COUNT$Initial)[tuple] - scope-45
>  [exec] |   |
>  [exec] |   |---Project[bag][1] - scope-46
>  [exec] |
>  [exec] |---Pre Combiner Local Rearrange[tuple]{Unknown} 
> - scope-56
>  [exec] |
>  [exec] |---C: New For Each(false,false)[bag] - 
> scope-26
>  [exec] |   |
>  [exec] |   Project[bytearray][0] - scope-13
>  [exec] |   |
>  [exec] |   POBinCond[bytearray] - scope-22
>  [exec] |   |
>  [exec] |   |---Equal To[boolean] - scope-17
>  [exec] |   |   |
>  [exec] |   |   |---Project[int][1] - scope-15
>  [exec] |   |   |
>  [exec] |   |   |---Constant(1) - scope-16
>  [exec] |   |
>  [exec] |   |---POMapLookUp[bytearray] - scope-19
>  [exec] |   |   |
>  [exec] |   |   |---Project[map][2] - scope-18
>  [exec] |   |
>  [exec] |   |---POMapLookUp[bytearray] - scope-21
>  [exec] |   |
>  [exec] |   |---Project[map][3] - scope-20
>  [exec] |
>  [exec] |---B: New For 
> Each(false,false,false,true)[bag] - scope-12
>  [exec] |   |
>  [exec] |   Project[bytearray][0] - scope-1
>  [exec] |   |
>  [exec] |   Cast[int] - scope-4
>  [exec]

[jira] [Commented] (PIG-4967) NPE in PigJobControl.run() when job status is null

2016-08-30 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448412#comment-15448412
 ] 

Daniel Dai commented on PIG-4967:
-

I am fine with the fix. The line "Because MAPREDUCE-6762 fixes PIG-4967 from 
Hadoop side" can be removed as it is a little wordy. Will commit shortly.

> NPE in PigJobControl.run() when job status is null
> --
>
> Key: PIG-4967
> URL: https://issues.apache.org/jira/browse/PIG-4967
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Critical
> Fix For: 0.16.0
>
> Attachments: PIG-4967-0.patch, PIG-4967-1.patch, PIG-4967-2.patch
>
>
> {code}
> [JobControl] ERROR org.apache.pig.backend.hadoop23.PigJobControl  - Error 
> while trying to run jobs.
> java.lang.NullPointerException
>   at org.apache.hadoop.mapreduce.Job.getJobName(Job.java:426)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.toString(ControlledJob.java:93)
>   at java.lang.String.valueOf(String.java:2982)
>   at java.lang.StringBuilder.append(StringBuilder.java:131)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:182)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)