[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (31 issues) Subscriber: pigdaily Key Summary PIG-5033MultiQueryOptimizerTez creates bad plan with union, split and FRJoin https://issues.apache.org/jira/browse/PIG-5033 PIG-5025Improve TestLoad.java: use own separated folder under /tmp https://issues.apache.org/jira/browse/PIG-5025 PIG-4976streaming job with store clause stuck if the script fail https://issues.apache.org/jira/browse/PIG-4976 PIG-4926Modify the content of start.xml for spark mode https://issues.apache.org/jira/browse/PIG-4926 PIG-4922Deadlock between SpillableMemoryManager and InternalSortedBag$SortedDataBagIterator https://issues.apache.org/jira/browse/PIG-4922 PIG-4918Pig on Tez cannot switch pig.temp.dir to another fs https://issues.apache.org/jira/browse/PIG-4918 PIG-4897Scope of param substitution for run/exec commands https://issues.apache.org/jira/browse/PIG-4897 PIG-4854Merge spark branch to trunk https://issues.apache.org/jira/browse/PIG-4854 PIG-4849pig on tez will cause tez-ui to crash,because the content from timeline server is too long. https://issues.apache.org/jira/browse/PIG-4849 PIG-4815Add xml format support for 'explain' in spark engine https://issues.apache.org/jira/browse/PIG-4815 PIG-4788the value BytesRead metric info always returns 0 even the length of input file is not 0 in spark engine https://issues.apache.org/jira/browse/PIG-4788 PIG-4745DataBag should protect content of passed list of tuples https://issues.apache.org/jira/browse/PIG-4745 PIG-4684Exception should be changed to warning when job diagnostics cannot be fetched https://issues.apache.org/jira/browse/PIG-4684 PIG-4656Improve String serialization and comparator performance in BinInterSedes https://issues.apache.org/jira/browse/PIG-4656 PIG-4598Allow user defined plan optimizer rules https://issues.apache.org/jira/browse/PIG-4598 PIG-4551Partition filter is not pushed down in case of SPLIT https://issues.apache.org/jira/browse/PIG-4551 PIG-4539New PigUnit https://issues.apache.org/jira/browse/PIG-4539 PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException https://issues.apache.org/jira/browse/PIG-4515 PIG-4323PackageConverter hanging in Spark https://issues.apache.org/jira/browse/PIG-4323 PIG-4313StackOverflowError in LIMIT operation on Spark https://issues.apache.org/jira/browse/PIG-4313 PIG-4251Pig on Storm https://issues.apache.org/jira/browse/PIG-4251 PIG-4002Disable combiner when map-side aggregation is used https://issues.apache.org/jira/browse/PIG-4002 PIG-3952PigStorage accepts '-tagSplit' to return full split information https://issues.apache.org/jira/browse/PIG-3952 PIG-3911Define unique fields with @OutputSchema https://issues.apache.org/jira/browse/PIG-3911 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues.apache.org/jira/browse/PIG-3877 PIG-3873Geo distance calculation using Haversine https://issues.apache.org/jira/browse/PIG-3873 PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange handling of Daylight Saving Time with location based timezones https://issues.apache.org/jira/browse/PIG-3864 PIG-3851Upgrade jline to 2.11 https://issues.apache.org/jira/browse/PIG-3851 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues.apache.org/jira/browse/PIG-3668 PIG-3587add functionality for rolling over dates https://issues.apache.org/jira/browse/PIG-3587 PIG-3087Refactor TestLogicalPlanBuilder to be meaningful https://issues.apache.org/jira/browse/PIG-3087 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (27 issues) Subscriber: pigdaily Key Summary PIG-4926Modify the content of start.xml for spark mode https://issues-test.apache.org/jira/browse/PIG-4926 PIG-4922Deadlock between SpillableMemoryManager and InternalSortedBag$SortedDataBagIterator https://issues-test.apache.org/jira/browse/PIG-4922 PIG-4918Pig on Tez cannot switch pig.temp.dir to another fs https://issues-test.apache.org/jira/browse/PIG-4918 PIG-4897Scope of param substitution for run/exec commands https://issues-test.apache.org/jira/browse/PIG-4897 PIG-4886Add PigSplit#getLocationInfo to fix the NPE found in log in spark mode https://issues-test.apache.org/jira/browse/PIG-4886 PIG-4854Merge spark branch to trunk https://issues-test.apache.org/jira/browse/PIG-4854 PIG-4849pig on tez will cause tez-ui to crash,because the content from timeline server is too long. https://issues-test.apache.org/jira/browse/PIG-4849 PIG-4788the value BytesRead metric info always returns 0 even the length of input file is not 0 in spark engine https://issues-test.apache.org/jira/browse/PIG-4788 PIG-4745DataBag should protect content of passed list of tuples https://issues-test.apache.org/jira/browse/PIG-4745 PIG-4684Exception should be changed to warning when job diagnostics cannot be fetched https://issues-test.apache.org/jira/browse/PIG-4684 PIG-4656Improve String serialization and comparator performance in BinInterSedes https://issues-test.apache.org/jira/browse/PIG-4656 PIG-4598Allow user defined plan optimizer rules https://issues-test.apache.org/jira/browse/PIG-4598 PIG-4551Partition filter is not pushed down in case of SPLIT https://issues-test.apache.org/jira/browse/PIG-4551 PIG-4539New PigUnit https://issues-test.apache.org/jira/browse/PIG-4539 PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException https://issues-test.apache.org/jira/browse/PIG-4515 PIG-4323PackageConverter hanging in Spark https://issues-test.apache.org/jira/browse/PIG-4323 PIG-4313StackOverflowError in LIMIT operation on Spark https://issues-test.apache.org/jira/browse/PIG-4313 PIG-4251Pig on Storm https://issues-test.apache.org/jira/browse/PIG-4251 PIG-4002Disable combiner when map-side aggregation is used https://issues-test.apache.org/jira/browse/PIG-4002 PIG-3952PigStorage accepts '-tagSplit' to return full split information https://issues-test.apache.org/jira/browse/PIG-3952 PIG-3911Define unique fields with @OutputSchema https://issues-test.apache.org/jira/browse/PIG-3911 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues-test.apache.org/jira/browse/PIG-3877 PIG-3873Geo distance calculation using Haversine https://issues-test.apache.org/jira/browse/PIG-3873 PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange handling of Daylight Saving Time with location based timezones https://issues-test.apache.org/jira/browse/PIG-3864 PIG-3851Upgrade jline to 2.11 https://issues-test.apache.org/jira/browse/PIG-3851 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues-test.apache.org/jira/browse/PIG-3668 PIG-3587add functionality for rolling over dates https://issues-test.apache.org/jira/browse/PIG-3587 You may edit this subscription at: https://issues-test.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (27 issues) Subscriber: pigdaily Key Summary PIG-4926Modify the content of start.xml for spark mode https://issues-test.apache.org/jira/browse/PIG-4926 PIG-4922Deadlock between SpillableMemoryManager and InternalSortedBag$SortedDataBagIterator https://issues-test.apache.org/jira/browse/PIG-4922 PIG-4918Pig on Tez cannot switch pig.temp.dir to another fs https://issues-test.apache.org/jira/browse/PIG-4918 PIG-4897Scope of param substitution for run/exec commands https://issues-test.apache.org/jira/browse/PIG-4897 PIG-4886Add PigSplit#getLocationInfo to fix the NPE found in log in spark mode https://issues-test.apache.org/jira/browse/PIG-4886 PIG-4854Merge spark branch to trunk https://issues-test.apache.org/jira/browse/PIG-4854 PIG-4849pig on tez will cause tez-ui to crash,because the content from timeline server is too long. https://issues-test.apache.org/jira/browse/PIG-4849 PIG-4788the value BytesRead metric info always returns 0 even the length of input file is not 0 in spark engine https://issues-test.apache.org/jira/browse/PIG-4788 PIG-4745DataBag should protect content of passed list of tuples https://issues-test.apache.org/jira/browse/PIG-4745 PIG-4684Exception should be changed to warning when job diagnostics cannot be fetched https://issues-test.apache.org/jira/browse/PIG-4684 PIG-4656Improve String serialization and comparator performance in BinInterSedes https://issues-test.apache.org/jira/browse/PIG-4656 PIG-4598Allow user defined plan optimizer rules https://issues-test.apache.org/jira/browse/PIG-4598 PIG-4551Partition filter is not pushed down in case of SPLIT https://issues-test.apache.org/jira/browse/PIG-4551 PIG-4539New PigUnit https://issues-test.apache.org/jira/browse/PIG-4539 PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException https://issues-test.apache.org/jira/browse/PIG-4515 PIG-4323PackageConverter hanging in Spark https://issues-test.apache.org/jira/browse/PIG-4323 PIG-4313StackOverflowError in LIMIT operation on Spark https://issues-test.apache.org/jira/browse/PIG-4313 PIG-4251Pig on Storm https://issues-test.apache.org/jira/browse/PIG-4251 PIG-4002Disable combiner when map-side aggregation is used https://issues-test.apache.org/jira/browse/PIG-4002 PIG-3952PigStorage accepts '-tagSplit' to return full split information https://issues-test.apache.org/jira/browse/PIG-3952 PIG-3911Define unique fields with @OutputSchema https://issues-test.apache.org/jira/browse/PIG-3911 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues-test.apache.org/jira/browse/PIG-3877 PIG-3873Geo distance calculation using Haversine https://issues-test.apache.org/jira/browse/PIG-3873 PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange handling of Daylight Saving Time with location based timezones https://issues-test.apache.org/jira/browse/PIG-3864 PIG-3851Upgrade jline to 2.11 https://issues-test.apache.org/jira/browse/PIG-3851 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues-test.apache.org/jira/browse/PIG-3668 PIG-3587add functionality for rolling over dates https://issues-test.apache.org/jira/browse/PIG-3587 You may edit this subscription at: https://issues-test.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384
[jira] [Commented] (PIG-4976) streaming job with store clause stuck if the script fail
[ https://issues.apache.org/jira/browse/PIG-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510109#comment-15510109 ] Koji Noguchi commented on PIG-4976: --- Nandor, granted I've never used this feature myself, but from http://pig.apache.org/docs/r0.16.0/basic.html#define-udfs I'm guessing {{define CMD `perl kk.pl` output('foo')}} means the streaming command (here, it would be perl) would write all its output to file 'foo'. Then PigStreaming would 'deserialize' them into Tuple form and pass them to next call. It is responsibility of the streaming process to create the file. I don't want the framework creating an empty output file and risk getting false positives. (From the current code, it should still fail but why risk it.) > streaming job with store clause stuck if the script fail > > > Key: PIG-4976 > URL: https://issues.apache.org/jira/browse/PIG-4976 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.17.0 > > Attachments: PIG-4976-1.patch, PIG-4976-2.patch, PIG-4976-3.patch, > PIG-4976-4.patch, PIG-4976-5-knoguchi.patch > > > When investigating PIG-4972, I also notice Pig job stuck when the perl script > have syntax error. This happens if we have output clause in stream > specification (means use a file as staging). The bug exist in both Tez and > MR, and it is not a regression. > Here is an example: > {code} > define CMD `perl kk.pl` output('foo') ship('kk.pl'); > A = load 'studenttab10k' as (name, age, gpa); > B = foreach A generate name; > C = stream B through CMD; > store C into 'ooo'; > {code} > kk.pl is any perl script contain a syntax error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4976) streaming job with store clause stuck if the script fail
[ https://issues.apache.org/jira/browse/PIG-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509610#comment-15509610 ] Nandor Kollar commented on PIG-4976: "File not created is expected." [~knoguchi], does this mean, that the file should already exist? If I'd like to use '/tmp/foo' as an output (define CMD `perl kk.pl` output('/tmp/foo') ship('kk.pl')), then I have to create the file before I execute the Pig script? Otherwise, it will fail I guess. > streaming job with store clause stuck if the script fail > > > Key: PIG-4976 > URL: https://issues.apache.org/jira/browse/PIG-4976 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.17.0 > > Attachments: PIG-4976-1.patch, PIG-4976-2.patch, PIG-4976-3.patch, > PIG-4976-4.patch, PIG-4976-5-knoguchi.patch > > > When investigating PIG-4972, I also notice Pig job stuck when the perl script > have syntax error. This happens if we have output clause in stream > specification (means use a file as staging). The bug exist in both Tez and > MR, and it is not a regression. > Here is an example: > {code} > define CMD `perl kk.pl` output('foo') ship('kk.pl'); > A = load 'studenttab10k' as (name, age, gpa); > B = foreach A generate name; > C = stream B through CMD; > store C into 'ooo'; > {code} > kk.pl is any perl script contain a syntax error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)