[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (41 issues) Subscriber: pigdaily Key Summary PIG-5228Orc_2 is failing with spark exec type https://issues.apache.org/jira/browse/PIG-5228 PIG-5225Several unit tests are not annotated with @Test https://issues.apache.org/jira/browse/PIG-5225 PIG-5218Jyhton_Checkin_3 fails with spark exec type https://issues.apache.org/jira/browse/PIG-5218 PIG-5207BugFix e2e tests fail on spark https://issues.apache.org/jira/browse/PIG-5207 PIG-5199exclude jline in spark dependency https://issues.apache.org/jira/browse/PIG-5199 PIG-5194HiveUDF fails with Spark exec type https://issues.apache.org/jira/browse/PIG-5194 PIG-5186Support aggregate warnings with Spark engine https://issues.apache.org/jira/browse/PIG-5186 PIG-5185Job name show "DefaultJobName" when running a Python script https://issues.apache.org/jira/browse/PIG-5185 PIG-5184set command to view value of a variable https://issues.apache.org/jira/browse/PIG-5184 PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown NPE in multithread env https://issues.apache.org/jira/browse/PIG-5160 PIG-5135HDFS bytes read stats are always 0 in Spark mode https://issues.apache.org/jira/browse/PIG-5135 PIG-5115Builtin AvroStorage generates incorrect avro schema when the same pig field name appears in the alias https://issues.apache.org/jira/browse/PIG-5115 PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true https://issues.apache.org/jira/browse/PIG-5106 PIG-5081Can not run pig on spark source code distribution https://issues.apache.org/jira/browse/PIG-5081 PIG-5080Support store alias as spark table https://issues.apache.org/jira/browse/PIG-5080 PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput https://issues.apache.org/jira/browse/PIG-5057 PIG-5029Optimize sort case when data is skewed https://issues.apache.org/jira/browse/PIG-5029 PIG-4926Modify the content of start.xml for spark mode https://issues.apache.org/jira/browse/PIG-4926 PIG-4913Reduce jython function initiation during compilation https://issues.apache.org/jira/browse/PIG-4913 PIG-4849pig on tez will cause tez-ui to crash,because the content from timeline server is too long. https://issues.apache.org/jira/browse/PIG-4849 PIG-4750REPLACE_MULTI should compile Pattern once and reuse it https://issues.apache.org/jira/browse/PIG-4750 PIG-4748DateTimeWritable forgets Chronology https://issues.apache.org/jira/browse/PIG-4748 PIG-4745DataBag should protect content of passed list of tuples https://issues.apache.org/jira/browse/PIG-4745 PIG-4684Exception should be changed to warning when job diagnostics cannot be fetched https://issues.apache.org/jira/browse/PIG-4684 PIG-4656Improve String serialization and comparator performance in BinInterSedes https://issues.apache.org/jira/browse/PIG-4656 PIG-4598Allow user defined plan optimizer rules https://issues.apache.org/jira/browse/PIG-4598 PIG-4551Partition filter is not pushed down in case of SPLIT https://issues.apache.org/jira/browse/PIG-4551 PIG-4539New PigUnit https://issues.apache.org/jira/browse/PIG-4539 PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException https://issues.apache.org/jira/browse/PIG-4515 PIG-4323PackageConverter hanging in Spark https://issues.apache.org/jira/browse/PIG-4323 PIG-4313StackOverflowError in LIMIT operation on Spark https://issues.apache.org/jira/browse/PIG-4313 PIG-4251Pig on Storm https://issues.apache.org/jira/browse/PIG-4251 PIG-4002Disable combiner when map-side aggregation is used https://issues.apache.org/jira/browse/PIG-4002 PIG-3952PigStorage accepts '-tagSplit' to return full split information https://issues.apache.org/jira/browse/PIG-3952 PIG-3911Define unique fields with @OutputSchema https://issues.apache.org/jira/browse/PIG-3911 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues.apache.org/jira/browse/PIG-3877 PIG-3873Geo distance calculation using Haversine https://issues.apache.org/jira/browse/PIG-3873 PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange handling of Daylight Saving Time with location based timezones https://issues.apache.org/jira/browse/PIG-3864 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues.apache.org/jira/browse/PIG-3668 PIG-3587add functionality for rol
[jira] [Updated] (PIG-5215) Merge changes from review board to spark branch
[ https://issues.apache.org/jira/browse/PIG-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated PIG-5215: -- Issue Type: Bug (was: Sub-task) Parent: (was: PIG-4059) > Merge changes from review board to spark branch > --- > > Key: PIG-5215 > URL: https://issues.apache.org/jira/browse/PIG-5215 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-5215.1.patch, PIG-5215.3.patch, PIG-5215.patch > > > in [review board|https://reviews.apache.org/r/57317/], there are comments > from community. After the review board is close, merge these changes to spark > branch -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PIG-5218) Jyhton_Checkin_3 fails with spark exec type
[ https://issues.apache.org/jira/browse/PIG-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000247#comment-16000247 ] liyunzhang_intel commented on PIG-5218: --- [~rohini]: commit to the branch. [~szita]: thanks for contribution. > Jyhton_Checkin_3 fails with spark exec type > --- > > Key: PIG-5218 > URL: https://issues.apache.org/jira/browse/PIG-5218 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Fix For: spark-branch > > Attachments: PIG-5218.0.patch, PIG-5218.1.patch > > > Exception observed: > {code} > Caused by: java.lang.ClassCastException: > org.apache.commons.logging.impl.SLF4JLocationAwareLog cannot be cast to > org.apache.commons.logging.impl.Log4JLogger > at > org.apache.hadoop.test.GenericTestUtils.setLogLevel(GenericTestUtils.java:107) > at > org.apache.hadoop.fs.FileContextCreateMkdirBaseTest.(FileContextCreateMkdirBaseTest.java:60) > ... 29 more > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PIG-5228) Orc_2 is failing with spark exec type
[ https://issues.apache.org/jira/browse/PIG-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000233#comment-16000233 ] Rohini Palaniswamy commented on PIG-5228: - bq. I think it may be dependent on HashMap implementation and thus JDK version as well. If running on different JDK version, it could be an issue as HashMap implementation changed between 1.6 and 1.7. For the same jdk version behavior of insertion is usually consistent. bq. My wild guess is that either MR or Spark makes an extra filling of a map somewhere under the hood and that's where the difference comes from. Ordering of entries in map is not something we guarantee. But can you still try to find why it is happening? I am surprised you are running into this with just simple load and store statements for same jdk version. Could be something to do with serialization as well. bq. so I've created a new a test case where we project by each key from the map. If figuring out the cause of change is taking more time, I am fine with the current patch which has separate test for Spark. > Orc_2 is failing with spark exec type > - > > Key: PIG-5228 > URL: https://issues.apache.org/jira/browse/PIG-5228 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Fix For: spark-branch > > Attachments: PIG-5228.0.patch > > > This test is failing due to mismatch in the actual and expected result. The > difference is only related to the order of entries in Pig maps such as: > Actual: > {code} > [name#alice, age#18]... > {code} > Expected: > {code} > [age#18, name#alice]... > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (PIG-5218) Jyhton_Checkin_3 fails with spark exec type
[ https://issues.apache.org/jira/browse/PIG-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000222#comment-16000222 ] Rohini Palaniswamy commented on PIG-5218: - +1. [~kellyzly], can you commit this into the spark branch? > Jyhton_Checkin_3 fails with spark exec type > --- > > Key: PIG-5218 > URL: https://issues.apache.org/jira/browse/PIG-5218 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Adam Szita >Assignee: Adam Szita > Fix For: spark-branch > > Attachments: PIG-5218.0.patch, PIG-5218.1.patch > > > Exception observed: > {code} > Caused by: java.lang.ClassCastException: > org.apache.commons.logging.impl.SLF4JLocationAwareLog cannot be cast to > org.apache.commons.logging.impl.Log4JLogger > at > org.apache.hadoop.test.GenericTestUtils.setLogLevel(GenericTestUtils.java:107) > at > org.apache.hadoop.fs.FileContextCreateMkdirBaseTest.(FileContextCreateMkdirBaseTest.java:60) > ... 29 more > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (PIG-5197) Replace IndexedKey with PigNullableWritable in spark branch
[ https://issues.apache.org/jira/browse/PIG-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5197. - Resolution: Won't Fix Don't have a suggestion to workaround that. Closing it as Won't Fix for now. If there are performance and/or comparator issues, we will revisit. > Replace IndexedKey with PigNullableWritable in spark branch > --- > > Key: PIG-5197 > URL: https://issues.apache.org/jira/browse/PIG-5197 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-5197.patch > > > The function of IndexedKey and PigNullableWritable is similar. > The difference between these two is IndexedKey contains Index,key while > PigNullableWritable contains index,key,value. > Besides,the comparators for PigNullableWritable have lot of conditions for > the different data types taken care of and IndexedKey can miss some of that. > We can try to replace IndexedKey with PigNullableWritable. -- This message was sent by Atlassian JIRA (v6.3.15#6346)