[jira] [Commented] (PIG-5463) Pig on Tez TestDateTime.testLocalExecution failing on hadoop3/tez-0.10
[ https://issues.apache.org/jira/browse/PIG-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888440#comment-17888440 ] Rohini Palaniswamy commented on PIG-5463: - +1 > Pig on Tez TestDateTime.testLocalExecution failing on hadoop3/tez-0.10 > -- > > Key: PIG-5463 > URL: https://issues.apache.org/jira/browse/PIG-5463 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Fix For: 0.19.0 > > Attachments: pig-5463-v01.patch, pig-5463-v02.patch > > > Somehow TestDateTime testLocalExecution started failing on Pig on Tez with > hadoop3. > {noformat} > 2024-09-11 10:50:29,815 [IPC Server handler 30 on default port 34089] WARN > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor - Invalid > resource ask by application appattempt_1726051802536_0001_01 > org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid > resource request! Cannot allocate containers as requested resource is less > than 0! Requested resource type=[memory-mb], Requested resource= vCores:1> > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.throwInvalidResourceException(SchedulerUtils.java:525) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkResourceRequestAgainstAvailableResource(SchedulerUtils.java:415) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:349) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:304) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:312) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:268) > at > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:254) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > at > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:93) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:434) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:105) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048) > {noformat} > Weird part is, it passes when tested alone or tested twice (with copy&paste). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5454) Make ParallelGC the default Garbage Collection
[ https://issues.apache.org/jira/browse/PIG-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888416#comment-17888416 ] Rohini Palaniswamy commented on PIG-5454: - +1 > Make ParallelGC the default Garbage Collection > -- > > Key: PIG-5454 > URL: https://issues.apache.org/jira/browse/PIG-5454 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5454-v01.patch, pig-5454-v02.patch, > pig-5454-v03.patch, pig-5454-v04.patch > > > From JDK9 and beyond, G1GC became the default GC. > I've seen our users hitting OOM after migrating to recent jdk and the issue > going away after reverting back to ParallelGC. > Maybe the GC behavior assumed by SelfSpillBag does not work with G1GC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5465) Owasp filter out false positives
[ https://issues.apache.org/jira/browse/PIG-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888387#comment-17888387 ] Rohini Palaniswamy commented on PIG-5465: - +1 > Owasp filter out false positives > > > Key: PIG-5465 > URL: https://issues.apache.org/jira/browse/PIG-5465 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Priority: Minor > Attachments: pig-owasp.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5460) Allow Tez to be launched from mapreduce job
[ https://issues.apache.org/jira/browse/PIG-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885424#comment-17885424 ] Rohini Palaniswamy commented on PIG-5460: - +1 > Allow Tez to be launched from mapreduce job > --- > > Key: PIG-5460 > URL: https://issues.apache.org/jira/browse/PIG-5460 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5460-v01.patch, pig-5460-v02.patch > > > It's like Oozie but not using Oozie launcher. > I would like to be able to submit Pig on Tez job from the mapper task. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5419) Upgrade Joda time version
[ https://issues.apache.org/jira/browse/PIG-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5419: Fix Version/s: 0.18.0 (was: 0.18.1) Hadoop Flags: Reviewed Patch Info: Patch Available +1. Thanks Venkat > Upgrade Joda time version > - > > Key: PIG-5419 > URL: https://issues.apache.org/jira/browse/PIG-5419 > Project: Pig > Issue Type: Improvement >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5419-v2.patch, PIG-5419.patch > > > Pig depends on an older version of Joda time, which can result in conflicts > with other versions in some workflows. Upgrading it to the latest > version(2.10.13) will resolve Pig's side of such issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5380) SortedDataBag hitting ConcurrentModificationException or producing incorrect output in a corner-case
[ https://issues.apache.org/jira/browse/PIG-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884310#comment-17884310 ] Rohini Palaniswamy commented on PIG-5380: - I think moving the reading from memory before the spill files might have problems with the ordering. > SortedDataBag hitting ConcurrentModificationException or producing incorrect > output in a corner-case > - > > Key: PIG-5380 > URL: https://issues.apache.org/jira/browse/PIG-5380 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5380-v01.patch, pig-5380-v02.patch, > pig-5380-v03.patch > > > User had a UDF that created large SortedDataBag. This UDF was failing with > {noformat} > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) > at java.util.ArrayList$Itr.next(ArrayList.java:851) > at > org.apache.pig.data.SortedDataBag$SortedDataBagIterator.readFromPriorityQ(SortedDataBag.java:346) > at > org.apache.pig.data.SortedDataBag$SortedDataBagIterator.next(SortedDataBag.java:322) > at > org.apache.pig.data.SortedDataBag$SortedDataBagIterator.hasNext(SortedDataBag.java:235) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5454) Make ParallelGC the default Garbage Collection
[ https://issues.apache.org/jira/browse/PIG-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884287#comment-17884287 ] Rohini Palaniswamy commented on PIG-5454: - Just one minor comment. Make it params instead of param. i.e public static final String PIG_GC_DEFAULT_PARAMS = "pig.gc.default.params"; > Make ParallelGC the default Garbage Collection > -- > > Key: PIG-5454 > URL: https://issues.apache.org/jira/browse/PIG-5454 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5454-v01.patch, pig-5454-v02.patch, > pig-5454-v03.patch > > > From JDK9 and beyond, G1GC became the default GC. > I've seen our users hitting OOM after migrating to recent jdk and the issue > going away after reverting back to ParallelGC. > Maybe the GC behavior assumed by SelfSpillBag does not work with G1GC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5456) Upgrade Spark to 3.4.3
[ https://issues.apache.org/jira/browse/PIG-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884286#comment-17884286 ] Rohini Palaniswamy commented on PIG-5456: - +1 > Upgrade Spark to 3.4.3 > -- > > Key: PIG-5456 > URL: https://issues.apache.org/jira/browse/PIG-5456 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.19.0 > > Attachments: pig-5456-v01.patch, pig-5456-v02.patch > > > Major blocker for upgrading to Spark 3.4.3 was Spark started using log4j2. > Simple upgrade failing a lot of tests with > {noformat} > java.lang.VerifyError: class org.apache.log4j.bridge.LogEventAdapter > overrides final method getTimeStamp.()J {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5459) Jython_Checkin_3 e2e failing with NoClassDefFoundError (hadoop3)
[ https://issues.apache.org/jira/browse/PIG-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882811#comment-17882811 ] Rohini Palaniswamy commented on PIG-5459: - +1 > Jython_Checkin_3 e2e failing with NoClassDefFoundError (hadoop3) > > > Key: PIG-5459 > URL: https://issues.apache.org/jira/browse/PIG-5459 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5459-v01.patch > > > {noformat} > turing_jython.conf/Jython_Checkin_3.pig", line 4, in _module_ > from org.apache.hadoop.conf import * > java.lang.NoClassDefFoundError: Lorg/junit/rules/ExpectedException; > at java.lang.Class.getDeclaredFields0(Native Method) > at java.lang.Class.privateGetDeclaredFields(Class.java:2583) > at java.lang.Class.privateGetPublicFields(Class.java:2614) > at java.lang.Class.getFields(Class.java:1557) > at org.python.core.PyJavaType.init(PyJavaType.java:419) > at org.python.core.PyType.createType(PyType.java:1523) > at org.python.core.PyType.addFromClass(PyType.java:1462) > at org.python.core.PyType.fromClass(PyType.java:1551) > at > org.python.core.adapter.ClassicPyObjectAdapter$6.adapt(ClassicPyObjectAdapter.java:77) > at > org.python.core.adapter.ExtensiblePyObjectAdapter.adapt(ExtensiblePyObjectAdapter.java:44) > at > org.python.core.adapter.ClassicPyObjectAdapter.adapt(ClassicPyObjectAdapter.java:131) > at org.python.core.Py.java2py(Py.java:2017) > at org.python.core.PyJavaPackage.addClass(PyJavaPackage.java:86) > at > org.python.core.packagecache.PackageManager.basicDoDir(PackageManager.java:113) > at > org.python.core.packagecache.SysPackageManager.doDir(SysPackageManager.java:148) > at org.python.core.PyJavaPackage.fillDir(PyJavaPackage.java:120) > at org.python.core.imp.importAll(imp.java:1189) > at org.python.core.imp.importAll(imp.java:1177) > at > org.python.pycode._pyx0.f$0(/tmp/yarn-local/usercache/.../gtrain-1722336537-turing_jython.conf/Jython_Checkin_3.pig:8) > at > org.python.pycode._pyx0.call_function(/tmp/yarn-local/usercache...gtrain-1722336537-tu/ring_jython.conf/Jython_Checkin_3.pig) > at org.python.core.PyTableCode.call(PyTableCode.java:171) > at org.python.core.PyCode.call(PyCode.java:18) > at org.python.core.Py.runCode(Py.java:1614) > at org.python.util.PythonInterpreter.execfile(PythonInterpreter.java:296) > at > org.apache.pig.scripting.jython.JythonScriptEngine$Interpreter.execfile(JythonScriptEngine.java:217) > at > org.apache.pig.scripting.jython.JythonScriptEngine.load(JythonScriptEngine.java:440) > at > org.apache.pig.scripting.jython.JythonScriptEngine.main(JythonScriptEngine.java:424) > at org.apache.pig.scripting.ScriptEngine.run(ScriptEngine.java:310) > at org.apache.pig.Main.runEmbeddedScript(Main.java:1096) > at org.apache.pig.Main.run(Main.java:584) > at org.apache.pig.Main.main(Main.java:175) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:328) > at org.apache.hadoop.util.RunJar.main(RunJar.java:241) > Caused by: java.lang.ClassNotFoundException: org.junit.rules.ExpectedException > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:418) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) > at java.lang.ClassLoader.loadClass(ClassLoader.java:351) > ... 37 more > java.lang.NoClassDefFoundError: java.lang.NoClassDefFoundError: > Lorg/junit/rules/ExpectedException; > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5451) Pig-on-Spark3 E2E Orc_Pushdown_5 failing
[ https://issues.apache.org/jira/browse/PIG-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882810#comment-17882810 ] Rohini Palaniswamy commented on PIG-5451: - +1 > Pig-on-Spark3 E2E Orc_Pushdown_5 failing > - > > Key: PIG-5451 > URL: https://issues.apache.org/jira/browse/PIG-5451 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-9-5451-v01.patch > > > Test failing with > "java.lang.IllegalAccessError: class org.threeten.extra.chrono.HybridDate > cannot access its superclass org.threeten.extra.chrono.AbstractDate" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5420) Update accumulo dependency to 1.10.1
[ https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882809#comment-17882809 ] Rohini Palaniswamy commented on PIG-5420: - +1 > Update accumulo dependency to 1.10.1 > > > Key: PIG-5420 > URL: https://issues.apache.org/jira/browse/PIG-5420 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.1 > > Attachments: pig-5420-v01.patch, pig-9-5420-v02.patch > > > Following owasp/cve report. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5460) Allow Tez to be launched from mapreduce job
[ https://issues.apache.org/jira/browse/PIG-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882806#comment-17882806 ] Rohini Palaniswamy commented on PIG-5460: - Change should just be {code:java} String tokenFile = System.getenv("HADOOP_TOKEN_FILE_LOCATION") if(tokenFile != null && globalConf.get(MRConfiguration.JOB_CREDENTIALS_BINARY) == null) { globalConf.set(MRConfiguration.JOB_CREDENTIALS_BINARY, tokenFile); globalConf.set("tez.credentials.path", tokenFile); } {code} SecurityHelper.populateTokenCache will take care of reading from that. It would be even better if you can put the above into a configureCredentialFile(Configuration conf) method in SecurityHelper instead of TezDAGBuilder and just call it from there, so that all related code is in one place. > Allow Tez to be launched from mapreduce job > --- > > Key: PIG-5460 > URL: https://issues.apache.org/jira/browse/PIG-5460 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5460-v01.patch > > > It's like Oozie but not using Oozie launcher. > I would like to be able to submit Pig on Tez job from the mapper task. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5458) Update metrics-core.version
[ https://issues.apache.org/jira/browse/PIG-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882807#comment-17882807 ] Rohini Palaniswamy commented on PIG-5458: - +1 > Update metrics-core.version > > > Key: PIG-5458 > URL: https://issues.apache.org/jira/browse/PIG-5458 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5458-v01.patch > > > Hadoop3 uses metrics-core.version of 3.2.4 from io.dropwizard.metrics > and > Hadoop2 uses metrics-core.version of 3.0.1 from com.codahale.metrics. > I believe one from com.yammer.metrics (2.1.2) can be dropped. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5461) E2E environment variables ignored
[ https://issues.apache.org/jira/browse/PIG-5461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882803#comment-17882803 ] Rohini Palaniswamy commented on PIG-5461: - +1 > E2E environment variables ignored > - > > Key: PIG-5461 > URL: https://issues.apache.org/jira/browse/PIG-5461 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5461-v01.patch > > > When running e2e against Hadoop3 and using hadoop2+oldpig for verification, I > was confused why environment variables like OLD_HADOOP_HOME were ignored. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5462) Always update Owasp version to latest
[ https://issues.apache.org/jira/browse/PIG-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882802#comment-17882802 ] Rohini Palaniswamy commented on PIG-5462: - +1 > Always update Owasp version to latest > -- > > Key: PIG-5462 > URL: https://issues.apache.org/jira/browse/PIG-5462 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5462-v01.patch, pig-5462-v02.patch > > > While looking at owasp report, a lot of them were completely off. > (Like hadoop-shims-0.10.3 being reported as vulnerable.) > Using latest org.owasp/dependency-check-ant > (https://mvnrepository.com/artifact/org.owasp/dependency-check-ant) > seems to help cut down the false positives. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5457) Upgrade Zookeeper to 3.7.2 (from 3.5.7)
[ https://issues.apache.org/jira/browse/PIG-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882801#comment-17882801 ] Rohini Palaniswamy commented on PIG-5457: - +1 > Upgrade Zookeeper to 3.7.2 (from 3.5.7) > --- > > Key: PIG-5457 > URL: https://issues.apache.org/jira/browse/PIG-5457 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.19.0 > > Attachments: pig-5457-v01.patch, pig-5457-v02.patch > > > As mentioned in PIG-5456, zookeeper-3.5.7 dependency pulls in > log4j-1.2.17.jar that we want to avoid. Updating to 3.6.4, making it same as > the dependency from hadoop 3.3.6. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5463) Pig on Tez TestDateTime.testLocalExecution failing on hadoop3/tez-0.10
[ https://issues.apache.org/jira/browse/PIG-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882798#comment-17882798 ] Rohini Palaniswamy commented on PIG-5463: - Can you just rename TestLocalDateTime.java to TestDateTimeLocal.java so that both files appear next to each other ? > Pig on Tez TestDateTime.testLocalExecution failing on hadoop3/tez-0.10 > -- > > Key: PIG-5463 > URL: https://issues.apache.org/jira/browse/PIG-5463 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Fix For: 0.19.0 > > Attachments: pig-5463-v01.patch > > > Somehow TestDateTime testLocalExecution started failing on Pig on Tez with > hadoop3. > {noformat} > 2024-09-11 10:50:29,815 [IPC Server handler 30 on default port 34089] WARN > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor - Invalid > resource ask by application appattempt_1726051802536_0001_01 > org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid > resource request! Cannot allocate containers as requested resource is less > than 0! Requested resource type=[memory-mb], Requested resource= vCores:1> > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.throwInvalidResourceException(SchedulerUtils.java:525) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkResourceRequestAgainstAvailableResource(SchedulerUtils.java:415) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:349) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:304) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:312) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:268) > at > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:254) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > at > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:93) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:434) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:105) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048) > {noformat} > Weird part is, it passes when tested alone or tested twice (with copy&paste). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5455) Upgrade Hadoop to 3.3.6 and Tez to 0.10.3
[ https://issues.apache.org/jira/browse/PIG-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863929#comment-17863929 ] Rohini Palaniswamy commented on PIG-5455: - +1 > Upgrade Hadoop to 3.3.6 and Tez to 0.10.3 > - > > Key: PIG-5455 > URL: https://issues.apache.org/jira/browse/PIG-5455 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.19.0 > > Attachments: pig-5455-v01.patch > > > Latest Tez (0.10.3 and later) requires Hadoop 3.3 or later > and simple upgrade of Hadoop failing the tests with > "Implementing class java.lang.IncompatibleClassChangeError: Implementing > class" > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5439) Support Spark 3 and drop SparkShim
[ https://issues.apache.org/jira/browse/PIG-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844427#comment-17844427 ] Rohini Palaniswamy commented on PIG-5439: - +1 > Support Spark 3 and drop SparkShim > -- > > Key: PIG-5439 > URL: https://issues.apache.org/jira/browse/PIG-5439 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.19.0 > > Attachments: pig-5439-v01.patch, pig-5439-v02.patch > > > Support Pig-on-Spark to run on spark3. > Initial version would only run up to Spark 3.2.4 and not on 3.3 or 3.4. > This is due to log4j mismatch. > After moving to log4j2 (PIG-5426), we can move Spark to 3.3 or higher. > So far, not all unit/e2e tests pass with the proposed patch but at least > compilation goes through. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5450) Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type
[ https://issues.apache.org/jira/browse/PIG-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837863#comment-17837863 ] Rohini Palaniswamy commented on PIG-5450: - +1 > Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type > -- > > Key: PIG-5450 > URL: https://issues.apache.org/jira/browse/PIG-5450 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5450-v01.patch > > > {noformat} > Caused by: java.lang.VerifyError: Bad return type > Exception Details: > Location: > org/apache/orc/impl/TypeUtils.createColumn(Lorg/apache/orc/TypeDescription;Lorg/apache/orc/TypeDescription$RowBatchVersion;I)Lorg/apache/hadoop/hive/ql/exec/vector/ColumnVector; > @117: areturn > Reason: > Type 'org/apache/hadoop/hive/ql/exec/vector/DateColumnVector' (current frame, > stack[0]) is not assignable to > 'org/apache/hadoop/hive/ql/exec/vector/ColumnVector' (from method signature) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5449) TestEmptyInputDir failing on pig-on-spark3
[ https://issues.apache.org/jira/browse/PIG-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837862#comment-17837862 ] Rohini Palaniswamy commented on PIG-5449: - +1 > TestEmptyInputDir failing on pig-on-spark3 > -- > > Key: PIG-5449 > URL: https://issues.apache.org/jira/browse/PIG-5449 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5449-v01.patch > > > TestEmptyInputDir failing on pig-on-spark3 with > {noformat:title=TestEmptyInputDir.testMergeJoinFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testMergeJoin(TestEmptyInputDir.java:141) > {noformat} > {noformat:title=TestEmptyInputDir.testGroupByFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testGroupBy(TestEmptyInputDir.java:80) > {noformat} > {noformat:title=TestEmptyInputDir.testBloomJoinOuterFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testBloomJoinOuter(TestEmptyInputDir.java:297) > {noformat} > {noformat:title=TestEmptyInputDir.testFRJoinFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testFRJoin(TestEmptyInputDir.java:171) > {noformat} > {noformat:title=TestEmptyInputDir.testBloomJoinFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testBloomJoin(TestEmptyInputDir.java:267) > {noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5448) All TestHBaseStorage tests failing on pig-on-spark3
[ https://issues.apache.org/jira/browse/PIG-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837861#comment-17837861 ] Rohini Palaniswamy commented on PIG-5448: - +1 > All TestHBaseStorage tests failing on pig-on-spark3 > --- > > Key: PIG-5448 > URL: https://issues.apache.org/jira/browse/PIG-5448 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5448-v01.patch > > > For Pig on Spark3 (with PIG-5439), all of the TestHBaseStorage unit tests are > failing with > {noformat} > org.apache.pig.PigException: ERROR 1002: Unable to store alias b > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at > org.apache.pig.test.TestHBaseStorage.testStoreToHBase_1_with_delete(TestHBaseStorage.java:1251) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:241) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > Caused by: java.lang.RuntimeException: No task metrics available for jobId 0 > at > org.apache.pig.tools.pigstats.spark.SparkJobStats.collectStats(SparkJobStats.java:109) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:77) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:73) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5438) Update SparkCounter.Accumulator to AccumulatorV2
[ https://issues.apache.org/jira/browse/PIG-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837860#comment-17837860 ] Rohini Palaniswamy commented on PIG-5438: - +1 > Update SparkCounter.Accumulator to AccumulatorV2 > > > Key: PIG-5438 > URL: https://issues.apache.org/jira/browse/PIG-5438 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.19.0 > > Attachments: pig-5438-v01.patch > > > Original Accumulator is deprecated in Spark2 and gone in Spark3. > AccumulatorV2 is usable on both Spark2 and Spark3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5446) Tez TestPigProgressReporting.testProgressReportingWithStatusMessage failing
[ https://issues.apache.org/jira/browse/PIG-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17826791#comment-17826791 ] Rohini Palaniswamy commented on PIG-5446: - +1 > Tez TestPigProgressReporting.testProgressReportingWithStatusMessage failing > --- > > Key: PIG-5446 > URL: https://issues.apache.org/jira/browse/PIG-5446 > Project: Pig > Issue Type: Bug > Components: tez >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5446-v01.patch > > > {noformat} > Unable to open iterator for alias B. Backend error : Vertex failed, > vertexName=scope-4, vertexId=vertex_1707216362777_0001_1_00, > diagnostics=[Task failed, taskId=task_1707216362777_0001_1_00_00, > diagnostics=[TaskAttempt 0 failed, info=[Attempt failed because it appears to > make no progress for 1ms], TaskAttempt 1 failed, info=[Attempt failed > because it appears to make no progress for 1ms]], Vertex did not succeed > due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex > vertex_1707216362777_0001_1_00 [scope-4] killed/failed due > to:OWN_TASK_FAILURE] DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias B. Backend error : Vertex failed, vertexName=scope-4, > vertexId=vertex_1707216362777_0001_1_00, diagnostics=[Task failed, > taskId=task_1707216362777_0001_1_00_00, diagnostics=[TaskAttempt 0 > failed, info=[Attempt failed because it appears to make no progress for > 1ms], TaskAttempt 1 failed, info=[Attempt failed because it appears to > make no progress for 1ms]], Vertex did not succeed due to > OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex > vertex_1707216362777_0001_1_00 [scope-4] killed/failed due > to:OWN_TASK_FAILURE] > DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 > at org.apache.pig.PigServer.openIterator(PigServer.java:1014) > at > org.apache.pig.test.TestPigProgressReporting.testProgressReportingWithStatusMessage(TestPigProgressReporting.java:58) > Caused by: org.apache.tez.dag.api.TezException: Vertex failed, > vertexName=scope-4, vertexId=vertex_1707216362777_0001_1_00, > diagnostics=[Task failed, taskId=task_1707216362777_0001_1_00_00, > diagnostics=[TaskAttempt 0 failed, info=[Attempt failed because it appears to > make no progress for 1ms], TaskAttempt 1 failed, info=[Attempt failed > because it appears to make no progress for 1ms]], Vertex did not succeed > due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex > vertex_1707216362777_0001_1_00 [scope-4] killed/failed due > to:OWN_TASK_FAILURE] > DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 > at > org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:204) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:243) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:212) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 45.647 {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5416) Spark unit tests failing randomly with "java.lang.RuntimeException: Unexpected job execution status RUNNING"
[ https://issues.apache.org/jira/browse/PIG-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17826790#comment-17826790 ] Rohini Palaniswamy commented on PIG-5416: - +1 > Spark unit tests failing randomly with "java.lang.RuntimeException: > Unexpected job execution status RUNNING" > > > Key: PIG-5416 > URL: https://issues.apache.org/jira/browse/PIG-5416 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Priority: Minor > Attachments: pig-5416-v01.patch > > > Spark unit tests fail randomly with same errors. > Sample stack trace showing "Caused by: java.lang.RuntimeException: > Unexpected job execution status RUNNING". > {noformat:title=TestBuiltInBagToTupleOrString.testPigScriptForBagToTupleUDF} > Unable to store alias B > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias B > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1783) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at org.apache.pig.PigServer.registerQuery(PigServer.java:721) > at > org.apache.pig.test.TestBuiltInBagToTupleOrString.testPigScriptForBagToTupleUDF(TestBuiltInBagToTupleOrString.java:429) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:240) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.execute(PigServer.java:1453) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1778) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5447) Pig-on-Spark TestSkewedJoin.testSkewedJoinOuter failing with NoSuchElementException
[ https://issues.apache.org/jira/browse/PIG-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17826789#comment-17826789 ] Rohini Palaniswamy commented on PIG-5447: - +1 > Pig-on-Spark TestSkewedJoin.testSkewedJoinOuter failing with > NoSuchElementException > --- > > Key: PIG-5447 > URL: https://issues.apache.org/jira/browse/PIG-5447 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5447-v01.patch > > > TestSkewedJoin.testSkewedJoinOuter is consistently failing for right-outer > and full-outer joins. > "Caused by: java.util.NoSuchElementException: next on empty iterator" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5437) Add lib and idea folder to .gitignore
[ https://issues.apache.org/jira/browse/PIG-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5437: Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) +1. Committed to trunk and branch-0.18. Thanks for the contribution [~maswin] > Add lib and idea folder to .gitignore > - > > Key: PIG-5437 > URL: https://issues.apache.org/jira/browse/PIG-5437 > Project: Pig > Issue Type: Improvement >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5437-0.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5420) Update accumulo dependency to 1.10.1
[ https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5420: Fix Version/s: 0.18.1 > Update accumulo dependency to 1.10.1 > > > Key: PIG-5420 > URL: https://issues.apache.org/jira/browse/PIG-5420 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.1 > > Attachments: pig-5420-v01.patch > > > Following owasp/cve report. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5419) Upgrade Joda time version
[ https://issues.apache.org/jira/browse/PIG-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5419: Fix Version/s: 0.18.1 (was: 0.18.0) Can you update to 2.12.5 ? > Upgrade Joda time version > - > > Key: PIG-5419 > URL: https://issues.apache.org/jira/browse/PIG-5419 > Project: Pig > Issue Type: Improvement >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Minor > Fix For: 0.18.1 > > Attachments: PIG-5419.patch > > > Pig depends on an older version of Joda time, which can result in conflicts > with other versions in some workflows. Upgrading it to the latest > version(2.10.13) will resolve Pig's side of such issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5440) Extra jars needed for hive3
[ https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5440. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Committed to trunk and branch-0.18. Thanks [~knoguchi] > Extra jars needed for hive3 > --- > > Key: PIG-5440 > URL: https://issues.apache.org/jira/browse/PIG-5440 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Fix For: 0.18.0 > > Attachments: pig-5440-v01.patch, pig-5440-v02.patch > > > When testing Hive3, e2e tests were failing with > {{Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/llap/security/LlapSigner$Signable}} etc. > Updating dependent classes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5438) Update SparkCounter.Accumulator to AccumulatorV2
[ https://issues.apache.org/jira/browse/PIG-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5438: Fix Version/s: 0.19.0 > Update SparkCounter.Accumulator to AccumulatorV2 > > > Key: PIG-5438 > URL: https://issues.apache.org/jira/browse/PIG-5438 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.19.0 > > Attachments: pig-5438-v01.patch > > > Original Accumulator is deprecated in Spark2 and gone in Spark3. > AccumulatorV2 is usable on both Spark2 and Spark3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5439) Support Spark 3 and drop SparkShim
[ https://issues.apache.org/jira/browse/PIG-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5439: Fix Version/s: 0.19.0 > Support Spark 3 and drop SparkShim > -- > > Key: PIG-5439 > URL: https://issues.apache.org/jira/browse/PIG-5439 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.19.0 > > Attachments: pig-5439-v01.patch > > > Support Pig-on-Spark to run on spark3. > Initial version would only run up to Spark 3.2.4 and not on 3.3 or 3.4. > This is due to log4j mismatch. > After moving to log4j2 (PIG-5426), we can move Spark to 3.3 or higher. > So far, not all unit/e2e tests pass with the proposed patch but at least > compilation goes through. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5414) Build failure on Linux ARM64 due to old Apache Avro
[ https://issues.apache.org/jira/browse/PIG-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5414: Fix Version/s: 0.18.1 > Build failure on Linux ARM64 due to old Apache Avro > --- > > Key: PIG-5414 > URL: https://issues.apache.org/jira/browse/PIG-5414 > Project: Pig > Issue Type: Bug > Components: build >Affects Versions: 0.18.0 >Reporter: Martin Tzvetanov Grigorov >Assignee: Martin Tzvetanov Grigorov >Priority: Major > Fix For: 0.18.1 > > Attachments: 35.patch, > TEST-org.apache.pig.builtin.TestAvroStorage.txt, > TEST-org.apache.pig.builtin.TestOrcStorage.txt, > TEST-org.apache.pig.builtin.TestOrcStoragePushdown.txt > > > Trying to build Apache Pig on Ubuntu 20.04.3 ARM64 fails because of old > version of Snappy and Avro libraries: > > {code:java} > Testsuite: org.apache.pig.builtin.TestAvroStorage > Tests run: 0, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.1 sec > - Standard Output --- > 2021-10-12 14:43:35,483 [main] INFO > org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) > of size 1431830528 to monitor. collectionUsageThreshold = 1064828928, > usageThreshold = 1064828928 > 2021-10-12 14:43:35,489 [main] INFO org.apache.pig.ExecTypeProvider - > Trying ExecType : LOCAL > 2021-10-12 14:43:35,489 [main] INFO org.apache.pig.ExecTypeProvider - > Picked LOCAL as the ExecType > 2021-10-12 14:43:35,515 [main] WARN org.apache.hadoop.conf.Configuration - > DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml > is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml > to override properties of core-default.xml, mapred-default.xml and > hdfs-default.xml respectively > 2021-10-12 14:43:35,755 [main] INFO > org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is > deprecated. Instead, use mapreduce.jobtracker.address > 2021-10-12 14:43:35,899 [main] WARN org.apache.hadoop.util.NativeCodeLoader > - Unable to load native-hadoop library for your platform... using > builtin-java classes where applicable > 2021-10-12 14:43:35,916 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting > to hadoop file system at: file:/// > 2021-10-12 14:43:36,116 [main] INFO > org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is > deprecated. Instead, use dfs.bytes-per-checksum > 2021-10-12 14:43:36,137 [main] INFO org.apache.pig.PigServer - Pig Script > ID for the session: PIG-default-01426621-bc19-499f-981e-b13959fe0d84 > 2021-10-12 14:43:36,137 [main] WARN org.apache.pig.PigServer - ATS is > disabled since yarn.timeline-service.enabled set to false > 2021-10-12 14:43:36,150 [main] INFO org.apache.pig.builtin.TestAvroStorage > - creating > test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro > 2021-10-12 14:43:36,502 [main] INFO org.apache.pig.builtin.TestAvroStorage > - Could not generate avro file: > test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro > java.net.ConnectException: Call From martin/127.0.0.1 to localhost:40073 > failed on connection exception: java.net.ConnectException: Connection > refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) > at org.apache.hadoop.ipc.Client.call(Client.java:1479) > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ... > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5418) Utils.parseSchema(String), parseConstant(String) leak memory
[ https://issues.apache.org/jira/browse/PIG-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5418: Fix Version/s: 0.18.1 > Utils.parseSchema(String), parseConstant(String) leak memory > > > Key: PIG-5418 > URL: https://issues.apache.org/jira/browse/PIG-5418 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.1 > > Attachments: PIG-5418.patch > > > A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak > memory. I noticed this while running a unit test for a UDF several thousand > times and checking the heap. > Links are to latest commit as of creating this ticket: > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256 > {{new PigContext()}} [creates a MapReduce > ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269]. > > This creates a > [MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34]. > > This registers a [Hadoop shutdown > hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105] > which doesn't go away until the JVM dies. See: > https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html > . > I will attach a proposed patch. From my reading of the code and running > tests, the existing schema parse APIs do not actually use anything from this > dummy PigContext, and with a minor tweak it can be passed in as NULL, > avoiding the creation of these extra resources. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5443) Add testcase for skew join for tez grace shuffle vertex manager
[ https://issues.apache.org/jira/browse/PIG-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5443: Description: Need to add test case for fix in https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the existing skewed join unit or e2e test cases by increasing mappers (split size) or adding PARALLEL 2 for right side data. Also check if one-one edges are affected by this part of the code. (was: Need to add test case for fix in https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the existing skewed join unit or e2e test cases by increasing mappers (split size) or adding PARALLEL 2 for right side data. ) > Add testcase for skew join for tez grace shuffle vertex manager > --- > > Key: PIG-5443 > URL: https://issues.apache.org/jira/browse/PIG-5443 > Project: Pig > Issue Type: Task >Reporter: Rohini Palaniswamy >Priority: Minor > > Need to add test case for fix in > https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the > existing skewed join unit or e2e test cases by increasing mappers (split > size) or adding PARALLEL 2 for right side data. Also check if one-one edges > are affected by this part of the code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5443) Add testcase for skew join for tez grace shuffle vertex manager
Rohini Palaniswamy created PIG-5443: --- Summary: Add testcase for skew join for tez grace shuffle vertex manager Key: PIG-5443 URL: https://issues.apache.org/jira/browse/PIG-5443 Project: Pig Issue Type: Task Reporter: Rohini Palaniswamy Need to add test case for fix in https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the existing skewed join unit or e2e test cases by increasing mappers (split size) or adding PARALLEL 2 for right side data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5442) Add only credentials from setStoreLocation to the Job Conf
[ https://issues.apache.org/jira/browse/PIG-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5442. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed +1. Committed to branch-0.18. and trunk. Thanks for the contribution [~maswin] > Add only credentials from setStoreLocation to the Job Conf > -- > > Key: PIG-5442 > URL: https://issues.apache.org/jira/browse/PIG-5442 > Project: Pig > Issue Type: Bug >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5442-1.patch > > > While testing HCatStorer with Iceberg realized Pig calls setStoreLocation on > all Stores with the same Job object - > [https://github.com/apache/pig/blob/b050a33c66fc22d648370b5c6bda04e0e51d3aa3/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java#L1081] > Setting populated by one store is affecting the other stores. In my case the > "mapred.output.committer.class" is set as HiveIcebergCommitter by PigStore > that is used by the Iceberg table and the other stores which inserts data to > a non-iceberg tables also use that setting and trying to use > HiveIcebergCommitter. > > On checking with [~rohini] , it is called to get the credentials from all > stores since addCredentials API was added later and not all stores have > implemented it and some still set configuration in setLocation method (i.e, > HCatStorer). > > Fixed it by passing a separate copy of Job object to each store's setLocation > method and adding only the credential object from the call. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5442) Add only credentials from setStoreLocation to the Job Conf
[ https://issues.apache.org/jira/browse/PIG-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5442: Attachment: PIG-5442-1.patch > Add only credentials from setStoreLocation to the Job Conf > -- > > Key: PIG-5442 > URL: https://issues.apache.org/jira/browse/PIG-5442 > Project: Pig > Issue Type: Bug >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Major > Attachments: PIG-5442-1.patch > > > While testing HCatStorer with Iceberg realized Pig calls setStoreLocation on > all Stores with the same Job object - > [https://github.com/apache/pig/blob/b050a33c66fc22d648370b5c6bda04e0e51d3aa3/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java#L1081] > Setting populated by one store is affecting the other stores. In my case the > "mapred.output.committer.class" is set as HiveIcebergCommitter by PigStore > that is used by the Iceberg table and the other stores which inserts data to > a non-iceberg tables also use that setting and trying to use > HiveIcebergCommitter. > > On checking with [~rohini] , it is called to get the credentials from all > stores since addCredentials API was added later and not all stores have > implemented it and some still set configuration in setLocation method (i.e, > HCatStorer). > > Fixed it by passing a separate copy of Job object to each store's setLocation > method and adding only the credential object from the call. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data
[ https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5441: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to branch-0.18 and trunk. Thanks [~yigress] for the contribution. > Pig skew join tez grace reducer fails to find shuffle data > -- > > Key: PIG-5441 > URL: https://issues.apache.org/jira/browse/PIG-5441 > Project: Pig > Issue Type: Bug > Components: tez >Affects Versions: 0.17.0 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5441.patch > > > User pig tez skew join encountered issue of not finding shuffle data from the > sampler aggregate vertex. The right side join has >1 reducers. > For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid > spill to disk for the sampler aggregation vertex. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data
[ https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725781#comment-17725781 ] Rohini Palaniswamy commented on PIG-5441: - +1. Can you just attach the patch to jira ? > Pig skew join tez grace reducer fails to find shuffle data > -- > > Key: PIG-5441 > URL: https://issues.apache.org/jira/browse/PIG-5441 > Project: Pig > Issue Type: Bug > Components: tez >Affects Versions: 0.17.0 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Major > Fix For: 0.18.0 > > > User pig tez skew join encountered issue of not finding shuffle data from the > sampler aggregate vertex. The right side join has >1 reducers. > For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid > spill to disk for the sampler aggregation vertex. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data
[ https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5441: Fix Version/s: 0.18.0 Assignee: Yi Zhang Status: Patch Available (was: Open) > Pig skew join tez grace reducer fails to find shuffle data > -- > > Key: PIG-5441 > URL: https://issues.apache.org/jira/browse/PIG-5441 > Project: Pig > Issue Type: Bug > Components: tez >Affects Versions: 0.17.0 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Major > Fix For: 0.18.0 > > > User pig tez skew join encountered issue of not finding shuffle data from the > sampler aggregate vertex. The right side join has >1 reducers. > For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid > spill to disk for the sampler aggregation vertex. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5440) Extra jars needed for hive3
[ https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725780#comment-17725780 ] Rohini Palaniswamy commented on PIG-5440: - +1 > Extra jars needed for hive3 > --- > > Key: PIG-5440 > URL: https://issues.apache.org/jira/browse/PIG-5440 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5440-v01.patch, pig-5440-v02.patch > > > When testing Hive3, e2e tests were failing with > {{Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/llap/security/LlapSigner$Signable}} etc. > Updating dependent classes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5440) Extra jars needed for hive3
[ https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722276#comment-17722276 ] Rohini Palaniswamy commented on PIG-5440: - +1. Can you add space between "orc-shims","aircompressor" before commit ? > Extra jars needed for hive3 > --- > > Key: PIG-5440 > URL: https://issues.apache.org/jira/browse/PIG-5440 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5440-v01.patch > > > When testing Hive3, e2e tests were failing with > {{Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/llap/security/LlapSigner$Signable}} etc. > Updating dependent classes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (PIG-5432) OrcStorage fails to detect schema in some cases
[ https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706981#comment-17706981 ] Rohini Palaniswamy edited comment on PIG-5432 at 3/30/23 5:33 PM: -- +1. Committed to branch-0.18 and trunk. Thanks for the contribution [~jtolar] was (Author: rohini): +1. Committed to branch-0.18 and trunk. Thanks for contribution [~jtolar] > OrcStorage fails to detect schema in some cases > --- > > Key: PIG-5432 > URL: https://issues.apache.org/jira/browse/PIG-5432 > Project: Pig > Issue Type: Bug >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5432.v01.patch > > > OrcStorage needs to detect the schema of input data paths. If some data paths > have no ORC files (perhaps only a _SUCCESS marker is present), this will > fail. > For example: > {code} > A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage(); > {code} > If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} > contains data, OrcStorage fails to detect the schema and Pig exits with a > confusing/unhelpful error, something like "Cannot find any ORC files from > . Probably multiple load/store statements in script." > The code tries to use a search algorithm to recursively search through all > input paths for the data (via Utils.depthFirstSearchForFile), but it is > implemented incorrectly and returns early in this scenario. > See: > https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408 > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667 > I'll attach a patch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5432) OrcStorage fails to detect schema in some cases
[ https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5432: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) +1. Committed to branch-0.18 and trunk. Thanks for contribution [~jtolar] > OrcStorage fails to detect schema in some cases > --- > > Key: PIG-5432 > URL: https://issues.apache.org/jira/browse/PIG-5432 > Project: Pig > Issue Type: Bug >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5432.v01.patch > > > OrcStorage needs to detect the schema of input data paths. If some data paths > have no ORC files (perhaps only a _SUCCESS marker is present), this will > fail. > For example: > {code} > A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage(); > {code} > If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} > contains data, OrcStorage fails to detect the schema and Pig exits with a > confusing/unhelpful error, something like "Cannot find any ORC files from > . Probably multiple load/store statements in script." > The code tries to use a search algorithm to recursively search through all > input paths for the data (via Utils.depthFirstSearchForFile), but it is > implemented incorrectly and returns early in this scenario. > See: > https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408 > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667 > I'll attach a patch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5432) OrcStorage fails to detect schema in some cases
[ https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5432: --- Fix Version/s: 0.18.0 Assignee: Jacob Tolar > OrcStorage fails to detect schema in some cases > --- > > Key: PIG-5432 > URL: https://issues.apache.org/jira/browse/PIG-5432 > Project: Pig > Issue Type: Bug >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5432.v01.patch > > > OrcStorage needs to detect the schema of input data paths. If some data paths > have no ORC files (perhaps only a _SUCCESS marker is present), this will > fail. > For example: > {code} > A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage(); > {code} > If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} > contains data, OrcStorage fails to detect the schema and Pig exits with a > confusing/unhelpful error, something like "Cannot find any ORC files from > . Probably multiple load/store statements in script." > The code tries to use a search algorithm to recursively search through all > input paths for the data (via Utils.depthFirstSearchForFile), but it is > implemented incorrectly and returns early in this scenario. > See: > https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408 > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667 > I'll attach a patch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5436) update owasp version
[ https://issues.apache.org/jira/browse/PIG-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5436. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Committed to trunk. Thanks Koji > update owasp version > > > Key: PIG-5436 > URL: https://issues.apache.org/jira/browse/PIG-5436 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5436-v01.patch > > > Owasp testing started to fail with > {quote}Caused by: org.h2.jdbc.JdbcBatchUpdateException: Value too long for > column "VERSIONENDEXCLUDING VARCHAR(50) SELECTIVITY 1" > {quote} > > Following https://github.com/jeremylong/DependencyCheck/issues/5225, updating > the owasp version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
[ https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5435: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~vnarayanan7] and [~knoguchi]. > pig.exec.reducers.max does not take effect for skewed join > -- > > Key: PIG-5435 > URL: https://issues.apache.org/jira/browse/PIG-5435 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5435-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j
[ https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5434: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Koji > Migrate from log4j to reload4j > -- > > Key: PIG-5434 > URL: https://issues.apache.org/jira/browse/PIG-5434 > Project: Pig > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5434-1.patch > > > Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As > 0.18 is delayed long enough, migrating to reload4j in this release similar to > HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5417: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thank you for the contribution [~xiaoheipangzi] > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5436) update owasp version
[ https://issues.apache.org/jira/browse/PIG-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677558#comment-17677558 ] Rohini Palaniswamy commented on PIG-5436: - +1 > update owasp version > > > Key: PIG-5436 > URL: https://issues.apache.org/jira/browse/PIG-5436 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5436-v01.patch > > > Owasp testing started to fail with > {quote}Caused by: org.h2.jdbc.JdbcBatchUpdateException: Value too long for > column "VERSIONENDEXCLUDING VARCHAR(50) SELECTIVITY 1" > {quote} > > Following https://github.com/jeremylong/DependencyCheck/issues/5225, updating > the owasp version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
[ https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5435: Status: Patch Available (was: Open) This is a patch that I had reviewed internally. [~knoguchi] can you +1 here. > pig.exec.reducers.max does not take effect for skewed join > -- > > Key: PIG-5435 > URL: https://issues.apache.org/jira/browse/PIG-5435 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5435-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
[ https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5435: Attachment: PIG-5435-1.patch > pig.exec.reducers.max does not take effect for skewed join > -- > > Key: PIG-5435 > URL: https://issues.apache.org/jira/browse/PIG-5435 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5435-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
Rohini Palaniswamy created PIG-5435: --- Summary: pig.exec.reducers.max does not take effect for skewed join Key: PIG-5435 URL: https://issues.apache.org/jira/browse/PIG-5435 Project: Pig Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Venkatasubrahmanian Narayanan Fix For: 0.18.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5417: Status: Patch Available (was: Open) > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677523#comment-17677523 ] Rohini Palaniswamy commented on PIG-5417: - Downloaded https://patch-diff.githubusercontent.com/raw/apache/pig/pull/36.patch and was going to commit it, but compilation failed as it did not catch IOException. Updated the patch with a try catch block. [~knoguchi], can you +1 as there is a minor change from the original patch? Thought will get this into the release as it address a CVE. > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5417: Attachment: PIG-5417-1.patch > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5417: --- Assignee: lujie (was: lujie) > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5417: --- Fix Version/s: 0.18.0 Assignee: lujie > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j
[ https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5434: Status: Patch Available (was: Open) This patch also upgrades to the latest slf4j version > Migrate from log4j to reload4j > -- > > Key: PIG-5434 > URL: https://issues.apache.org/jira/browse/PIG-5434 > Project: Pig > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5434-1.patch > > > Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As > 0.18 is delayed long enough, migrating to reload4j in this release similar to > HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j
[ https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5434: Attachment: PIG-5434-1.patch > Migrate from log4j to reload4j > -- > > Key: PIG-5434 > URL: https://issues.apache.org/jira/browse/PIG-5434 > Project: Pig > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5434-1.patch > > > Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As > 0.18 is delayed long enough, migrating to reload4j in this release similar to > HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5426) Migrate to log4j2.x
[ https://issues.apache.org/jira/browse/PIG-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5426: Fix Version/s: 0.19.0 (was: 0.18.0) Running into issues. As 0.18 is delayed long enough, migrating to reload4j in that release as part of PIG-5434. Will migrate to log4j2.x in the next release. > Migrate to log4j2.x > --- > > Key: PIG-5426 > URL: https://issues.apache.org/jira/browse/PIG-5426 > Project: Pig > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.19.0 > > > Hadoop (HADOOP-18088) decided to migrate to reload4j to address log4j1.x > vulnerabilities. I did the work of migrating Oozie server and client to > log4j2.x while launched hadoop jobs will still use 1.x till hadoop migrates. > So think it should be easy to do that for pig client as well. If it does not > work as expected, will just go with the easy switch to reload4j. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5434) Migrate from log4j to reload4j
Rohini Palaniswamy created PIG-5434: --- Summary: Migrate from log4j to reload4j Key: PIG-5434 URL: https://issues.apache.org/jira/browse/PIG-5434 Project: Pig Issue Type: Improvement Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As 0.18 is delayed long enough, migrating to reload4j in this release similar to HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
[ https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5431: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Koji for the review. > Date datatype is different between Hive 1.x and Hive 3.x > > > Key: PIG-5431 > URL: https://issues.apache.org/jira/browse/PIG-5431 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5431-1.patch > > > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot > be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5433: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Koji. > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5433-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
[ https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5431: Status: Patch Available (was: Open) > Date datatype is different between Hive 1.x and Hive 3.x > > > Key: PIG-5431 > URL: https://issues.apache.org/jira/browse/PIG-5431 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5431-1.patch > > > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot > be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
[ https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5431: Attachment: PIG-5431-1.patch > Date datatype is different between Hive 1.x and Hive 3.x > > > Key: PIG-5431 > URL: https://issues.apache.org/jira/browse/PIG-5431 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5431-1.patch > > > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot > be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5433: Attachment: PIG-5433-1.patch > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5433-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5433: Status: Patch Available (was: Open) > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5433-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677083#comment-17677083 ] Rohini Palaniswamy commented on PIG-5433: - Ran into below test failure {code} org/apache/htrace/core/Tracer$Builder java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3256) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3310) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3278) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:68) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.(HDataStorage.java:58) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:227) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:111) at org.apache.pig.impl.PigContext.connect(PigContext.java:310) at org.apache.pig.PigServer.(PigServer.java:232) at org.apache.pig.PigServer.(PigServer.java:220) at org.apache.pig.PigServer.(PigServer.java:212) at org.apache.pig.PigServer.(PigServer.java:208) at org.apache.pig.builtin.TestOrcStorage.setup(TestOrcStorage.java:109) Caused by: java.lang.ClassNotFoundException: org.apache.htrace.core.Tracer$Builder at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) {code} and java.io.IOException: Waiting for startup of standalone server in TestHBaseStorage described in https://stackoverflow.com/questions/67364593/java-io-ioexception-waiting-for-startup-of-standalone-server-minizookeeperclu > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
Rohini Palaniswamy created PIG-5433: --- Summary: Fix test failures with TestHBaseStorage and htrace dependency Key: PIG-5433 URL: https://issues.apache.org/jira/browse/PIG-5433 Project: Pig Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
Rohini Palaniswamy created PIG-5431: --- Summary: Date datatype is different between Hive 1.x and Hive 3.x Key: PIG-5431 URL: https://issues.apache.org/jira/browse/PIG-5431 Project: Pig Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5321) Upgrade Spark 2 version to 2.2.0 for Pig on Spark
[ https://issues.apache.org/jira/browse/PIG-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5321. - Resolution: Duplicate This has been already fixed by [~knoguchi] as part of PIG-5397 with Spark 2 version being upgraded to 2.4.8. > Upgrade Spark 2 version to 2.2.0 for Pig on Spark > - > > Key: PIG-5321 > URL: https://issues.apache.org/jira/browse/PIG-5321 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Ádám Szita >Priority: Major > > Right now we maintain support for 2 versions of Spark for PoS jobs: > spark1.version=1.6.1 > spark2.version=2.1.1 > I believe we should move forward with the latter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5430) TestTezGraceParallelism failing due to tez log change
[ https://issues.apache.org/jira/browse/PIG-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618002#comment-17618002 ] Rohini Palaniswamy commented on PIG-5430: - +1 > TestTezGraceParallelism failing due to tez log change > - > > Key: PIG-5430 > URL: https://issues.apache.org/jira/browse/PIG-5430 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5430-v01.patch > > > After PIG-5428, TestTezGraceParallelism:testIncreaseParallelism, > testDecreaseParallelism started failing due to change in log messages by > recent Tez. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5429) Update hbase version from 2.0.0 to 2.4.14
[ https://issues.apache.org/jira/browse/PIG-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618001#comment-17618001 ] Rohini Palaniswamy commented on PIG-5429: - +1 > Update hbase version from 2.0.0 to 2.4.14 > - > > Key: PIG-5429 > URL: https://issues.apache.org/jira/browse/PIG-5429 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5429-v01.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5428) Update hadoop2,3 and tez to recent versions
[ https://issues.apache.org/jira/browse/PIG-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616638#comment-17616638 ] Rohini Palaniswamy commented on PIG-5428: - +1 > Update hadoop2,3 and tez to recent versions > --- > > Key: PIG-5428 > URL: https://issues.apache.org/jira/browse/PIG-5428 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.18.0 > > Attachments: pig-5428-v01.patch > > > PIG-5253 hadoop3 patch is committed. > Now, updating hadoop2&3, tez and other dependent library versions. > Only testing using two different parameters. > * -Dhbaseversion=2 -Dhadoopversion=2 -Dhiveversion=1 -Dsparkversion=2 > and > * -Dhbaseversion=2 -Dhadoopversion=3 -Dhiveversion=3 -Dsparkversion=2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5406: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Daniel for the review. > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.15.0, 0.16.0, 0.17.0 >Reporter: James Z.M. Gao >Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5406-v1.patch > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5426) Migrate to log4j2.x
Rohini Palaniswamy created PIG-5426: --- Summary: Migrate to log4j2.x Key: PIG-5426 URL: https://issues.apache.org/jira/browse/PIG-5426 Project: Pig Issue Type: Improvement Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 Hadoop (HADOOP-18088) decided to migrate to reload4j to address log4j1.x vulnerabilities. I did the work of migrating Oozie server and client to log4j2.x while launched hadoop jobs will still use 1.x till hadoop migrates. So think it should be easy to do that for pig client as well. If it does not work as expected, will just go with the easy switch to reload4j. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5388) Upgrade to Avro and Trevni 1.9.x
[ https://issues.apache.org/jira/browse/PIG-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5388: --- Fix Version/s: 0.18.0 Assignee: Rohini Palaniswamy Summary: Upgrade to Avro and Trevni 1.9.x (was: Upgrade to Avro 1.9.x) > Upgrade to Avro and Trevni 1.9.x > > > Key: PIG-5388 > URL: https://issues.apache.org/jira/browse/PIG-5388 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5419) Upgrade Joda time version
[ https://issues.apache.org/jira/browse/PIG-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5419: Fix Version/s: 0.18.0 Can you update to 2.11.0 (https://www.joda.org/joda-time/changes-report.html#a2.11.0)? > Upgrade Joda time version > - > > Key: PIG-5419 > URL: https://issues.apache.org/jira/browse/PIG-5419 > Project: Pig > Issue Type: Improvement >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5419.patch > > > Pig depends on an older version of Joda time, which can result in conflicts > with other versions in some workflows. Upgrading it to the latest > version(2.10.13) will resolve Pig's side of such issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5406: Attachment: PIG-5406-v1.patch > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.15.0, 0.16.0, 0.17.0 >Reporter: James Z.M. Gao >Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5406-v1.patch > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5406: Status: Patch Available (was: Open) > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0, 0.16.0, 0.15.0 >Reporter: James Z.M. Gao >Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5406-v1.patch > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5406: --- Fix Version/s: 0.18.0 Assignee: Rohini Palaniswamy Priority: Minor (was: Major) > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.15.0, 0.16.0, 0.17.0 >Reporter: James Z.M. Gao >Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5423) Upgrade hadoop/tez dependency
[ https://issues.apache.org/jira/browse/PIG-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579177#comment-17579177 ] Rohini Palaniswamy commented on PIG-5423: - [~knoguchi], you mentioned about having to add tez_conf.set("tez.runtime.transfer.data-via-events.enabled", "false"); to fix some test failures. Can the patch be updated with that? > Upgrade hadoop/tez dependency > -- > > Key: PIG-5423 > URL: https://issues.apache.org/jira/browse/PIG-5423 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5423-v01.patch > > > We already have PIG-5253 for supporting hadoop3. Here, upgrading hadoop2 > dependency to the most recent hadoop2 version, 2.10.1. > Also, upgrading Tez to 0.9.2. (0.10.1 showed some regressions and need > further checking). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5422) Upgrade guava/groovy dependency
[ https://issues.apache.org/jira/browse/PIG-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5422. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed +1. Committed to trunk. Thanks Koji. > Upgrade guava/groovy dependency > --- > > Key: PIG-5422 > URL: https://issues.apache.org/jira/browse/PIG-5422 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5422-v01.patch, pig-5422-v02.patch > > > Following owasp/cve. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5421) Upgrade commons dependencies
[ https://issues.apache.org/jira/browse/PIG-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5421. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Committed to trunk. Thanks Koji > Upgrade commons dependencies > - > > Key: PIG-5421 > URL: https://issues.apache.org/jira/browse/PIG-5421 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5421-v01.patch > > > Following owasp/cve report -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5253. - Hadoop Flags: Reviewed Resolution: Fixed > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reopened PIG-5253: - > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5377. - Hadoop Flags: Reviewed Resolution: Fixed > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5377: Patch Info: (was: Patch Available) > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reopened PIG-5377: - > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5425. - Hadoop Flags: Reviewed Resolution: Fixed > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reopened PIG-5425: - > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5425: Resolution: Fixed Status: Resolved (was: Patch Available) > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5425: --- Fix Version/s: 0.18.0 Assignee: Jacob Tolar +1. Committed to trunk. Thanks for the patch [~jtolar] > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5420) Update accumulo dependency to 1.10.1
[ https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579158#comment-17579158 ] Rohini Palaniswamy commented on PIG-5420: - This patch needs updating as accumulo.version is now moved to ivy/libraries-h2.properties and ivy/libraries-h3.properties after PIG-5253 > Update accumulo dependency to 1.10.1 > > > Key: PIG-5420 > URL: https://issues.apache.org/jira/browse/PIG-5420 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5420-v01.patch > > > Following owasp/cve report. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5253. - Resolution: Fixed Attached [^PIG-5253.0.svn.patch] from https://reviews.apache.org/r/72326/ to jira. Fixed the wrong license file and committed [^PIG-5253-v3.patch] to trunk. Thanks [~nkollar] and [~szita] for this key patch. > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5253: Attachment: PIG-5253.0.svn.patch PIG-5253-v3.patch > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5377: Resolution: Fixed Status: Resolved (was: Patch Available) > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5377: Fix Version/s: 0.18.0 Committed to trunk. Thank you for the contribution [~kpriceyahoo]. > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)