[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755949#comment-16755949 ]
Hyukjin Kwon commented on SPARK-26777: -------------------------------------- Please don't reopen. Take a look https://spark.apache.org/contributing.html > SQL worked in 2.3.2 and fails in 2.4.0 > -------------------------------------- > > Key: SPARK-26777 > URL: https://issues.apache.org/jira/browse/SPARK-26777 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.4.0 > Reporter: Yuri Budilov > Priority: Major > > Following SQL worked in Spark 2.3.2 and now fails on 2.4.0 (AWS EMR Spark) > ------------ PySpark call below: > spark.sql("select partition_year_utc,partition_month_utc,partition_day_utc \ > from datalake_reporting.copy_of_leads_notification \ > where partition_year_utc = (select max(partition_year_utc) from > datalake_reporting.copy_of_leads_notification) \ > and partition_month_utc = \ > (select max(partition_month_utc) from > datalake_reporting.copy_of_leads_notification as m \ > where \ > m.partition_year_utc = (select max(partition_year_utc) from > datalake_reporting.copy_of_leads_notification)) \ > and partition_day_utc = (select max(d.partition_day_utc) from > datalake_reporting.copy_of_leads_notification as d \ > where d.partition_month_utc = \ > (select max(m1.partition_month_utc) from > datalake_reporting.copy_of_leads_notification as m1 \ > where m1.partition_year_utc = \ > (select max(y.partition_year_utc) from > datalake_reporting.copy_of_leads_notification as y) \ > ) \ > ) \ > order by 1 desc, 2 desc, 3 desc limit 1 ").show(1,False) > Error: (no need for data, this is syntax). > py4j.protocol.Py4JJavaError: An error occurred while calling o1326.showString. > : java.lang.UnsupportedOperationException: Cannot evaluate expression: > scalar-subquery#4495 [] > > Note: all 3 columns in query are Partitioned columns - see bottom of the > schema) > > Hive EMR AWS Schema is: > > CREATE EXTERNAL TABLE `copy_of_leads_notification`( > `message.environment.siteorigin` string, `dcpheader.dcploaddateutc` string, > `message.id` int, `source.properties._country` string, `message.created` > string, `dcpheader.generatedmessageid` string, `message.tags` bigint, > `source.properties._enqueuedtimeutc` string, `source.properties._leadtype` > string, `message.itemid` string, `message.prospect.postcode` string, > `message.prospect.email` string, `message.referenceid` string, > `message.item.year` string, `message.identifier` string, > `dcpheader.dcploadmonthutc` string, `message.processed` string, > `source.properties._tenant` string, `message.item.price` string, > `message.subscription.confirmresponse` boolean, `message.itemtype` string, > `message.prospect.lastname` string, `message.subscription.insurancequote` > boolean, `source.exchangename` string, > `message.prospect.identificationnumbers` bigint, > `message.environment.ipaddress` string, `dcpheader.dcploaddayutc` string, > `source.properties._itemtype` string, `source.properties._requesttype` > string, `message.item.make` string, `message.prospect.firstname` string, > `message.subscription.survey` boolean, `message.prospect.homephone` string, > `message.extendedproperties` bigint, `message.subscription.financequote` > boolean, `message.uniqueidentifier` string, `source.properties._id` string, > `dcpheader.sourcemessageguid` string, `message.requesttype` string, > `source.routingkey` string, `message.service` string, `message.item.model` > string, `message.environment.pagesource` string, `source.source` string, > `message.sellerid` string, `partition_date_utc` string, > `message.selleridentifier` string, `message.subscription.newsletter` boolean, > `dcpheader.dcploadyearutc` string, `message.leadtype` string, > `message.history` bigint, `message.callconnect.calloutcome` string, > `message.callconnect.datecreatedutc` string, > `message.callconnect.callrecordingurl` string, > `message.callconnect.transferoutcome` string, > `message.callconnect.hiderecording` boolean, > `message.callconnect.callstartutc` string, `message.callconnect.code` string, > `message.callconnect.callduration` string, `message.fraudnetinfo` string, > `message.callconnect.answernumber` string, `message.environment.sourcedevice` > string, `message.comments` string, `message.fraudinfo.servervariables` > bigint, `message.callconnect.servicenumber` string, > `message.callconnect.callid` string, `message.callconnect.voicemailurl` > string, `message.item.stocknumber` string, > `message.callconnect.answerduration` string, `message.callconnect.callendutc` > string, `message.item.series` string, `message.item.detailsurl` string, > `message.item.pricetype` string, `message.item.description` string, > `message.item.colour` string, `message.item.badge` string, > `message.item.odometer` string, `message.environment.requestheader` string, > `message.item.registrationnumber` string, `message.item.bodytype` string, > `message.item.fueltype` string, `message.item.redbookcode` string, > `message.item.spotid` string, `message.item.id` string, > `message.item.transmission` string, `message.item.vin` string, > `message.item.enginedescription` string, `message.prospect.mobilephone` > string, `message.prospect.membertrackingid` string, > `message.environment.username` string, `message.prospect.workphone` string, > `message.environment.servername` string, `message.environment.sessionid` > string, `message.tradein.type` string, `message.tradein.model` string, > `message.tradein.year` string, `message.tradein.make` string, > `message.tradein.kms` string, `message.fraudinfo.servertimestamp` string, > `message.prospect.suburb` string, `message.callconnect.username` string, > `message.callconnect.password` string, `message.status` string, > `message.tradein.colour` string, `message.prospect.address` string, > `message.prospect.state` string, `message.tradein.detailsurl` string, > `message.prospect.faxnumber` string, `message.prospect.companyname` string, > `message.prospect.title` string, `message.callconnect.calloutcometext` > string, `message.prospect.preferredcontacttime` string, > `message.fraudinfo.devicedata` string, > `message.prospect.preferredcontactmethod` string, > `message.assignment.assigned` string, `message.assignment.email` string, > `message.assignment.name` string, `message.country` string, > `message.financepackage.id` string, `message.financepackage.version` string) > PARTITIONED BY ( `partition_year_utc` string, `partition_month_utc` string, > `partition_day_utc` string, `job_run_guid` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS > INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > LOCATION 's3://datalake/yurib_test/leads_notification' > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org