insert_push_body_df = spark.sql('''select CASE WHEN t2.offer_id='' then NULL else t2.offer_id end as offer_id,\ CASE WHEN t2.content_set_id='' then NULL else t2.content_set_id end as content_set_id,\ CASE WHEN t2.post_id='' then NULL else t2.post_id end as post_id,t2.nuid,t2.apid,\ t2.push_id,t2.event_id as id1,t2.offset,t2.event_occurre d,min(t2.event_processed),\ t2.payload,t2.group_id,t2.trimmed,t2.event_type,from_unixtime(unix_timestamp(),'yyyy-MM-dd hh:mm:ss') as load_date,\ t2.app_id as app_id,CAST(REGEXP_REPLACE(SUBSTR(t2.event_occurred, 1, 10), '-', '') AS INT) AS occurred_part FROM \ (SELECT regexp_extract(t1.pl_unbase,'"offer_id":"([^"]*)"',1) as offer_id,\ regexp_extract(t1.pl_unbase,'"content_set_id":"([^"]*)"',1) as content_set_id,\ regexp_extract(t1.pl_unbase,'"post_id":"([^"]*)"',1) as post_id,\ regexp_replace(COALESCE(get_json_object(t1.pl_unbase, '$.audience.alias'),\ get_json_object(t1.pl_unbase,'$.audience.or.alias')),'\\[|\\"|\\]','') as nuid,\ regexp_replace(COALESCE(get_json_object(t1.pl_unbase, '$.audience.apid'),\ get_json_object(t1.pl_unbase, '$.audience.or.apid') ), '\\[|\\"|\\]','') as apid,\ t1.push_id as push_id,t1.event_id as event_id,t1.offset as offset,\ t1.occurred as event_occurred,t1.processed as event_processed,t1.pl_unbase as payload,\ t1.group_id as group_id,t1.trimmed as trimmed,\ t1.event_type as event_type,t1.app_id as app_id FROM (select cast(unbase64(body.payload) as string) as pl_unbase,\ body.push_id as push_id,id as event_id,offset as offset,occurred as occurred,\ processed as processed,body.group_id as group_id,body.trimmed as trimmed,\ type as event_type,app_id as app_id from dev_stg_urban_airship.push_body_stage) t1 ) t2 GROUP BY t2.offer_id,\ t2.content_set_id,t2.post_id,t2.nuid,t2.apid,t2.push_id,t2.e vent_id,t2.offset,t2.event_occurred,t2.payload,t2.group_id, t2.trimmed,t2.event_type,t2.app_id''')
In the above code when running as HQL APID value is getting populated ,but when using as sprak_sql its giving NULL. ITS line 12 regexp_replace(COALESCE(get_json_object(t1.pl_unbase, '$.audience.apid'),\ get_json_object(t1.pl_unbase, '$.audience.or.apid') ), '\\[|\\"|\\]','') as apid,\ HOW does get_json_object works in spark Regards Nirav