I don't think this would be hard to implement. The physical explode operator supports it (for our HiveQL compatibility).
Perhaps comment on this JIRA? https://issues.apache.org/jira/browse/SPARK-13721 It could probably just be another argument to explode() Michael On Mon, Jul 25, 2016 at 6:12 PM, Don Drake <dondr...@gmail.com> wrote: > No response on the Users list, I thought I would repost here. > > See below. > > -Don > > ---------- Forwarded message ---------- > From: Don Drake <dondr...@gmail.com> > Date: Sun, Jul 24, 2016 at 2:18 PM > Subject: Outer Explode needed > To: user <u...@spark.apache.org> > > > I have a nested data structure (array of structures) that I'm using the > DSL df.explode() API to flatten the data. However, when the array is > empty, I'm not getting the rest of the row in my output as it is skipped. > > This is the intended behavior, and Hive supports a SQL "OUTER explode()" > to generate the row when the explode would not yield any output. > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView > > Can we get this same outer explode in the DSL? I have to jump through > some outer join hoops to get the rows where the array is empty. > > Thanks. > > -Don > > -- > Donald Drake > Drake Consulting > http://www.drakeconsulting.com/ > https://twitter.com/dondrake <http://www.MailLaunder.com/> > 800-733-2143 > > > > -- > Donald Drake > Drake Consulting > http://www.drakeconsulting.com/ > https://twitter.com/dondrake <http://www.MailLaunder.com/> > 800-733-2143 >