[ https://issues.apache.org/jira/browse/SPARK-12352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Herman van Hovell closed SPARK-12352. ------------------------------------- Resolution: Duplicate > Reuse the result of split in SQL > -------------------------------- > > Key: SPARK-12352 > URL: https://issues.apache.org/jira/browse/SPARK-12352 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 1.5.2 > Reporter: Yadong Qi > Priority: Critical > > When use split in sql, if we want to get the different value through index > from same array, it will split the same row every time. And the split in Java > is poor performance. > {code} > spark-sql> explain extended select array[0] as a, array[1] as b, array[2] as > c from (select split(value, ',') as array from src_split) t; > == Parsed Logical Plan == > 'Project [unresolvedalias('array[0] AS a#16),unresolvedalias('array[1] AS > b#17),unresolvedalias('array[2] AS c#18)] > 'Subquery t > 'Project [unresolvedalias('split('value,,) AS array#15)] > 'UnresolvedRelation [src_split], None > == Analyzed Logical Plan == > a: string, b: string, c: string > Project [array#15[0] AS a#16,array#15[1] AS b#17,array#15[2] AS c#18] > Subquery t > Project [split(value#20,,) AS array#15] > MetastoreRelation default, src_split, None > == Optimized Logical Plan == > Project [split(value#20,,)[0] AS a#16,split(value#20,,)[1] AS > b#17,split(value#20,,)[2] AS c#18] > MetastoreRelation default, src_split, None > == Physical Plan == > Project [split(value#20,,)[0] AS a#16,split(value#20,,)[1] AS > b#17,split(value#20,,)[2] AS c#18] > HiveTableScan [value#20], (MetastoreRelation default, src_split, None) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org