[ https://issues.apache.org/jira/browse/SPARK-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074135#comment-14074135 ]
Apache Spark commented on SPARK-2686: ------------------------------------- User 'javadba' has created a pull request for this issue: https://github.com/apache/spark/pull/1586 > Add Length support to Spark SQL and HQL and Strlen support to SQL > ----------------------------------------------------------------- > > Key: SPARK-2686 > URL: https://issues.apache.org/jira/browse/SPARK-2686 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 0.9.1, 0.9.2, 1.0.0, 1.1.0, 1.1.1 > Environment: all > Reporter: Stephen Boesch > Priority: Minor > Labels: hql, length, sql > Fix For: 1.1.1 > > Original Estimate: 0h > Remaining Estimate: 0h > > Syntactic, parsing, and operational support have been added for LEN(GTH) and > STRLEN functions. > Examples: > SQL: > import org.apache.spark.sql._ > case class TestData(key: Int, value: String) > val sqlc = new SQLContext(sc) > import sqlc._ > val testData: SchemaRDD = sqlc.sparkContext.parallelize( > (1 to 100).map(i => TestData(i, i.toString))) > testData.registerAsTable("testData") > sqlc.sql("select length(key) as key_len from testData order by key_len desc > limit 5").collect > res12: Array[org.apache.spark.sql.Row] = Array([3], [2], [2], [2], [2]) > HQL: > val hc = new org.apache.spark.sql.hive.HiveContext(sc) > import hc._ > hc.hql > hql("select length(grp) from simplex").collect > res14: Array[org.apache.spark.sql.Row] = Array([6], [6], [6], [6]) > As far as codebase changes: they have been purposefully made similar to the > ones made for for adding SUBSTR(ING) from July 17: > SQLParser, Optimizer, Expression, stringOperations, and HiveQL were the main > classes changed. The testing suites affected are ConstantFolding and > ExpressionEvaluation. > In addition some ad-hoc testing was done as shown in the examples. -- This message was sent by Atlassian JIRA (v6.2#6252)