Thank you again For val r = df.filter(col("paid") > "").map(x => (x.getString(0),x.getString(1).....)
Can you give an example of column expression please like df.filter(col("paid") > "").col("firstcolumn").getString ?.... On Thursday, 24 March 2016, 0:45, Michael Armbrust <mich...@databricks.com> wrote: You can only use as on a Column expression, not inside of a lambda function. The reason is the lambda function is compiled into opaque bytecode that Spark SQL is not able to see. We just blindly execute it. However, there are a couple of ways to name the columns that come out of a map. Either use a case class instead of a tuple. Or use .toDF("name1", "name2"....) after the map. >From a performance perspective, its even better though if you can avoid maps >and stick to Column expressions. The reason is that for maps, we have to >actually materialize and object to pass to your function. However, if you >stick to column expression we can actually work directly on serialized data. On Wed, Mar 23, 2016 at 5:27 PM, Ashok Kumar <ashok34...@yahoo.com> wrote: thank you sir sql("select `_1` as firstcolumn from items") is there anyway one can keep the csv column names using databricks when mapping val r = df.filter(col("paid") > "").map(x => (x.getString(0),x.getString(1).....) can I call example x.getString(0).as.(firstcolumn) in above when mapping if possible so columns will have labels On Thursday, 24 March 2016, 0:18, Michael Armbrust <mich...@databricks.com> wrote: You probably need to use `backticks` to escape `_1` since I don't think that its a valid SQL identifier. On Wed, Mar 23, 2016 at 5:10 PM, Ashok Kumar <ashok34...@yahoo.com.invalid> wrote: Gurus, If I register a temporary table as below r.toDFres58: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: double, _4: double, _5: double] r.toDF.registerTempTable("items") sql("select * from items")res60: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: double, _4: double, _5: double] Is there anyway I can do a select on the first column only sql("select _1 from items" throws error Thanking you