Thank you again
For
val r = df.filter(col("paid") > "").map(x => 
(x.getString(0),x.getString(1).....)

Can you give an example of column expression please like
df.filter(col("paid") > "").col("firstcolumn").getString   ?....

 

    On Thursday, 24 March 2016, 0:45, Michael Armbrust <mich...@databricks.com> 
wrote:
 

 You can only use as on a Column expression, not inside of a lambda function.  
The reason is the lambda function is compiled into opaque bytecode that Spark 
SQL is not able to see.  We just blindly execute it.
However, there are a couple of ways to name the columns that come out of a map. 
 Either use a case class instead of a tuple.  Or use .toDF("name1", 
"name2"....) after the map.
>From a performance perspective, its even better though if you can avoid maps 
>and stick to Column expressions.  The reason is that for maps, we have to 
>actually materialize and object to pass to your function.  However, if you 
>stick to column expression we can actually work directly on serialized data.
On Wed, Mar 23, 2016 at 5:27 PM, Ashok Kumar <ashok34...@yahoo.com> wrote:

thank you sir
sql("select `_1` as firstcolumn from items")

is there anyway one can keep the csv column names using databricks when mapping
val r = df.filter(col("paid") > "").map(x => 
(x.getString(0),x.getString(1).....)

can I call example  x.getString(0).as.(firstcolumn) in above when mapping if 
possible so columns will have labels


 

    On Thursday, 24 March 2016, 0:18, Michael Armbrust <mich...@databricks.com> 
wrote:
 

 You probably need to use `backticks` to escape `_1` since I don't think that 
its a valid SQL identifier.
On Wed, Mar 23, 2016 at 5:10 PM, Ashok Kumar <ashok34...@yahoo.com.invalid> 
wrote:

Gurus,
If I register a temporary table as below
 r.toDFres58: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: 
double, _4: double, _5: double]
r.toDF.registerTempTable("items")
sql("select * from items")res60: org.apache.spark.sql.DataFrame = [_1: string, 
_2: string, _3: double, _4: double, _5: double]
Is there anyway I can do a select on the first column only
sql("select _1 from items" throws error
Thanking you



   



  

Reply via email to