Re: Re: the issue about the + in column,can we support the string please?
Hi , I checked the code. It seems it is hard to change the code. Current code, string + int is translated to double + double. If I change the the string + int to string + sting, it will incompatible whit old version. Does anyone have better idea about this issue please? 1427357...@qq.com From: Shmuel Blitz Date: 2018-03-26 17:17 To: 1427357...@qq.com CC: spark users; dev Subject: Re: Re: the issue about the + in column,can we support the string please? I agree. Just pointed out the option, in case you missed it. Cheers, Shmuel On Mon, Mar 26, 2018 at 10:57 AM, 1427357...@qq.com <1427357...@qq.com> wrote: Hi, Using concat is one of the way. But the + is more intuitive and easy to understand. 1427357...@qq.com From: Shmuel Blitz Date: 2018-03-26 15:31 To: 1427357...@qq.com CC: spark?users; dev Subject: Re: the issue about the + in column,can we support the string please? Hi, you can get the same with: import org.apache.spark.sql.functions._ import sqlContext.implicits._ import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType} val schema = StructType(Array(StructField("name", StringType), StructField("age", IntegerType) )) val lst = List(Row("Shmuel", 13), Row("Blitz", 23)) val rdd = sc.parallelize(lst) val df = sqlContext.createDataFrame(rdd,schema) df.withColumn("newName", concat($"name" , lit("abc")) ).show() On Mon, Mar 26, 2018 at 6:36 AM, 1427357...@qq.com <1427357...@qq.com> wrote: Hi all, I have a table like below: +---+-+---+ | id| name|sharding_id| +---+-+---+ | 1|leader us| 1| | 3|mycat| 1| +---+-+---+ My schema is : root |-- id: integer (nullable = false) |-- name: string (nullable = true) |-- sharding_id: integer (nullable = false) I want add a new column named newName. The new column is based on "name" and append "abc" after it. My code looks like: stud_scoreDF.withColumn("newName", stud_scoreDF.col("name") + "abc" ).show() When I run the code, I got the reslult: +---+-+---+---+ | id| name|sharding_id|newName| +---+-+---+---+ | 1|leader us| 1| null| | 3|mycat| 1| null| +---+-+---+---+ I checked the code, the key code is in arithmetic.scala. line 165. It looks like: override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = dataType match { case dt: DecimalType => defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1.$$plus($eval2)") case ByteType | ShortType => defineCodeGen(ctx, ev, (eval1, eval2) => s"(${ctx.javaType(dataType)})($eval1 $symbol $eval2)") case CalendarIntervalType => defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1.add($eval2)") case _ => defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1 $symbol $eval2") } My issue is: Can we add case StringType in this class to support string append please? 1427357...@qq.com -- Shmuel Blitz Big Data Developer Email: shmuel.bl...@similarweb.com www.similarweb.com -- Shmuel Blitz Big Data Developer Email: shmuel.bl...@similarweb.com www.similarweb.com
Re: Re: the issue about the + in column,can we support the string please?
I agree. Just pointed out the option, in case you missed it. Cheers, Shmuel On Mon, Mar 26, 2018 at 10:57 AM, 1427357...@qq.com <1427357...@qq.com> wrote: > Hi, > > Using concat is one of the way. > But the + is more intuitive and easy to understand. > > -- > 1427357...@qq.com > > > *From:* Shmuel Blitz <shmuel.bl...@similarweb.com> > *Date:* 2018-03-26 15:31 > *To:* 1427357...@qq.com > *CC:* spark?users <user@spark.apache.org>; dev <d...@spark.apache.org> > *Subject:* Re: the issue about the + in column,can we support the string > please? > Hi, > > you can get the same with: > > import org.apache.spark.sql.functions._ > import sqlContext.implicits._ > import org.apache.spark.sql.types.{IntegerType, StringType, StructField, > StructType} > > val schema = StructType(Array(StructField("name", StringType), > StructField("age", IntegerType) )) > > val lst = List(Row("Shmuel", 13), Row("Blitz", 23)) > val rdd = sc.parallelize(lst) > > val df = sqlContext.createDataFrame(rdd,schema) > > df.withColumn("newName", concat($"name" , lit("abc")) ).show() > > On Mon, Mar 26, 2018 at 6:36 AM, 1427357...@qq.com <1427357...@qq.com> > wrote: > >> Hi all, >> >> I have a table like below: >> >> +---+-+---+ >> | id| name|sharding_id| >> +---+-+---+ >> | 1|leader us| 1| >> | 3|mycat| 1| >> +---+-+---+ >> >> My schema is : >> root >> |-- id: integer (nullable = false) >> |-- name: string (nullable = true) >> |-- sharding_id: integer (nullable = false) >> >> I want add a new column named newName. The new column is based on "name" >> and append "abc" after it. My code looks like: >> >> stud_scoreDF.withColumn("newName", stud_scoreDF.col("name") + "abc" >> ).show() >> >> When I run the code, I got the reslult: >> +---+-+---+---+ >> | id| name|sharding_id|newName| >> +---+-+---+---+ >> | 1|leader us| 1| null| >> | 3|mycat| 1| null| >> +---+-+---+---+ >> >> >> I checked the code, the key code is in arithmetic.scala. line 165. >> It looks like: >> >> override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = >> dataType match { >> case dt: DecimalType => >> defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1.$$plus($eval2)") >> case ByteType | ShortType => >> defineCodeGen(ctx, ev, >> (eval1, eval2) => s"(${ctx.javaType(dataType)})($eval1 $symbol >> $eval2)") >> case CalendarIntervalType => >> defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1.add($eval2)") >> case _ => >> defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1 $symbol $eval2") >> } >> >> >> My issue is: >> Can we add case StringType in this class to support string append please? >> >> >> >> -- >> 1427357...@qq.com >> > > > > -- > Shmuel Blitz > Big Data Developer > Email: shmuel.bl...@similarweb.com > www.similarweb.com > <https://www.facebook.com/SimilarWeb/> > <https://www.linkedin.com/company/429838/> > <https://twitter.com/similarweb> > > -- Shmuel Blitz Big Data Developer Email: shmuel.bl...@similarweb.com www.similarweb.com <https://www.facebook.com/SimilarWeb/> <https://www.linkedin.com/company/429838/> <https://twitter.com/similarweb>
Re: Re: the issue about the + in column,can we support the string please?
Hi, Using concat is one of the way. But the + is more intuitive and easy to understand. 1427357...@qq.com From: Shmuel Blitz Date: 2018-03-26 15:31 To: 1427357...@qq.com CC: spark?users; dev Subject: Re: the issue about the + in column,can we support the string please? Hi, you can get the same with: import org.apache.spark.sql.functions._ import sqlContext.implicits._ import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType} val schema = StructType(Array(StructField("name", StringType), StructField("age", IntegerType) )) val lst = List(Row("Shmuel", 13), Row("Blitz", 23)) val rdd = sc.parallelize(lst) val df = sqlContext.createDataFrame(rdd,schema) df.withColumn("newName", concat($"name" , lit("abc")) ).show() On Mon, Mar 26, 2018 at 6:36 AM, 1427357...@qq.com <1427357...@qq.com> wrote: Hi all, I have a table like below: +---+-+---+ | id| name|sharding_id| +---+-+---+ | 1|leader us| 1| | 3|mycat| 1| +---+-+---+ My schema is : root |-- id: integer (nullable = false) |-- name: string (nullable = true) |-- sharding_id: integer (nullable = false) I want add a new column named newName. The new column is based on "name" and append "abc" after it. My code looks like: stud_scoreDF.withColumn("newName", stud_scoreDF.col("name") + "abc" ).show() When I run the code, I got the reslult: +---+-+---+---+ | id| name|sharding_id|newName| +---+-+---+---+ | 1|leader us| 1| null| | 3|mycat| 1| null| +---+-+---+---+ I checked the code, the key code is in arithmetic.scala. line 165. It looks like: override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = dataType match { case dt: DecimalType => defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1.$$plus($eval2)") case ByteType | ShortType => defineCodeGen(ctx, ev, (eval1, eval2) => s"(${ctx.javaType(dataType)})($eval1 $symbol $eval2)") case CalendarIntervalType => defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1.add($eval2)") case _ => defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1 $symbol $eval2") } My issue is: Can we add case StringType in this class to support string append please? 1427357...@qq.com -- Shmuel Blitz Big Data Developer Email: shmuel.bl...@similarweb.com www.similarweb.com