>>> I tried to use df.withColumn but I am getting below exception.

What is rowNumber here ? UDF ?  You can use monotonicallyIncreasingId
for generating id

>>> Also, is it possible to add a column from one dataframe to another?

You can't, because how can you add one dataframe to another if they have
different number of rows. You'd better to use join to correlate 2 data
frames.

On Thu, Nov 26, 2015 at 6:39 AM, Vishnu Viswanath <
vishnu.viswanat...@gmail.com> wrote:

> Hi,
>
> I am trying to add the row number to a spark dataframe.
> This is my dataframe:
>
> scala> df.printSchema
> root
> |-- line: string (nullable = true)
>
> I tried to use df.withColumn but I am getting below exception.
>
> scala> df.withColumn("row",rowNumber)
> org.apache.spark.sql.AnalysisException: unresolved operator 'Project 
> [line#2326,'row_number() AS row#2327];
> at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37)
> at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
> at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:174)
> at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49)
>
> Also, is it possible to add a column from one dataframe to another?
> something like
>
> scala> df.withColumn("line2",df2("line"))
>
> org.apache.spark.sql.AnalysisException: resolved attribute(s) line#2330 
> missing from line#2326 in operator !Project [line#2326,line#2330 AS 
> line2#2331];
>
> ​
>
> Thanks and Regards,
> Vishnu Viswanath
> *www.vishnuviswanath.com <http://www.vishnuviswanath.com>*
>



-- 
Best Regards

Jeff Zhang

Reply via email to