RE: different behavior while using createDataFrame and read.df in SparkR

2016-02-08 Thread Sun, Rui
le has to be re-assigned to reference a column in the new DataFrame. From: Devesh Raj Singh [mailto:raj.deves...@gmail.com] Sent: Saturday, February 6, 2016 8:31 PM To: Sun, Rui <rui@intel.com> Cc: user@spark.apache.org Subject: Re: different behavior while using createDataFrame and read.d

Re: different behavior while using createDataFrame and read.df in SparkR

2016-02-06 Thread Devesh Raj Singh
Thank you ! Rui Sun for the observation! It helped. I have a new problem arising. When I create a small function for dummy variable creation for categorical column BDADummies<-function(dataframe,column){ cat.column<-vector(mode="character",length=nrow(dataframe)) cat.column<-collect(column)

RE: different behavior while using createDataFrame and read.df in SparkR

2016-02-05 Thread Sun, Rui
I guess this is related to https://issues.apache.org/jira/browse/SPARK-11976 When calling createDataFrame on iris, the “.” Character in column names will be replaced with “_”. It seems that when you create a DataFrame from the CSV file, the “.” Character in column names are still there. From: