subject:"\[jira\] \[Commented\] \(SPARK\-17214\) How to deal with dots \(.\) present in column names in SparkR"

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-27 Thread Felix Cheung (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15442870#comment-15442870
 ] 

Felix Cheung commented on SPARK-17214:
--

I think the underlining issue is that we should either handle column names with 
`.` correctly (preferred) or translate them uniformly as in other cases (eg. 
`as.DataFrame`)

As of now a DataFrame from csv source can have `.` in column names and it is 
unoperable until renamed:
{code}
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> iris_sdf
SparkDataFrame[Sepal.Length:double, Sepal.Width:double, Petal.Length:double, 
Petal.Width:double, Species:string]
{code}

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-27 Thread Felix Cheung (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15442865#comment-15442865
 ] 

Felix Cheung commented on SPARK-17214:
--

[~bansalism] what version of Spark + SparkR are you testing with?
I ran your example and it worked

{code}
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> iris_sdf
SparkDataFrame[Sepal_Length:double, Sepal_Width:double, Petal_Length:double, 
Petal_Width:double, Species:string]
> head(iris_sdf)
  Sepal_Length Sepal_Width Petal_Length Petal_Width Species
1  5.1 3.5  1.4 0.2  setosa
2  4.9 3.0  1.4 0.2  setosa
3  4.7 3.2  1.3 0.2  setosa
4  4.6 3.1  1.5 0.2  setosa
5  5.0 3.6  1.4 0.2  setosa
6  5.4 3.9  1.7 0.4  setosa
> a <- select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width)
> head(a)
  Sepal_Length Sepal_Width
1  5.1 3.5
2  4.9 3.0
3  4.7 3.2
4  4.6 3.1
5  5.0 3.6
6  5.4 3.9
>
{code}

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-27 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440966#comment-15440966
 ] 

Sean Owen commented on SPARK-17214:
---

I think the issue is that the 'underlying' dataframe hasn't changed names. I 
don't know if that's to be expected or not -- you're using regular R methods to 
manipulate this right? CC [~felixcheung] who will actually know what he's 
talking about.

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Mohit Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435257#comment-15435257
 ] 

Mohit Bansal commented on SPARK-17214:
--

If I am not wrong,
Following command is used to update the column names:
names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")

moreover when I am running schema(iris_sdf), it is showing column names with 
underscores(_)

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435237#comment-15435237
 ] 

Sean Owen commented on SPARK-17214:
---

Your error however shows

{code}
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
{code}

I don't think you have replaced them.

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Mohit Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435232#comment-15435232
 ] 

Mohit Bansal commented on SPARK-17214:
--

[~srowen]

Sorry, but I don't think so...
I understand that dots(.) are not acceptable by SparkDataFrame, but I have 
replaced all the dots (.) with underscore(_).

Still SparkR is not allowing me to access the column names with underscores.

Thanks in advance for your help

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435173#comment-15435173
 ] 

Sean Owen commented on SPARK-17214:
---

Duplicate of things like https://issues.apache.org/jira/browse/SPARK-16874 ?

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Mohit Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435135#comment-15435135
 ] 

Mohit Bansal commented on SPARK-17214:
--

[~sowen]
Done 

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

8 matches

Site Navigation

Mail list logo

Footer information