[jira] [Resolved] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-28 Thread Mohit Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Bansal resolved SPARK-17214.
--
Resolution: Later

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Mohit Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435257#comment-15435257
 ] 

Mohit Bansal commented on SPARK-17214:
--

If I am not wrong,
Following command is used to update the column names:
names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")

moreover when I am running schema(iris_sdf), it is showing column names with 
underscores(_)

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Mohit Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435232#comment-15435232
 ] 

Mohit Bansal commented on SPARK-17214:
--

[~srowen]

Sorry, but I don't think so...
I understand that dots(.) are not acceptable by SparkDataFrame, but I have 
replaced all the dots (.) with underscore(_).

Still SparkR is not allowing me to access the column names with underscores.

Thanks in advance for your help

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Mohit Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435135#comment-15435135
 ] 

Mohit Bansal commented on SPARK-17214:
--

[~sowen]
Done 

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$cl
> What should I do to get it work?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-24 Thread Mohit Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Bansal updated SPARK-17214:
-
Description: 
I am trying to load a local csv file into SparkR, which contains dots in column 
names. After reading the file I tried to change the names and replaced "." with 
"_". Still I am not able to do any operation on the created SDF. Here is the 
reproducible code:

---
#writing iris dataset to local
write.csv(iris,"iris.csv",row.names=F)

#reading it back using read.df
iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")

#changing column names
names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")

#selecting required columna
head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))

-

16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
org.apache.spark.sql.api.r.SQLUtils failed
Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
  org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length given 
[Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$cl

What should I do to get it work?

  
was:http://stackoverflow.com/questions/39125255/how-to-deal-with-dots-in-column-names-in-sparkr

Summary: How to deal with dots (.) present in column names in SparkR  
(was: Even after replacing the column names having dots , still it is referring 
to previous column names in SparkR ref: 
http://stackoverflow.com/questions/39125255/how-to-deal-with-dots-in-column-names-in-sparkr)

> How to deal with dots (.) present in column names in SparkR
> ---
>
> Key: SPARK-17214
> URL: https://issues.apache.org/jira/browse/SPARK-17214
> Project: Spark
>  Issue Type: Bug
>Reporter: Mohit Bansal
>
> I am trying to load a local csv file into SparkR, which contains dots in 
> column names. After reading the file I tried to change the names and replaced 
> "." with "_". Still I am not able to do any operation on the created SDF. 
> Here is the reproducible code:
> ---
> #writing iris dataset to local
> write.csv(iris,"iris.csv",row.names=F)
> #reading it back using read.df
> iris_sdf<-read.df("iris.csv","csv",header="true",inferSchema="true")
> #changing column names
> names(iris_sdf)<-c("Sepal_Length","Sepal_Width","Petal_Length","Petal_Width","Species")
> #selecting required columna
> head(select(iris_sdf,iris_sdf$Sepal_Length,iris_sdf$Sepal_Width))
> -
> 16/08/24 13:51:24 ERROR RBackendHandler: dfToCols on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   org.apache.spark.sql.AnalysisException: Unable to resolve Sepal.Length 
> given [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species];
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1$$anonfun$apply$5.apply(LogicalPlan.scala:134)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:133)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolve$1.apply(LogicalPlan.scala:129)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at s

[jira] [Created] (SPARK-17214) Even after replacing the column names having dots , still it is referring to previous column names in SparkR ref: http://stackoverflow.com/questions/39125255/how-to-deal

2016-08-24 Thread Mohit Bansal (JIRA)
Mohit Bansal created SPARK-17214:


 Summary: Even after replacing the column names having dots , still 
it is referring to previous column names in SparkR ref: 
http://stackoverflow.com/questions/39125255/how-to-deal-with-dots-in-column-names-in-sparkr
 Key: SPARK-17214
 URL: https://issues.apache.org/jira/browse/SPARK-17214
 Project: Spark
  Issue Type: Bug
Reporter: Mohit Bansal


http://stackoverflow.com/questions/39125255/how-to-deal-with-dots-in-column-names-in-sparkr



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org