[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR
[ https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082882#comment-16082882 ] Ott Toomet commented on SPARK-20263: Grishma--sure, there are workarounds like that. Note that filtering from a dummy df should be done at spark level although my example above might be a feasible way to achieve just that (with a given schema). I am just thinking that being forced to use a workaround for such a simple task seems unnecessary... My particular problem was rbind'ing dataframes sequentially, so an alternative were to implement 'rbind(NULL, data)'. Thanks, Ott > create empty dataframes in sparkR > - > > Key: SPARK-20263 > URL: https://issues.apache.org/jira/browse/SPARK-20263 > Project: Spark > Issue Type: Wish > Components: SparkR >Affects Versions: 2.1.0 >Reporter: Ott Toomet >Priority: Minor > > SparkR 2.1 does not support creating empty dataframes, nor conversion of > empty R dataframes to spark ones: > createDataFrame(data.frame(a=integer())) > gives > Error in takeRDD(x, 1)[[1]] : subscript out of bounds -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR
[ https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081126#comment-16081126 ] Grishma Jena commented on SPARK-20263: -- [~otoomet] Have you tried creating a Spark dataframe with a dummy record and then filtering it out? > create empty dataframes in sparkR > - > > Key: SPARK-20263 > URL: https://issues.apache.org/jira/browse/SPARK-20263 > Project: Spark > Issue Type: Wish > Components: SparkR >Affects Versions: 2.1.0 >Reporter: Ott Toomet >Priority: Minor > > SparkR 2.1 does not support creating empty dataframes, nor conversion of > empty R dataframes to spark ones: > createDataFrame(data.frame(a=integer())) > gives > Error in takeRDD(x, 1)[[1]] : subscript out of bounds -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR
[ https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974001#comment-15974001 ] Felix Cheung commented on SPARK-20263: -- I see - so you would need a way to create an empty dataframe but with a known schema? I think it's a reasonable ask and would be good to have > create empty dataframes in sparkR > - > > Key: SPARK-20263 > URL: https://issues.apache.org/jira/browse/SPARK-20263 > Project: Spark > Issue Type: Wish > Components: SparkR >Affects Versions: 2.1.0 >Reporter: Ott Toomet >Priority: Minor > > SparkR 2.1 does not support creating empty dataframes, nor conversion of > empty R dataframes to spark ones: > createDataFrame(data.frame(a=integer())) > gives > Error in takeRDD(x, 1)[[1]] : subscript out of bounds -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR
[ https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973157#comment-15973157 ] Ott Toomet commented on SPARK-20263: for instance, by rbinding in a loop. I have several pieces of code that work along the lines: for(file in datafiles) { df <- read.parquet(file) result <- do_something_with(df) final <- rbind(final, result) } Sure, one can easily add logic to check if we have initialized 'final' but it seems so much simpler and logical just to initialize empty 'final' before the loop. > create empty dataframes in sparkR > - > > Key: SPARK-20263 > URL: https://issues.apache.org/jira/browse/SPARK-20263 > Project: Spark > Issue Type: Wish > Components: SparkR >Affects Versions: 2.1.0 >Reporter: Ott Toomet >Priority: Minor > > SparkR 2.1 does not support creating empty dataframes, nor conversion of > empty R dataframes to spark ones: > createDataFrame(data.frame(a=integer())) > gives > Error in takeRDD(x, 1)[[1]] : subscript out of bounds -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR
[ https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961724#comment-15961724 ] Felix Cheung commented on SPARK-20263: -- May I ask how would you use this empty dataframe? > create empty dataframes in sparkR > - > > Key: SPARK-20263 > URL: https://issues.apache.org/jira/browse/SPARK-20263 > Project: Spark > Issue Type: Wish > Components: SparkR >Affects Versions: 2.1.0 >Reporter: Ott Toomet >Priority: Minor > > SparkR 2.1 does not support creating empty dataframes, nor conversion of > empty R dataframes to spark ones: > createDataFrame(data.frame(a=integer())) > gives > Error in takeRDD(x, 1)[[1]] : subscript out of bounds -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org