[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR

2017-07-11 Thread Ott Toomet (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082882#comment-16082882
 ] 

Ott Toomet commented on SPARK-20263:


Grishma--sure, there are workarounds like that.  Note that filtering from a 
dummy df should be done at spark level although my example above might be a 
feasible way to achieve just that (with a given schema).   I am just thinking 
that being forced to use a workaround for such a simple task seems 
unnecessary...

My particular problem was rbind'ing dataframes sequentially, so an alternative 
were to implement 'rbind(NULL, data)'.  

Thanks,
Ott

> create empty dataframes in sparkR
> -
>
> Key: SPARK-20263
> URL: https://issues.apache.org/jira/browse/SPARK-20263
> Project: Spark
>  Issue Type: Wish
>  Components: SparkR
>Affects Versions: 2.1.0
>Reporter: Ott Toomet
>Priority: Minor
>
> SparkR 2.1 does not support creating empty dataframes, nor conversion of 
> empty R dataframes to spark ones:
> createDataFrame(data.frame(a=integer()))
> gives 
> Error in takeRDD(x, 1)[[1]] : subscript out of bounds



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR

2017-07-10 Thread Grishma Jena (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081126#comment-16081126
 ] 

Grishma Jena commented on SPARK-20263:
--

[~otoomet] Have you tried creating a Spark dataframe with a dummy record and 
then filtering it out?

> create empty dataframes in sparkR
> -
>
> Key: SPARK-20263
> URL: https://issues.apache.org/jira/browse/SPARK-20263
> Project: Spark
>  Issue Type: Wish
>  Components: SparkR
>Affects Versions: 2.1.0
>Reporter: Ott Toomet
>Priority: Minor
>
> SparkR 2.1 does not support creating empty dataframes, nor conversion of 
> empty R dataframes to spark ones:
> createDataFrame(data.frame(a=integer()))
> gives 
> Error in takeRDD(x, 1)[[1]] : subscript out of bounds



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR

2017-04-18 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974001#comment-15974001
 ] 

Felix Cheung commented on SPARK-20263:
--

I see - so you would need a way to create an empty dataframe but with a known 
schema?
I think it's a reasonable ask and would be good to have

> create empty dataframes in sparkR
> -
>
> Key: SPARK-20263
> URL: https://issues.apache.org/jira/browse/SPARK-20263
> Project: Spark
>  Issue Type: Wish
>  Components: SparkR
>Affects Versions: 2.1.0
>Reporter: Ott Toomet
>Priority: Minor
>
> SparkR 2.1 does not support creating empty dataframes, nor conversion of 
> empty R dataframes to spark ones:
> createDataFrame(data.frame(a=integer()))
> gives 
> Error in takeRDD(x, 1)[[1]] : subscript out of bounds



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR

2017-04-18 Thread Ott Toomet (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973157#comment-15973157
 ] 

Ott Toomet commented on SPARK-20263:


for instance, by rbinding in a loop.  I have several pieces of code that work 
along the lines:
for(file in datafiles) {
   df <- read.parquet(file)
   result <- do_something_with(df)
   final <- rbind(final, result) 
}

Sure, one can easily add logic to check if we have initialized 'final' but it 
seems so much simpler and logical just to initialize empty 'final' before the 
loop.

> create empty dataframes in sparkR
> -
>
> Key: SPARK-20263
> URL: https://issues.apache.org/jira/browse/SPARK-20263
> Project: Spark
>  Issue Type: Wish
>  Components: SparkR
>Affects Versions: 2.1.0
>Reporter: Ott Toomet
>Priority: Minor
>
> SparkR 2.1 does not support creating empty dataframes, nor conversion of 
> empty R dataframes to spark ones:
> createDataFrame(data.frame(a=integer()))
> gives 
> Error in takeRDD(x, 1)[[1]] : subscript out of bounds



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR

2017-04-08 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961724#comment-15961724
 ] 

Felix Cheung commented on SPARK-20263:
--

May I ask how would you use this empty dataframe?

> create empty dataframes in sparkR
> -
>
> Key: SPARK-20263
> URL: https://issues.apache.org/jira/browse/SPARK-20263
> Project: Spark
>  Issue Type: Wish
>  Components: SparkR
>Affects Versions: 2.1.0
>Reporter: Ott Toomet
>Priority: Minor
>
> SparkR 2.1 does not support creating empty dataframes, nor conversion of 
> empty R dataframes to spark ones:
> createDataFrame(data.frame(a=integer()))
> gives 
> Error in takeRDD(x, 1)[[1]] : subscript out of bounds



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org