Re: [R] About populating a dataframe in a loop

2017-01-07 Thread Rui Barradas

Hello,

I believe you should follow Jeremiah's sugestion to first read all csv 
files into a list and then rbind them.

Something like the following.

file_list <- list.files(pattern = "*.csv")
df_list <- lapply(file_list, read.csv)
result <- do.call(rbind, df_list)

Hope this helps,

Rui Barradas

Em 07-01-2017 06:51, lily li escreveu:

Thanks, Richard. But if the data cannot fill the constructed data frame,
will there be NA values?


On Fri, Jan 6, 2017 at 10:07 PM, Richard M. Heiberger > wrote:

Incrementally increasing the size of an array is not efficient in R.
The recommended technique is to allocate as much space as you will
need, and then fill it.

 > system.time({tmp <- 1:5 ; for (i in 1:1000) tmp <- rbind(tmp, 1:5)})
user  system elapsed
   0.011   0.000   0.011
 > dim(tmp)
[1] 10015
 > system.time({tmp <- matrix(NA, 1001, 5); for (i in 1:1001)
tmp[i,] <- 1:5})
user  system elapsed
   0.001   0.000   0.001
 > dim(tmp)
[1] 10015

On Fri, Jan 6, 2017 at 11:46 PM, lily li > wrote:
 > Hi Rui,
 >
 > Thanks for your reply. Yes, when I tried to rbind two dataframes,
it works.
 > However, if there are more than 50, it got stuck for hours. When
I tried to
 > terminate the process and open the csv file separately, it has
only one
 > data frame. What is the problem? Thanks.
 >
 >
 > On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas
> wrote:
 >
 >> Hello,
 >>
 >> Works with me:
 >>
 >> set.seed(6574)
 >>
 >> pre.mat = data.frame()
 >> for(i in 1:10){
 >> mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5,
TRUE))
 >> pre.mat = rbind(pre.mat, mat.temp)
 >> }
 >>
 >> nrow(pre.mat)  # should be 50
 >>
 >>
 >> Can you give us an example that doesn't work?
 >>
 >> Rui Barradas
 >>
 >>
 >> Em 06-01-2017 18:00, lily li escreveu:
 >>
 >>> Hi R users,
 >>>
 >>> I have a question about filling a dataframe in R using a for loop.
 >>>
 >>> I created an empty dataframe first and then filled it, using
the code:
 >>> pre.mat = data.frame()
 >>> for(i in 1:10){
 >>>  mat.temp = data.frame(some values filled in)
 >>>  pre.mat = rbind(pre.mat, mat.temp)
 >>> }
 >>> However, the resulted dataframe has not all the rows that I
desired for.
 >>> What is the problem and how to solve it? Thanks.
 >>>
 >>> [[alternative HTML version deleted]]
 >>>
 >>> __
 >>> R-help@r-project.org  mailing list
-- To UNSUBSCRIBE and more, see
 >>> https://stat.ethz.ch/mailman/listinfo/r-help

 >>> PLEASE do read the posting guide http://www.R-project.org/posti
 >>> ng-guide.html
 >>> and provide commented, minimal, self-contained, reproducible code.
 >>>
 >>>
 >
 > [[alternative HTML version deleted]]
 >
 > __
 > R-help@r-project.org  mailing list
-- To UNSUBSCRIBE and more, see
 > https://stat.ethz.ch/mailman/listinfo/r-help

 > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html

 > and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About populating a dataframe in a loop

2017-01-06 Thread lily li
Thanks, Richard. But if the data cannot fill the constructed data frame,
will there be NA values?


On Fri, Jan 6, 2017 at 10:07 PM, Richard M. Heiberger 
wrote:

> Incrementally increasing the size of an array is not efficient in R.
> The recommended technique is to allocate as much space as you will
> need, and then fill it.
>
> > system.time({tmp <- 1:5 ; for (i in 1:1000) tmp <- rbind(tmp, 1:5)})
>user  system elapsed
>   0.011   0.000   0.011
> > dim(tmp)
> [1] 10015
> > system.time({tmp <- matrix(NA, 1001, 5); for (i in 1:1001) tmp[i,] <-
> 1:5})
>user  system elapsed
>   0.001   0.000   0.001
> > dim(tmp)
> [1] 10015
>
> On Fri, Jan 6, 2017 at 11:46 PM, lily li  wrote:
> > Hi Rui,
> >
> > Thanks for your reply. Yes, when I tried to rbind two dataframes, it
> works.
> > However, if there are more than 50, it got stuck for hours. When I tried
> to
> > terminate the process and open the csv file separately, it has only one
> > data frame. What is the problem? Thanks.
> >
> >
> > On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas 
> wrote:
> >
> >> Hello,
> >>
> >> Works with me:
> >>
> >> set.seed(6574)
> >>
> >> pre.mat = data.frame()
> >> for(i in 1:10){
> >> mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
> >> pre.mat = rbind(pre.mat, mat.temp)
> >> }
> >>
> >> nrow(pre.mat)  # should be 50
> >>
> >>
> >> Can you give us an example that doesn't work?
> >>
> >> Rui Barradas
> >>
> >>
> >> Em 06-01-2017 18:00, lily li escreveu:
> >>
> >>> Hi R users,
> >>>
> >>> I have a question about filling a dataframe in R using a for loop.
> >>>
> >>> I created an empty dataframe first and then filled it, using the code:
> >>> pre.mat = data.frame()
> >>> for(i in 1:10){
> >>>  mat.temp = data.frame(some values filled in)
> >>>  pre.mat = rbind(pre.mat, mat.temp)
> >>> }
> >>> However, the resulted dataframe has not all the rows that I desired
> for.
> >>> What is the problem and how to solve it? Thanks.
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posti
> >>> ng-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>>
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About populating a dataframe in a loop

2017-01-06 Thread jeremiah rounds
As a rule never rbind in a loop. It has O(n^2) run time because the rbind
itself can be O(n) (where n is the number of data.frames).  Instead either
put them all into a list with lapply or vector("list", length=) and then
datatable::rbindlist, do.call(rbind, thelist) or use the equivalent from
dplyr.  All of which will be much more efficient.



On Fri, Jan 6, 2017 at 8:46 PM, lily li  wrote:

> Hi Rui,
>
> Thanks for your reply. Yes, when I tried to rbind two dataframes, it works.
> However, if there are more than 50, it got stuck for hours. When I tried to
> terminate the process and open the csv file separately, it has only one
> data frame. What is the problem? Thanks.
>
>
> On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas 
> wrote:
>
> > Hello,
> >
> > Works with me:
> >
> > set.seed(6574)
> >
> > pre.mat = data.frame()
> > for(i in 1:10){
> > mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
> > pre.mat = rbind(pre.mat, mat.temp)
> > }
> >
> > nrow(pre.mat)  # should be 50
> >
> >
> > Can you give us an example that doesn't work?
> >
> > Rui Barradas
> >
> >
> > Em 06-01-2017 18:00, lily li escreveu:
> >
> >> Hi R users,
> >>
> >> I have a question about filling a dataframe in R using a for loop.
> >>
> >> I created an empty dataframe first and then filled it, using the code:
> >> pre.mat = data.frame()
> >> for(i in 1:10){
> >>  mat.temp = data.frame(some values filled in)
> >>  pre.mat = rbind(pre.mat, mat.temp)
> >> }
> >> However, the resulted dataframe has not all the rows that I desired for.
> >> What is the problem and how to solve it? Thanks.
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posti
> >> ng-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About populating a dataframe in a loop

2017-01-06 Thread Richard M. Heiberger
Incrementally increasing the size of an array is not efficient in R.
The recommended technique is to allocate as much space as you will
need, and then fill it.

> system.time({tmp <- 1:5 ; for (i in 1:1000) tmp <- rbind(tmp, 1:5)})
   user  system elapsed
  0.011   0.000   0.011
> dim(tmp)
[1] 10015
> system.time({tmp <- matrix(NA, 1001, 5); for (i in 1:1001) tmp[i,] <- 1:5})
   user  system elapsed
  0.001   0.000   0.001
> dim(tmp)
[1] 10015

On Fri, Jan 6, 2017 at 11:46 PM, lily li  wrote:
> Hi Rui,
>
> Thanks for your reply. Yes, when I tried to rbind two dataframes, it works.
> However, if there are more than 50, it got stuck for hours. When I tried to
> terminate the process and open the csv file separately, it has only one
> data frame. What is the problem? Thanks.
>
>
> On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas  wrote:
>
>> Hello,
>>
>> Works with me:
>>
>> set.seed(6574)
>>
>> pre.mat = data.frame()
>> for(i in 1:10){
>> mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
>> pre.mat = rbind(pre.mat, mat.temp)
>> }
>>
>> nrow(pre.mat)  # should be 50
>>
>>
>> Can you give us an example that doesn't work?
>>
>> Rui Barradas
>>
>>
>> Em 06-01-2017 18:00, lily li escreveu:
>>
>>> Hi R users,
>>>
>>> I have a question about filling a dataframe in R using a for loop.
>>>
>>> I created an empty dataframe first and then filled it, using the code:
>>> pre.mat = data.frame()
>>> for(i in 1:10){
>>>  mat.temp = data.frame(some values filled in)
>>>  pre.mat = rbind(pre.mat, mat.temp)
>>> }
>>> However, the resulted dataframe has not all the rows that I desired for.
>>> What is the problem and how to solve it? Thanks.
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About populating a dataframe in a loop

2017-01-06 Thread lily li
Hi Rui,

Thanks for your reply. Yes, when I tried to rbind two dataframes, it works.
However, if there are more than 50, it got stuck for hours. When I tried to
terminate the process and open the csv file separately, it has only one
data frame. What is the problem? Thanks.


On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas  wrote:

> Hello,
>
> Works with me:
>
> set.seed(6574)
>
> pre.mat = data.frame()
> for(i in 1:10){
> mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
> pre.mat = rbind(pre.mat, mat.temp)
> }
>
> nrow(pre.mat)  # should be 50
>
>
> Can you give us an example that doesn't work?
>
> Rui Barradas
>
>
> Em 06-01-2017 18:00, lily li escreveu:
>
>> Hi R users,
>>
>> I have a question about filling a dataframe in R using a for loop.
>>
>> I created an empty dataframe first and then filled it, using the code:
>> pre.mat = data.frame()
>> for(i in 1:10){
>>  mat.temp = data.frame(some values filled in)
>>  pre.mat = rbind(pre.mat, mat.temp)
>> }
>> However, the resulted dataframe has not all the rows that I desired for.
>> What is the problem and how to solve it? Thanks.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About populating a dataframe in a loop

2017-01-06 Thread Rui Barradas

Hello,

Works with me:

set.seed(6574)

pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
pre.mat = rbind(pre.mat, mat.temp)
}

nrow(pre.mat)  # should be 50


Can you give us an example that doesn't work?

Rui Barradas

Em 06-01-2017 18:00, lily li escreveu:

Hi R users,

I have a question about filling a dataframe in R using a for loop.

I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
 mat.temp = data.frame(some values filled in)
 pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired for.
What is the problem and how to solve it? Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About populating a dataframe in a loop

2017-01-06 Thread lily li
Hi R users,

I have a question about filling a dataframe in R using a for loop.

I created an empty dataframe first and then filled it, using the code:
pre.mat = data.frame()
for(i in 1:10){
mat.temp = data.frame(some values filled in)
pre.mat = rbind(pre.mat, mat.temp)
}
However, the resulted dataframe has not all the rows that I desired for.
What is the problem and how to solve it? Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.