Run Rprof on your script that is updating the dataframe. A dataframe
is a list and everytime you access something in the list it can be
expensive. Rprof will probably show that a lot of time is spent in
the function "[[" which is accessing portions of the dataframe.
Vectors are much faster becaus
I want to thank everyone for the help. I ended up having to use a loop to
assign values from the table to NinYear. However, as I have played with the
full datasets I have noticed that R is MUCH faster if I use vectors in the
loop rather than columns of a dataframe. In the specific case of 43,000
l
or perhaps...
data1$NinYear <- with(data1, ave(ID, Year, FUN = length))
> unique(data1)
ID Year NinYear
1 209 1971 2
3 213 1951 2
5 213 1953 20
20 213 1954 11
31 213 1955 2
33 234 1953 20
38 234 1958 2
40 234 1965 3
43 249 1952 2
A
Is this what you want?
data1$NinYear <- with(data1, ave(ID, Year, FUN = length))
On Tue, Oct 14, 2008 at 12:22 PM, Tom La Bone <[EMAIL PROTECTED]>wrote:
>
> The table function, which I was unaware of, works great. However, I still
> don't see how to assign the values calculated with table to dat
The table function, which I was unaware of, works great. However, I still
don't see how to assign the values calculated with table to data1$NinYear
without using a loop.
Tom
Henrique Dallazuanna wrote:
>
> Try this:
>
> with(data1, table(ID, Year))
>
> On Tue, Oct 14, 2008 at 10:58 AM, To
> This seems to work but is horribly slow (some files I am working with have
> over 500,000 lines). Can anyone suggest a faster way of doing this, perhaps
> a way that does not use a for loop? Thanks.
If the table solutions don't work or take forever with your real data, have a
look into the wiki:
try the following:
out <- tapply(data1$ID, list(data1$ID, data1$Year), length)
out[is.na(out)] <- 0
out
I hope it helps.
Best,
Dimitris
Tom La Bone wrote:
Assume that I have the dataframe "data1", which is listed at the end of this
message. I want count the number of lines that each person
Try this:
with(data1, table(ID, Year))
On Tue, Oct 14, 2008 at 10:58 AM, Tom La Bone <[EMAIL PROTECTED]>wrote:
>
> Assume that I have the dataframe "data1", which is listed at the end of
> this
> message. I want count the number of lines that each person has for each
> year. For example, the per
table(data1$ID, data1$Year)
See ?table and other functions referenced in ?table.
Tom La Bone wrote:
Assume that I have the dataframe "data1", which is listed at the end of this
message. I want count the number of lines that each person has for each
year. For example, the person with ID=213 ha
Assume that I have the dataframe "data1", which is listed at the end of this
message. I want count the number of lines that each person has for each
year. For example, the person with ID=213 has 15 entries (NinYear) for 1953.
The following bit of code calculates NinYear:
for (i in 1:length(data1$
10 matches
Mail list logo