[R] Merging by factor variables

2011-02-02 Thread H Roark
I'm wondering about the behavior of the merge function when using factors as by variables. I know that when you combine two factors using c() the results can be odd, as in: c(factor(1:5),factor(6:10)) which prints: [1] 1 2 3 4 5 1 2 3 4 5 I presume this is because factors are actually stored

[R] Efficient way to determine if a data frame has missing observations

2011-02-02 Thread H Roark
I have a data set covering a large number of cities with values for characteristics such as land area, population, and employment. The problem I have is that some cities lack observations for some of the characteristics and I'd like a quick way to determine which cities have missing data. For

[R] read.table() versus scan()

2011-01-27 Thread H Roark
I need to import a large number of simple, space-delimited text files with a few columns of data each. The one quirk is that some rows are missing data and some contain junk text at the end of each line. A typical file might look like: a b c d 1 2 3 x 4 5 6 7 8 9 x 1 2 3 x c c 4 5 6 x 7 8 9 x

[R] How does the data.frame function generate column names?

2011-01-23 Thread H Roark
Hi all, I'm a new R user and am confused about how R behaves when converting a vector to a data frame when using the data.frame function. I'm specifically interested in cases where the vector is expressed as a subset of another data frame. For example, say I want to create a data frame from