On 12/07/2008 12:31 PM, [EMAIL PROTECTED] wrote:
I am using a simple R statement to read in the file:

a <- read.csv("Sample.dat", header=TRUE)

There is alot of data but the first few lines look like:

DayOfYear,Quantity,Fraction,Category,SubCategory
1,82,0.0000390392720794458,(Unknown),(Unknown)
2,78,0.0000371349173438631,(Unknown),(Unknown)
. . .
71,2,0.0000009521773677913,WOMEN,Piratesses
72,4,0.0000019043547355827,WOMEN,Piratesses
73,3,0.0000014282660516870,WOMEN,Piratesses
74,14,0.0000066652415745395,WOMEN,Piratesses
75,2,0.0000009521773677913,WOMEN,Piratesses

If I read the data in as above, the command

a[1]

results in the output
[ reached getOption("max.print") -- omitted 16193 rows ]]

Shouldn't this be the first row?

No, the first row would be a[1,]. read.csv() returns a dataframe, and those are indexed with two indices to treat them like a matrix, or with one index to treat them like a list of their columns.

Duncan Murdoch


a$Category[1]

results in the output

[1] (Unknown)
4464 Levels:   Tags ... WOMEN

But

a$Category[365]

gives me:

[1] 7 Plates   (Dessert),Western\n120,5,0.0000023804434194784,7 Plates   
(Dessert)
4464 Levels:   Tags ... WOMEN

There is something fundamental about either vectors of the read.csv command 
that I am missing here.

Thank you.

Kevin

---- jim holtman <[EMAIL PROTECTED]> wrote:
Please provide commented, minimal, self-contained, reproducible code,
or at least a before/after of what you data would look like.  Taking a
guess at what you are asking, here is one way of doing it:


x <- data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=1:20, b=runif(20))
x
   cat  a          b
1    B  1 0.65472393
2    C  2 0.35319727
3    B  3 0.27026015
4    A  4 0.99268406
5    C  5 0.63349326
6    A  6 0.21320814
7    C  7 0.12937235
8    A  8 0.47811803
9    A  9 0.92407447
10   A 10 0.59876097
11   A 11 0.97617069
12   A 12 0.73179251
13   B 13 0.35672691
14   C 14 0.43147369
15   C 15 0.14821156
16   C 16 0.01307758
17   B 17 0.71556607
18   B 18 0.10318424
19   C 19 0.44628435
20   B 20 0.64010105
# create a list of the indices of the data grouped by 'cat'
split(seq(nrow(x)), x$cat)
$A
[1]  4  6  8  9 10 11 12

$B
[1]  1  3 13 17 18 20

$C
[1]  2  5  7 14 15 16 19

# or do you want the data
split(x, x$cat)
$A
   cat  a         b
4    A  4 0.9926841
6    A  6 0.2132081
8    A  8 0.4781180
9    A  9 0.9240745
10   A 10 0.5987610
11   A 11 0.9761707
12   A 12 0.7317925

$B
   cat  a         b
1    B  1 0.6547239
3    B  3 0.2702601
13   B 13 0.3567269
17   B 17 0.7155661
18   B 18 0.1031842
20   B 20 0.6401010

$C
   cat  a          b
2    C  2 0.35319727
5    C  5 0.63349326
7    C  7 0.12937235
14   C 14 0.43147369
15   C 15 0.14821156
16   C 16 0.01307758
19   C 19 0.44628435


On Sat, Jul 12, 2008 at 3:32 AM,  <[EMAIL PROTECTED]> wrote:
I have search the archive and I could not find what I need so I will try to ask 
the question here.

I read a table in (read.table)

a <- read.table(.....)

The table has column names like DayOfYear, Quantity, and Category.

The values in the row for Category are strings (characters).

I want to get all of the rows grouped by Category. The number of unique 
category names could be around 50. Say for argument sake the number of 
categories is exactly 50. Can I somehow get a vector of length 50 containing 
the rows corresponding to the category (another vector)? I realize I can access 
any row a[i]$Category (right?). But I wanta vector containing the rows 
corresponding to each distinct Category name.

Thank you.

Kevin

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to