Re: [R] subsetting a data set

2006-09-08 Thread Graham Smith
Petr,

Thanks, I shall store all this away for reference and have a look at the
posting guide.

I didn't expect it to be as complicated as it has turned out.

As you will see from my post to Sian, my pressing problem was solved by
replacing the "&" with an "|" in the sample code I gave.

Graham




On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
>
> Hi
>
> On 8 Sep 2006 at 10:33, Graham Smith wrote:
>
> Date sent:  Fri, 8 Sep 2006 10:33:49 +0100
> From:   "Graham Smith" <[EMAIL PROTECTED]>
> To: "Petr Pikal" <[EMAIL PROTECTED]>
> Copies to:  r-help@stat.math.ethz.ch
> Subject:Re: [R] subsetting a data set
>
> > Petr,
> >
> > Thanks again, but the data is GQ1, Max is a variable (column)
> >
> > So I have used
> >
> >  by(GQ1[,"Max"], list(GQ1$Status), summary)
> >
> > Which is very good,  and is better than the way I did it before by
> > summarising for each status level individually, but that still isn't
> > combing the data for Status == "Expert" and Status = "Ecol"
> >
> > So at the moment the status variable has 3 levels Expert, Ecol and
> > Stake,
>
> look at ?factors how to deal with factors, if your variable is not a
> factor (see ?str) than turn it to one.
>
> x<-sample(letters[1:3], 20, replace=T) #character
> x.f<-as.factor(x) #turn to factor
> > x.f
> [1] b c b a c a c a a a a a b c c c b b c b
> Levels: a b c
> > levels(x.f)<-c("x","x","y") #rename levels
> > x.f
> [1] x y x x y x y x x x x x x y y y x x y x
> Levels: x y
> >
> >
> > I want to analsye that at two levels: Expert and Ecol combined into a
> > new level called "AllEcol" and the exsiting level "Stake"
>
> so in your case something like
>
> GQ1$statusComb<-factor(GQ1$status, labels=c("AllEcol","AllEcol",
> "Stake"))
>
> shall do it. Beware of label ordering!!!
>
> BTW. It had been good if you provided a usable example, as stated in
> posting guide. Many times trying to elaborate an example I will solve
> the problem myself.
>
> HTH
> Petr
>
> >
> > It is this combining the levels that has got me stuck.
> >
> > Thanks again,
> >
> > Graham
> >
> > On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> > >
> > > Sorry, I did not notice that in your case Max is not a function but
> > > your data. So probably
> > >
> > > by(Max[, your.columns], list(Max$status), summary)
> > >
> > > is maybe what you want.
> > > HTH
> > > Petr
> > >
> > >
> > > On 8 Sep 2006 at 10:31, Petr Pikal wrote:
> > >
> > > From:   "Petr Pikal" <[EMAIL PROTECTED]>
> > > To: "Graham Smith" <[EMAIL PROTECTED]>,
> > > r-help@stat.math.ethz.ch
> > > Date sent:  Fri, 08 Sep 2006 10:31:12 +0200
> > > Priority:   normal
> > > Subject:Re: [R] subsetting a data set
> > >
> > > > Hi
> > > >
> > > > I am not sure if your Max is the same as max so I am not sure what
> > > > you exactly want from your data. However you shall consult
> > > > ?tapply, ?by, ?aggregate and maybe also ?"[" together with chapter
> > > > 2 in intro manual in docs directory.
> > > >
> > > > aggregate(data[, some.columns], list(data$factor1, data$factor2),
> > > > max)
> > > >
> > > > will give you maximum for specified columns based on spliting the
> > > > data according to both factors
> > > >
> > > > Also connection summary with max is not common and I wonder what
> > > > is your output in this case. I believe that there are six same
> > > > numbers. However R is case sensitive and maybe Max does something
> > > > different from max. In my case it throws an error.
> > > >
> > > > HTH
> > > > Petr
> > > >
> > > > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> > > >
> > > > Date sent:Fri, 8 Sep 2006 08:06:16 +0100
> > > > From: "Graham Smith" <[EMAIL PROTECTED]>
> > > > To:   r-help@stat.math.ethz.ch
> > > > Subject:  [R] subsetting a data set
> >

Re: [R] subsetting a data set

2006-09-08 Thread Graham Smith
Sean,

This seems to be getting there except that I am going to need a
data.frameto hold "AllEcol" rather than a column, as GQ1 has 16
variable. So maybe
this needs turned around into something like

AllEcol<- GQ1[(GQ1$Status == "Expert) | (GQ1$Status == "Ecol"),]

Except this doesn't work, as I obviuosly haven't got the syntax right.

However, the direct answer to my question was contained in your answer.

If you go back to my original post I was principally asking why this (below)
didn't work, and how I could get around it.

summary (Max[Status=="Ecol"& Status=="Expert"])

by replacing the "&" with "|" , as your example, I can now combine both
levels and produce a summary.

summary (Max[Status=="Ecol" | Status=="Expert"])

The only comparable example I was able to find used the "&" symbol, which is
why I tried it.

Many thanks,

Graham

On 08/09/06, Sean O'Riordain <[EMAIL PROTECTED]> wrote:
>
> Hi Graham,
> Try creating a new column with the two levels that you want...
>
> something along the lines of (warning untested!!!)
>
> GQ1[(GQ1$Status == "Expert) | (GQ1$Status == "Ecol"),]$newColumn <-
> "AllEcol"
> GQ1[GQ1$Status == "Stake",]$newColumn <- "Stake"
>
> and then do the
> by(GQ1[,"Max"], list(GQ1$NewColumn), summary)
>
> when in doubt... break the problem into smaller chunks... :-)
>
> cheers,
> Sean
>
> On 08/09/06, Graham Smith <[EMAIL PROTECTED]> wrote:
> > Petr,
> >
> > Thanks again, but the data is GQ1, Max is a variable (column)
> >
> > So I have used
> >
> >  by(GQ1[,"Max"], list(GQ1$Status), summary)
> >
> > Which is very good,  and is better than the way I did it before by
> > summarising for each status level individually, but that still isn't
> combing
> > the data for Status == "Expert" and Status = "Ecol"
> >
> > So at the moment the status variable has 3 levels Expert, Ecol and
> Stake,
> >
> > I want to analsye that at two levels: Expert and Ecol combined into a
> new
> > level called "AllEcol" and the exsiting level "Stake"
> >
> > It is this combining the levels that has got me stuck.
> >
> > Thanks again,
> >
> > Graham
> >
> > On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> > >
> > > Sorry, I did not notice that in your case Max is not a function but
> > > your data. So probably
> > >
> > > by(Max[, your.columns], list(Max$status), summary)
> > >
> > > is maybe what you want.
> > > HTH
> > > Petr
> > >
> > >
> > > On 8 Sep 2006 at 10:31, Petr Pikal wrote:
> > >
> > > From:   "Petr Pikal" <[EMAIL PROTECTED]>
> > > To: "Graham Smith" <[EMAIL PROTECTED]>,
> > > r-help@stat.math.ethz.ch
> > > Date sent:  Fri, 08 Sep 2006 10:31:12 +0200
> > > Priority:   normal
> > > Subject:Re: [R] subsetting a data set
> > >
> > > > Hi
> > > >
> > > > I am not sure if your Max is the same as max so I am not sure what
> you
> > > > exactly want from your data. However you shall consult ?tapply, ?by,
> > > > ?aggregate and maybe also ?"[" together with chapter 2 in intro
> manual
> > > > in docs directory.
> > > >
> > > > aggregate(data[, some.columns], list(data$factor1, data$factor2),
> max)
> > > >
> > > > will give you maximum for specified columns based on spliting the
> data
> > > > according to both factors
> > > >
> > > > Also connection summary with max is not common and I wonder what is
> > > > your output in this case. I believe that there are six same numbers.
> > > > However R is case sensitive and maybe Max does something different
> > > > from max. In my case it throws an error.
> > > >
> > > > HTH
> > > > Petr
> > > >
> > > > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> > > >
> > > > Date sent:Fri, 8 Sep 2006 08:06:16 +0100
> > > > From: "Graham Smith" <[EMAIL PROTECTED]>
> > > > To:   r-help@stat.math.ethz.ch
> > > > Subject:  [R] subsetting a data set
> > > >
> > > > > I have a data set call

Re: [R] subsetting a data set

2006-09-08 Thread Petr Pikal
Hi

On 8 Sep 2006 at 10:33, Graham Smith wrote:

Date sent:  Fri, 8 Sep 2006 10:33:49 +0100
From:   "Graham Smith" <[EMAIL PROTECTED]>
To: "Petr Pikal" <[EMAIL PROTECTED]>
Copies to:  r-help@stat.math.ethz.ch
Subject:            Re: [R] subsetting a data set

> Petr,
> 
> Thanks again, but the data is GQ1, Max is a variable (column)
> 
> So I have used
> 
>  by(GQ1[,"Max"], list(GQ1$Status), summary)
> 
> Which is very good,  and is better than the way I did it before by
> summarising for each status level individually, but that still isn't
> combing the data for Status == "Expert" and Status = "Ecol"
> 
> So at the moment the status variable has 3 levels Expert, Ecol and
> Stake,

look at ?factors how to deal with factors, if your variable is not a 
factor (see ?str) than turn it to one.

x<-sample(letters[1:3], 20, replace=T) #character
x.f<-as.factor(x) #turn to factor
> x.f
 [1] b c b a c a c a a a a a b c c c b b c b
Levels: a b c
> levels(x.f)<-c("x","x","y") #rename levels
> x.f
 [1] x y x x y x y x x x x x x y y y x x y x
Levels: x y
>
> 
> I want to analsye that at two levels: Expert and Ecol combined into a
> new level called "AllEcol" and the exsiting level "Stake"

so in your case something like 

GQ1$statusComb<-factor(GQ1$status, labels=c("AllEcol","AllEcol", 
"Stake"))

shall do it. Beware of label ordering!!!

BTW. It had been good if you provided a usable example, as stated in 
posting guide. Many times trying to elaborate an example I will solve 
the problem myself.

HTH
Petr

> 
> It is this combining the levels that has got me stuck.
> 
> Thanks again,
> 
> Graham
> 
> On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> >
> > Sorry, I did not notice that in your case Max is not a function but
> > your data. So probably
> >
> > by(Max[, your.columns], list(Max$status), summary)
> >
> > is maybe what you want.
> > HTH
> > Petr
> >
> >
> > On 8 Sep 2006 at 10:31, Petr Pikal wrote:
> >
> > From:   "Petr Pikal" <[EMAIL PROTECTED]>
> > To: "Graham Smith" <[EMAIL PROTECTED]>,
> > r-help@stat.math.ethz.ch
> > Date sent:  Fri, 08 Sep 2006 10:31:12 +0200
> > Priority:   normal
> > Subject:Re: [R] subsetting a data set
> >
> > > Hi
> > >
> > > I am not sure if your Max is the same as max so I am not sure what
> > > you exactly want from your data. However you shall consult
> > > ?tapply, ?by, ?aggregate and maybe also ?"[" together with chapter
> > > 2 in intro manual in docs directory.
> > >
> > > aggregate(data[, some.columns], list(data$factor1, data$factor2),
> > > max)
> > >
> > > will give you maximum for specified columns based on spliting the
> > > data according to both factors
> > >
> > > Also connection summary with max is not common and I wonder what
> > > is your output in this case. I believe that there are six same
> > > numbers. However R is case sensitive and maybe Max does something
> > > different from max. In my case it throws an error.
> > >
> > > HTH
> > > Petr
> > >
> > > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> > >
> > > Date sent:Fri, 8 Sep 2006 08:06:16 +0100
> > > From: "Graham Smith" <[EMAIL PROTECTED]>
> > > To:   r-help@stat.math.ethz.ch
> > > Subject:  [R] subsetting a data set
> > >
> > > > I have a data set called GQ1, which has 20 variables one of
> > > > which is a factor called Status at thre levels "Expert", "Ecol"
> > > > and "Stake"
> > > >
> > > > I have managed to evaluate some of the data split by status
> > > > using commands like:
> > > >
> > > > summary (Max[Status=="Ecol"])
> > > >
> > > > BUT how do I produce  asummary for Ecol and Expert combined, the
> > > > only example I can find suggsts I could use
> > > >
> > > > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't
> > > > work.
> > > >
> > > > Additionally on the same vein, if I cannot work out how to
> > > > create a new data set that would contai

Re: [R] subsetting a data set

2006-09-08 Thread Graham Smith
Sian

On 08/09/06, Sean O'Riordain <[EMAIL PROTECTED]> wrote:
>
> Hi Graham,
> Try creating a new column with the two levels that you want...
>
> something along the lines of (warning untested!!!)
>
> GQ1[(GQ1$Status == "Expert) | (GQ1$Status == "Ecol"),]$newColumn <-
> "AllEcol"
> GQ1[GQ1$Status == "Stake",]$newColumn <- "Stake"
>
> and then do the
> by(GQ1[,"Max"], list(GQ1$NewColumn), summary)
>
> when in doubt... break the problem into smaller chunks... :-)
>
> cheers,
> Sean
>
> On 08/09/06, Graham Smith <[EMAIL PROTECTED]> wrote:
> > Petr,
> >
> > Thanks again, but the data is GQ1, Max is a variable (column)
> >
> > So I have used
> >
> >  by(GQ1[,"Max"], list(GQ1$Status), summary)
> >
> > Which is very good,  and is better than the way I did it before by
> > summarising for each status level individually, but that still isn't
> combing
> > the data for Status == "Expert" and Status = "Ecol"
> >
> > So at the moment the status variable has 3 levels Expert, Ecol and
> Stake,
> >
> > I want to analsye that at two levels: Expert and Ecol combined into a
> new
> > level called "AllEcol" and the exsiting level "Stake"
> >
> > It is this combining the levels that has got me stuck.
> >
> > Thanks again,
> >
> > Graham
> >
> > On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> > >
> > > Sorry, I did not notice that in your case Max is not a function but
> > > your data. So probably
> > >
> > > by(Max[, your.columns], list(Max$status), summary)
> > >
> > > is maybe what you want.
> > > HTH
> > > Petr
> > >
> > >
> > > On 8 Sep 2006 at 10:31, Petr Pikal wrote:
> > >
> > > From:   "Petr Pikal" <[EMAIL PROTECTED]>
> > > To: "Graham Smith" <[EMAIL PROTECTED]>,
> > > r-help@stat.math.ethz.ch
> > > Date sent:  Fri, 08 Sep 2006 10:31:12 +0200
> > > Priority:   normal
> > > Subject:Re: [R] subsetting a data set
> > >
> > > > Hi
> > > >
> > > > I am not sure if your Max is the same as max so I am not sure what
> you
> > > > exactly want from your data. However you shall consult ?tapply, ?by,
> > > > ?aggregate and maybe also ?"[" together with chapter 2 in intro
> manual
> > > > in docs directory.
> > > >
> > > > aggregate(data[, some.columns], list(data$factor1, data$factor2),
> max)
> > > >
> > > > will give you maximum for specified columns based on spliting the
> data
> > > > according to both factors
> > > >
> > > > Also connection summary with max is not common and I wonder what is
> > > > your output in this case. I believe that there are six same numbers.
> > > > However R is case sensitive and maybe Max does something different
> > > > from max. In my case it throws an error.
> > > >
> > > > HTH
> > > > Petr
> > > >
> > > > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> > > >
> > > > Date sent:Fri, 8 Sep 2006 08:06:16 +0100
> > > > From: "Graham Smith" <[EMAIL PROTECTED]>
> > > > To:   r-help@stat.math.ethz.ch
> > > > Subject:  [R] subsetting a data set
> > > >
> > > > > I have a data set called GQ1, which has 20 variables one of which
> is
> > > > > a factor called Status at thre levels "Expert", "Ecol" and "Stake"
> > > > >
> > > > > I have managed to evaluate some of the data split by status using
> > > > > commands like:
> > > > >
> > > > > summary (Max[Status=="Ecol"])
> > > > >
> > > > > BUT how do I produce  asummary for Ecol and Expert combined, the
> > > > > only example I can find suggsts I could use
> > > > >
> > > > > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't
> > > > > work.
> > > > >
> > > > > Additionally on the same vein, if I cannot work out how to create
> a
> > > > > new data set that would contain all th

Re: [R] subsetting a data set

2006-09-08 Thread Sean O'Riordain
Hi Graham,
Try creating a new column with the two levels that you want...

something along the lines of (warning untested!!!)

GQ1[(GQ1$Status == "Expert) | (GQ1$Status == "Ecol"),]$newColumn <- "AllEcol"
GQ1[GQ1$Status == "Stake",]$newColumn <- "Stake"

and then do the
by(GQ1[,"Max"], list(GQ1$NewColumn), summary)

when in doubt... break the problem into smaller chunks... :-)

cheers,
Sean

On 08/09/06, Graham Smith <[EMAIL PROTECTED]> wrote:
> Petr,
>
> Thanks again, but the data is GQ1, Max is a variable (column)
>
> So I have used
>
>  by(GQ1[,"Max"], list(GQ1$Status), summary)
>
> Which is very good,  and is better than the way I did it before by
> summarising for each status level individually, but that still isn't combing
> the data for Status == "Expert" and Status = "Ecol"
>
> So at the moment the status variable has 3 levels Expert, Ecol and Stake,
>
> I want to analsye that at two levels: Expert and Ecol combined into a new
> level called "AllEcol" and the exsiting level "Stake"
>
> It is this combining the levels that has got me stuck.
>
> Thanks again,
>
> Graham
>
> On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> >
> > Sorry, I did not notice that in your case Max is not a function but
> > your data. So probably
> >
> > by(Max[, your.columns], list(Max$status), summary)
> >
> > is maybe what you want.
> > HTH
> > Petr
> >
> >
> > On 8 Sep 2006 at 10:31, Petr Pikal wrote:
> >
> > From:           "Petr Pikal" <[EMAIL PROTECTED]>
> > To: "Graham Smith" <[EMAIL PROTECTED]>,
> > r-help@stat.math.ethz.ch
> > Date sent:  Fri, 08 Sep 2006 10:31:12 +0200
> > Priority:   normal
> > Subject:Re: [R] subsetting a data set
> >
> > > Hi
> > >
> > > I am not sure if your Max is the same as max so I am not sure what you
> > > exactly want from your data. However you shall consult ?tapply, ?by,
> > > ?aggregate and maybe also ?"[" together with chapter 2 in intro manual
> > > in docs directory.
> > >
> > > aggregate(data[, some.columns], list(data$factor1, data$factor2), max)
> > >
> > > will give you maximum for specified columns based on spliting the data
> > > according to both factors
> > >
> > > Also connection summary with max is not common and I wonder what is
> > > your output in this case. I believe that there are six same numbers.
> > > However R is case sensitive and maybe Max does something different
> > > from max. In my case it throws an error.
> > >
> > > HTH
> > > Petr
> > >
> > > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> > >
> > > Date sent:Fri, 8 Sep 2006 08:06:16 +0100
> > > From: "Graham Smith" <[EMAIL PROTECTED]>
> > > To:   r-help@stat.math.ethz.ch
> > > Subject:  [R] subsetting a data set
> > >
> > > > I have a data set called GQ1, which has 20 variables one of which is
> > > > a factor called Status at thre levels "Expert", "Ecol" and "Stake"
> > > >
> > > > I have managed to evaluate some of the data split by status using
> > > > commands like:
> > > >
> > > > summary (Max[Status=="Ecol"])
> > > >
> > > > BUT how do I produce  asummary for Ecol and Expert combined, the
> > > > only example I can find suggsts I could use
> > > >
> > > > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't
> > > > work.
> > > >
> > > > Additionally on the same vein, if I cannot work out how to create a
> > > > new data set that would contain all the data for all the variables
> > > > but only for the data where Status = Ecol, or where status equalles
> > > > Ecol and Expert.
> > > >
> > > > I know this is yet again a very simple problem, but I really can't
> > > > find the solution in the help or the books I have.
> > > >
> > > > Many thanks,
> > > >
> > > > Graham
> > > >
> > > >  [[alternative HTML version deleted]]
> > > >
> > > > __
> > > > R-help@stat.math.ethz.ch mailing list

Re: [R] subsetting a data set

2006-09-08 Thread Graham Smith
Petr,

Thanks again, but the data is GQ1, Max is a variable (column)

So I have used

 by(GQ1[,"Max"], list(GQ1$Status), summary)

Which is very good,  and is better than the way I did it before by
summarising for each status level individually, but that still isn't combing
the data for Status == "Expert" and Status = "Ecol"

So at the moment the status variable has 3 levels Expert, Ecol and Stake,

I want to analsye that at two levels: Expert and Ecol combined into a new
level called "AllEcol" and the exsiting level "Stake"

It is this combining the levels that has got me stuck.

Thanks again,

Graham

On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
>
> Sorry, I did not notice that in your case Max is not a function but
> your data. So probably
>
> by(Max[, your.columns], list(Max$status), summary)
>
> is maybe what you want.
> HTH
> Petr
>
>
> On 8 Sep 2006 at 10:31, Petr Pikal wrote:
>
> From:   "Petr Pikal" <[EMAIL PROTECTED]>
> To: "Graham Smith" <[EMAIL PROTECTED]>,
> r-help@stat.math.ethz.ch
> Date sent:  Fri, 08 Sep 2006 10:31:12 +0200
> Priority:   normal
> Subject:Re: [R] subsetting a data set
>
> > Hi
> >
> > I am not sure if your Max is the same as max so I am not sure what you
> > exactly want from your data. However you shall consult ?tapply, ?by,
> > ?aggregate and maybe also ?"[" together with chapter 2 in intro manual
> > in docs directory.
> >
> > aggregate(data[, some.columns], list(data$factor1, data$factor2), max)
> >
> > will give you maximum for specified columns based on spliting the data
> > according to both factors
> >
> > Also connection summary with max is not common and I wonder what is
> > your output in this case. I believe that there are six same numbers.
> > However R is case sensitive and maybe Max does something different
> > from max. In my case it throws an error.
> >
> > HTH
> > Petr
> >
> > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> >
> > Date sent:Fri, 8 Sep 2006 08:06:16 +0100
> > From: "Graham Smith" <[EMAIL PROTECTED]>
> > To:   r-help@stat.math.ethz.ch
> > Subject:  [R] subsetting a data set
> >
> > > I have a data set called GQ1, which has 20 variables one of which is
> > > a factor called Status at thre levels "Expert", "Ecol" and "Stake"
> > >
> > > I have managed to evaluate some of the data split by status using
> > > commands like:
> > >
> > > summary (Max[Status=="Ecol"])
> > >
> > > BUT how do I produce  asummary for Ecol and Expert combined, the
> > > only example I can find suggsts I could use
> > >
> > > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't
> > > work.
> > >
> > > Additionally on the same vein, if I cannot work out how to create a
> > > new data set that would contain all the data for all the variables
> > > but only for the data where Status = Ecol, or where status equalles
> > > Ecol and Expert.
> > >
> > > I know this is yet again a very simple problem, but I really can't
> > > find the solution in the help or the books I have.
> > >
> > > Many thanks,
> > >
> > > Graham
> > >
> > >  [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html and provide commented,
> > > minimal, self-contained, reproducible code.
> >
> > Petr Pikal
> > [EMAIL PROTECTED]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
>
> Petr Pikal
> [EMAIL PROTECTED]
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting a data set

2006-09-08 Thread Petr Pikal
Hi

if you use summary aggregate probably will not work and tapply have 
to be called differently

tapply(seq(along=Max[,1]), list(Max$Status), function(i, x) 
summary(x[i]), x=Max[,one.column])

or you can use by

by(Max[,1:5]), list(Max$Status), summary)

or if you do not like the output  something like that

lll <- lapply(as.list(Max[,your.columns]), function(x) 
sapply(split(x,Max$Status),summary))
do.call("rbind",lll)
or
do.call("data.frame",lll)

HTH
Petr

On 8 Sep 2006 at 10:03, Graham Smith wrote:

Date sent:  Fri, 8 Sep 2006 10:03:51 +0100
From:   "Graham Smith" <[EMAIL PROTECTED]>
To: "Petr Pikal" <[EMAIL PROTECTED]>
Copies to:      r-help@stat.math.ethz.ch
Subject:Re: [R] subsetting a data set

> Petr,
> 
> Thanks I shall have at look at these options.
> 
> Sorry about the confusion with the "Max", in my example "Max" is the
> name of the variable that I am summarising. I chose a poor example to
> cut and paste form R, not thinking about the obvious confusion this
> would cause.
> 
> Thanks again
> 
> Graham
> 
> On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> >
> > Hi
> >
> > I am not sure if your Max is the same as max so I am not sure what
> > you exactly want from your data. However you shall consult ?tapply,
> > ?by, ?aggregate and maybe also ?"[" together with chapter 2 in intro
> > manual in docs directory.
> >
> > aggregate(data[, some.columns], list(data$factor1, data$factor2),
> > max)
> >
> > will give you maximum for specified columns based on spliting the
> > data according to both factors
> >
> > Also connection summary with max is not common and I wonder what is
> > your output in this case. I believe that there are six same numbers.
> > However R is case sensitive and maybe Max does something different
> > from max. In my case it throws an error.
> >
> > HTH
> > Petr
> >
> > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> >
> > Date sent:  Fri, 8 Sep 2006 08:06:16 +0100
> > From:   "Graham Smith" < [EMAIL PROTECTED]>
> > To: r-help@stat.math.ethz.ch
> > Subject:[R] subsetting a data set
> >
> > > I have a data set called GQ1, which has 20 variables one of which
> > > is a factor called Status at thre levels "Expert", "Ecol" and
> > > "Stake"
> > >
> > > I have managed to evaluate some of the data split by status using
> > > commands like:
> > >
> > > summary (Max[Status=="Ecol"])
> > >
> > > BUT how do I produce  asummary for Ecol and Expert combined, the
> > > only example I can find suggsts I could use
> > >
> > > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't
> > > work.
> > >
> > > Additionally on the same vein, if I cannot work out how to create
> > > a new data set that would contain all the data for all the
> > > variables but only for the data where Status = Ecol, or where
> > > status equalles Ecol and Expert.
> > >
> > > I know this is yet again a very simple problem, but I really can't
> > > find the solution in the help or the books I have.
> > >
> > > Many thanks,
> > >
> > > Graham
> > >
> > >  [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html and provide commented,
> > > minimal, self-contained, reproducible code.
> >
> > Petr Pikal
> > [EMAIL PROTECTED]
> >
> >
> 
>  [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting a data set

2006-09-08 Thread Petr Pikal
Sorry, I did not notice that in your case Max is not a function but 
your data. So probably

by(Max[, your.columns], list(Max$status), summary)

is maybe what you want.
HTH
Petr


On 8 Sep 2006 at 10:31, Petr Pikal wrote:

From:   "Petr Pikal" <[EMAIL PROTECTED]>
To: "Graham Smith" <[EMAIL PROTECTED]>, 
r-help@stat.math.ethz.ch
Date sent:  Fri, 08 Sep 2006 10:31:12 +0200
Priority:   normal
Subject:            Re: [R] subsetting a data set

> Hi
> 
> I am not sure if your Max is the same as max so I am not sure what you
> exactly want from your data. However you shall consult ?tapply, ?by,
> ?aggregate and maybe also ?"[" together with chapter 2 in intro manual
> in docs directory.
> 
> aggregate(data[, some.columns], list(data$factor1, data$factor2), max)
> 
> will give you maximum for specified columns based on spliting the data
> according to both factors
> 
> Also connection summary with max is not common and I wonder what is
> your output in this case. I believe that there are six same numbers.
> However R is case sensitive and maybe Max does something different
> from max. In my case it throws an error.
> 
> HTH
> Petr
> 
> On 8 Sep 2006 at 8:06, Graham Smith wrote:
> 
> Date sent:Fri, 8 Sep 2006 08:06:16 +0100
> From: "Graham Smith" <[EMAIL PROTECTED]>
> To:   r-help@stat.math.ethz.ch
> Subject:  [R] subsetting a data set
> 
> > I have a data set called GQ1, which has 20 variables one of which is
> > a factor called Status at thre levels "Expert", "Ecol" and "Stake"
> > 
> > I have managed to evaluate some of the data split by status using
> > commands like:
> > 
> > summary (Max[Status=="Ecol"])
> > 
> > BUT how do I produce  asummary for Ecol and Expert combined, the
> > only example I can find suggsts I could use
> > 
> > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't
> > work.
> > 
> > Additionally on the same vein, if I cannot work out how to create a
> > new data set that would contain all the data for all the variables
> > but only for the data where Status = Ecol, or where status equalles
> > Ecol and Expert.
> > 
> > I know this is yet again a very simple problem, but I really can't
> > find the solution in the help or the books I have.
> > 
> > Many thanks,
> > 
> > Graham
> > 
> >  [[alternative HTML version deleted]]
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
> 
> Petr Pikal
> [EMAIL PROTECTED]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting a data set

2006-09-08 Thread Graham Smith
Petr,

Thanks I shall have at look at these options.

Sorry about the confusion with the "Max", in my example "Max" is the name of
the variable that I am summarising. I chose a poor example to cut and paste
form R, not thinking about the obvious confusion this would cause.

Thanks again

Graham

On 08/09/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
>
> Hi
>
> I am not sure if your Max is the same as max so I am not sure what
> you exactly want from your data. However you shall consult ?tapply,
> ?by, ?aggregate and maybe also ?"[" together with chapter 2 in intro
> manual in docs directory.
>
> aggregate(data[, some.columns], list(data$factor1, data$factor2),
> max)
>
> will give you maximum for specified columns based on spliting the
> data according to both factors
>
> Also connection summary with max is not common and I wonder what is
> your output in this case. I believe that there are six same numbers.
> However R is case sensitive and maybe Max does something different
> from max. In my case it throws an error.
>
> HTH
> Petr
>
> On 8 Sep 2006 at 8:06, Graham Smith wrote:
>
> Date sent:  Fri, 8 Sep 2006 08:06:16 +0100
> From:   "Graham Smith" < [EMAIL PROTECTED]>
> To: r-help@stat.math.ethz.ch
> Subject:[R] subsetting a data set
>
> > I have a data set called GQ1, which has 20 variables one of which is a
> > factor called Status at thre levels "Expert", "Ecol" and "Stake"
> >
> > I have managed to evaluate some of the data split by status using
> > commands like:
> >
> > summary (Max[Status=="Ecol"])
> >
> > BUT how do I produce  asummary for Ecol and Expert combined, the only
> > example I can find suggsts I could use
> >
> > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't work.
> >
> > Additionally on the same vein, if I cannot work out how to create a
> > new data set that would contain all the data for all the variables but
> > only for the data where Status = Ecol, or where status equalles Ecol
> > and Expert.
> >
> > I know this is yet again a very simple problem, but I really can't
> > find the solution in the help or the books I have.
> >
> > Many thanks,
> >
> > Graham
> >
> >  [[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
>
> Petr Pikal
> [EMAIL PROTECTED]
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting a data set

2006-09-08 Thread Petr Pikal
Hi

I am not sure if your Max is the same as max so I am not sure what 
you exactly want from your data. However you shall consult ?tapply, 
?by, ?aggregate and maybe also ?"[" together with chapter 2 in intro 
manual in docs directory.

aggregate(data[, some.columns], list(data$factor1, data$factor2), 
max)

will give you maximum for specified columns based on spliting the 
data according to both factors

Also connection summary with max is not common and I wonder what is 
your output in this case. I believe that there are six same numbers. 
However R is case sensitive and maybe Max does something different 
from max. In my case it throws an error.

HTH
Petr

On 8 Sep 2006 at 8:06, Graham Smith wrote:

Date sent:  Fri, 8 Sep 2006 08:06:16 +0100
From:   "Graham Smith" <[EMAIL PROTECTED]>
To: r-help@stat.math.ethz.ch
Subject:[R] subsetting a data set

> I have a data set called GQ1, which has 20 variables one of which is a
> factor called Status at thre levels "Expert", "Ecol" and "Stake"
> 
> I have managed to evaluate some of the data split by status using
> commands like:
> 
> summary (Max[Status=="Ecol"])
> 
> BUT how do I produce  asummary for Ecol and Expert combined, the only
> example I can find suggsts I could use
> 
> summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't work.
> 
> Additionally on the same vein, if I cannot work out how to create a
> new data set that would contain all the data for all the variables but
> only for the data where Status = Ecol, or where status equalles Ecol
> and Expert.
> 
> I know this is yet again a very simple problem, but I really can't
> find the solution in the help or the books I have.
> 
> Many thanks,
> 
> Graham
> 
>  [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.