Re: [R] Sample of a subsample

2017-09-25 Thread Bert Gunter
Yes.

Beating a pretty weary horse, a slightly cleaner version of my prior
offering using with(), instead of within() is:

with(dat,
dat[sampleNo[sample(var1[!var1%%2 & !sampleNo], 10, rep=FALSE)],
"sampleNo"] <- 2)

with() and within() are convenient ways to avoid having to repeatedly name
the columns via $  . Note also the use of logical subscripting of the data
frame in which numeric 0 is coerced to FALSE and any nonzero value to TRUE
(which I should have done previously).

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Sep 25, 2017 at 11:43 AM, Eric Berger  wrote:

> Hi David,
> I was about to post a reply when Bert responded. His answer is good
> and his comment to use the name 'dat' rather than 'data' is instructive.
> I am providing my suggestion as well because I think it may address
> what was causing you some confusion (mainly to use "which", but also
> the missing !)
>
> idx2 <- sample( which( (!data$var1%%2) & data$sampleNo==0 ), size=10,
> replace=F)
> data[idx2,]$sampleNo <- 2
>
> Eric
>
>
>
> On Mon, Sep 25, 2017 at 9:03 PM, Bert Gunter 
> wrote:
>
>> For personal aesthetic reasons, I changed the name "data" to "dat".
>>
>> Your code, with a slight modification:
>>
>> set.seed (1357)  ## for reproducibility
>> dat <- data.frame(var1=seq(1:40), var2=seq(40,1))
>> dat$sampleNo <- 0
>> idx <- sample(seq(1,nrow(dat)), size=10, replace=F)
>> dat[idx,"sampleNo"] <-1
>>
>> ## yielding
>> > dat
>>
>>var1 var2 sampleNo
>> 1 1   400
>> 2 2   391
>> 3 3   380
>> 4 4   370
>> 5 5   360
>> 6 6   351
>> 7 7   340
>> 8 8   330
>> 9 9   320
>> 10   10   310
>> 11   11   300
>> 12   12   290
>> 13   13   280
>> 14   14   270
>> 15   15   261
>> 16   16   251
>> 17   17   240
>> 18   18   230
>> 19   19   220
>> 20   20   211
>> 21   21   200
>> 22   22   191
>> 23   23   180
>> 24   24   171
>> 25   25   160
>> 26   26   151
>> 27   27   140
>> 28   28   130
>> 29   29   120
>> 30   30   110
>> 31   31   100
>> 32   3290
>> 33   3380
>> 34   3470
>> 35   3561
>> 36   3650
>> 37   3741
>> 38   3830
>> 39   3920
>> 40   4010
>>
>> ## This is basically a transcription of your specification into indexing
>> logic
>>
>> dat <- within(dat,sampleNo[sample(var1[(var1%%2 == 0) &
>> sampleNo==0],10,rep=FALSE)] <- 2)
>>
>> ##yielding
>> > dat
>>
>>var1 var2 sampleNo
>> 1 1   400
>> 2 2   391
>> 3 3   380
>> 4 4   372
>> 5 5   360
>> 6 6   351
>> 7 7   340
>> 8 8   332
>> 9 9   320
>> 10   10   312
>> 11   11   300
>> 12   12   290
>> 13   13   280
>> 14   14   272
>> 15   15   261
>> 16   16   251
>> 17   17   240
>> 18   18   232
>> 19   19   220
>> 20   20   211
>> 21   21   200
>> 22   22   191
>> 23   23   180
>> 24   24   171
>> 25   25   160
>> 26   26   151
>> 27   27   140
>> 28   28   132
>> 29   29   120
>> 30   30   112
>> 31   31   100
>> 32   3292
>> 33   3380
>> 34   3472
>> 35   3561
>> 36   3652
>> 37   3741
>> 38   3830
>> 39   3920
>> 40   4010
>>
>>
>>
>>
>>
>> dat <- within(dat,sampleNo[sample(var1[(var1%%2 == 0) &
>> sampleNo==0],10,rep=FALSE)] <- 2)
>>
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Mon, Sep 25, 2017 at 10:27 AM, David Studer 
>> wrote:
>>
>> > Hello everybody!
>> >
>> > I have the following problem: I'd like to select a sample from a
>> subsample
>> > in a dataset. Actually, I don't want to select it, but to create a new
>> > variable sampleNo that indicates to which sample (one or two) a case
>> > belongs to.
>> >
>> > Lets suppose I have a dataset containing 40 cases:
>> >
>> > data <- data.frame(var1=seq(1:40), var2=seq(40,1))
>> >
>> > The first sample (n=10) I drew like this:
>> >
>> > data$sampleNo <- 0
>> > idx <- sample(seq(1,nrow(data)), size=10, replace=F)
>> > data[idx,]$sampleNo <- 1
>> >
>> > Now, (and here my problems start) I'd like to draw a second sample
>> (n=10).
>> > But this sample should be drawn from the cases that don't belong to the
>> > first sample only. *Add

Re: [R] Sample of a subsample

2017-09-25 Thread Eric Berger
Hi David,
I was about to post a reply when Bert responded. His answer is good
and his comment to use the name 'dat' rather than 'data' is instructive.
I am providing my suggestion as well because I think it may address
what was causing you some confusion (mainly to use "which", but also
the missing !)

idx2 <- sample( which( (!data$var1%%2) & data$sampleNo==0 ), size=10,
replace=F)
data[idx2,]$sampleNo <- 2

Eric



On Mon, Sep 25, 2017 at 9:03 PM, Bert Gunter  wrote:

> For personal aesthetic reasons, I changed the name "data" to "dat".
>
> Your code, with a slight modification:
>
> set.seed (1357)  ## for reproducibility
> dat <- data.frame(var1=seq(1:40), var2=seq(40,1))
> dat$sampleNo <- 0
> idx <- sample(seq(1,nrow(dat)), size=10, replace=F)
> dat[idx,"sampleNo"] <-1
>
> ## yielding
> > dat
>
>var1 var2 sampleNo
> 1 1   400
> 2 2   391
> 3 3   380
> 4 4   370
> 5 5   360
> 6 6   351
> 7 7   340
> 8 8   330
> 9 9   320
> 10   10   310
> 11   11   300
> 12   12   290
> 13   13   280
> 14   14   270
> 15   15   261
> 16   16   251
> 17   17   240
> 18   18   230
> 19   19   220
> 20   20   211
> 21   21   200
> 22   22   191
> 23   23   180
> 24   24   171
> 25   25   160
> 26   26   151
> 27   27   140
> 28   28   130
> 29   29   120
> 30   30   110
> 31   31   100
> 32   3290
> 33   3380
> 34   3470
> 35   3561
> 36   3650
> 37   3741
> 38   3830
> 39   3920
> 40   4010
>
> ## This is basically a transcription of your specification into indexing
> logic
>
> dat <- within(dat,sampleNo[sample(var1[(var1%%2 == 0) &
> sampleNo==0],10,rep=FALSE)] <- 2)
>
> ##yielding
> > dat
>
>var1 var2 sampleNo
> 1 1   400
> 2 2   391
> 3 3   380
> 4 4   372
> 5 5   360
> 6 6   351
> 7 7   340
> 8 8   332
> 9 9   320
> 10   10   312
> 11   11   300
> 12   12   290
> 13   13   280
> 14   14   272
> 15   15   261
> 16   16   251
> 17   17   240
> 18   18   232
> 19   19   220
> 20   20   211
> 21   21   200
> 22   22   191
> 23   23   180
> 24   24   171
> 25   25   160
> 26   26   151
> 27   27   140
> 28   28   132
> 29   29   120
> 30   30   112
> 31   31   100
> 32   3292
> 33   3380
> 34   3472
> 35   3561
> 36   3652
> 37   3741
> 38   3830
> 39   3920
> 40   4010
>
>
>
>
>
> dat <- within(dat,sampleNo[sample(var1[(var1%%2 == 0) &
> sampleNo==0],10,rep=FALSE)] <- 2)
>
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Mon, Sep 25, 2017 at 10:27 AM, David Studer  wrote:
>
> > Hello everybody!
> >
> > I have the following problem: I'd like to select a sample from a
> subsample
> > in a dataset. Actually, I don't want to select it, but to create a new
> > variable sampleNo that indicates to which sample (one or two) a case
> > belongs to.
> >
> > Lets suppose I have a dataset containing 40 cases:
> >
> > data <- data.frame(var1=seq(1:40), var2=seq(40,1))
> >
> > The first sample (n=10) I drew like this:
> >
> > data$sampleNo <- 0
> > idx <- sample(seq(1,nrow(data)), size=10, replace=F)
> > data[idx,]$sampleNo <- 1
> >
> > Now, (and here my problems start) I'd like to draw a second sample
> (n=10).
> > But this sample should be drawn from the cases that don't belong to the
> > first sample only. *Additionally, "var1" should be an even number.*
> >
> > So sampleNo should be 0 for cases that were not drawn at all, 1 for cases
> > that belong to the first sample and 2 for cases belonging to the second
> > sample (= sampleNo equals 0 and var1 is even).
> >
> > I was trying to solve it like this:
> >
> > idx2<-data$var1%%2 & data$sampleNo==0
> > sample(data[idx2,], size=10, replace=F)
> >
> > But how can I set sampleNo to 2?
> >
> >
> > Thank you very much for your help!
> >
> > David
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML versi

Re: [R] Sample of a subsample

2017-09-25 Thread Bert Gunter
For personal aesthetic reasons, I changed the name "data" to "dat".

Your code, with a slight modification:

set.seed (1357)  ## for reproducibility
dat <- data.frame(var1=seq(1:40), var2=seq(40,1))
dat$sampleNo <- 0
idx <- sample(seq(1,nrow(dat)), size=10, replace=F)
dat[idx,"sampleNo"] <-1

## yielding
> dat

   var1 var2 sampleNo
1 1   400
2 2   391
3 3   380
4 4   370
5 5   360
6 6   351
7 7   340
8 8   330
9 9   320
10   10   310
11   11   300
12   12   290
13   13   280
14   14   270
15   15   261
16   16   251
17   17   240
18   18   230
19   19   220
20   20   211
21   21   200
22   22   191
23   23   180
24   24   171
25   25   160
26   26   151
27   27   140
28   28   130
29   29   120
30   30   110
31   31   100
32   3290
33   3380
34   3470
35   3561
36   3650
37   3741
38   3830
39   3920
40   4010

## This is basically a transcription of your specification into indexing
logic

dat <- within(dat,sampleNo[sample(var1[(var1%%2 == 0) &
sampleNo==0],10,rep=FALSE)] <- 2)

##yielding
> dat

   var1 var2 sampleNo
1 1   400
2 2   391
3 3   380
4 4   372
5 5   360
6 6   351
7 7   340
8 8   332
9 9   320
10   10   312
11   11   300
12   12   290
13   13   280
14   14   272
15   15   261
16   16   251
17   17   240
18   18   232
19   19   220
20   20   211
21   21   200
22   22   191
23   23   180
24   24   171
25   25   160
26   26   151
27   27   140
28   28   132
29   29   120
30   30   112
31   31   100
32   3292
33   3380
34   3472
35   3561
36   3652
37   3741
38   3830
39   3920
40   4010





dat <- within(dat,sampleNo[sample(var1[(var1%%2 == 0) &
sampleNo==0],10,rep=FALSE)] <- 2)




Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Sep 25, 2017 at 10:27 AM, David Studer  wrote:

> Hello everybody!
>
> I have the following problem: I'd like to select a sample from a subsample
> in a dataset. Actually, I don't want to select it, but to create a new
> variable sampleNo that indicates to which sample (one or two) a case
> belongs to.
>
> Lets suppose I have a dataset containing 40 cases:
>
> data <- data.frame(var1=seq(1:40), var2=seq(40,1))
>
> The first sample (n=10) I drew like this:
>
> data$sampleNo <- 0
> idx <- sample(seq(1,nrow(data)), size=10, replace=F)
> data[idx,]$sampleNo <- 1
>
> Now, (and here my problems start) I'd like to draw a second sample (n=10).
> But this sample should be drawn from the cases that don't belong to the
> first sample only. *Additionally, "var1" should be an even number.*
>
> So sampleNo should be 0 for cases that were not drawn at all, 1 for cases
> that belong to the first sample and 2 for cases belonging to the second
> sample (= sampleNo equals 0 and var1 is even).
>
> I was trying to solve it like this:
>
> idx2<-data$var1%%2 & data$sampleNo==0
> sample(data[idx2,], size=10, replace=F)
>
> But how can I set sampleNo to 2?
>
>
> Thank you very much for your help!
>
> David
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sample of a subsample

2017-09-25 Thread David Studer
Hello everybody!

I have the following problem: I'd like to select a sample from a subsample
in a dataset. Actually, I don't want to select it, but to create a new
variable sampleNo that indicates to which sample (one or two) a case
belongs to.

Lets suppose I have a dataset containing 40 cases:

data <- data.frame(var1=seq(1:40), var2=seq(40,1))

The first sample (n=10) I drew like this:

data$sampleNo <- 0
idx <- sample(seq(1,nrow(data)), size=10, replace=F)
data[idx,]$sampleNo <- 1

Now, (and here my problems start) I'd like to draw a second sample (n=10).
But this sample should be drawn from the cases that don't belong to the
first sample only. *Additionally, "var1" should be an even number.*

So sampleNo should be 0 for cases that were not drawn at all, 1 for cases
that belong to the first sample and 2 for cases belonging to the second
sample (= sampleNo equals 0 and var1 is even).

I was trying to solve it like this:

idx2<-data$var1%%2 & data$sampleNo==0
sample(data[idx2,], size=10, replace=F)

But how can I set sampleNo to 2?


Thank you very much for your help!

David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.