Re: [R] "Safe" use of iterator (package iterators)

2016-12-14 Thread Rich Calaway via R-help
Hi, Harold--

The short answer is "Yes"--in your example, the nextElem will always be a list 
with the i component equal to the next element of itx1 and the j component 
equal to the next element of itx2.

I posted a more detailed explanation in response to a query from David on the 
Microsoft TechNet forum: 
https://social.technet.microsoft.com/Forums/en-US/724b1dde-03e3-4fff-b061-363bc8ba1652/how-are-multiple-iterobjects-handled-by-foreach?forum=ropen


Cheers,
Rich

Rich Calaway
Release Manager
Microsoft R Product Team
24/1341
+1 (425) 4219919 X19919


Message: 2
Date: Fri, 9 Dec 2016 17:15:38 +
From: "Doran, Harold" <hdo...@air.org>
To: "r-help@r-project.org" <r-help@r-project.org>
Subject: [R] "Safe" use of iterator (package iterators)
Message-ID:
<b08b6af0cf8ca44f81b9983eebdcd68601358f7...@dc1vex10mb01.air.org>
Content-Type: text/plain; charset="iso-8859-1"

I believe I now see the light vis-?-vis iterators when combined with foreach() 
calls in R. I have now been able to reduce computational workload to minutes 
instead of hours. I want to verify that the way I am using them is "safe". By 
safe I mean does the iterator traverse elements in the same way as I have below 
in my toy example to illustrate what I mean.

In the first "traditional" example, I have only one index variable for the loop 
and so I know that the same list in r1 and r2 are always being grabbed. That 
is, in iteration 1 it is guaranteed to use r1[[1]] + r2[[1]].

In the example that uses the iterators, is this also guaranteed even though I 
now have two iterator objects? That is, will the index for element i always be 
the same as the index for element j when using this across many different cores?

It seems to be true and in all my test cases so far I am seeing it to be true. 
But, that could be just luck, so I wonder if there is a condition under which 
that would NOT be true.

Thank you
Harold


library(foreach)
library(doParallel)
cl <- makeCluster(2) 
registerDoParallel(cl)

### Create random data
r1 <- vector("list", 20)
for(i in 1:20){
r1[[i]] <- rnorm(10)
}

### Create random data
r2 <- vector("list", 20)
for(i in 1:20){
r2[[i]] <- rnorm(10)
}

### Use a for loop traditionally 
result1 <- vector("list", 20)
for(i in 1:20){
result1[[i]] <- r1[[i]] + r2[[i]]
}

### Use iterators
itx1 <- iter(r1)
itx2 <- iter(r2)

result2 <- foreach(i = itx1, j = itx2) %dopar% {
i + j
}   

all.equal(result1, result2)




Message: 4
Date: Fri, 9 Dec 2016 10:26:55 -0800
From: David Winsemius <dwinsem...@comcast.net>
To: "Doran, Harold" <hdo...@air.org>
Cc: "r-help@r-project.org" <r-help@r-project.org>
Subject: Re: [R] "Safe" use of iterator (package iterators)
Message-ID: <f5e6b19c-1c53-427b-8bcf-0739ddc63...@comcast.net>
Content-Type: text/plain; charset=iso-8859-1


> On Dec 9, 2016, at 9:15 AM, Doran, Harold <hdo...@air.org> wrote:
> 
> I believe I now see the light vis-?-vis iterators when combined with 
> foreach() calls in R. I have now been able to reduce computational workload 
> to minutes instead of hours. I want to verify that the way I am using them is 
> "safe". By safe I mean does the iterator traverse elements in the same way as 
> I have below in my toy example to illustrate what I mean.
> 
> In the first "traditional" example, I have only one index variable for the 
> loop and so I know that the same list in r1 and r2 are always being grabbed. 
> That is, in iteration 1 it is guaranteed to use r1[[1]] + r2[[1]].
> 
> In the example that uses the iterators, is this also guaranteed even though I 
> now have two iterator objects? That is, will the index for element i always 
> be the same as the index for element j when using this across many different 
> cores?
> 
> It seems to be true and in all my test cases so far I am seeing it to be 
> true. But, that could be just luck, so I wonder if there is a condition under 
> which that would NOT be true.
> 
> Thank you
> Harold
> 
> 
> library(foreach)
> library(doParallel)
> cl <- makeCluster(2) 
> registerDoParallel(cl)
> 
> ### Create random data
> r1 <- vector("list", 20)
> for(i in 1:20){
>   r1[[i]] <- rnorm(10)
> }
> 
> ### Create random data
> r2 <- vector("list", 20)
> for(i in 1:20){
>   r2[[i]] <- rnorm(10)
> }
> 
> ### Use a for loop traditionally 
> result1 <- vector("list", 20)
> for(i in 1:20){
>   result1[[i]] <- r1[[i]] + r2[[i]]
> }
>   
> ### Use iterators
> itx1 <- iter(r1)
> itx2 <- iter(r2)
> 
> result2 <- foreach(i = itx1, j = itx2) %do

Re: [R] "Safe" use of iterator (package iterators)

2016-12-09 Thread Doran, Harold
That is a helpful, and important, caveat. So, perhaps I should amend my 
original question to ask something like is it safe *when* length(r1) == 
length(r2)

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Friday, December 09, 2016 1:27 PM
To: Doran, Harold <hdo...@air.org>
Cc: r-help@r-project.org
Subject: Re: [R] "Safe" use of iterator (package iterators)


> On Dec 9, 2016, at 9:15 AM, Doran, Harold <hdo...@air.org> wrote:
> 
> I believe I now see the light vis-à-vis iterators when combined with 
> foreach() calls in R. I have now been able to reduce computational workload 
> to minutes instead of hours. I want to verify that the way I am using them is 
> "safe". By safe I mean does the iterator traverse elements in the same way as 
> I have below in my toy example to illustrate what I mean.
> 
> In the first "traditional" example, I have only one index variable for the 
> loop and so I know that the same list in r1 and r2 are always being grabbed. 
> That is, in iteration 1 it is guaranteed to use r1[[1]] + r2[[1]].
> 
> In the example that uses the iterators, is this also guaranteed even though I 
> now have two iterator objects? That is, will the index for element i always 
> be the same as the index for element j when using this across many different 
> cores?
> 
> It seems to be true and in all my test cases so far I am seeing it to be 
> true. But, that could be just luck, so I wonder if there is a condition under 
> which that would NOT be true.
> 
> Thank you
> Harold
> 
> 
> library(foreach)
> library(doParallel)
> cl <- makeCluster(2)
> registerDoParallel(cl)
> 
> ### Create random data
> r1 <- vector("list", 20)
> for(i in 1:20){
>   r1[[i]] <- rnorm(10)
> }
> 
> ### Create random data
> r2 <- vector("list", 20)
> for(i in 1:20){
>   r2[[i]] <- rnorm(10)
> }
> 
> ### Use a for loop traditionally
> result1 <- vector("list", 20)
> for(i in 1:20){
>   result1[[i]] <- r1[[i]] + r2[[i]]
> }
>   
> ### Use iterators
> itx1 <- iter(r1)
> itx2 <- iter(r2)
> 
> result2 <- foreach(i = itx1, j = itx2) %dopar% {
>   i + j
>   }   
> 
> all.equal(result1, result2)

I wasn't sure how this would or should behave. I'm not an experienced user, 
merely a reader of help pages. Neither the help page, not the vignette 
references on the help page answered my questions in this case. I expected that 
call would behave analogously to the behavior of mapply when given iterators of 
unequal length. (The shorter of the objects is recycled to reach the length of 
the longer object.) That expectation was not realized. It appears that the 
length of first object of the objects determines computation length, but that 
missing values will not be recycled for the shorter iterator. An error is not 
reported, but rather numeric(0) is returned. So in one sense the %dopar% 
version is "safer" at least to the extent of not failing with an error that 
would have occurred when using a for-loop.

This was my test case:

r1 <- vector("list", 10)
for(i in 1:20){
r1[[i]] <-20:29+i*10
# random numbers are not good for determining sequences of operations }
r2 <- vector("list", 20)
for(i in 1:10){
r2[[i]] <- 1:10 +i
}
itx1 <- iter(r1)
itx2 <- iter(r2)

result2 <- foreach(i = itx1, j = itx2) %dopar% {
i + j
}   

result2



-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "Safe" use of iterator (package iterators)

2016-12-09 Thread David Winsemius

> On Dec 9, 2016, at 9:15 AM, Doran, Harold  wrote:
> 
> I believe I now see the light vis-à-vis iterators when combined with 
> foreach() calls in R. I have now been able to reduce computational workload 
> to minutes instead of hours. I want to verify that the way I am using them is 
> "safe". By safe I mean does the iterator traverse elements in the same way as 
> I have below in my toy example to illustrate what I mean.
> 
> In the first "traditional" example, I have only one index variable for the 
> loop and so I know that the same list in r1 and r2 are always being grabbed. 
> That is, in iteration 1 it is guaranteed to use r1[[1]] + r2[[1]].
> 
> In the example that uses the iterators, is this also guaranteed even though I 
> now have two iterator objects? That is, will the index for element i always 
> be the same as the index for element j when using this across many different 
> cores?
> 
> It seems to be true and in all my test cases so far I am seeing it to be 
> true. But, that could be just luck, so I wonder if there is a condition under 
> which that would NOT be true.
> 
> Thank you
> Harold
> 
> 
> library(foreach)
> library(doParallel)
> cl <- makeCluster(2) 
> registerDoParallel(cl)
> 
> ### Create random data
> r1 <- vector("list", 20)
> for(i in 1:20){
>   r1[[i]] <- rnorm(10)
> }
> 
> ### Create random data
> r2 <- vector("list", 20)
> for(i in 1:20){
>   r2[[i]] <- rnorm(10)
> }
> 
> ### Use a for loop traditionally 
> result1 <- vector("list", 20)
> for(i in 1:20){
>   result1[[i]] <- r1[[i]] + r2[[i]]
> }
>   
> ### Use iterators
> itx1 <- iter(r1)
> itx2 <- iter(r2)
> 
> result2 <- foreach(i = itx1, j = itx2) %dopar% {
>   i + j
>   }   
> 
> all.equal(result1, result2)

I wasn't sure how this would or should behave. I'm not an experienced user, 
merely a reader of help pages. Neither the help page, not the vignette 
references on the help page answered my questions in this case. I expected that 
call would behave analogously to the behavior of mapply when given iterators of 
unequal length. (The shorter of the objects is recycled to reach the length of 
the longer object.) That expectation was not realized. It appears that the 
length of first object of the objects determines computation length, but that 
missing values will not be recycled for the shorter iterator. An error is not 
reported, but rather numeric(0) is returned. So in one sense the %dopar% 
version is "safer" at least to the extent of not failing with an error that 
would have occurred when using a for-loop.

This was my test case:

r1 <- vector("list", 10)
for(i in 1:20){
r1[[i]] <-20:29+i*10  
# random numbers are not good for determining sequences of operations
}
r2 <- vector("list", 20)
for(i in 1:10){
r2[[i]] <- 1:10 +i
}
itx1 <- iter(r1)
itx2 <- iter(r2)

result2 <- foreach(i = itx1, j = itx2) %dopar% {
i + j
}   

result2



-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] "Safe" use of iterator (package iterators)

2016-12-09 Thread Doran, Harold
I believe I now see the light vis-à-vis iterators when combined with foreach() 
calls in R. I have now been able to reduce computational workload to minutes 
instead of hours. I want to verify that the way I am using them is "safe". By 
safe I mean does the iterator traverse elements in the same way as I have below 
in my toy example to illustrate what I mean.

In the first "traditional" example, I have only one index variable for the loop 
and so I know that the same list in r1 and r2 are always being grabbed. That 
is, in iteration 1 it is guaranteed to use r1[[1]] + r2[[1]].

In the example that uses the iterators, is this also guaranteed even though I 
now have two iterator objects? That is, will the index for element i always be 
the same as the index for element j when using this across many different cores?

It seems to be true and in all my test cases so far I am seeing it to be true. 
But, that could be just luck, so I wonder if there is a condition under which 
that would NOT be true.

Thank you
Harold


library(foreach)
library(doParallel)
cl <- makeCluster(2) 
registerDoParallel(cl)

### Create random data
r1 <- vector("list", 20)
for(i in 1:20){
r1[[i]] <- rnorm(10)
}

### Create random data
r2 <- vector("list", 20)
for(i in 1:20){
r2[[i]] <- rnorm(10)
}

### Use a for loop traditionally 
result1 <- vector("list", 20)
for(i in 1:20){
result1[[i]] <- r1[[i]] + r2[[i]]
}

### Use iterators
itx1 <- iter(r1)
itx2 <- iter(r2)

result2 <- foreach(i = itx1, j = itx2) %dopar% {
i + j
}   

all.equal(result1, result2)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.