Ok, sorry, I should have included some comments.

The function is divided in three parts, 1. intro, 2. decision, 3. keep rows.
Part 3 is the function keep(), internal to to.keep(). Let's start with 1.

1. Setup some variables first.
1.a) The variables 'a'.
If the input object 'x' is a matrix this doesn't give a great speed-up but if 'x' is a data.frame, extraction is time consuming.
So, do this once only, at the beginning.
1.b) The new environment.
This is because my first version would need to change values declared outside the internal function. This can be done with the global assignment operator, <<-, but this pratice should be avoided, it's easy to mess things up. Note that all the variables changed inside the internal function are in this new environment, 'e'.
In particular note that 'result' is initialized with 1000 rows.
2. The loop.
This is where we decide if we want to keep that row. I have negated the condition from an original 'no'.
The 'no' condition:
    a1[i] < a1 & a2[i] < a2 & a3[i] > a3 & a4[i] < a4
Then the test would be:
    if(any(no)) dont_keep else keep.  # pseudo-code
Not in pseudo-code:
    if( all( !no ) ) keep(i, e)
The down side of this is that the original is more readable.

3. The internal function, keep().
Considering the small number of rows I have used for tests, e$result was initialized to 1e3.
With 5e5 lines I would increase this number to 1e5.
First, the funcion updates the [row number] pointer into 'result' and checks if we are at a 'result' limit.
If yes, make it bigger by e$increment [ == 1e3 ] rows.
Then just assign row i from matrix/df 'x' to the appropriate row of e$result. The reason why we need the environment is because on function return, all but the returned value is lost. We could return a list with saved values of ires, curr.rows, result, and return the list. But this would complicate and slow things down. Assign, update and reassign. Messy. Environments can help keep it "simple", in the sense of to keep together what is meant to be used together.

And now I hope there is not an overdose of comments :)

Rui Barradas

Em 21-07-2012 18:37, wwreith escreveu:
Any chance I could ask for an idiots guide for function to.keep(x). I
understand how to use it but not what some of the lines are doing. Comments
would be extremely helpful.



--
View this message in context: 
http://r.789695.n4.nabble.com/Speeding-up-a-loop-tp4637201p4637316.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to