Re: [R] Problem to generate training data set and test data set

2006-12-26 Thread Charles C. Berry
What you describe is called stratified sampling. It was discusssed last month (and other times) on this list: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/90220.html Using RSiteSearch(stratified sampling) will produce many hits to relevant articles and packages. On Mon,

[R] Problem to generate training data set and test data set

2006-12-25 Thread Aimin Yan
I have a full data set like this: aa basaas bms ams bcuacu omega y 1 ALA 0 127.71 0 69.99 0 -0.2498560 79.91470 outward 2 PRO 0 68.55 0 55.44 0 -0.0949008 76.60380 outward 3 ALA 0 52.72 0 47.82 0 -0.0396550 52.19970 outward 4 PHE 0 22.62 0

Re: [R] Problem to generate training data set and test data set

2006-12-25 Thread Jim Lemon
Aimin Yan wrote: I have a full data set like this: aa basaas bms ams bcuacu omega y 1 ALA 0 127.71 0 69.99 0 -0.2498560 79.91470 outward 2 PRO 0 68.55 0 55.44 0 -0.0949008 76.60380 outward 3 ALA 0 52.72 0 47.82 0 -0.0396550 52.19970 outward