Abou,

 

I am not trying to be negative. Assuming you are a professor of Statistics, 
your request seems odd as what you are asking about is very routine in much of 
statistical work where you want to make a model or something using just part of 
your data and need to reserve some to check if you perhaps trained an algorithm 
too much for the original data used.

 

A simple online search before asking questions here is appreciated. I did a 
quick search for something like “R split data into three parts” and see several 
applicable answers.

 

There are people on this forum who actually get paid to do nontrivial tasks and 
do not mind help in spots but feel sort of used if expected to write a serious 
amount of code and perhaps then be asked to redo it with more bells and 
whistles added. A recent badly phrased request comes to mind where several of 
us provided and answer only to find out it was for a different scenario, …

 

So let me continue with a serious answer. May we assume you KNOW how to read 
the data in to something like a data.frame? If so, and if you see no need or 
value in doing this the hard way, then your question could have been to ask if 
there is an R built-in function or perhaps a pacjkage already set to solve it 
quickly. Again, a simple online search can do wonders.  Here, for example is a 
package called caret and this page discusses spliutting data multiple ways:

 

https://topepo.github.io/caret/data-splitting.html

 

There are other such pages suggesting how to do it using base R.

 

Here is one that gives an example on how to make  three unequal partitions:

 

inds <- partition(iris$Sepal.Length, p = c(train = 0.6, valid = 0.2, test = 
0.2))

 

 

There is more to do below but in the above, you would use whatever names you 
want instead of train/valid/test and set all three to 0.33 and so on.

 

I repeat, that what you want to do strikes some of us as a fairly routine thing 
to do and lots of people have written how they have done it and you can pick 
and choose, or redo it on your own. If what you have is a homework assignment, 
the appropriate thing is to have you learn to use some technique yourself and 
perhaps get minor help when it fails. But if you will be doing this regularly, 
use of some packages is highly valuable.

 

Good Luck.

 

 

 

 

 

From: AbouEl-Makarim Aboueissa <abouelmakarim1...@gmail.com> 
Sent: Thursday, September 2, 2021 9:51 PM
To: Avi Gross <avigr...@verizon.net>
Cc: R mailing list <r-help@r-project.org>
Subject: Re: [R] Splitting a data column randomly into 3 groups

 

Sorry, please forget about it. I believe that I am very serious when I posted 
my question.

 

with thanks

abou


______________________

AbouEl-Makarim Aboueissa, PhD

 

Professor, Statistics and Data Science

Graduate Coordinator

Department of Mathematics and Statistics

University of Southern Maine

 

 

 

On Thu, Sep 2, 2021 at 9:42 PM Avi Gross via R-help <r-help@r-project.org 
<mailto:r-help@r-project.org> > wrote:

What is stopping you Abou?

Some of us here start wondering if we have better things to do than homework 
for others. Help is supposed to be after they try and encounter issues that we 
may help with.

So think about your problem. You supplied data in a file that is NOT in CSV 
format but is in Tab separated format.

You need to get it in to your program and store it in something. It looks like 
you have 204 items so 1/3 of those would be exactly 68.

So if your data is in an object like a vector or data.frame, you want to choose 
random number between 1 and 204. How do you do that? You need 1/3 of the length 
of the object items, in your case 68.

Now extract the items with  those indices into say A1. Extract all the rest 
into a temporary item.

Make another 68 random indices, with no overlap, and copy those items into A2 
and the ones that do not have those into A3 and you are sort of done, other 
than some cleanup or whatever.

There are many ways to do the above and I am sure packages too.

But since you have made no visible effort, I personally am not going to pick 
anything in particular.

Had you shown some text and code along the lines of the above and just wanted 
to know how to copy just the ones that were not selected, we could easily ...


-----Original Message-----
From: R-help <r-help-boun...@r-project.org 
<mailto:r-help-boun...@r-project.org> > On Behalf Of AbouEl-Makarim Aboueissa
Sent: Thursday, September 2, 2021 9:30 PM
To: R mailing list <r-help@r-project.org <mailto:r-help@r-project.org> >
Subject: [R] Splitting a data column randomly into 3 groups

Dear All:

How to split a column data *randomly* into three groups. Please see the 
attached data. I need to split column #2 titled "Data"

with many thanks
abou
______________________


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science* *Graduate Coordinator*

*Department of Mathematics and Statistics* *University of Southern Maine*

______________________________________________
R-help@r-project.org <mailto:R-help@r-project.org>  mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to