Re: [R] How to fix this problem

2023-09-25 Thread CALUM POLWART
Using readr to read the data might let you clean it on the way in...

readr::read_csv("filename.csv", col_types = list(rep(col_numeric(),6))

On Mon, 25 Sep 2023, 16:54 Ebert,Timothy Aaron,  wrote:

> An update please:
> Collectively we have suggested removing commas from the "E..coli" column,
> checking for different forms of "NA", and looking outside the dataset for
> e-trash (spaces, text, or other content). For removing commas, I would use
> global replace to ensure that all commas were removed from all columns.
>
>
> Did this solve the problem?
>
> If not can you share some early lines and end lines from the data, and how
> R is reading your data? Something simple like head(KD6), tail(KD6), and
> str(KD6).
>
>
>
> -Original Message-
> From: R-help  On Behalf Of Michael Dewey
> Sent: Monday, September 25, 2023 11:09 AM
> To: avi.e.gr...@gmail.com; 'Parkhurst, David' ;
> r-help@r-project.org
> Subject: Re: [R] How to fix this problem
>
> [External Email]
>
> It looks here as though the E coli column has commas in it so will be
> treated as character.
>
> Michael
>
> On 25/09/2023 15:45, avi.e.gr...@gmail.com wrote:
> > David,
> >
> > This may just be the same as your earlier problem. When the type of a
> column is guessed by looking at the early entries, any non-numeric entry
> forces the entire column to be character.
> >
> > Suggestion: fix your original EXCEL FILE or edit your CSV to remove the
> last entries that look just lie commas.
> >
> >
> > -----Original Message-
> > From: R-help  On Behalf Of Parkhurst,
> > David
> > Sent: Sunday, September 24, 2023 2:06 PM
> > To: r-help@r-project.org
> > Subject: [R] How to fix this problem
> >
> > I have a matrix, KD6, and I m trying to get a correlation matrix from
> it.  When I enter cor(KD6), I get the message  Error in cor(KD6) : 'x' must
> be numeric .
> > Here are some early lines from KD6:
> >  Flow  E..coliTNSRP TPTSS
> > 1  38.82,4201.65300 0.0270 0.0630  66.80
> > 2 133.02,4201.39400 0.0670 0.1360   6.80
> > 3  86.2   101.73400 0.0700 0.1720  97.30
> > 4   4.85,3900.40400 0.0060 0.0280   8.50
> > 5   0.32,4900.45800 0.0050 0.0430  19.75
> > 6   0.0  1860.51200 0.0040 0.0470  12.00
> > 7  11.19,8351.25500 0.0660 0.1450  12.20
> >
> > Why are these not numeric?
> > There are some NAs later in the matrix, but I get this same error if I
> ask for cor(KD6[1:39,]) to leave out the lines with NAs.  Are they a
> problem anyway?
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat/
> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
> > %7Cab9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84
> > %7C0%7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> > ta=kbguZQ1HLECz6FFh%2FEZI5A1mI3GweE1q7WgUGLxpjOI%3D=0
> > PLEASE do read the posting guide
> > http://www.r/
> > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7Cab
> > 9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> > 7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=By
> > 82TIbEPatOL9qRBoDbZ1tojvTd1%2B2Wo3UaBkIlt70%3D=0
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Michael
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.r-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix this problem

2023-09-25 Thread Ebert,Timothy Aaron
An update please:
Collectively we have suggested removing commas from the "E..coli" column, 
checking for different forms of "NA", and looking outside the dataset for 
e-trash (spaces, text, or other content). For removing commas, I would use 
global replace to ensure that all commas were removed from all columns.


Did this solve the problem?

If not can you share some early lines and end lines from the data, and how R is 
reading your data? Something simple like head(KD6), tail(KD6), and str(KD6).



-Original Message-
From: R-help  On Behalf Of Michael Dewey
Sent: Monday, September 25, 2023 11:09 AM
To: avi.e.gr...@gmail.com; 'Parkhurst, David' ; 
r-help@r-project.org
Subject: Re: [R] How to fix this problem

[External Email]

It looks here as though the E coli column has commas in it so will be treated 
as character.

Michael

On 25/09/2023 15:45, avi.e.gr...@gmail.com wrote:
> David,
>
> This may just be the same as your earlier problem. When the type of a column 
> is guessed by looking at the early entries, any non-numeric entry forces the 
> entire column to be character.
>
> Suggestion: fix your original EXCEL FILE or edit your CSV to remove the last 
> entries that look just lie commas.
>
>
> -Original Message-
> From: R-help  On Behalf Of Parkhurst,
> David
> Sent: Sunday, September 24, 2023 2:06 PM
> To: r-help@r-project.org
> Subject: [R] How to fix this problem
>
> I have a matrix, KD6, and I m trying to get a correlation matrix from it.  
> When I enter cor(KD6), I get the message  Error in cor(KD6) : 'x' must be 
> numeric .
> Here are some early lines from KD6:
>  Flow  E..coliTNSRP TPTSS
> 1  38.82,4201.65300 0.0270 0.0630  66.80
> 2 133.02,4201.39400 0.0670 0.1360   6.80
> 3  86.2   101.73400 0.0700 0.1720  97.30
> 4   4.85,3900.40400 0.0060 0.0280   8.50
> 5   0.32,4900.45800 0.0050 0.0430  19.75
> 6   0.0  1860.51200 0.0040 0.0470  12.00
> 7  11.19,8351.25500 0.0660 0.1450  12.20
>
> Why are these not numeric?
> There are some NAs later in the matrix, but I get this same error if I ask 
> for cor(KD6[1:39,]) to leave out the lines with NAs.  Are they a problem 
> anyway?
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat/
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
> %7Cab9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84
> %7C0%7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> ta=kbguZQ1HLECz6FFh%2FEZI5A1mI3GweE1q7WgUGLxpjOI%3D=0
> PLEASE do read the posting guide
> http://www.r/
> -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7Cab
> 9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> 7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=By
> 82TIbEPatOL9qRBoDbZ1tojvTd1%2B2Wo3UaBkIlt70%3D=0
> and provide commented, minimal, self-contained, reproducible code.
>

--
Michael

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix this problem

2023-09-25 Thread Michael Dewey
It looks here as though the E coli column has commas in it so will be 
treated as character.


Michael

On 25/09/2023 15:45, avi.e.gr...@gmail.com wrote:

David,

This may just be the same as your earlier problem. When the type of a column is 
guessed by looking at the early entries, any non-numeric entry forces the 
entire column to be character.

Suggestion: fix your original EXCEL FILE or edit your CSV to remove the last 
entries that look just lie commas.


-Original Message-
From: R-help  On Behalf Of Parkhurst, David
Sent: Sunday, September 24, 2023 2:06 PM
To: r-help@r-project.org
Subject: [R] How to fix this problem

I have a matrix, KD6, and I�m trying to get a correlation matrix from it.  When 
I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�.
Here are some early lines from KD6:
 Flow  E..coliTNSRP TPTSS
1  38.82,4201.65300 0.0270 0.0630  66.80
2 133.02,4201.39400 0.0670 0.1360   6.80
3  86.2   101.73400 0.0700 0.1720  97.30
4   4.85,3900.40400 0.0060 0.0280   8.50
5   0.32,4900.45800 0.0050 0.0430  19.75
6   0.0  1860.51200 0.0040 0.0470  12.00
7  11.19,8351.25500 0.0660 0.1450  12.20

Why are these not numeric?
There are some NAs later in the matrix, but I get this same error if I ask for 
cor(KD6[1:39,]) to leave out the lines with NAs.  Are they a problem anyway?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix this problem

2023-09-25 Thread avi.e.gross
David,

This may just be the same as your earlier problem. When the type of a column is 
guessed by looking at the early entries, any non-numeric entry forces the 
entire column to be character.

Suggestion: fix your original EXCEL FILE or edit your CSV to remove the last 
entries that look just lie commas.


-Original Message-
From: R-help  On Behalf Of Parkhurst, David
Sent: Sunday, September 24, 2023 2:06 PM
To: r-help@r-project.org
Subject: [R] How to fix this problem

I have a matrix, KD6, and I�m trying to get a correlation matrix from it.  When 
I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�.
Here are some early lines from KD6:
Flow  E..coliTNSRP TPTSS
1  38.82,4201.65300 0.0270 0.0630  66.80
2 133.02,4201.39400 0.0670 0.1360   6.80
3  86.2   101.73400 0.0700 0.1720  97.30
4   4.85,3900.40400 0.0060 0.0280   8.50
5   0.32,4900.45800 0.0050 0.0430  19.75
6   0.0  1860.51200 0.0040 0.0470  12.00
7  11.19,8351.25500 0.0660 0.1450  12.20

Why are these not numeric?
There are some NAs later in the matrix, but I get this same error if I ask for 
cor(KD6[1:39,]) to leave out the lines with NAs.  Are they a problem anyway?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix this problem

2023-09-25 Thread Ivan Krylov
On Sun, 24 Sep 2023 18:05:43 +
"Parkhurst, David"  wrote:

> I have a matrix, KD6, and I_m trying to get a correlation matrix from
> it.  When I enter cor(KD6), I get the message _Error in cor(KD6) :
> 'x' must be numeric_.
> Here are some early lines from KD6:
>Flow   E..coliTNSRP TPTSS
> 1  38.82,4201.65300 0.0270 0.0630  66.80
> 2 133.02,4201.39400 0.0670 0.1360   6.80

Use str(KD6) to find out the types of every column of the KD6 data
frame. There may be some non-numeric strings later in the document
preventing R from converting them to numeric automatically. A brute
force solution is as.numeric(), perhaps preceded by some string
manipulation, but it's better to find out why the strings were
considered non-numeric in the first place.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix this problem

2023-09-25 Thread Ivan Calandra

Dear David,

simply check str(KD6). My guess (because we don't have your dataset, 
only a print of it) is that KD6 is not a matrix but a data.frame. The 
problem seems to come from the column "E..coli" which contains commas 
instead of periods (so text and not number). There might be other issues 
of course.


HTH,
Ivan

On 24/09/2023 20:05, Parkhurst, David wrote:

I have a matrix, KD6, and I�m trying to get a correlation matrix from it.  When 
I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�.
Here are some early lines from KD6:
 Flow  E..coliTNSRP TPTSS
1  38.82,4201.65300 0.0270 0.0630  66.80
2 133.02,4201.39400 0.0670 0.1360   6.80
3  86.2   101.73400 0.0700 0.1720  97.30
4   4.85,3900.40400 0.0060 0.0280   8.50
5   0.32,4900.45800 0.0050 0.0430  19.75
6   0.0  1860.51200 0.0040 0.0470  12.00
7  11.19,8351.25500 0.0660 0.1450  12.20

Why are these not numeric?
There are some NAs later in the matrix, but I get this same error if I ask for 
cor(KD6[1:39,]) to leave out the lines with NAs.  Are they a problem anyway?

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to fix this problem

2023-09-25 Thread Parkhurst, David
I have a matrix, KD6, and I�m trying to get a correlation matrix from it.  When 
I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�.
Here are some early lines from KD6:
Flow  E..coliTNSRP TPTSS
1  38.82,4201.65300 0.0270 0.0630  66.80
2 133.02,4201.39400 0.0670 0.1360   6.80
3  86.2   101.73400 0.0700 0.1720  97.30
4   4.85,3900.40400 0.0060 0.0280   8.50
5   0.32,4900.45800 0.0050 0.0430  19.75
6   0.0  1860.51200 0.0040 0.0470  12.00
7  11.19,8351.25500 0.0660 0.1450  12.20

Why are these not numeric?
There are some NAs later in the matrix, but I get this same error if I ask for 
cor(KD6[1:39,]) to leave out the lines with NAs.  Are they a problem anyway?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.