Re: [R] How to fix this problem
Using readr to read the data might let you clean it on the way in... readr::read_csv("filename.csv", col_types = list(rep(col_numeric(),6)) On Mon, 25 Sep 2023, 16:54 Ebert,Timothy Aaron, wrote: > An update please: > Collectively we have suggested removing commas from the "E..coli" column, > checking for different forms of "NA", and looking outside the dataset for > e-trash (spaces, text, or other content). For removing commas, I would use > global replace to ensure that all commas were removed from all columns. > > > Did this solve the problem? > > If not can you share some early lines and end lines from the data, and how > R is reading your data? Something simple like head(KD6), tail(KD6), and > str(KD6). > > > > -Original Message- > From: R-help On Behalf Of Michael Dewey > Sent: Monday, September 25, 2023 11:09 AM > To: avi.e.gr...@gmail.com; 'Parkhurst, David' ; > r-help@r-project.org > Subject: Re: [R] How to fix this problem > > [External Email] > > It looks here as though the E coli column has commas in it so will be > treated as character. > > Michael > > On 25/09/2023 15:45, avi.e.gr...@gmail.com wrote: > > David, > > > > This may just be the same as your earlier problem. When the type of a > column is guessed by looking at the early entries, any non-numeric entry > forces the entire column to be character. > > > > Suggestion: fix your original EXCEL FILE or edit your CSV to remove the > last entries that look just lie commas. > > > > > > -----Original Message- > > From: R-help On Behalf Of Parkhurst, > > David > > Sent: Sunday, September 24, 2023 2:06 PM > > To: r-help@r-project.org > > Subject: [R] How to fix this problem > > > > I have a matrix, KD6, and I m trying to get a correlation matrix from > it. When I enter cor(KD6), I get the message Error in cor(KD6) : 'x' must > be numeric . > > Here are some early lines from KD6: > > Flow E..coliTNSRP TPTSS > > 1 38.82,4201.65300 0.0270 0.0630 66.80 > > 2 133.02,4201.39400 0.0670 0.1360 6.80 > > 3 86.2 101.73400 0.0700 0.1720 97.30 > > 4 4.85,3900.40400 0.0060 0.0280 8.50 > > 5 0.32,4900.45800 0.0050 0.0430 19.75 > > 6 0.0 1860.51200 0.0040 0.0470 12.00 > > 7 11.19,8351.25500 0.0660 0.1450 12.20 > > > > Why are these not numeric? > > There are some NAs later in the matrix, but I get this same error if I > ask for cor(KD6[1:39,]) to leave out the lines with NAs. Are they a > problem anyway? > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat/ > > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu > > %7Cab9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84 > > %7C0%7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw > > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C > > ta=kbguZQ1HLECz6FFh%2FEZI5A1mI3GweE1q7WgUGLxpjOI%3D=0 > > PLEASE do read the posting guide > > http://www.r/ > > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7Cab > > 9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84%7C0% > > 7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL > > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=By > > 82TIbEPatOL9qRBoDbZ1tojvTd1%2B2Wo3UaBkIlt70%3D=0 > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Michael > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.r-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fix this problem
An update please: Collectively we have suggested removing commas from the "E..coli" column, checking for different forms of "NA", and looking outside the dataset for e-trash (spaces, text, or other content). For removing commas, I would use global replace to ensure that all commas were removed from all columns. Did this solve the problem? If not can you share some early lines and end lines from the data, and how R is reading your data? Something simple like head(KD6), tail(KD6), and str(KD6). -Original Message- From: R-help On Behalf Of Michael Dewey Sent: Monday, September 25, 2023 11:09 AM To: avi.e.gr...@gmail.com; 'Parkhurst, David' ; r-help@r-project.org Subject: Re: [R] How to fix this problem [External Email] It looks here as though the E coli column has commas in it so will be treated as character. Michael On 25/09/2023 15:45, avi.e.gr...@gmail.com wrote: > David, > > This may just be the same as your earlier problem. When the type of a column > is guessed by looking at the early entries, any non-numeric entry forces the > entire column to be character. > > Suggestion: fix your original EXCEL FILE or edit your CSV to remove the last > entries that look just lie commas. > > > -Original Message- > From: R-help On Behalf Of Parkhurst, > David > Sent: Sunday, September 24, 2023 2:06 PM > To: r-help@r-project.org > Subject: [R] How to fix this problem > > I have a matrix, KD6, and I m trying to get a correlation matrix from it. > When I enter cor(KD6), I get the message Error in cor(KD6) : 'x' must be > numeric . > Here are some early lines from KD6: > Flow E..coliTNSRP TPTSS > 1 38.82,4201.65300 0.0270 0.0630 66.80 > 2 133.02,4201.39400 0.0670 0.1360 6.80 > 3 86.2 101.73400 0.0700 0.1720 97.30 > 4 4.85,3900.40400 0.0060 0.0280 8.50 > 5 0.32,4900.45800 0.0050 0.0430 19.75 > 6 0.0 1860.51200 0.0040 0.0470 12.00 > 7 11.19,8351.25500 0.0660 0.1450 12.20 > > Why are these not numeric? > There are some NAs later in the matrix, but I get this same error if I ask > for cor(KD6[1:39,]) to leave out the lines with NAs. Are they a problem > anyway? > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat/ > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu > %7Cab9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84 > %7C0%7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C > ta=kbguZQ1HLECz6FFh%2FEZI5A1mI3GweE1q7WgUGLxpjOI%3D=0 > PLEASE do read the posting guide > http://www.r/ > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7Cab > 9f2511a43e4f0cc0f308dbbdd95e2f%7C0d4da0f84a314d76ace60a62331e1b84%7C0% > 7C0%7C638312513538190955%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=By > 82TIbEPatOL9qRBoDbZ1tojvTd1%2B2Wo3UaBkIlt70%3D=0 > and provide commented, minimal, self-contained, reproducible code. > -- Michael __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fix this problem
It looks here as though the E coli column has commas in it so will be treated as character. Michael On 25/09/2023 15:45, avi.e.gr...@gmail.com wrote: David, This may just be the same as your earlier problem. When the type of a column is guessed by looking at the early entries, any non-numeric entry forces the entire column to be character. Suggestion: fix your original EXCEL FILE or edit your CSV to remove the last entries that look just lie commas. -Original Message- From: R-help On Behalf Of Parkhurst, David Sent: Sunday, September 24, 2023 2:06 PM To: r-help@r-project.org Subject: [R] How to fix this problem I have a matrix, KD6, and I�m trying to get a correlation matrix from it. When I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�. Here are some early lines from KD6: Flow E..coliTNSRP TPTSS 1 38.82,4201.65300 0.0270 0.0630 66.80 2 133.02,4201.39400 0.0670 0.1360 6.80 3 86.2 101.73400 0.0700 0.1720 97.30 4 4.85,3900.40400 0.0060 0.0280 8.50 5 0.32,4900.45800 0.0050 0.0430 19.75 6 0.0 1860.51200 0.0040 0.0470 12.00 7 11.19,8351.25500 0.0660 0.1450 12.20 Why are these not numeric? There are some NAs later in the matrix, but I get this same error if I ask for cor(KD6[1:39,]) to leave out the lines with NAs. Are they a problem anyway? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fix this problem
David, This may just be the same as your earlier problem. When the type of a column is guessed by looking at the early entries, any non-numeric entry forces the entire column to be character. Suggestion: fix your original EXCEL FILE or edit your CSV to remove the last entries that look just lie commas. -Original Message- From: R-help On Behalf Of Parkhurst, David Sent: Sunday, September 24, 2023 2:06 PM To: r-help@r-project.org Subject: [R] How to fix this problem I have a matrix, KD6, and I�m trying to get a correlation matrix from it. When I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�. Here are some early lines from KD6: Flow E..coliTNSRP TPTSS 1 38.82,4201.65300 0.0270 0.0630 66.80 2 133.02,4201.39400 0.0670 0.1360 6.80 3 86.2 101.73400 0.0700 0.1720 97.30 4 4.85,3900.40400 0.0060 0.0280 8.50 5 0.32,4900.45800 0.0050 0.0430 19.75 6 0.0 1860.51200 0.0040 0.0470 12.00 7 11.19,8351.25500 0.0660 0.1450 12.20 Why are these not numeric? There are some NAs later in the matrix, but I get this same error if I ask for cor(KD6[1:39,]) to leave out the lines with NAs. Are they a problem anyway? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fix this problem
On Sun, 24 Sep 2023 18:05:43 + "Parkhurst, David" wrote: > I have a matrix, KD6, and I_m trying to get a correlation matrix from > it. When I enter cor(KD6), I get the message _Error in cor(KD6) : > 'x' must be numeric_. > Here are some early lines from KD6: >Flow E..coliTNSRP TPTSS > 1 38.82,4201.65300 0.0270 0.0630 66.80 > 2 133.02,4201.39400 0.0670 0.1360 6.80 Use str(KD6) to find out the types of every column of the KD6 data frame. There may be some non-numeric strings later in the document preventing R from converting them to numeric automatically. A brute force solution is as.numeric(), perhaps preceded by some string manipulation, but it's better to find out why the strings were considered non-numeric in the first place. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fix this problem
Dear David, simply check str(KD6). My guess (because we don't have your dataset, only a print of it) is that KD6 is not a matrix but a data.frame. The problem seems to come from the column "E..coli" which contains commas instead of periods (so text and not number). There might be other issues of course. HTH, Ivan On 24/09/2023 20:05, Parkhurst, David wrote: I have a matrix, KD6, and I�m trying to get a correlation matrix from it. When I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�. Here are some early lines from KD6: Flow E..coliTNSRP TPTSS 1 38.82,4201.65300 0.0270 0.0630 66.80 2 133.02,4201.39400 0.0670 0.1360 6.80 3 86.2 101.73400 0.0700 0.1720 97.30 4 4.85,3900.40400 0.0060 0.0280 8.50 5 0.32,4900.45800 0.0050 0.0430 19.75 6 0.0 1860.51200 0.0040 0.0470 12.00 7 11.19,8351.25500 0.0660 0.1450 12.20 Why are these not numeric? There are some NAs later in the matrix, but I get this same error if I ask for cor(KD6[1:39,]) to leave out the lines with NAs. Are they a problem anyway? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to fix this problem
I have a matrix, KD6, and I�m trying to get a correlation matrix from it. When I enter cor(KD6), I get the message �Error in cor(KD6) : 'x' must be numeric�. Here are some early lines from KD6: Flow E..coliTNSRP TPTSS 1 38.82,4201.65300 0.0270 0.0630 66.80 2 133.02,4201.39400 0.0670 0.1360 6.80 3 86.2 101.73400 0.0700 0.1720 97.30 4 4.85,3900.40400 0.0060 0.0280 8.50 5 0.32,4900.45800 0.0050 0.0430 19.75 6 0.0 1860.51200 0.0040 0.0470 12.00 7 11.19,8351.25500 0.0660 0.1450 12.20 Why are these not numeric? There are some NAs later in the matrix, but I get this same error if I ask for cor(KD6[1:39,]) to leave out the lines with NAs. Are they a problem anyway? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.