[R] read.table() Issue
Yesterday I changed the headers for a couple of columns in data text files and removed hyphens from within character strings, too. When I tried to re-read these data sources using read.table() I encountered an issue I've not before seen. Both files were read almost instantly until yesterday's wording changes. Now both files seem to cause R to hang. Rather than having the prompt immediately returned nothing happens. In emacs the 'working' symbol appears but the read.table() function does not complete. What might cause this? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table() Issue [UPDATE]
On Wed, 1 Aug 2012, Rich Shepard wrote: What might cause this? I restored these two files from last Friday and they are read into R with no problems. So, I'll make one change at a time and see where things break. Will post results when I have them. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table() Issue
An unmatched quote can make read.table run very slowly when there are lots of lines in the file. E.g., z - rep(A B C, 10^6) z[2] - A \B C # unmatched quote on line 2 tf - tempfile() cat(file=tf, sep=\n, z) system.time(z2 - read.table(tf, skip=2)) # skip bad line user system elapsed 0.860 0.028 0.887 str(z2) 'data.frame': 98 obs. of 3 variables: $ V1: Factor w/ 1 level A: 1 1 1 1 1 1 1 1 1 1 ... $ V2: Factor w/ 1 level B: 1 1 1 1 1 1 1 1 1 1 ... $ V3: Factor w/ 1 level C: 1 1 1 1 1 1 1 1 1 1 ... system.time(z1 - read.table(tf, skip=1)) [ no return for several minutes on a 64-bit Linux machine ] On smaller files it quickly gives the error line 1 did not have 4 elements, along with a warning incomplete final line found by readTableHeader Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rich Shepard Sent: Wednesday, August 01, 2012 10:52 AM To: r-help@r-project.org Subject: [R] read.table() Issue Yesterday I changed the headers for a couple of columns in data text files and removed hyphens from within character strings, too. When I tried to re-read these data sources using read.table() I encountered an issue I've not before seen. Both files were read almost instantly until yesterday's wording changes. Now both files seem to cause R to hang. Rather than having the prompt immediately returned nothing happens. In emacs the 'working' symbol appears but the read.table() function does not complete. What might cause this? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table() Issue [RESOLVED]
On Wed, 1 Aug 2012, Rich Shepard wrote: What might cause this? Must be computers acting like computers. Restored files from backup, made changes one at a time, and there are no problems reading them into R data frames. My apologies for taking up space here. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table() Issue
On Wed, 1 Aug 2012, William Dunlap wrote: An unmatched quote can make read.table run very slowly when there are lots of lines in the file. E.g., Bill, Yes. Turns out that there was no closing quote on a changed header. I found this by an error message on one data file; the other data file didn't generate an error for me to see. Thanks very much, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table issue with #
use the 'comment.char' parameter of read.table Sent from my iPad On Mar 1, 2012, at 17:51, Rui Barradas rui1...@sapo.pt wrote: Hello, The problem is that I get a the following error bacause anything after the # is ignored. Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 6 did not have 500 elements R thinks that line 6 has only 2 elements because of the #. Use 'readLines' instead, followed by 'strsplit'. In the example below the separator is a space. tc - textConnection( yes yes yes yes yes yes yes yes yes yes yes yes # yes yes ) #x - read.table(tc) # same error: line 3 did not have 5 elements x - readLines(tc) close(tc) strsplit(x, ) Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/read-table-issue-with-tp4436554p4436737.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table issue with #
The # is the default comment character in read.table(), but that can easily be changed: tc - textConnection( + yes yes yes yes yes + yes yes yes yes yes + yes yes # yes yes + ) x - read.table(tc, comment.char=) x V1 V2 V3 V4 V5 1 yes yes yes yes yes 2 yes yes yes yes yes 3 yes yes # yes yes There's insufficient context here to know if that was actually the original problem, but is an alternate solution for what Rui proposed. Sarah On Thu, Mar 1, 2012 at 5:51 PM, Rui Barradas rui1...@sapo.pt wrote: Hello, The problem is that I get a the following error bacause anything after the # is ignored. Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 6 did not have 500 elements R thinks that line 6 has only 2 elements because of the #. Use 'readLines' instead, followed by 'strsplit'. In the example below the separator is a space. tc - textConnection( yes yes yes yes yes yes yes yes yes yes yes yes # yes yes ) #x - read.table(tc) # same error: line 3 did not have 5 elements x - readLines(tc) close(tc) strsplit(x, ) Hope this helps, Rui Barradas -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table issue with #
Hello, The problem is that I get a the following error bacause anything after the # is ignored. Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 6 did not have 500 elements R thinks that line 6 has only 2 elements because of the #. Use 'readLines' instead, followed by 'strsplit'. In the example below the separator is a space. tc - textConnection( yes yes yes yes yes yes yes yes yes yes yes yes # yes yes ) #x - read.table(tc) # same error: line 3 did not have 5 elements x - readLines(tc) close(tc) strsplit(x, ) Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/read-table-issue-with-tp4436554p4436737.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table issue
Dear R-Group, I am getting this error message incomplete final line found by readTableHeader in the code below. It seems to me that the error message is because of quote in the text data. Is there any easy way to handle this? Or should I do a substitute. tempTxt - 100589;Canara Robeco Expo-Income Plan;18.92;18.92;19.35;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') V1 V2V3V4V5 V6 1 100589 Canara Robeco Expo-Income Plan 18.92 18.92 19.35 02-Apr-2007 tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') Error in read.table(textConnection(tempTxt), sep = ;) : incomplete final line found by readTableHeader on 'tempTxt' Thanks, Santosh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table issue
The problem is that you have an unbalanced quote (') in your input . you need to specifiy quote = '' in read.table: tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';', quote = '') V1V2V3V4V5 V6 1 103272 Canara Robeco Fortune '94 30.07 30.07 30.75 02-Apr-2007 The quote is '94 in the string. On Sat, Oct 9, 2010 at 10:05 PM, Santosh Srinivas santosh.srini...@gmail.com wrote: Dear R-Group, I am getting this error message incomplete final line found by readTableHeader in the code below. It seems to me that the error message is because of quote in the text data. Is there any easy way to handle this? Or should I do a substitute. tempTxt - 100589;Canara Robeco Expo-Income Plan;18.92;18.92;19.35;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') V1 V2 V3 V4 V5 V6 1 100589 Canara Robeco Expo-Income Plan 18.92 18.92 19.35 02-Apr-2007 tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007 + read.table(textConnection(tempTxt), sep=';') Error in read.table(textConnection(tempTxt), sep = ;) : incomplete final line found by readTableHeader on 'tempTxt' Thanks, Santosh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.