[R] help usin scan on large matrix (caveats to what has been discussed before)
Dear all, I have a few points that I am unsure about using scan. I know that it is covered in the intro to R, and also has been discussed here: http://www.mail-archive.com/r-help@r-project.org/msg04869.html but nevertheless, I cannot get it to work. I have a potentially very large matrix that I need to read in (35MB). I am about to run it on a server with 16G of memory etc, so I hope it will work. I ultimately only need to run image() on it, producing a heatmap. read.table crashes on it, and is slow, so I would like to read it using scan. The file where I store it has the following format: V1 V2 V3 V4 V5 1 508 424 208 111 66 2 59 101 95 113 81 3 26 30 24 17 18 4 4 0 8 3 9 5 0 0 0 0 0 6 0 0 0 0 0 where the first line are column names, the first column rownames. read.table works perfectly without any parameters on this (the file has been output using write.table). I use: rows-length(R) cols - max(unlist(lapply(R,function(x) length(unlist(gregexpr( ,x,fixed=TRUE,useBytes=TRUE)) c-scan(file=f,what=list(c(,(rep(integer(0),cols, skip=1) m-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying integer(0) correct? And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
On Aug 12, 2010, at 11:30 AM, Martin Tomko wrote: c-scan(file=f,what=list(c(,(rep(integer(0),cols, skip=1) m-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying integer(0) correct? No. Well, integer(0) is just superfluous where 0L would do, since scan only looks at the types not the contents, but more importantly, what= wants a list of as many elements as there are columns and you gave it list(c(,(rep(integer(0),5 [[1]] [1] I think what you actually meant was c(list(NULL),rep(list(0L),5)) And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
Hi Peter, thank you for your reply. I still cannot get it to work. I have modified your code as follows: rows-length(R) cols - max(unlist(lapply(R,function(x) length(unlist(gregexpr( ,x,fixed=TRUE,useBytes=TRUE)) c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1),rows-1)), skip=1) m-matrix(c, nrow = rows-1, ncol=cols+1,byrow=TRUE); the list c seems ok, with all the values I would expect. Still, length(c) gives me a value = cols+1, which I find odd (I would expect =cols). I thine repeated it rows-1 times (to account for the header row). The values seem ok. Anyway, I tried to construct the matrix, but when I print it, the values are odd: m[1:10,1:10] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [2,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [3,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [4,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [5,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [6,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [7,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [8,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [9,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [10,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Any idea where the values are gone? Thanks Martin Hence, I filled it into the matrix of dimensions On 8/12/2010 12:24 PM, peter dalgaard wrote: On Aug 12, 2010, at 11:30 AM, Martin Tomko wrote: c-scan(file=f,what=list(c(,(rep(integer(0),cols, skip=1) m-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying integer(0) correct? No. Well, integer(0) is just superfluous where 0L would do, since scan only looks at the types not the contents, but more importantly, what= wants a list of as many elements as there are columns and you gave it list(c(,(rep(integer(0),5 [[1]] [1] I think what you actually meant was c(list(NULL),rep(list(0L),5)) And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Tomko Postdoctoral Research Assistant Geographic Information Systems Division Department of Geography University of Zurich - Irchel Winterthurerstr. 190 CH-8057 Zurich, Switzerland email: martin.to...@geo.uzh.ch site: http://www.geo.uzh.ch/~mtomko mob:+41-788 629 558 tel:+41-44-6355256 fax:+41-44-6356848 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
Hi, I don't know if this can be useful to you, but I recently wrote a small function to read a large datafile like yours in a number of steps, with the possibility to save each intermediate block as .Rdata. This is based on read.table --- not as efficient as lower-level scan() but it might be good enough, file - 'test.txt' ## write.table(matrix(rnorm(1e6*14), ncol=14), file=file,row.names = F, ## col.names = F ) n - as.numeric(gsub([^0123456789],, system(paste(wc -l , file), int=TRUE))) n blocks - function(n=18, size=5){ res - c(replicate(n%/%size, size)) if(n%%size) res - c(res, n%%size) if(!sum(res) == n) stop(ERROR!!!) res } ## blocks(1003, 500) readBlocks - function(file, nbk=1e5, out=tmp, save.inter=TRUE, classes= c(numeric, numeric, rep(NULL, 6), numeric, numeric, rep(NULL, 4))){ n - as.numeric(gsub([^0123456789],, system(paste(wc -l , file), int=TRUE))) ncols - length(grep(NULL, classes, invert=TRUE)) results - matrix(0, nrow=n, ncol=ncols) Nb - blocks(n, nbk) skip - c(0, cumsum(Nb)) for(ii in seq_along(Nb)){ d - read.table(file, colClasses = classes, nrows=Nb[ii], skip=skip[ii], comment.char = ) if(save.inter){ save(d, file=paste(out, ., ii, .rda, sep=)) } print(ii) results[seq(1+skip[ii], skip[ii]+Nb[ii]), ] - as.matrix(d) rm(d) ; gc() } save(results, file=paste(out, .rda, sep=)) invisible(results) } ## test - readBlocks(file) HTH, baptiste On Aug 12, 2010, at 1:34 PM, Martin Tomko wrote: Hi Peter, thank you for your reply. I still cannot get it to work. I have modified your code as follows: rows-length(R) cols - max(unlist(lapply(R,function(x) length(unlist(gregexpr( ,x,fixed=TRUE,useBytes=TRUE)) c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1),rows-1)), skip=1) m-matrix(c, nrow = rows-1, ncol=cols+1,byrow=TRUE); the list c seems ok, with all the values I would expect. Still, length(c) gives me a value = cols+1, which I find odd (I would expect =cols). I thine repeated it rows-1 times (to account for the header row). The values seem ok. Anyway, I tried to construct the matrix, but when I print it, the values are odd: m[1:10,1:10] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [2,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [3,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [4,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [5,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [6,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [7,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [8,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [9,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [10,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Any idea where the values are gone? Thanks Martin Hence, I filled it into the matrix of dimensions On 8/12/2010 12:24 PM, peter dalgaard wrote: On Aug 12, 2010, at 11:30 AM, Martin Tomko wrote: c-scan(file=f,what=list(c(,(rep(integer(0),cols, skip=1) m-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying integer(0) correct? No. Well, integer(0) is just superfluous where 0L would do, since scan only looks at the types not the contents, but more importantly, what= wants a list of as many elements as there are columns and you gave it list(c(,(rep(integer(0),5 [[1]] [1] I think what you actually meant was c(list(NULL),rep(list(0L),5)) And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Tomko Postdoctoral Research Assistant Geographic Information Systems Division Department of Geography University of Zurich - Irchel Winterthurerstr. 190 CH-8057 Zurich, Switzerland email:martin.to...@geo.uzh.ch site: http://www.geo.uzh.ch/~mtomko mob: +41-788 629 558 tel: +41-44-6355256 fax: +41-44-6356848
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
Hi baptiste, thanks a lot. Could you please comment on that code, I cannto figure out what it does. Appart from the file name, what parameters does it need? Seems to me like you need to know the size of the table a priori. Is that right? Do you have to set up the block size depending on that (so that you get full multiples of the block to form the resulting frame)? Cheers Martin On 8/12/2010 2:45 PM, baptiste AuguiƩ wrote: Hi, I don't know if this can be useful to you, but I recently wrote a small function to read a large datafile like yours in a number of steps, with the possibility to save each intermediate block as .Rdata. This is based on read.table --- not as efficient as lower-level scan() but it might be good enough, file- 'test.txt' ## write.table(matrix(rnorm(1e6*14), ncol=14), file=file,row.names = F, ## col.names = F ) n- as.numeric(gsub([^0123456789],, system(paste(wc -l , file), int=TRUE))) n blocks- function(n=18, size=5){ res- c(replicate(n%/%size, size)) if(n%%size) res- c(res, n%%size) if(!sum(res) == n) stop(ERROR!!!) res } ## blocks(1003, 500) readBlocks- function(file, nbk=1e5, out=tmp, save.inter=TRUE, classes= c(numeric, numeric, rep(NULL, 6), numeric, numeric, rep(NULL, 4))){ n- as.numeric(gsub([^0123456789],, system(paste(wc -l , file), int=TRUE))) ncols- length(grep(NULL, classes, invert=TRUE)) results- matrix(0, nrow=n, ncol=ncols) Nb- blocks(n, nbk) skip- c(0, cumsum(Nb)) for(ii in seq_along(Nb)){ d- read.table(file, colClasses = classes, nrows=Nb[ii], skip=skip[ii], comment.char = ) if(save.inter){ save(d, file=paste(out, ., ii, .rda, sep=)) } print(ii) results[seq(1+skip[ii], skip[ii]+Nb[ii]), ]- as.matrix(d) rm(d) ; gc() } save(results, file=paste(out, .rda, sep=)) invisible(results) } ## test- readBlocks(file) HTH, baptiste On Aug 12, 2010, at 1:34 PM, Martin Tomko wrote: Hi Peter, thank you for your reply. I still cannot get it to work. I have modified your code as follows: rows-length(R) cols- max(unlist(lapply(R,function(x) length(unlist(gregexpr( ,x,fixed=TRUE,useBytes=TRUE)) c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1),rows-1)), skip=1) m-matrix(c, nrow = rows-1, ncol=cols+1,byrow=TRUE); the list c seems ok, with all the values I would expect. Still, length(c) gives me a value = cols+1, which I find odd (I would expect =cols). I thine repeated it rows-1 times (to account for the header row). The values seem ok. Anyway, I tried to construct the matrix, but when I print it, the values are odd: m[1:10,1:10] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [2,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [3,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [4,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [5,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [6,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [7,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [8,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [9,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [10,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Any idea where the values are gone? Thanks Martin Hence, I filled it into the matrix of dimensions On 8/12/2010 12:24 PM, peter dalgaard wrote: On Aug 12, 2010, at 11:30 AM, Martin Tomko wrote: c-scan(file=f,what=list(c(,(rep(integer(0),cols, skip=1) m-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying integer(0) correct? No. Well, integer(0) is just superfluous where 0L would do, since scan only looks at the types not the contents, but more importantly, what= wants a list of as many elements as there are columns and you gave it list(c(,(rep(integer(0),5 [[1]] [1] I think what you actually meant was c(list(NULL),rep(list(0L),5)) And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
On Aug 12, 2010, at 1:34 PM, Martin Tomko wrote: Hi Peter, thank you for your reply. I still cannot get it to work. I have modified your code as follows: rows-length(R) cols - max(unlist(lapply(R,function(x) length(unlist(gregexpr( ,x,fixed=TRUE,useBytes=TRUE)) Notice that the above is completely useless to the reader unless you tell us what R is (except for a statistical programming language ;-)) c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1),rows-1)), skip=1) What's the outer rep() and rows-1 doing in there???! Notice that the parentheses don't match up as I think you think they do, so there's really only one argument to rep(), making it a no-op. The rows-1 is going inside the c, which might be causing the apparent extra column. And the number of rows should not affect 'what=' anyway. Now if you had done what I wrote... m-matrix(c, nrow = rows-1, ncol=cols+1,byrow=TRUE); If you make a matrix from a list, odd things will happen. You need an unlist(c). And more than likely NOT byrow=TRUE. However, I think do.call(cbind,c) should do the trick more easily. the list c seems ok, with all the values I would expect. Still, length(c) gives me a value = cols+1, which I find odd (I would expect =cols). I thine repeated it rows-1 times (to account for the header row). The values seem ok. Anyway, I tried to construct the matrix, but when I print it, the values are odd: m[1:10,1:10] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [2,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [3,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [4,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [5,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [6,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [7,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [8,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [9,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [10,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Any idea where the values are gone? Thanks Martin Hence, I filled it into the matrix of dimensions On 8/12/2010 12:24 PM, peter dalgaard wrote: On Aug 12, 2010, at 11:30 AM, Martin Tomko wrote: c-scan(file=f,what=list(c(,(rep(integer(0),cols, skip=1) m-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying integer(0) correct? No. Well, integer(0) is just superfluous where 0L would do, since scan only looks at the types not the contents, but more importantly, what= wants a list of as many elements as there are columns and you gave it list(c(,(rep(integer(0),5 [[1]] [1] I think what you actually meant was c(list(NULL),rep(list(0L),5)) And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Tomko Postdoctoral Research Assistant Geographic Information Systems Division Department of Geography University of Zurich - Irchel Winterthurerstr. 190 CH-8057 Zurich, Switzerland email:martin.to...@geo.uzh.ch site: http://www.geo.uzh.ch/~mtomko mob: +41-788 629 558 tel: +41-44-6355256 fax: +41-44-6356848 -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
Hi Peter, apologies, too fast copying and pasting. So, here is the explanation: f-C:/test/mytab.txt; R-readLines(con=f); where mytab.txt is a table formatted as noted in previous post (space delimited, with header, rownames, containing integers). Now, my understandign of scan was that I have to specify the FULL number of values in it (examples specify things like 200*2000 for a matrix etc). That's why I thought that I need to do cols*rows as well. Avoiding the first line with headers is simple, avoiding the first column is not - hence my questions. Sorry, the corrected, matching parentheses are here - why did the previous execute is a wonder... c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1)),rows-1), skip=1) here, my reasoning was: * c(list(NULL),rep(list(0L),cols-1)) specifies a template for any line (first elelement to be ignored = NULL, it is a string in the table specified, and then a repetition of integers - I am still not sure how you derived 0L, and what it means and where to find a doc for that.); * the previous needs to be repeated rows-1 times, hence what=rep(c(list(NULL),rep(list(0L),cols-1)),rows-1) I do nto understand the following: You need an unlist(c). And more than likely NOT byrow=TRUE. However, I think do.call(cbind,c) should do the trick more easily. what will unlist(c) do; why should it not be bywrow=TRUE, and how would you go about integrating do.call(cbind,c) with matrix. Apologies to naive questions, I am a newbie, in principle. Cheers Martin On 8/12/2010 4:29 PM, peter dalgaard wrote: On Aug 12, 2010, at 1:34 PM, Martin Tomko wrote: Hi Peter, thank you for your reply. I still cannot get it to work. I have modified your code as follows: rows-length(R) cols- max(unlist(lapply(R,function(x) length(unlist(gregexpr( ,x,fixed=TRUE,useBytes=TRUE)) Notice that the above is completely useless to the reader unless you tell us what R is (except for a statistical programming language ;-)) c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1),rows-1)), skip=1) What's the outer rep() and rows-1 doing in there???! Notice that the parentheses don't match up as I think you think they do, so there's really only one argument to rep(), making it a no-op. The rows-1 is going inside the c, which might be causing the apparent extra column. And the number of rows should not affect 'what=' anyway. Now if you had done what I wrote... m-matrix(c, nrow = rows-1, ncol=cols+1,byrow=TRUE); If you make a matrix from a list, odd things will happen. You need an unlist(c). And more than likely NOT byrow=TRUE. However, I think do.call(cbind,c) should do the trick more easily. the list c seems ok, with all the values I would expect. Still, length(c) gives me a value = cols+1, which I find odd (I would expect =cols). I thine repeated it rows-1 times (to account for the header row). The values seem ok. Anyway, I tried to construct the matrix, but when I print it, the values are odd: m[1:10,1:10] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [2,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [3,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [4,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [5,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [6,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [7,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [8,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [9,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 [10,] NULL Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Integer,15 Any idea where the values are gone? Thanks Martin Hence, I filled it into the matrix of dimensions On 8/12/2010 12:24 PM, peter dalgaard wrote: On Aug 12, 2010, at 11:30 AM, Martin Tomko wrote: c-scan(file=f,what=list(c(,(rep(integer(0),cols, skip=1) m-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying integer(0) correct? No. Well, integer(0) is just superfluous where 0L would do, since scan only looks at the types not the contents, but more importantly, what= wants a list of as many elements as there are columns and you gave it list(c(,(rep(integer(0),5 [[1]] [1] I think what you actually meant was c(list(NULL),rep(list(0L),5)) And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
Martin Tomko wrote: Hi Peter, apologies, too fast copying and pasting. So, here is the explanation: f-C:/test/mytab.txt; R-readLines(con=f); where mytab.txt is a table formatted as noted in previous post (space delimited, with header, rownames, containing integers). Now, my understandign of scan was that I have to specify the FULL number of values in it (examples specify things like 200*2000 for a matrix etc). That's why I thought that I need to do cols*rows as well. Avoiding the first line with headers is simple, avoiding the first column is not - hence my questions. Sorry, the corrected, matching parentheses are here - why did the previous execute is a wonder... c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1)),rows-1), skip=1) here, my reasoning was: * c(list(NULL),rep(list(0L),cols-1)) specifies a template for any line (first elelement to be ignored = NULL, it is a string in the table specified, and then a repetition of integers - I am still not sure how you derived 0L, and what it means and where to find a doc for that.); * the previous needs to be repeated rows-1 times, hence what=rep(c(list(NULL),rep(list(0L),cols-1)),rows-1) I do nto understand the following: You need an unlist(c). And more than likely NOT byrow=TRUE. However, I think do.call(cbind,c) should do the trick more easily. what will unlist(c) do; why should it not be bywrow=TRUE, and how would you go about integrating do.call(cbind,c) with matrix. Apologies to naive questions, I am a newbie, in principle. At this point I think you need to actually try my suggestions, and maybe read the documentation again. Explaining how you have misunderstood the documentation is not going to help... -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help usin scan on large matrix (caveats to what has been discussed before)
I did. Did not work. Did you try your code? The matrix did not result into integer numbers as expected. MY approach resulted in a correct scan result, at least. M. Martin Tomko wrote: Hi Peter, apologies, too fast copying and pasting. So, here is the explanation: f-C:/test/mytab.txt; R-readLines(con=f); where mytab.txt is a table formatted as noted in previous post (space delimited, with header, rownames, containing integers). Now, my understandign of scan was that I have to specify the FULL number of values in it (examples specify things like 200*2000 for a matrix etc). That's why I thought that I need to do cols*rows as well. Avoiding the first line with headers is simple, avoiding the first column is not - hence my questions. Sorry, the corrected, matching parentheses are here - why did the previous execute is a wonder... c-scan(file=f,what=rep(c(list(NULL),rep(list(0L),cols-1)),rows-1), skip=1) here, my reasoning was: * c(list(NULL),rep(list(0L),cols-1)) specifies a template for any line (first elelement to be ignored = NULL, it is a string in the table specified, and then a repetition of integers - I am still not sure how you derived 0L, and what it means and where to find a doc for that.); * the previous needs to be repeated rows-1 times, hence what=rep(c(list(NULL),rep(list(0L),cols-1)),rows-1) I do nto understand the following: You need an unlist(c). And more than likely NOT byrow=TRUE. However, I think do.call(cbind,c) should do the trick more easily. what will unlist(c) do; why should it not be bywrow=TRUE, and how would you go about integrating do.call(cbind,c) with matrix. Apologies to naive questions, I am a newbie, in principle. At this point I think you need to actually try my suggestions, and maybe read the documentation again. Explaining how you have misunderstood the documentation is not going to help... -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.