> I've successfully import my synteny data to R by using scan > command. Below show my results. My major problem with my data is how > am i going to combine the column names with the data( splt) where i > have tried on cbind but a warning message occur. I have realized > that the splt data only have 5 column instead of 6. Please help me with this!! > > I want my data to be a numerical data with a proper column and > column names and to replace CS with 1 and CSO with 0 and also to > get remove all the punctuations and the characters from the data.
> 1)for col names > > nms<-scan("C:/Users/user/Documents/cfa-1.txt",sep="\t",nlines=1, > skip=10,what=character(0)) > Read 6 items > > nms > [1] "CS(O) id (number of marker/anchor) " > [2] " Location(s) on reference " > [3] "CS(O) size" > [4] "CS(O) density on reference chromosome" > [5] "Location(s) on tested " > [6] "Breakpoints CS(O) locations (denstiy of marker/anchor)" > > 2) my data > > x<-scan("C:/Users/user/Documents/cfa-1.txt",sep="\n",skip=12, > what=character(0)) > Read 21 items > > splt<-strsplit(x,"\t") > > splt > [[1]] > [1] "CS 1 (73): " " cfa1: [ 3251712 - 24126920 ] " > [3] " 20875208 " " 3 " > [5] " hsa18: [ 132170848 - 50139168 ] " "] 24126920, 24153560 [(8 ) " > [[2]] > [1] "CS 2 (3): " " cfa1: [ 24153560 - 24265894 ] " > [3] " 112334 " " 27 " > [5] " hsa18: [ 50105060 - 49934572 ] " "] 24265894, 24823786 [(7 ) " > [[3]] > [1] "CSO 3.1 (6): " > [2] " cfa1: [ 24823786 - 27113036 ] " > [3] " 2289250 " > [4] " 3 " > [5] " hsa18: [ 48121156 - 46579500 ]- Decreasing order - ] 27113036, > 27418228 [ (13)" > ... You are probably better off using read.table or read.delim to get your data into R, since you most likely want it in the form of a data frame rather than a list. Otherwise,try this. #Convert to matrix datamat <- matrix(unlist(splt), ncol=6, byrow=TRUE) #This will remove punctuation, but it looks like you want to do something more with some of the columns; I'm just not sure what it is. nopunct <- gsub("[[:punct:]]", "", datamat) #Convert to a data frame df <- as.data.frame(nopunct) #Make column 3 numeric (you will probably want to do something like this for each one) df[,3] <- as.numeric(df[,3]) # Set column names names(df) <- nms Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.