Included below is a simple awk script that converts genePred to bed format: genePredToBed
--Hiram Rathi Thiagarajan wrote: > Hi Hiram, > > Thank you very much for gene track. It's exactly what I wanted! > > I am currently trying to get this table into a BED format however > noticed that the last two columns actually contains the exonStarts and > exonEnds rather than blockSizes and blockStarts (which would be > exonStarts). Is the blockSizes information readily available somewhere > where I can get access to it? > > Thanks again for all your help. > > Cheers, > Rathi > > On Sun, 04 Apr 2010 06:03:03 +1000, Hiram Clawson <[email protected]> > wrote: > >> Good Afternoon Rathi: >> >> You can get a 'single coverage' gene track out of the mm9 refGene >> table with the following mysql and kent source tree command: >> >> $ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A mm9 -Ne \ >> "select >> name,chrom,strand,txStart,txEnd,cdsStart,cdsEnd,exonCount,exonStarts,exonEnds >> >> from >> refGene" > mm9.refGene.gp >> >> $ genePredSingleCover mm9.refGene.gp stdout | sort > >> mm9.refGene.singleCover.gp >> >> See also: >> http://genome.ucsc.edu/admin/cvs.html >> http://genome.ucsc.edu/admin/jk-install.html >> >> And two scripts and a configuration file that can fetch and >> build the source tree: >> >> http://genome-test.cse.ucsc.edu/~kent/src/unzipped/product/scripts/kentSrcUpdate.sh >> >> >> http://genome-test.cse.ucsc.edu/~kent/src/unzipped/product/scripts/beta.cvsup.pl >> >> >> http://genome-test.cse.ucsc.edu/~kent/src/unzipped/product/scripts/browserEnvironment.txt >> >> >> >> --Hiram #!/usr/bin/awk -f # # Convert genePred file to a bed file (on stdout) # BEGIN { FS="\t"; OFS="\t"; } { name=$1 chrom=$2 strand=$3 start=$4 end=$5 cdsStart=$6 cdsEnd=$7 blkCnt=$8 delete starts split($9, starts, ","); delete ends split($10, ends, ","); blkStarts="" blkSizes="" for (i = 1; i <= blkCnt; i++) { blkSizes = blkSizes (ends[i]-starts[i]) ","; blkStarts = blkStarts (starts[i]-start) ","; } print chrom, start, end, name, 1000, strand, cdsStart, cdsEnd, 0, blkCnt, blkSizes, blkStarts } _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
