Hi, Jennifer, Thanks a lot. Could you tell me the exact format of BED6 and BED12? I only have 4 columns "contig, start, end, name" for the TFBS regions, and I thought that means BED4.
Best, Wei On Mon, May 3, 2010 at 8:28 PM, Jennifer Jackson <[email protected]> wrote: > Hello Wei, > > The commands look right. To do some detective work, there are two things > you could look in to, as a sanity check: > > 1) how much coverage are you getting between the genomes in the over.chain > file? Many BLAT hits could be compressed into not-so-many chains. If you add > up the coverage in the "to" database and "from" database, that might tell > you something (if it is low for either one, then the low rate of liftOver > would makes sense). > > 2) how long are the query regions in your "from" SB_Ste12_bind4.bed file? > Have you tried to use shorter genomic regions representing a gene's > footprint (BED6) or just the regions for exons from a particular transcript > (BED12)? If you take those same regions - and run a BLAT against the "to" > genome, are you able to capture hits that are not found with liftOver? Or do > you find more that meet or exceed the original BLAT criteria? A few cases of > each would be best. > > These kinds of tests can probably help you figure out where the problem is. > > I am also working on getting some feedback for the chain/net part of your > process to see if there are some suggestions to offer. > > > Thanks, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > On 5/3/10 11:52 AM, Wei Zheng wrote: > >> Hi, Jennifer, >> >> I get BLAT matches, but few liftOver matches (19 out of 256 regions). >> Below are the code I used. >> >> cd /mnt/shared/GenomeLift/bayanus >> #mkdir lift net psl chainRaw over bedOver >> for i in {1..16} M; do faSplit -lift=lift/chr$i.lft size >> ../sacCer1/chr$i.fa -oneFile 3000 /scratch/split/chr$i; done >> for i in {1..16} M; do blat contig.fasta /scratch/split/chr$i.fa >> raw/Sb_Sc_chr${i}.psl -tileSize=11 -minScore=100 -minIdentity=80 >> -fastMap; done >> cd raw; for i in {1..16} M; do liftUp -pslQ ../psl/chr$i.psl >> ../lift/chr$i.lft warn Sb_Sc_chr${i}.psl; done >> #faToTwoBit contig.fasta Sb_contig.2bit >> cd ..; >> for i in {1..16} M; do axtChain -linearGap=medium -psl psl/chr${i}.psl >> Sb_contig.2bit ../sacCer1/chr${i}.2bit chainRaw/Sb2Sc_chr${i}.chain; done >> chainMergeSort chainRaw/*.chain | chainSplit chain stdin >> cd chain; for i in *.chain; do chainNet $i ../contig.sizes >> ../../sacCer1/chrom.sizes ../net/${i}.net /dev/null; done >> for i in *.chain; do netChainSubset ../net/$i.net <http://i.net> $i >> >> ../over/$i; done >> cat ../over/*.chain >../bedOver/over.chain >> cd ..; >> liftOver -minMatch=0.1 -multiple SB_Ste12_bind4.bed bedOver/over.chain >> SB2SC_Ste12_bind.bed SB2SC_nomatch.txt >> >> >> On Mon, May 3, 2010 at 2:31 PM, Jennifer Jackson <[email protected] >> <mailto:[email protected]>> wrote: >> > Hello Wei, >> > >> > To clarify, do you not have BLAT matches between the two species or >> do you >> > have the matches, but it is liftOver that not mapping data? If you >> don't >> > know, try tweaking liftOver a bit first. >> > >> > liftOver parameters: >> > - use BED as an input (if you were using positional format) >> > - use -multiple, use -minMatch 0.1 >> > - maybe add in a -minSizeQ 300 (or so) to keep short fragment out (that >> > multiple will capture). Or leave out -minSizeQ, review, then add it >> back in >> > using a threshold you set based on what type of output you desire. >> > >> > Try the liftOver changes and let us know if this does not solve the >> problem. >> > If it doesn't, send back details for the processing you (your exact >> > parameters based on the processes from the document) and we can provide >> > feedback for loosening up the match criteria. >> > >> > Thanks, >> > Jennifer >> > >> > --------------------------------- >> > Jennifer Jackson >> > UCSC Genome Informatics Group >> > http://genome.ucsc.edu/ >> > >> > On 5/3/10 10:52 AM, Wei Zheng wrote: >> >> >> >> Hello, >> >> >> >> I was trying to generate over.chain from S. bayanus to S. cerevisiae >> >> and perform liftOver to map TFBS of S.bayanus to S. cerevisiae >> >> coordinates, using the instructions on >> >> >> http://hgwdev.cse.ucsc.edu/~kent/src/unzipped/hg/doc/liftOver.txt<http://hgwdev.cse.ucsc.edu/%7Ekent/src/unzipped/hg/doc/liftOver.txt> >> . >> >> However I can only get less than 10% of my TFBS lifted, the others are >> >> always non-matched. >> >> >> >> I wonder whether you could point out some key parameters in blat, >> >> axtChain, and liftOver steps for such closely related yeast species. >> >> The assembly I used was sacCer1 (16 chromosomes downloaded from UCSC) >> >> for S. cerevisiae and 1098 contigs (downloaded from SGD, reported in >> >> Kellis 2003) for S. bayanus. >> >> >> >> Thank you very much! >> >> >> >> Wei >> >> _______________________________________________ >> >> Genome maillist - [email protected] >> <mailto:[email protected]> >> >> >> https://lists.soe.ucsc.edu/mailman/listinfo/genome >> > >> >> _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
