Hi Wei, Here is the FAQ for BED files: http://genome.ucsc.edu/FAQ/FAQformat.html#format1
If you went to a BED6, you could add strand. May or may not be important for your research. For transcription factor binding sites, this is probably a yes. But, liftOver can do conversion with or without strand. Thanks, jen On 5/3/10 6:03 PM, Wei Zheng wrote: > Hi, Jennifer, > > Thanks a lot. Could you tell me the exact format of BED6 and BED12? I > only have 4 columns "contig, start, end, name" for the TFBS regions, and > I thought that means BED4. > > Best, > Wei > On Mon, May 3, 2010 at 8:28 PM, Jennifer Jackson <[email protected] > <mailto:[email protected]>> wrote: > > Hello Wei, > > The commands look right. To do some detective work, there are two > things you could look in to, as a sanity check: > > 1) how much coverage are you getting between the genomes in the > over.chain file? Many BLAT hits could be compressed into not-so-many > chains. If you add up the coverage in the "to" database and "from" > database, that might tell you something (if it is low for either > one, then the low rate of liftOver would makes sense). > > 2) how long are the query regions in your "from" SB_Ste12_bind4.bed > file? Have you tried to use shorter genomic regions representing a > gene's footprint (BED6) or just the regions for exons from a > particular transcript (BED12)? If you take those same regions - and > run a BLAT against the "to" genome, are you able to capture hits > that are not found with liftOver? Or do you find more that meet or > exceed the original BLAT criteria? A few cases of each would be best. > > These kinds of tests can probably help you figure out where the > problem is. > > I am also working on getting some feedback for the chain/net part of > your process to see if there are some suggestions to offer. > > > Thanks, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > On 5/3/10 11:52 AM, Wei Zheng wrote: > > Hi, Jennifer, > > I get BLAT matches, but few liftOver matches (19 out of 256 > regions). > Below are the code I used. > > cd /mnt/shared/GenomeLift/bayanus > #mkdir lift net psl chainRaw over bedOver > for i in {1..16} M; do faSplit -lift=lift/chr$i.lft size > ../sacCer1/chr$i.fa -oneFile 3000 /scratch/split/chr$i; done > for i in {1..16} M; do blat contig.fasta /scratch/split/chr$i.fa > raw/Sb_Sc_chr${i}.psl -tileSize=11 -minScore=100 -minIdentity=80 > -fastMap; done > cd raw; for i in {1..16} M; do liftUp -pslQ ../psl/chr$i.psl > ../lift/chr$i.lft warn Sb_Sc_chr${i}.psl; done > #faToTwoBit contig.fasta Sb_contig.2bit > cd ..; > for i in {1..16} M; do axtChain -linearGap=medium -psl > psl/chr${i}.psl > Sb_contig.2bit ../sacCer1/chr${i}.2bit > chainRaw/Sb2Sc_chr${i}.chain; done > chainMergeSort chainRaw/*.chain | chainSplit chain stdin > cd chain; for i in *.chain; do chainNet $i ../contig.sizes > ../../sacCer1/chrom.sizes ../net/${i}.net /dev/null; done > for i in *.chain; do netChainSubset ../net/$i.net <http://i.net> > <http://i.net> $i > > ../over/$i; done > cat ../over/*.chain >../bedOver/over.chain > cd ..; > liftOver -minMatch=0.1 -multiple SB_Ste12_bind4.bed > bedOver/over.chain > SB2SC_Ste12_bind.bed SB2SC_nomatch.txt > > > On Mon, May 3, 2010 at 2:31 PM, Jennifer Jackson > <[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > Hello Wei, > > > > To clarify, do you not have BLAT matches between the two > species or > do you > > have the matches, but it is liftOver that not mapping data? > If you don't > > know, try tweaking liftOver a bit first. > > > > liftOver parameters: > > - use BED as an input (if you were using positional format) > > - use -multiple, use -minMatch 0.1 > > - maybe add in a -minSizeQ 300 (or so) to keep short fragment > out (that > > multiple will capture). Or leave out -minSizeQ, review, then > add it > back in > > using a threshold you set based on what type of output you > desire. > > > > Try the liftOver changes and let us know if this does not > solve the > problem. > > If it doesn't, send back details for the processing you (your > exact > > parameters based on the processes from the document) and we > can provide > > feedback for loosening up the match criteria. > > > > Thanks, > > Jennifer > > > > --------------------------------- > > Jennifer Jackson > > UCSC Genome Informatics Group > > http://genome.ucsc.edu/ > > > > On 5/3/10 10:52 AM, Wei Zheng wrote: > >> > >> Hello, > >> > >> I was trying to generate over.chain from S. bayanus to S. > cerevisiae > >> and perform liftOver to map TFBS of S.bayanus to S. cerevisiae > >> coordinates, using the instructions on > >> > http://hgwdev.cse.ucsc.edu/~kent/src/unzipped/hg/doc/liftOver.txt > <http://hgwdev.cse.ucsc.edu/%7Ekent/src/unzipped/hg/doc/liftOver.txt>. > >> However I can only get less than 10% of my TFBS lifted, the > others are > >> always non-matched. > >> > >> I wonder whether you could point out some key parameters in > blat, > >> axtChain, and liftOver steps for such closely related yeast > species. > >> The assembly I used was sacCer1 (16 chromosomes downloaded > from UCSC) > >> for S. cerevisiae and 1098 contigs (downloaded from SGD, > reported in > >> Kellis 2003) for S. bayanus. > >> > >> Thank you very much! > >> > >> Wei > >> _______________________________________________ > >> Genome maillist - [email protected] > <mailto:[email protected]> > <mailto:[email protected] > <mailto:[email protected]>> > > >> https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
