To whom it may concern,
I wish to batch automate the re-alignment of all SNP flanking sequences on
chromosome 16 from ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/rs_fasta/
and align that to the genomic download for chr16 at
http://hgdownload.cse.ucsc.edu/downloads.html.<http://hgdownload.cse.ucsc.edu/downloads.html>I
have a couple of questions:
1) Is there anywhere that I can get the SNP flanking sequences customized
(i.e. only get the positively selected SNP sequences if I have a list of all
of the rs numbers for those SNP's, and the same for the non-positively
selected)?
2) I did a test BLAT run using only a portion of the queries in the SNP
sequences and I then converted it to the human readable output using
pslPretty. However, I was wondering
a) How I could identify the base to which the SNP referrs to in the
pslPretty output, as it is on the website (see example below)? Below it
seems to be identified by a "G" for the genomic sequence (?) and an (R) for
the reference sequence (?)
b) I am assuming that the output that I received for this test run was in
order from highest score to lowest for each query. Is there any way to
modify the parameters so that only the result with the highest score is in
the output file? Is this what happens on the ucsc website?
79237487
AAACAAACAGCTTGTTTGTGGTTCGTCCTGAAATCCTCCCTGCTCACAAAACAGCCAGCTACTTGGTTTTCTAAAAGACGTAATTTTGCAGGCAGACTTC
79237586
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
00000201
AAACAAACAGCTTGTTTGTGGTTCGTCCTGAAATCCTCCCTGCTCACAAAACAGCCAGCTACTTGGTTTTCTAAAAGACGTAATTTTGCAGGCAGACTTC
00000300
*79237587 G 79237587
00000301 R 00000301
*79237588
TAGAGCCATTCTGTGCAGAAGAAGGGAAGGGAGAAGCTGTTTGTTTTACCTGTAGTATGAAGATATTCTTTGCGCTGTTAGAACTGAGCTCATTAATTCT
79237687
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
00000302
TAGAGCCATTCTGTGCAGAAGAAGGGAAGGGAGAAGCTGTTTGTTTTACCTGTAGTATGAAGATATTCTTTGCGCTGTTAGAACTGAGCTCATTAATTCT
00000401
3) Finally, I was wondering if there were any documents/descriptions online
of common modifications of BLAT and pslPretty.
I apologize for the length of this email. I am an undergraduate
bioinformatics intern, and so I have to ask for your patient in helping me.
Kyle Tretina
Junior
Wheaton College
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome