Hi,

note:  apologies in advance if this gets duplicated.  It didn't post  
after a day, and I figured it may have been blocked due to my pgp sig  
attachment.

First, i just wanted to say thanks for the mailing list and to thank  
everyone for their work on the source tree - it's a great resource  
that I use almost daily!  I've browsed the list for quite some time,  
but have recently run across some strangeness in the behavior of  
gfClient relative to blat.  Likely, the strangeness is of my own  
doing, but I figured I might email to see if that, indeed, was the case.

I'm working from gfClient/Server (v.34x4) and blat (v. 34x4) compiled  
from CVS.  The problem I'm running into deals with alignments starting  
in repeat regions (versus alignments extending over repeats).  Here  
are my gfServer start parameters:

/Users/bcf/bin/i386/gfserver start 127.0.0.1 8888 /Users/bcf/Data/test/ 
SoftMask/*.softmask.2bit -mask

where *.softmask.2bit was created from a fasta file of soft-masked  
sequences (from repeatmasker | `maskOutFa -soft`) using faToTwobit.   
these targets also contain the query sequence I am demonstrating  
with.  I am running gfServer because the number of queries for what I  
am attempting is large, and I would prefer to avoid reindexing the  
2bit file with every call to blat.

my query with gfClient is:

/Users/bcf/bin/i386/gfclient -t=DNA -q=DNA -minScore=0 -minIdentity=0 - 
out=psl 127.0.0.1 8888 / ~/tmp/tmp.fa stdout

where tmp.fa is a single, soft-masked sequence in fasta format.   
tmp.fa has a soft-masked repeat region, extending from position 76-158  
(0-indexed).  The (truncated) gfClient output is:

match   mis-    rep.    N's     Q gap   Q gap   T gap   T gap   strand  Q       
        Q        
Q       Q       T               T       T       T       block   blockSizes      
qStarts  tStarts
        match   match           count   bases   count   bases           name    
        size     
start   end     name            size    start   end     count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
100     2       0       1       0       0       2       2       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02D3DFI  250     30       
135     3       18,4,81,        0,18,22,        30,49,54,
99      2       0       1       1       1       3       5       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DA9YF  222     102      
209     4       18,4,68,12,     0,18,22,91,     102,121,126,197,
94      2       0       0       1       7       1       9       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DAKZ3  297     23      128     
 
2       12,84,  0,19,   23,44,
100     2       0       1       0       0       4       12      -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DBSW8  222     102      
217     5       18,4,17,51,13,  0,18,22,39,90,  102,121,126,144,204,
55      1       0       0       0       0       0       0       -       
FX5ZTWB02D5UGZ  179     76      132     FX5ZTWB02DBYD5  226     39      95      
 
1       56,     47,     39,
67      1       0       0       0       0       0       0       -       
FX5ZTWB02D5UGZ  179     76      144     FX5ZTWB02DJ4YU  231     96      164     
 
1       68,     35,     96,
100     2       0       1       0       0       2       2       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DJB25  170     29       
134     3       18,4,81,        0,18,22,        29,48,53,
100     2       0       1       0       0       2       2       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DVMEF  168     29       
134     3       18,4,81,        0,18,22,        29,48,53,
79      0       0       0       1       3       3       19      -       
FX5ZTWB02D5UGZ  179     76      158     FX5ZTWB02DWVVC  241     64       
162     4       13,28,15,23,    21,34,65,80,    64,94,123,139,
94      2       0       0       1       7       1       9       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02EGVMB  247     23      128     
 
2       12,84,  0,19,   23,44,
100     2       0       1       0       0       2       2       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02EHBES  338     39       
144     3       18,4,81,        0,18,22,        39,58,63,
44      0       0       0       0       0       1       1       -       
FX5ZTWB02D5UGZ  179     76      120     FX5ZTWB02EOC38  213     66      111     
 
2       28,16,  59,87,  66,95,
100     2       0       1       0       0       2       2       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02ETWES  202     29       
134     3       18,4,81,        0,18,22,        29,48,53,
100     2       0       1       0       0       1       1       -       
FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02EZOW2  208     19       
123     2       18,85,  0,18,   19,38,

A blat run of the form:

blat /Users/bcf/Data/test/SoftMask/*.softmask.clean.2bit tmp.fa - 
mask=lower stdout

returns (full output):

match   mis-    rep.    N's     Q gap   Q gap   T gap   T gap   strand  Q       
        Q        
Q       Q       T               T       T       T       block   blockSizes      
qStarts  tStarts
        match   match           count   bases   count   bases           name    
        size     
start   end     name            size    start   end     count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
31      2       80      0       0       0       1       24      +       
FX5ZTWB02D5UGZ  179     45      158     FX5ZTWB02EZZ23  182     35       
172     2       31,82,  45,76,  35,90,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EMMWO  294     35      67      
1        
32,     45,     35,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EM5LP  153     35      67      
1        
32,     45,     35,
32      0       33      0       1       1       1       23      +       
FX5ZTWB02D5UGZ  179     45      111     FX5ZTWB02ELBHJ  161     34       
122     2       32,33,  45,78,  34,89,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EKORL  159     36      68      
1        
32,     45,     36,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EJB29  138     35      67      
1        
32,     45,     35,
66      2       0       0       1       8       2       26      +       
FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02EJ0PM  301     0       94      
3        
11,26,31,       0,11,45,        0,12,63,
68      1       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     0       77      FX5ZTWB02EICDX  381     229     322     
 
2       37,32,  0,45,   229,290,
66      2       0       0       1       8       2       25      +       
FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02EH3VT  247     0       93      
3        
11,26,31,       0,11,45,        0,12,62,
62      1       0       0       2       13      2       35      +       
FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02EGNY4  328     0       98      
3        
24,8,31,        0,29,45,        0,32,67,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02EG2T5  198     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02ECX8Z  224     0       67      
2        
11,32,  26,45,  0,35,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EC10O  167     35      67      
1        
32,     45,     35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02EBHWR  212     0       67      
2        
11,32,  26,45,  0,35,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DYAKJ  141     35      67      
1        
32,     45,     35,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DUG6S  181     35      67      
1        
32,     45,     35,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DTP6B  182     35      67      
1        
32,     45,     35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DSULL  245     0       67      
2        
11,32,  26,45,  0,35,
44      0       37      0       1       8       2       55      +       
FX5ZTWB02D5UGZ  179     26      115     FX5ZTWB02DPKMK  206     0        
136     3       11,32,38,       26,45,77,       0,35,98,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DPI46  179     35      67      
1        
32,     45,     35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DNZB3  290     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DMW8R  211     0       67      
2        
11,32,  26,45,  0,35,
40      0       0       0       1       8       2       27      +       
FX5ZTWB02D5UGZ  179     26      74      FX5ZTWB02DMQ4E  240     0       67      
3        
11,5,24,        26,45,50,       0,37,43,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DKKJB  175     35      67      
1        
32,     45,     35,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DI8TE  158     35      67      
1        
32,     45,     35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DGCU6  275     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DB9V0  286     0       67      
2        
11,32,  26,45,  0,35,
43      2       78      0       2       9       2       47      +       
FX5ZTWB02D5UGZ  179     26      158     FX5ZTWB02D92EW  204     0        
170     3       11,32,80,       26,45,78,       0,35,90,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D8YAV  238     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D8RCP  216     0       67      
2        
11,32,  26,45,  0,35,
95      0       82      1       1       1       2       4       +       
FX5ZTWB02D5UGZ  179     0       179     FX5ZTWB02D887O  221     0       182     
 
3       11,149,18,      0,11,161,       0,12,164,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D83KC  250     0       67      
2        
11,32,  26,45,  0,35,
96      0       82      1       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     0       179     FX5ZTWB02D5UGZ  179     0       179     
 
1       179,    0,      0,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D5NGS  270     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D35EE  194     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D1GE9  269     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D1EQS  198     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D0168  201     0       67      
2        
11,32,  26,45,  0,35,
30      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     47      77      FX5ZTWB02C9X15  154     39      69      
1        
30,     47,     39,
32      0       0       0       0       0       0       0       +       
FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02C8WKA  194     35      67      
1        
32,     45,     35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02C7D4Y  222     0       67      
2        
11,32,  26,45,  0,35,
43      0       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02C6UVQ  263     0       67      
2        
11,32,  26,45,  0,35,
66      2       0       0       1       8       2       25      +       
FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02C5OY2  249     0       93      
3        
11,26,31,       0,11,45,        0,12,62,
66      2       0       0       1       8       1       24      +       
FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02C1LC0  179     0       92      
2        
37,31,  0,45,   0,61,


It looks like blat is treating the masking correctly in alignments -  
there are no alignments starting in the repeat region (76-158) of the  
Query or the Targets.  Alignments across masked regions that begin in  
unmasked regions are treated as expected (i,e. the self to self (Q=  
FX5ZTWB02D5UGZ to T= FX5ZTWB02D5UGZ) alignment extends through the  
masked region).

Conversely, in the truncated gfClient output, several of the  
alignments listed have a `Q start` to `Q end` within 76-158, which is  
an unexpected result given the use of the `-mask` flag to start an  
instance of gfServer with the soft-masked, 2bit input file.  After  
double-checking the associated Target sequence (and reverse complement  
of the Target) for masked bases, it appears alignments are started in  
repeat-masked regions of these targets.

I noticed the gfServer help indicated that the mask option is to be  
used with nib files, but I assumed since 2bit files were also a valid  
input option (and can be composed of multiple fastas, which I need),  
the `-mask` option applied, as well.  So, the discrepancy in the  
output from blat versus gfClient is what has me confused.  Again, I  
suspect that I've got something wrong here, that my interpretation of  
the expected behavior is incorrect, or that the help is indeed correct  
that gfServer masking is nib only, but I can't quite put my finger on  
the problem.

Thanks for your time,
brant

************************************************
Brant C. Faircloth
Dept. of Ecology and Evolutionary Biology
621 Charles E. Young Drive South
University of California
Los Angeles, CA 90095 USA

rooms:   LSS 4304 and 4315
email:   [email protected]
lab:     +1.310.206.2270
office:  +1.310.206.3083
mobile:  +1.706.201.6110
************************************************

< * )
  (_ \\
  _ ||





_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to