Dear Angie,

Thank you for the thorough explanation.  The default settings are much
too stringent for my purposes.  Is it possible for you to explain to
me how the two scores are calculated, TopScore and SynScore, so that a
user can make intelligent parameter choices?

Thanks,
Danielle


On Mon, Feb 8, 2010 at 4:25 PM, Angie Hinrichs <[email protected]> wrote:
> Hi Danielle,
>
>> It has been suggested in the archives that "netFilter -syn" can be
>> used to do this and elsewhere in the archives there's an indication
>> that it has been "tuned" to human-mouse.  What does that mean?  Will
>> it still work for other mammal pairs?
>
> Yes, that is the aim of "netFilter -syn".  I will describe how it
> works in as much detail as I can, and hopefully that will be enough
> information for you to decide whether it will suit your purposes.
>
> "netFilter -syn" uses these parameters described in the usage (run
> just "netFilter" with no arguments to see the full usage):
>
>   -syn        - do filtering based on synteny (tuned for human/mouse).
>   -minTopScore=N - Minimum score for top level alignments. default 300000
>   -minSynScore=N - Min syntenic block score (def=200,000).
>                      Default covers 27,000 bases including 9,000
>                      aligning--a very stringent requirement.
>   -minSynSize=N - Min syntenic block size (def=20,000). -
>   -minSynAli=N  - Min syntenic alignment size(def=10,000). -
>   -maxFar=N     - Max distance to allow synteny (def=200,000).
>
> The -syn filter has multiple stages:
>
> First, a net passes the -syn filter if its score is at least
> minSynScore, its size on the reference genome is at least minSynSize,
> and its alignment size (number of bases in gapless aligned blocks in
> chain) is at least minSynAli.
>
> If those criteria are not satisfied, but the net has been classified
> as "top" by the netSyntenic program, and its score is at least
> minTopScore, it passes.
>
> If the net has been classified as "nonSyn" by netSyntenic, it fails.
>
> If the net's distance from the position predicted by its parent net
> is greater than maxFar, it fails.
>
> Finally, if the net has not met any of the above criteria, it passes.
> The source code has this comment:
> /* For all others, assume syntenic.  This keeps tandem dupes, small inversion,
>  * and translocations. */
>
> The source tree has a description of the .net format that may provide
> insight into some of these parameters ("ali" is compared to minSynAli,
> "qFar" is compared to maxFar):  kent/src/hg/mouseStuff/netFormat.doc
>
> Source code for the chain* and net* programs is also in mouseStuff,
> and ultimately that is where to look for answers about the detailed
> workings of the programs.  (Or, as you have done, you can ask the
> genome list for a reading of the entrails of the code :)
>
> I don't know the process by which the parameters were tuned -- perhaps
> Jim Kent can comment.
>
> My hunch is that it will probably do a decent job with other pairs of
> mammalian genomes; I suggest running netFilter on a downloaded .net
> file, using netToBed to make a custom track, and then visually
> comparing the custom track to the net track in a handful of regions --
> do the retained and excluded parts of the net seem reasonable?  If
> not, is there a netFilter parameter that could be changed to get the
> sensitivity / specificity that you're looking for?  I guess that would
> be a tuning process, by subjectively evaluating the output of a very
> complex quantitative system.
>
>
>> Also, is it possible to specify the permissible gap size?  For
>> example, let's say there are two level 1 nets that are separated by a
>> 100bp or less and I want to join them, is that possible with the
>> program?
>
> netFilter can gloss over gaps within a net, but not between top-level
> nets.  But hopefully there wouldn't be cases of adjacent top-level
> nets that should have been joined -- that should have happened at the
> chaining step.  The command line arg that you want is -minGap=100:
>
>   -minGap=N  - restrict to those with gap size (tSize) >= minSize
>
> -- that should read "restrict gaps to those with size (tSize) >=
> minGap".
>
> Hope that helps,
>
> Angie
>
> ----- "Danielle Lemay" <[email protected]> wrote:
>
>> From: "Danielle Lemay" <[email protected]>
>> To: [email protected]
>> Sent: Saturday, February 6, 2010 3:21:28 PM GMT -08:00 US/Canada Pacific
>> Subject: [Genome] netFilter to produce syntenic regions
>>
>> Hi,
>>
>> This question is regarding the netFilter program.
>>
>> Given a pair of mammalian genomes (e.g. mouse and opossum), I'd like
>> to produce a list of all syntenic regions.  This is presumably a
>> combination of the level 1 net with the lower level nets that are
>> syntenic.
>>
>> It has been suggested in the archives that "netFilter -syn" can be
>> used to do this and elsewhere in the archives there's an indication
>> that it has been "tuned" to human-mouse.  What does that mean?  Will
>> it still work for other mammal pairs?
>>
>> Also, is it possible to specify the permissible gap size?  For
>> example, let's say there are two level 1 nets that are separated by a
>> 100bp or less and I want to join them, is that possible with the
>> program?
>>
>> Thanks,
>> Danielle
>>
>> ########################################
>> Danielle G. Lemay, PhD
>> Postdoctoral Scholar, Bruce German's Lab
>> Food Science and Technology Department
>> University of California at Davis
>> (530) 297 7688
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>



-- 
########################################
Danielle G. Lemay, PhD
Postdoctoral scholar, German Lab
Food Science and Technology Department
University of California at Davis
(530) 297 7688

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to