This is an automated email from the git hooks/post-receive script. afif-guest pushed a commit to branch master in repository dazzdb.
commit a92d5776a1c686e194de0a71b5cc0590a0144aac Author: Afif Elghraoui <a...@ghraoui.name> Date: Sun Sep 20 02:50:40 2015 -0700 touch-up manpages --- debian/man/Catrack.1.md | 20 +++++++++++--------- debian/man/DAM2fasta.1.md | 18 +++++++++++------- debian/man/DB2quiva.1.md | 25 +++++++++++++------------ debian/man/DBdust.1.md | 26 ++++++++++++++------------ debian/man/DBrm.1.md | 16 +++++++++------- debian/man/DBshow.1.md | 36 +++++++++++++++++++----------------- debian/man/DBsplit.1.md | 24 +++++++++++++----------- debian/man/DBstats.1.md | 24 +++++++++++++----------- debian/man/fasta2DAM.1.md | 13 ++++++++----- debian/man/quiva2DB.1.md | 2 +- debian/man/simulator.1.md | 36 +++++++++++++++++++----------------- 11 files changed, 131 insertions(+), 109 deletions(-) diff --git a/debian/man/Catrack.1.md b/debian/man/Catrack.1.md index 41adc6a..2d5fdc3 100644 --- a/debian/man/Catrack.1.md +++ b/debian/man/Catrack.1.md @@ -1,20 +1,22 @@ -% (1) 1.0 +% CATRACK(1) 1.0 % % September 2015 # NAME -# SYNOPSIS +Catrack - merge dazzler block tracks -# DESCRIPTION +# SYNOPSIS -Catrack [-v] <path:db|dam> <track:name> +**Catrack** [**-v**] *path:db|dam* *track:name* -Find all block tracks of the form .<path>.#.<track>... and merge them into a single -track, .<path>.<track>..., for the given DB or DAM. The block track files must all -encode the same kind of track data (this is checked), and the files must exist for -block 1, 2, 3, ... up to the last block number. +# DESCRIPTION -# OPTIONS +Find all block tracks of the form .*path*.#.*track*... and merge them into a +single track, .*path*.*track*..., for the given DB or DAM. The block track +files must all encode the same kind of track data (this is checked), and the +files must exist for block 1, 2, 3, ... up to the last block number. # SEE ALSO + +**daligner**(1) diff --git a/debian/man/DAM2fasta.1.md b/debian/man/DAM2fasta.1.md index aaac72c..8ec9a87 100644 --- a/debian/man/DAM2fasta.1.md +++ b/debian/man/DAM2fasta.1.md @@ -1,21 +1,25 @@ -% (1) 1.0 +% DAM2FASTA(1) 1.0 % % September 2015 # NAME +DAM2fasta - get fasta files from Dazzler map DB + # SYNOPSIS -DAM2fasta [-vU] [-w<int(80)>] <path:dam> +**DAM2fasta** [**-vU**] [**-w***int(80)*] *path:dam* # DESCRIPTION The set of .fasta files for the given map DB or DAM are recreated from the DAM exactly as they were input. That is, this is a perfect inversion, including the -reconstitution of the proper .fasta headers and the concatenation of contigs with -the proper number of N's between them. By default the output sequences are in lower -case and 80 chars per line. The -U option specifies upper case should be used, and -the characters per line, or line width, can be set to any positive value with -the -w option. +reconstitution of the proper .fasta headers and the concatenation of contigs +with the proper number of N's between them. By default the output sequences are +in lower case and 80 chars per line. The **-U** option specifies upper case +should be used, and the characters per line, or line width, can be set to any +positive value with the **-w** option. # SEE ALSO + +**daligner**(1) diff --git a/debian/man/DB2quiva.1.md b/debian/man/DB2quiva.1.md index 4b1a03b..2323f9d 100644 --- a/debian/man/DB2quiva.1.md +++ b/debian/man/DB2quiva.1.md @@ -1,24 +1,25 @@ -% (1) 1.0 +% DB2QUIVA(1) 1.0 % % September 2015 # NAME +DB2quiva - get .quiva files from a Dazzler database + # SYNOPSIS -DB2quiva [-vU] <path:db> +**DB2quiva** [**-vU**] *path:db* # DESCRIPTION -The set of .quiva files within the given DB are recreated from the DB exactly as they -were input. That is, this is a perfect inversion, including the reconstitution of the -proper .quiva headers. Because of this property, one can, if desired, delete the -.quiva source files once they are in the DB as they can always be recreated from it. -By .fastq convention each QV vector is output as a line without new-lines, and by -default the Deletion Tag entry is in lower case letters. The -U option specifies -upper case letters should be used instead. - - -# OPTIONS +The set of .quiva files within the given DB are recreated from the DB exactly +as they were input. That is, this is a perfect inversion, including the +reconstitution of the proper .quiva headers. Because of this property, one can, +if desired, delete the .quiva source files once they are in the DB as they can +always be recreated from it. By .fastq convention each QV vector is output as a +line without new-lines, and by default the Deletion Tag entry is in lower case +letters. The **-U** option specifies upper case letters should be used instead. # SEE ALSO + +**daligner**(1) diff --git a/debian/man/DBdust.1.md b/debian/man/DBdust.1.md index 383707d..3f8d957 100644 --- a/debian/man/DBdust.1.md +++ b/debian/man/DBdust.1.md @@ -1,32 +1,34 @@ -% (1) 1.0 +% DBDUST(1) 1.0 % % September 2015 # NAME +DBdust - description + # SYNOPSIS -# DESCRIPTION +**DBdust** [**-b**] [**-w***int(64)*] [**-t***double(2.)*] [**-m***int(10)*] *path:db|dam* -DBdust [-b] [-w<int(64)>] [-t<double(2.)>] [-m<int(10)>] <path:db|dam> +# DESCRIPTION -Runs the symmetric DUST algorithm over the reads in the untrimmed DB <path>.db or -<path>.dam producing a track .<path>.dust[.anno,.data] that marks all intervals of low -complexity sequence, where the scan window is of size -w, the threshold for being a -low-complexity interval is -t, and only perfect intervals of size greater than -m are -recorded. If the -b option is set then the definition of low complexity takes into +Runs the symmetric DUST algorithm over the reads in the untrimmed DB *path*.db or +*path*.dam producing a track .*path*.dust[.anno,.data] that marks all intervals of low +complexity sequence, where the scan window is of size **-w**, the threshold for being a +low-complexity interval is **-t**, and only perfect intervals of size greater than **-m** are +recorded. If the **-b** option is set then the definition of low complexity takes into account the frequency of a given base. The command is incremental if given a DB to which new data has been added since it was last run on the DB, then it will extend the track to include the new reads. It is important to set this flag for genomes with a strong AT/GC bias, albeit the code is a tad slower. The dust track, if present, -is understood and used by DBshow, DBstats, and dalign. +is understood and used by **DBshow**(1), **DBstats**(1), and **daligner**(1). -DBdust can also be run over an untriimmed DB block in which case it outputs a track +**DBdust** can also be run over an untriimmed DB block in which case it outputs a track encoding where the trace file names contain the block number, e.g. .FOO.3.dust.anno and .FOO.3.dust.data, given FOO.3 on the command line. We call this a *block track*. This permits job parallelism in block-sized chunks, and the resulting sequence of block tracks can then be merged into a track for the entire untrimmed DB with Catrack. -# OPTIONS - # SEE ALSO + +**daligner**(1) diff --git a/debian/man/DBrm.1.md b/debian/man/DBrm.1.md index c923cd9..05d0688 100644 --- a/debian/man/DBrm.1.md +++ b/debian/man/DBrm.1.md @@ -1,19 +1,21 @@ -% (1) 1.0 +% DBRM(1) 1.0 % % September 2015 # NAME +DBrm - delete a Dazzler database + # SYNOPSIS -# DESCRIPTION +**DBrm** *path:db|dam* ... -DBrm <path:db|dam> ... +# DESCRIPTION -Delete all the files for the given data bases. Do not use **rm**(1) to remove a database, as +Delete all the files for the given databases. Do not use **rm**(1) to remove a database, as there are at least two and often several secondary files for each DB including track -files, and all of these are removed by DBrm. - -# OPTIONS +files, and all of these are removed by **DBrm**. # SEE ALSO + +**daligner**(1) diff --git a/debian/man/DBshow.1.md b/debian/man/DBshow.1.md index c163fa9..cb54c8c 100644 --- a/debian/man/DBshow.1.md +++ b/debian/man/DBshow.1.md @@ -1,18 +1,20 @@ -% (1) 1.0 +% DBSHOW(1) 1.0 % % September 2015 # NAME +DBshow - display reads stored in a Dazzler database + # SYNOPSIS -# DESCRIPTION +**DBshow** [**-unqUQ**] [**-w***int(80)*] [**-m***track*]+ + *path:db|dam* [ *reads:FILE* | *reads:range* ... ] -DBshow [-unqUQ] [-w<int(80)>] [-m<track>]+ - <path:db|dam> [ <reads:FILE> | <reads:range> ... ] +# DESCRIPTION -Displays the requested reads in the database <path>.db or <path>.dam. By default the -command applies to the trimmed database, but if -u is set then the entire DB is used. +Displays the requested reads in the database *path*.db or *path*.dam. By default the +command applies to the trimmed database, but if **-u** is set then the entire DB is used. If no read arguments are given then every read in the database or database block is displayed. Otherwise the input file or the list of supplied integer ranges give the ordinal positions in the actively loaded portion of the db. In the case of a file, it @@ -24,23 +26,23 @@ represents the index of the last read in the actively loaded db. For example, 1 3-5 $ displays reads 1, 3, 4, 5, and the last read in the active db. As another example, 1-$ displays every read in the active db (the default). -By default a .fasta file of the read sequences is displayed. If the -q option is +By default a .fasta file of the read sequences is displayed. If the **-q** option is set, then the QV streams are also displayed in a non-standard modification of the -fasta format. If the -n option is set then the DNA sequence is *not* displayed. -If the -Q option is set then a .quiva file is displayed and in this case the -n -and -m options mayt not be set (and the -q and -w options have no effect). +fasta format. If the **-n** option is set then the DNA sequence is *not* displayed. +If the **-Q** option is set then a .quiva file is displayed and in this case the **-n** +and **-m** options may not be set (and the **-q** and **-w** options have no effect). -If one or more masks are set with the -m option then the track intervals are also +If one or more masks are set with the **-m** option then the track intervals are also displayed in an additional header line and the bases within an interval are displayed in the case opposite that used for all the other bases. By default the output -sequences are in lower case and 80 chars per line. The -U option specifies upper +sequences are in lower case and 80 chars per line. The **-U** option specifies upper case should be used, and the characters per line, or line width, can be set to any -positive value with the -w option. +positive value with the **-w** option. -The .fasta or .quiva files that are output can be converted into a DB by fasta2DB -and quiva2DB (if the -q and -n options are not set and no -m options are set), +The .fasta or .quiva files that are output can be converted into a DB by **fasta2DB**(1) +and **quiva2DB**(1) (if the **-q** and **-n** options are not set and no **-m** options are set), giving one a simple way to make a DB of a subset of the reads for testing purposes. -# OPTIONS - # SEE ALSO + +**daligner**(1) diff --git a/debian/man/DBsplit.1.md b/debian/man/DBsplit.1.md index 2a60e26..769b6db 100644 --- a/debian/man/DBsplit.1.md +++ b/debian/man/DBsplit.1.md @@ -1,28 +1,30 @@ -% (1) 1.0 +% DBSPLIT(1) 1.0 % % September 2015 # NAME +DBsplit - divide a Dazzler database into a series of blocks + # SYNOPSIS -# DESCRIPTION +**DBsplit** [**-a**] [**-x***int*] [**-s***int(200)*] *path:db|dam* - DBsplit [-a] [-x<int>] [-s<int(200)>] <path:db|dam> +# DESCRIPTION -Divide the database <path>.db or <path>.dam conceptually into a series of blocks -referable to on the command line as <path>.1, <path>.2, ... If the -x option is set -then all reads less than the given length are ignored, and if the -a option is not +Divide the database *path*.db or *path*.dam conceptually into a series of blocks +referable to on the command line as *path*.1, *path*.2, ... If the **-x** option is set +then all reads less than the given length are ignored, and if the **-a** option is not set then secondary reads from a given well are also ignored. The remaining reads, constituting what we call the trimmed DB, are split amongst the blocks so that each -block is of size -s * 1Mbp except for the last which necessarily contains a smaller -residual. The default value for -s is 200Mbp because blocks of this size can be -compared by our "overlapper" dalign in roughly 16Gb of memory. The blocks are very +block is of size **-s** `* 1Mbp` except for the last which necessarily contains a smaller +residual. The default value for **-s** is 200Mbp because blocks of this size can be +compared by our "overlapper" **daligner**(1) in roughly 16Gb of memory. The blocks are very space efficient in that their sub-index of the master .idx is computed on the fly when loaded, and the .bps and .qvs files (if a .db) of base pairs and quality values, respectively, is shared with the master DB. Any relevant portions of tracks associated with the DB are also computed on the fly when loading a database block. -# OPTIONS - # SEE ALSO + +**daligner**(1) diff --git a/debian/man/DBstats.1.md b/debian/man/DBstats.1.md index c240a33..6f041ec 100644 --- a/debian/man/DBstats.1.md +++ b/debian/man/DBstats.1.md @@ -1,23 +1,25 @@ -% (1) 1.0 +% DBSTATS(1) 1.0 % % September 2015 # NAME +DBstats - show statistics for reads in a Dazzler database + # SYNOPSIS -# DESCRIPTION +**DBstats** [**-nu**] [**-b***int(1000)*] [**-m***track*]+ *path:db|dam* -DBstats [-nu] [-b<int(1000)] [-m<track>]+ <path:db|dam> +# DESCRIPTION -Show overview statistics for all the reads in the trimmed data base <path>.db or -<path>.dam, including a histogram of read lengths where the bucket size is set -with the -b option (default 1000). If the -u option is given then the untrimmed -database is summarized. If the -n option is given then the histogran of read lengths -is not displayed. Any track such as a "dust" track that gives a seried of -intervals along the read can be specified with the -m option in which case a summary +Show overview statistics for all the reads in the trimmed data base *path*.db or +*path*.dam, including a histogram of read lengths where the bucket size is set +with the **-b** option (default 1000). If the **-u** option is given then the untrimmed +database is summarized. If the **-n** option is given then the histogram of read lengths +is not displayed. Any track such as a "dust" track that gives a series of +intervals along the read can be specified with the **-m** option in which case a summary and a histogram of the interval lengths is displayed. -# OPTIONS - # SEE ALSO + +**daligner**(1) diff --git a/debian/man/fasta2DAM.1.md b/debian/man/fasta2DAM.1.md index 2b64f93..582235a 100644 --- a/debian/man/fasta2DAM.1.md +++ b/debian/man/fasta2DAM.1.md @@ -1,21 +1,24 @@ -% (1) 1.0 +% FASTA2DAM(1) 1.0 % % September 2015 # NAME +fasta2DAM - build a Dazzler map database from fasta files + # SYNOPSIS -fasta2DAM [-v] <path:dam> ( -f<file> | <input:fasta> ... ) +**fasta2DAM** [**-v**] *path:dam* ( **-f***file* | *input:fasta* ... ) # DESCRIPTION Builds a map DB or DAM from the list of .fasta files following the map database name -argument, or if the -f option is used, the list of .fasta files in <file>. Any .fasta +argument, or if the **-f** option is used, the list of .fasta files in *file*. Any .fasta entry that has a run of N's in it will be split into separate "contig" entries and the interval of the contig in the original entry recorded. The header for each .fasta entry is saved with the contigs created from it. -# OPTIONS - # SEE ALSO + +**DAM2fasta**(1) +**daligner**(1) diff --git a/debian/man/quiva2DB.1.md b/debian/man/quiva2DB.1.md index 4d46eef..aa0d368 100644 --- a/debian/man/quiva2DB.1.md +++ b/debian/man/quiva2DB.1.md @@ -4,7 +4,7 @@ # NAME -quiva2DB - +quiva2DB - add .quiva files to a Dazzler database # SYNOPSIS diff --git a/debian/man/simulator.1.md b/debian/man/simulator.1.md index aec8a14..66fe0a9 100644 --- a/debian/man/simulator.1.md +++ b/debian/man/simulator.1.md @@ -1,37 +1,39 @@ -% SIMULATOR(1) 1.0 +% DSIMULATOR(1) 1.0 % % September 2015 # NAME +dsimulator - generate synthetic reads for a random genome + # SYNOPSIS +**dsimulator** *genlen:double* [**-c***double(20.)*] [**-b***double(.5)*] + [**-r***int*] [**-m***int(10000)*] [**-s***int(2000)*] + [**-x***int(4000)*] [**-e***double(.15)*] + [**-M***file*] + # DESCRIPTION -simulator <genlen:double> [-c<double(20.)>] [-b<double(.5)] [-r<int>] - [-m<int(10000)>] [-s<int(2000)>] - [-x<int(4000)>] [-e<double(.15)>] - [-M<file>] - -In addition to the DB commands we include here, somewhat tangentially, a simple -simulator that generates synthetic reads for a random genome. simulator first -generates a fake genome of size genlen*1Mb long, that has an AT-bias of -b. It then -generates sample reads of mean length -m from a log-normal length distribution with -standard deviation -s, but ignores reads of length less than -x. It collects enough -reads to cover the genome -c times and introduces -e fraction errors into each read +**dsimulator** first generates a fake genome of size *genlen*`*1Mb` long, that has an AT-bias of **-b**. It then +generates sample reads of mean length **-m** from a log-normal length distribution with +standard deviation **-s**, but ignores reads of length less than **-x**. It collects enough +reads to cover the genome **-c** times and introduces **-e** fraction errors into each read where the ratio of insertions, deletions, and substitutions are set by defined -constants INS_RATE (default 73%) and DEL_RATE (default 20%) within generate.c. One +constants `INS_RATE` (default 73%) and `DEL_RATE` (default 20%) within generate.c. One can also control the rate at which reads are picked from the forward and reverse -strands by setting the defined constant FLIP_RATE (default 50/50). The -r option seeds +strands by setting the defined constant `FLIP_RATE` (default 50/50). The **-r** option seeds the random number generator for the generation of the genome so that one can reproducibly generate the same underlying genome to sample from. If this parameter is missing, then the job id of the invocation seeds the random number generator. The output is sent to the standard output (i.e. it is a UNIX pipe). The output is in -Pacbio .fasta format suitable as input to fasta2DB. Finally, the -M option requests +Pacbio .fasta format suitable as input to **fasta2DB**(1). Finally, the **-M** option requests that the coordinates from which each read has been sampled are written to the indicated file, one line per read, ASCII encoded. This "map" file essentially tells one where every read belongs in an assembly and is very useful for debugging and testing -purposes. If a read pair is say b,e then if b < e the read was sampled from [b,e] in -the forward direction, and if b > e from [e,b] in the reverse direction. +purposes. If a read pair is say b,e then if `b < e` the read was sampled from [b,e] in +the forward direction, and if `b > e` from [e,b] in the reverse direction. # SEE ALSO + +**daligner**(1) -- Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/dazzdb.git _______________________________________________ debian-med-commit mailing list debian-med-commit@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/debian-med-commit