Peter C. wrote:
Could I suggest you treat known input formats which are not supported
as output formats a little differently and instead of this:

unknown output format 'genpept'

Perhaps give,

format 'genpept' is not supported for output (only input)

This would help the user rule out having a typo etc.

A useful suggestion. We can apply that to feature formats too. I'll see what I can do.

may be worth a tidy up on what we do with formats that are only valid for nucleotide or protein (though that is a little tricky as we currently try to let some fail over to an equivalent format.

Does biopython have a definition of the fields it expects to write out in a
GenPept or RefseqP format file? We would be able to allow GenBank as an
alias for, presumably, genpept.

Not explicitly, no. I was hoping to use EMBOSS for cross validation ;)

No problem. We'll go first then and try to define standard formats.

With hindsight this may have been a mistake, but we use "genbank"
format to mean either nucleotides of proteins. On parsing we just
look at the units of length in the LOCUS line (bp or aa). We also
try to cope with both the current NCBI files and some older variants
we have in our unit tests (different offsets in the LOCUS line).

We try that too on input, but for output we have to be explicit so the user can pick just one of the choices.

regards,

Peter R.
_______________________________________________
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss

Reply via email to