Re: [COOT] Coot fails to read numeric chain IDs

2009-06-04 Thread Paul Emsley

Kevin Cowtan wrote:
Coot reads files with duplicate atom numbers 
just fine (at least 0.6pre-latest does).



However Coot will ignore any lines beginning
ATOM 1?
ATOM 2?
etc
because those are not valid atom records. So that is the reason these 
are being ignored. 


Yes, we should say that this is handled by the mmdb library from Eugene 
Krissinel.


The original poster could have expedited the analysis somewhat by 
quoting the console log which should have reproduced the problematic 
lines and line numbers.


Paul.

(p.s. back now, less out of touch)


Re: [COOT] Coot fails to read numeric chain IDs

2009-06-03 Thread Garib Murshudov

On 3 Jun 2009, at 09:19, Kevin Cowtan wrote:


Hi!

Hybrid-36 (initial idea from Ralph Grosse Kunstleve, I think) works  
around this problem by allowing higher numbered atoms to be numbered  
in base 36 starting from A. CCP4 v6.1.1 and recent versions of  
Coot support this, as does phenix.refine.

http://cci.lbl.gov/hybrid_36/

However, last time I checked, it wasn't available in refmac. I think  
refmac ignores atom numbers on read, and writes out its own atom  
numbers sequentially, looping at 10. As a result, you cannot  
read a refmac output PDB file with more 10 atoms into Coot or  
any other CCP4 application without somehow renumbering the atoms  
first.


Garib: Have I got that right? Is that still the case in recent  
versions?


I am working on this when I have time. However I do not think atom  
numbers have any useful information in them (Perhaps for connect  
records).



Garib






Kevin


Edward Miller wrote:

Hey Paul,
I believe I have figured out the problem - but not the solution.
On closer inspection, it wasn't just  the numerically labelled  
chain IDs that weren't read, a few of the alphabetical chains were  
also not read.
What happens is REFMAC restarts atom numbering once 10 atoms  
are reached when outputting the refined coordinates.
COOT was reading in the first 9 atoms, numbered 1-9 by  
REFMAC, then the second 9 atoms, again numbered 1-9 by  
REFMAC. At this point, however, REFMAC for some reason made the  
next atom 10 and then restarted numbering at 1. This atom,  
numbered 10, was not read by COOT and nor where any subsequent  
atoms.
The input coordinates that I used for REFMAC, which again consisted  
of 60 chains, was created by combining sets of 10mers, thus the  
atom numbering restarted after each 10 chains. REFMAC had no  
problem refining these coordinates and COOT had no problem reading  
these coordinates.
To further confirm that COOT is failing once it reads atom 10,  
I wrote a little perl script to properly renumber the atoms in the  
REFMAC refined coordinates. In this file, there are correctly  
246960 atoms.
Within my perl script, I formatted the output such that the atom  
numbers eat into the spaces after "ATOM"  like so:

ATOM 246960  O   LEU 7 736  64.294  96.393-111.576  1.00 34.97
I see now that the last atom COOT reads in is atom number 9.
Perhaps, I fear, this is a problem with the PDB format - that the  
column width for atom numbers can not accommodate more than 9  
atoms.

Ed
On Tue, Jun 2, 2009 at 9:39 AM, Paul Emsley  > wrote:

   Edward Miller wrote:
   Hey Folks,
   I'm working on refining a full capsid containing 60 chains.
   Using the chain ID column, I've numbered my chains A-Z, a-z,  
and
   0-7. I was successfully able to refine my capsid in refmac  
using

   this chain ID naming scheme.
   However, coot fails to read in any of the chains labeled 0-7.
   I don't know why this is failing - I will try to make a synthetic
   PDB file with 60 chains to see if I can reproduce this.
   Relatively recently Eugene has extended mmdb to  work with the
   hybrid_36 format.  So for now you could go that route.
   Paul.





Re: [COOT] Coot fails to read numeric chain IDs

2009-06-03 Thread Kevin Cowtan

I thought it did. But apparently it doesn't.

None of this is Coot code, it's CCP4 library code, so it's beyond our 
control.


Phil Evans wrote:
why does Coot ignore these atoms renumbered back to 1 beyond 1? 
Shouldn't it just ignore the irrelevant atom number?

Phil


Re: [COOT] Coot fails to read numeric chain IDs

2009-06-03 Thread Phil Evans
why does Coot ignore these atoms renumbered back to 1 beyond 1?  
Shouldn't it just ignore the irrelevant atom number?

Phil

On 3 Jun 2009, at 09:19, Kevin Cowtan wrote:


Hi!

Hybrid-36 (initial idea from Ralph Grosse Kunstleve, I think) works  
around this problem by allowing higher numbered atoms to be numbered  
in base 36 starting from A. CCP4 v6.1.1 and recent versions of  
Coot support this, as does phenix.refine.

http://cci.lbl.gov/hybrid_36/

However, last time I checked, it wasn't available in refmac. I think  
refmac ignores atom numbers on read, and writes out its own atom  
numbers sequentially, looping at 10. As a result, you cannot  
read a refmac output PDB file with more 10 atoms into Coot or  
any other CCP4 application without somehow renumbering the atoms  
first.


Garib: Have I got that right? Is that still the case in recent  
versions?


Kevin


Edward Miller wrote:

Hey Paul,
I believe I have figured out the problem - but not the solution.
On closer inspection, it wasn't just  the numerically labelled  
chain IDs that weren't read, a few of the alphabetical chains were  
also not read.
What happens is REFMAC restarts atom numbering once 10 atoms  
are reached when outputting the refined coordinates.
COOT was reading in the first 9 atoms, numbered 1-9 by  
REFMAC, then the second 9 atoms, again numbered 1-9 by  
REFMAC. At this point, however, REFMAC for some reason made the  
next atom 10 and then restarted numbering at 1. This atom,  
numbered 10, was not read by COOT and nor where any subsequent  
atoms.
The input coordinates that I used for REFMAC, which again consisted  
of 60 chains, was created by combining sets of 10mers, thus the  
atom numbering restarted after each 10 chains. REFMAC had no  
problem refining these coordinates and COOT had no problem reading  
these coordinates.
To further confirm that COOT is failing once it reads atom 10,  
I wrote a little perl script to properly renumber the atoms in the  
REFMAC refined coordinates. In this file, there are correctly  
246960 atoms.
Within my perl script, I formatted the output such that the atom  
numbers eat into the spaces after "ATOM"  like so:

ATOM 246960  O   LEU 7 736  64.294  96.393-111.576  1.00 34.97
I see now that the last atom COOT reads in is atom number 9.
Perhaps, I fear, this is a problem with the PDB format - that the  
column width for atom numbers can not accommodate more than 9  
atoms.

Ed
On Tue, Jun 2, 2009 at 9:39 AM, Paul Emsley  > wrote:

   Edward Miller wrote:
   Hey Folks,
   I'm working on refining a full capsid containing 60 chains.
   Using the chain ID column, I've numbered my chains A-Z, a-z,  
and
   0-7. I was successfully able to refine my capsid in refmac  
using

   this chain ID naming scheme.
   However, coot fails to read in any of the chains labeled 0-7.
   I don't know why this is failing - I will try to make a synthetic
   PDB file with 60 chains to see if I can reproduce this.
   Relatively recently Eugene has extended mmdb to  work with the
   hybrid_36 format.  So for now you could go that route.
   Paul.


Re: [COOT] Coot fails to read numeric chain IDs

2009-06-03 Thread Kevin Cowtan

Hi!

Hybrid-36 (initial idea from Ralph Grosse Kunstleve, I think) works 
around this problem by allowing higher numbered atoms to be numbered in 
base 36 starting from A. CCP4 v6.1.1 and recent versions of Coot 
support this, as does phenix.refine.

http://cci.lbl.gov/hybrid_36/

However, last time I checked, it wasn't available in refmac. I think 
refmac ignores atom numbers on read, and writes out its own atom numbers 
sequentially, looping at 10. As a result, you cannot read a refmac 
output PDB file with more 10 atoms into Coot or any other CCP4 
application without somehow renumbering the atoms first.


Garib: Have I got that right? Is that still the case in recent versions?

Kevin


Edward Miller wrote:

Hey Paul,

I believe I have figured out the problem - but not the solution.

On closer inspection, it wasn't just  the numerically labelled chain IDs 
that weren't read, a few of the alphabetical chains were also not read.


What happens is REFMAC restarts atom numbering once 10 atoms are 
reached when outputting the refined coordinates.


COOT was reading in the first 9 atoms, numbered 1-9 by REFMAC, 
then the second 9 atoms, again numbered 1-9 by REFMAC. At this 
point, however, REFMAC for some reason made the next atom 10 and 
then restarted numbering at 1. This atom, numbered 10, was not read 
by COOT and nor where any subsequent atoms.


The input coordinates that I used for REFMAC, which again consisted of 
60 chains, was created by combining sets of 10mers, thus the atom 
numbering restarted after each 10 chains. REFMAC had no problem refining 
these coordinates and COOT had no problem reading these coordinates.


To further confirm that COOT is failing once it reads atom 10, I 
wrote a little perl script to properly renumber the atoms in the REFMAC 
refined coordinates. In this file, there are correctly 246960 atoms.


Within my perl script, I formatted the output such that the atom numbers 
eat into the spaces after "ATOM"  like so:


ATOM 246960  O   LEU 7 736  64.294  96.393-111.576  1.00 34.97


I see now that the last atom COOT reads in is atom number 9.


Perhaps, I fear, this is a problem with the PDB format - that the column 
width for atom numbers can not accommodate more than 9 atoms.


Ed

On Tue, Jun 2, 2009 at 9:39 AM, Paul Emsley > wrote:


Edward Miller wrote:

Hey Folks,

I'm working on refining a full capsid containing 60 chains.

Using the chain ID column, I've numbered my chains A-Z, a-z, and
0-7. I was successfully able to refine my capsid in refmac using
this chain ID naming scheme.

However, coot fails to read in any of the chains labeled 0-7.


I don't know why this is failing - I will try to make a synthetic
PDB file with 60 chains to see if I can reproduce this.

Relatively recently Eugene has extended mmdb to  work with the
hybrid_36 format.  So for now you could go that route.

Paul.




Re: [COOT] Coot fails to read numeric chain IDs

2009-06-02 Thread Ethan Merritt
On Tuesday 02 June 2009 15:00:23 Edward Miller wrote:
> I see now that the last atom COOT reads in is atom number 9.
> 
> 
> Perhaps, I fear, this is a problem with the PDB format - that the column
> width for atom numbers can not accommodate more than 9 atoms.
> 
> Ed
> 

Yes. This is one of many limitations of the PDB format.
We've had this discussion before, and the rational option (abandon it 
in favor of something better) has been nixed by the Powers That Be.

Note that the current PDB standard
 http://www.wwpdb.org/documentation/format32/sect9.html
states
 "If a collection contains more than 99,999 total atoms, 
  then more than one entry must be made"

And, in fact, the PDB splits such large models into more than one
PDB file upon deposition if you haven't already done so before hand.

Nevertheless, many [most?] programs deal happily with this nonsense by
ignoring the sequence field altogether.  What earthly good does it do
you to know what was the seqeunce number for atom NZ of Lys Z34 in
a particular version of your model, given that it was almost certainly
different after being processed by the next program used in the course 
of refinement or model-fitting.   

The situation is somewhat different for non-protein atoms, since the
sequence number is used by the CONECT records.

So I'll make a modest proposal to give all protein atoms sequence
number 1, at leave the other 99,998 available integers for ligands :-)

-- 
Ethan A Merritt
Biomolecular Structure Center
University of Washington, Seattle 98195-7742


Re: [COOT] Coot fails to read numeric chain IDs

2009-06-02 Thread Edward Miller
Additionally, since this was asked before, I am running COOT 0.5.2

Ed

On Tue, Jun 2, 2009 at 6:00 PM, Edward Miller  wrote:

> Hey Paul,
>
> I believe I have figured out the problem - but not the solution.
>
> On closer inspection, it wasn't just  the numerically labelled chain IDs
> that weren't read, a few of the alphabetical chains were also not read.
>
> What happens is REFMAC restarts atom numbering once 10 atoms are
> reached when outputting the refined coordinates.
>
> COOT was reading in the first 9 atoms, numbered 1-9 by REFMAC, then
> the second 9 atoms, again numbered 1-9 by REFMAC. At this point,
> however, REFMAC for some reason made the next atom 10 and then restarted
> numbering at 1. This atom, numbered 10, was not read by COOT and nor
> where any subsequent atoms.
>
> The input coordinates that I used for REFMAC, which again consisted of 60
> chains, was created by combining sets of 10mers, thus the atom numbering
> restarted after each 10 chains. REFMAC had no problem refining these
> coordinates and COOT had no problem reading these coordinates.
>
> To further confirm that COOT is failing once it reads atom 10, I wrote
> a little perl script to properly renumber the atoms in the REFMAC refined
> coordinates. In this file, there are correctly 246960 atoms.
>
> Within my perl script, I formatted the output such that the atom numbers
> eat into the spaces after "ATOM"  like so:
>
> ATOM 246960  O   LEU 7 736  64.294  96.393-111.576  1.00 34.97
>
>
> I see now that the last atom COOT reads in is atom number 9.
>
>
> Perhaps, I fear, this is a problem with the PDB format - that the column
> width for atom numbers can not accommodate more than 9 atoms.
>
> Ed
>
> On Tue, Jun 2, 2009 at 9:39 AM, Paul Emsley wrote:
>
>> Edward Miller wrote:
>>
>>> Hey Folks,
>>>
>>> I'm working on refining a full capsid containing 60 chains.
>>>
>>> Using the chain ID column, I've numbered my chains A-Z, a-z, and 0-7. I
>>> was successfully able to refine my capsid in refmac using this chain ID
>>> naming scheme.
>>>
>>> However, coot fails to read in any of the chains labeled 0-7.
>>>
>>
>> I don't know why this is failing - I will try to make a synthetic PDB file
>> with 60 chains to see if I can reproduce this.
>>
>> Relatively recently Eugene has extended mmdb to  work with the hybrid_36
>> format.  So for now you could go that route.
>>
>> Paul.
>>
>
>


Re: [COOT] Coot fails to read numeric chain IDs

2009-06-02 Thread Edward Miller
Hey Paul,

I believe I have figured out the problem - but not the solution.

On closer inspection, it wasn't just  the numerically labelled chain IDs
that weren't read, a few of the alphabetical chains were also not read.

What happens is REFMAC restarts atom numbering once 10 atoms are reached
when outputting the refined coordinates.

COOT was reading in the first 9 atoms, numbered 1-9 by REFMAC, then
the second 9 atoms, again numbered 1-9 by REFMAC. At this point,
however, REFMAC for some reason made the next atom 10 and then restarted
numbering at 1. This atom, numbered 10, was not read by COOT and nor
where any subsequent atoms.

The input coordinates that I used for REFMAC, which again consisted of 60
chains, was created by combining sets of 10mers, thus the atom numbering
restarted after each 10 chains. REFMAC had no problem refining these
coordinates and COOT had no problem reading these coordinates.

To further confirm that COOT is failing once it reads atom 10, I wrote a
little perl script to properly renumber the atoms in the REFMAC refined
coordinates. In this file, there are correctly 246960 atoms.

Within my perl script, I formatted the output such that the atom numbers eat
into the spaces after "ATOM"  like so:

ATOM 246960  O   LEU 7 736  64.294  96.393-111.576  1.00 34.97


I see now that the last atom COOT reads in is atom number 9.


Perhaps, I fear, this is a problem with the PDB format - that the column
width for atom numbers can not accommodate more than 9 atoms.

Ed

On Tue, Jun 2, 2009 at 9:39 AM, Paul Emsley wrote:

> Edward Miller wrote:
>
>> Hey Folks,
>>
>> I'm working on refining a full capsid containing 60 chains.
>>
>> Using the chain ID column, I've numbered my chains A-Z, a-z, and 0-7. I
>> was successfully able to refine my capsid in refmac using this chain ID
>> naming scheme.
>>
>> However, coot fails to read in any of the chains labeled 0-7.
>>
>
> I don't know why this is failing - I will try to make a synthetic PDB file
> with 60 chains to see if I can reproduce this.
>
> Relatively recently Eugene has extended mmdb to  work with the hybrid_36
> format.  So for now you could go that route.
>
> Paul.
>


Re: [COOT] Coot fails to read numeric chain IDs

2009-06-02 Thread Paul Emsley

Edward Miller wrote:

Hey Folks,

I'm working on refining a full capsid containing 60 chains.

Using the chain ID column, I've numbered my chains A-Z, a-z, and 0-7. 
I was successfully able to refine my capsid in refmac using this chain 
ID naming scheme.


However, coot fails to read in any of the chains labeled 0-7.


I don't know why this is failing - I will try to make a synthetic PDB 
file with 60 chains to see if I can reproduce this.


Relatively recently Eugene has extended mmdb to  work with the hybrid_36 
format.  So for now you could go that route.


Paul.


[COOT] Coot fails to read numeric chain IDs

2009-06-01 Thread Edward Miller
Hey Folks,

I'm working on refining a full capsid containing 60 chains.

Using the chain ID column, I've numbered my chains A-Z, a-z, and 0-7. I was
successfully able to refine my capsid in refmac using this chain ID naming
scheme.

However, coot fails to read in any of the chains labeled 0-7.

Attempts to exploit the "Use SEGIDs..." extension fail to work as coot
doesn't even display chains 0-7 to begin permitting a swap with SEGIDs.

Would anyone have any ideas how to get around this problem?


Cheers,

Ed