Dear Gerard,

a possible programmatic approach may be a loop over all (model, map,
resolution) or (model, map, half_map1, half_map2, resolution) from PDB/EMDB
and calling

phenix.validation_cryoem model.<pdb or cif> emdb_xxxx.map resolution=value
> xxxx.log
or
phenix.validation_cryoem model.<pdb or cif> emdb_xxxx.map
half_map1.map half_map2.map resolution=value > xxxx.log

The log file will have things like overall model counts:

DEVIATIONS FROM IDEAL VALUES.
  BOND      :  0.031   1.220   4166
  ANGLE     :  3.271 102.317   5611
  CHIRALITY :  0.085   0.434    629
  PLANARITY :  0.018   0.270    730
  DIHEDRAL  : 28.082 179.891   1579
  MIN NONBONDED DISTANCE : 1.223

MOLPROBITY STATISTICS.
  ALL-ATOM CLASHSCORE : 20.08
  RAMACHANDRAN PLOT:
    OUTLIERS :  3.29 %
    ALLOWED  :  9.69 %
    FAVORED  : 87.02 %
  ROTAMER OUTLIERS : 24.26 %
  CBETA DEVIATIONS :  0.00 %
  PEPTIDE PLANE:
    CIS-PROLINE     : 0.00 %
    CIS-GENERAL     : 1.59 %
    TWISTED PROLINE : 0.00 %
    TWISTED GENERAL : 1.39 %

and model-to-map fit:

  CC_mask  : 0.8818
  CC_box   : 0.7657

Also it checks the sanity of HELIX/SHEET records.

Then parsing these logs and taking top N entries having the worst metric
you focus on (like clashscore or map-model CC or anything else you choose)
will likely give you some good candidates for your tests. Of course this
won't help you with local modeling issues such as wrong register or similar.

Instead of dealing with text log files you can use pickle=true keyword in
the above commands. Then a Python pickle file will be created that contains
all the information. Once you iterated over all entries, then you can load
these pickle files and harvest information you need.

Also, you may find some examples here (but probably not a lot individual
ones):
http://journals.iucr.org/d/issues/2018/09/00/kw5139/index.html

I can help more with details, if needed.

Good luck!
Pavel

On Tue, Dec 10, 2019 at 5:26 AM Gerard DVD Kleywegt <ger...@xray.bmc.uu.se>
wrote:

> Dear colleagues,
>
> We are developing a new validation method that takes EM maps and models
> into
> account. In order to understand the potential, the applicability and the
> limitations of the method we are looking for good test cases, which turn
> out
> to be surprisingly hard to find.
>
> What we are looking for are EM structures with serious modelling errors,
> e.g.
> register errors (model sequence out-of-register with the map),
> connectivity
> errors (between (non-)adjacent secondary structure elements),
> directionality
> errors (e.g., a helix built backwards), substantial stretches of mistraced
> residues, etc.
>
> The structures should be publicly available in PDB (model) and EMDB (map),
> they should be at 4.0Å resolution or better, and ideally there should be a
> higher resolution improved structure (EM or X-ray).
>
> If the modelling errors have been confirmed and discussed in the
> literature
> that is ideal, but we are also interested if this is not the case.
>
> Feel free to reply either in confidence to me (mailto:ger...@ebi.ac.uk),
> or to
> the entire list.
>
> Many thanks in advance for any examples you can think of!
>
> PS: apologies for cross-posting to two other mailing lists.
>
> --Gerard
>
> ---
> Gerard J. Kleywegt, EMBL-EBI, Hinxton, UK
> Head of Molecular and  Cellular Structure
> ger...@ebi.ac.uk pdbe.org emdb-empiar.org
> PA: Roisin Dunlop    pdbe_ad...@ebi.ac.uk
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Reply via email to