Re: [ccp4bb] Introducing the UNTANGLE Challenge

2024-01-21 Thread Herbert J. Bernstein
Have you considered the impact of tunneling?  Your rope crossings are not
perfect barriers.

On Sat, Jan 20, 2024 at 6:09 PM James Holton  wrote:

> Update:
>
> I've gotten some feedback asking for clarity on what I mean by "tangled".
> I paste here a visual aid:
>
>
> The protein chains in an ensemble model are like these ropes. If these
> ropes are the same length as the distance from floor to ceiling, then
> straight up-and-down is the global minimum in energy (left). The anchor
> points are analogous to the rest of the protein structure, which is the
> same in both diagrams. Imagine for a moment, however, after anchoring the
> dangling rope ends to the floor you look up and see the ropes are actually
> crossed (right). You got the end points right, but no amount of pulling on
> the ropes (energy minimization) is going to get you from the tangled
> structure to the global minimum. The tangled ropes are also strained,
> because they are being forced to be a little longer than they want to be.
> This strain in protein models manifests as geometry outliers and the
> automatic weighting in your refinement program responds to bad geometry by
> relaxing the x-ray weight, which alleviates some of the strain, but
> increases your Rfree.
>
> The goal of this challenge is to eliminate these tangles, and do it
> efficiently. What we need is a topoisomerase! Something that can find the
> source of strain and let the ropes pass through each other at the
> appropriate place.  I've always wanted one of those for the wires behind my
> desk...
>
> More details on the origins of tangling in ensemble models can be found
> here:
> https://bl831.als.lbl.gov/~jamesh/challenge/twoconf/#tangle
>
> -James Holton
> MAD Scientist
>
> On 1/18/2024 4:33 PM, James Holton wrote:
>
> Greetings Everybody,
>
> I present to you a Challenge.
>
> Structural biology would be far more powerful if we can get our models out
> of local minima, and together, I believe we can find a way to escape them.
>
> tldr: I dare any one of you to build a model that scores better than my
> "best.pdb" model below. That is probably impossible, so I also dare you to
> approach or even match "best.pdb" by doing something more clever than just
> copying it. Difficulty levels range from 0 to 11. First one to match the
> best.pdb energy score an Rfree wins the challenge, and I'd like you to be
> on my paper. You have nine months.
>
> Details of the challenge, scoring system, test data, and available
> starting points can be found here:
> https://bl831.als.lbl.gov/~jamesh/challenge/twoconf/
>
> Why am I doing this?
> We all know that macromolecules adopt multiple conformations. That is how
> they function. And yet, ensemble refinement still has a hard time competing
> with conventional single-conformer-with-a-few-split-side-chain models when
> it comes to revealing correlated motions, or even just simultaneously
> satisfying density data and chemical restraints. That is, ensembles still
> suffer from the battle between R factors and geometry restraints. This is
> because the ensemble member chains cannot pass through each other, and get
> tangled. The tangling comes from the density, not the chemistry. Refinement
> in refmac, shelxl, phenix, simulated annealing, qFit, and even coot cannot
> untangle them.
>
> The good news is: knowledge of chemistry, combined with R factors, appears
> to be a powerful indicator of how near a model is to being untangled. What
> is really exciting is that the genuine, underlying ensemble cannot be
> tangled. The true ensemble _defines_ the density; it is not being fit to
> it. The more untangled a model gets the closer it comes to the true
> ensemble, with deviations from reasonable chemistry becoming easier and
> easier to detect. In the end, when all alternative hypotheses have been
> eliminated, the model must match the truth.
>
> Why can't we do this with real data? Because all ensemble models are
> tangled. Let's get to untangling them, shall we?
>
> To demonstrate, I have created a series of examples that are progressively
> more difficult to solve, but the ground truth model and density is the same
> in all cases. Build the right model, and it will not only explain the data
> to within experimental error, and have the best possible validation stats,
> but it will reveal the true, underlying cooperative motion of the protein
> as well.
>
> Unless, of course, you can prove me wrong?
>
> -James Holton
> MAD Scientist
>
>
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are availab

Re: [ccp4bb] nearestcell

2023-12-27 Thread Herbert J. Bernstein
That is a live link to sauc, but I have not updated the database from the
PDB in a while.  Let me know if there is an
urgent need for an update,  Otherwise I am planning one for March.  I
normally build the database from COD and
the PDB, but if CCDC wishes, we can add the CSD to the mix.   For the
longer term, would anybody be
willing to provide an institutional home?

Regards,
Herbert

On Wed, Dec 27, 2023 at 1:47 PM Mike S  wrote:

> Hi Kay,
>
> http://flops.arcib.org:8084/sauc-1.1.1/
>
> similar to the nearest-cell server, which seems to change its link and is
> harder to find (or stale link)
> https://app.strubi.ox.ac.uk/nearest-cell/nearest-cell.cgi
>
> HTH,
> Mike
>
> On Wed, Dec 27, 2023 at 1:39 PM Kay Diederichs <
> kay.diederi...@uni-konstanz.de> wrote:
>
>> Dear all,
>>
>> I seem to remember a tool called "nearestcell", a command-line equivalent
>> of the Oxford Nearest-Cell web server which appears to be offline.
>> However I cannot locate that tool in CCP4 nor elsewhere. Can anyone point
>> me to it or give alternatives, please? (I did try the PDB's advanced search
>> but it is not made for this purpose)
>>
>> Thanks, and a happy and successful 2024 to everybody!
>>
>> Kay
>>
>> 
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
>> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
>> available at https://www.jiscmail.ac.uk/policyandsecurity/
>>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] open review?

2022-06-22 Thread Herbert J. Bernstein
Dear James,
  I think open reviews would be a major improvement in the grants review
process.  Most grant reviews are done carefully and honestly, but I have
seen some that were clearly written carelessly and dishonestly that would
not
have been submitted if the reviewers knew they would have to publicly
stand behind what they said.  Sunlight is an excellent disinfectant.
  Good suggestion.
  Regards,
Herbert

On Wed, Jun 22, 2022 at 9:09 PM James Holton  wrote:

> Greetings all,
>
> I'd like to ask a question that I expect might generate some spirited
> discussion.
>
> We have seen recently a groundswell of support for openness and
> transparency in peer review. Not only are pre-prints popular, but we are
> also seeing reviewer comments getting published along with the papers
> themselves. Sometimes even signed by the reviewers, who would have
> traditionally remained anonymous.
>
> My question is: why don't we also do this for grant proposals?
>
> I know this is not the norm. However, after thinking about it, why
> wouldn't we want the process of how funding is awarded in science to be
> at least as transparent as the process of publishing the results? Not
> that the current process isn't transparent, but it could be more so.
> What if applications, and their reviewer comments, were made public?
> Perhaps after an embargo period?  There could be great benefits here.
> New investigators especially, would have a much clearer picture of
> format, audience, context and convention. I expect unsuccessful
> applications might be even more valuable than successful ones. And yet,
> in reality, those old proposals and especially the comments almost never
> see the light of day. Monumental amounts of work goes into them, on both
> sides, but then get tucked away into the darkest corners of our hard
> drives.
>
> So, 2nd question is: would you do it? Would you upload your application
> into the public domain for all to see? What about the reviewer comments?
> If not, why not?  Afraid people will steal your ideas? Well, once
> something is public, its pretty clear who got the idea first.
>
> 3rd question: what if the service were semi-private? and you got to get
> comments on your proposal before submitting it to your funding agency?
> Would that be helpful? What if in exchange for that service you had to
> review 2-3 other applications?  Would that be worth it?
>
> Or, perhaps, I'm being far too naiive about all this. For all I know
> there are some rules against doing this I'm not aware of.  Either way,
> I'm interested in what this community thinks. Please share your views!
> On- or off-list is fine.
>
> -James Holton
> MAD Scientist
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] HDRMX virtual meeting 27-28 April 2022

2022-04-18 Thread Herbert J. Bernstein
Dear Colleagues,

  The next High Data Rate Macromolecular Crystallography (HDRMX)
meeting will be held on 27 and 28 April 2022 from 9 am to 1 pm, New
York time, as a virtual zoom meeting.

The rate at which data are acquired for macromolecular crystallography at
synchrotrons is continually increasing due to brighter sources, more finely
focused X-ray beams, and faster detectors.

This increased rate of data acquisition creates new opportunities for real
time data analysis that were previously not possible, in addition it also
creates challenges regarding computing hardware and network infrastructure.
This meeting will provide a forum for beamline scientists, engineers, and
detector manufacturers from around the world to discussprogress in
addressing the opportunities and challenges arising from higher data
acquisition rates.

This meeting will also allow MX beamlines at light sources around the world
to share their current status and future updates as well as provide
detector manufacturers with an opportunity to gather requirements for the
next generation of MX detectors.

The scheduled speakers are:

27 April 2022:

  Jie Nan, Oskar Aurelius, MAXIV

  Max Burian, Diego Gaemperle, DECTRIS Ltd

  Filip Leonarski, PSI

  Graeme Winter, Diamond Light Source

  Felix Wittwer, LBL

  Aaron Brewster, LBL


28 April 2022:

  Daniel Eriksson, Australian Synchrotron, ANSTO

  Francisco Hernandez Vivanco, Australian Synchrotron, ANSTO

  Marina Nikolova, EMBL

  Thomas White, DESY

  Jon Schuermann, Cornell University/APS NE-CAT

  Alexei Soares, NSLS-II BNL


There will be a discussion period at the end of each day.

Among the topics to be discussed are changes needed to

cope with moderately higher detector data rates in the

near term and much higher data rates -- in the thousands

of frames per second -- possible in the longer term.


If you are interested in attending via Zoom, please register

at:


  https://www.bnl.gov/hdrmx2022/


  -- Dale Kreitler, dkreit...@bnl.gov

 Herbert J. Bernstein, hbernst...@bnl.gov



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Open Access Repositories for Big Data?

2019-01-18 Thread Herbert J. Bernstein
The zenodo policies seem to the most workable as a start.  I would suggest
contacting them for the cases that go over 50GB, but at worst splitting
into 50GB chunks.  -- Herbert

On Fri, Jan 18, 2019 at 10:49 AM Andreas Förster <
andreas.foers...@dectris.com> wrote:

> Hi Aaron,
>
> can you slice your data and then link to the bits?
>
> We're currently trying to find out what "unlimited Google Drive storage"
> means by uploading pi in chunks of 70 GB or so.
>
> All best.
>
>
> Andreas
>
>
>
> On Fri, Jan 18, 2019 at 4:31 PM Aaron Finke  wrote:
>
>> Dear CCP4ites,
>>
>> Is anyone aware of online repositories that will store huge sets of raw
>> data (>100 GB)? I’m aware of Zenodo and SBGrid, but Zenodo’s limit is
>> typically 50 GB and their absolute limit is 100 GB. SBGrid has yet to
>> respond to my emails.
>>
>> I could host them myself, but the involuntary dry heaving response I got
>> when I brought up the idea to our IT department implied they were less
>> enthused with the idea than I was. So a cloud service would be far more
>> preferable as a long term solution.
>>
>> Thanks,
>> Aaron
>>
>> --
>> Aaron Finke
>> Staff Scientist, MacCHESS
>> Cornell University
>> e-mail: af...@cornell.edu
>>
>>
>> --
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>>
>
>
> --
> 
> Andreas Förster, Ph.D.
> Application Scientist Crystallography, Scientific Sales
> Phone: +41 56 500 21 00 | Direct: +41 56 500 21 76 | Email:
> andreas.foers...@dectris.com
> DECTRIS Ltd. | Taefernweg 1 | 5405 Baden-Daettwil | Switzerland |
> www.dectris.com
>
>
>
> 
> 
> 
> 
> [image: LinkedIn]
> 
> 
> 
> 
> 
> 
> [image:
> facebook] 
>  
> 
>
> *Confidentiality Note: This message is intended only for the use of the
> named recipient(s)*
> *and may contain confidential and/or privileged information. If you are
> not the intended*
> *recipient, please contact the sender and delete the message. Any
> unauthorized use of*
> *the information contained in this message is prohibited.*
>
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Long term storage for raw images/ crystallographic data sets

2018-11-30 Thread Herbert J. Bernstein
Dear Colleagues,

  May I suggest that those who are at Universities take a look at the
G-suite for Education

https://edu.google.com/products/gsuite-for-education/editions/?modal_active=none

which provides unlimited cloud storage for free to educational institutions.

  Regards,
Herbert

On Thu, Nov 29, 2018 at 3:54 PM Lieberman, Raquel L <
raquel.lieber...@chemistry.gatech.edu> wrote:

> Dear All,
>
> How do your labs handle long-term raw data backups? My lab is maxing out
> our 6TB RAID backup (with two off-site mirrors) so I am investigating our
> next long term solution. The vast majority of the data sets are published
> structures (i.e. processed data deposited in PDB) or redundant/unusable so
> immediate access is not anticipated, but the size of data sets is
> increasing quickly with time, so I am looking for a scalable-yet-affordable
> solution.
>
> Would be grateful for input into various options, e.g. bigger HD/RAIDs,
> cloud backup, tape, anything else.
>
> I will compile.
>
> Thank you,
>
> Raquel
> --
> Raquel L. Lieberman, Ph.D.
> Professor
> School of Chemistry and Biochemistry
> Georgia Institute of Technology
>
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Should we still keep copies of all raw data?

2018-07-14 Thread Herbert J. Bernstein
Dear James,

  Perhaps it is time for us to admit that this is too large, expensive and
complex a problem for us to resolve without help from one
or more of the commercial data managers, such as Google or Amazon.  I know
that dealing with ads is a nuisance, introducing
a loss of time for research, but going nuts trying to recover lost data
also costs time.  Perhaps we should show a willingness
to sell a little of our eyeball time seeing some ads in order to have
access to the most cost-effective data management
systems currently in existence.

  Regards,
Herbert

On Sat, Jul 14, 2018 at 2:23 PM, James Holton 
wrote:

> Why not just upload it to proteindiffraction.org ?  Or the SBGrid data
> bank (https://data.sbgrid.org/) ?  Or both for "redundancy" ?
>
>
> Yes, I did once do some calculations on what it would take to preserve
> data for tens of thousands of years, and the only proven storage medium for
> that timescale is clay tablets.  Assuming 1 mm^3 is all you need to store
> one bit it comes to about $3000/GB.
>
>
> Hard drives, however, are now down to $33/TB, which is comparable to a box
> of pipette tips, and takes up less space.  LTO-6 tapes are $3/TB.  So the
> cost of storage I don't think is any real burden, its the cost of managing
> that storage.  If you buy a box of 12 TB bare drives, then you need to
> spend a lot of time and effort getting your data onto them, and then
> wondering if they will still work after a few years.  Modern drives are
> much more reliable than they used to be, but maybe you want two copies?  Or
> a parity disk?  What you pay for when you buy a NAS, particularly a
> high-end NAS like NetApp is the cost and quality of management.  Rolled
> into the price of the product is not just redundant bits and the wires to
> connect them, but a team of people who get paid to make sure your data are
> always safe and available.
>
>
> The question then always comes down to cost/benefit.  What is the
> consequence of data loss?  What is the probability of data loss?  And are
> you feeling lucky?
>
>
> A few years ago I got a panicked email from a user whom I will not name,
> but this user had just been "Rupp-ed".  As in Bernhard had found a deposit
> of theirs that look a lot like a fake structure, and asked about it.  This
> deposition had been made ten years earlier, the student who did it had left
> science, and could not be reached.  This left the PI holding the bag. Turns
> out the student had made a mistake and deposited Fcalc instead of Fobs. But
> how do you prove that?  This user was VERY happy to find out that I still
> had their images on DVD. I was able to restore them and re-process them in
> about an hour.
>
>
> Lucky?  Perhaps.  Not every beamline at every synchrotron backs up data,
> and not every DVD I've written can be read back.  About 3000 images are
> still unrecoverable from those days.  On the other hand, there are other
> beamlines who make a point of destroying any traces of user data as part of
> their data protection plan. Most, I think, are middle-of-the-road with a
> data retention policy like "we'll do what we can, but can't promise
> anything".  Even at the same synchrotron policies can vary from beamline to
> beamline.  So again: do you feel lucky?  Do you?
>
>
> -James Holton
>
> MAD Scientist
>
> On 7/13/2018 2:30 AM, Sergei Strelkov wrote:
>
> Dear All,
>
>
> I believe this question may be of some interest.
>
> In the past, we always stored all raw data ever collected by the lab.
>
> With the recent advances, such as
>
> (a) automated/on-the-fly processing offered by some (European)
> synchrotrons, and
>
> (b) an ongoing discussion on centralized raw data archiving,
>
> I wonder if it is time to revise the strict policy of keeping all data
>
> (before we invest in a new NAS system... )
>
>
> Best wishes,
>
> Sergei
>
>
> Prof. Sergei V. Strelkov Laboratory for Biocrystallography Department of 
> Pharmaceutical Sciences, KU Leuven
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


[ccp4bb] Updated SAUC Cell Database

2017-12-29 Thread Herbert J. Bernstein
The SAUC Cell Database has been updated to 122596 PDB cells and 397452 COD
cells.  SAUC is accessible at

  http://iterate.sourceforge.net/sauc
or
  http://flops.arcib.org:8084/sauc/

The second link is to a faster machine, but with less network bandwidth

If you would like to install your own version locally, the source is
available in
the sauc.git repository iterate project at http://sf.net/projects/iterate at

 https://sourceforge.net/p/iterate/sauc/ci/master/tree/

or on github at

  https://github.com/yayahjb/sauc

  -- Herbert J. Bernstein
 yaya...@gmail.com


Re: [ccp4bb] similar unit cell

2017-06-18 Thread Herbert J. Bernstein
Try

http://iterate.sourceforge.net/sauc-1.0.0/

which finds 4Y42 as closest for both cells, but lots of others nearby.

On Sun, Jun 18, 2017 at 10:39 AM, James Holton 
wrote:

>
> By the way.  Does anyone out there have a unit cell search engine still
> running?  The two mentioned on this thread so far:
>
> ContaMiner:   https://strube.cbrc.kaust.edu.sa/contaminer/
>
> and SIMBAD:  http://ccp4serv7.rc-harwell.ac.uk:8080/testserv/
> seem to both be down.  And the other one I know about
> nearest-cell: https://app.strubi.ox.ac.uk/nearest-cell/nearest-cell.cgi
>
> is up, but doesn't seem to be working.  Can't find 4y42 using Dong Xiao's
> cell below.
>
> Perhaps the maintainers of these pages could chime in?
>
> -James Holton
> MAD Scientist
>
>
>
> On 6/17/2017 10:43 AM, James Holton wrote:
>
> The reduced (aka "Niggli") cell for your case is 82.8246 88.2997 82.8246
> 98.8892 110.5181 117.0456
>
> You can get this using "tracer" or "othercell" in the CCP4 suite.
>
> I keep a local table of reduced cells from the PDB that I update
> periodically, and the closest one to yours in that list is:
>
> 4y42   82.5   88.0   82.5  99 110 117   cyanate hydratase
> This was deposited in P1, but if you download the data and feed it to
> Pointless, it is quite confident that the true space group is actually C2.
> There are 10 chains in the ASU so there may be some pseudo symmetry going
> on.  The paper reporting 4y42 describes it as a " serendipitous
> crystallization".
>
> What happens if you try to refine your data against the model deposited as
> 4y42?
>
> Hope that helps,
>
> -James Holton
> MAD Scientist
>
> On 6/17/2017 12:07 AM, dongxiaofei wrote:
>
> Dear ALL,
>
> I got two kinds of crystals of different proteins ,but there are many
> similarities.
> The shape of the crystals are similar, the cell parameters are also
> similar :
>  protein A , 136.12  94.398  89.47690  125.47990 ,  Space group C
> 1 2 1 and
>  protein B , 136.14  94.369  89.11590  125.49590 ,  Space group C
> 1 2 1.
>
> protein A has a NMR structure ,but Rfree always high above 50% after
> molecular replacement , protein B’s Rfree is also  above 50% .
>
> So I am wonder if these crystals are the result of debris of proteins ,
> because the growth of the crystals needs more than half a year . I am sure
> the two proteins are different and crystals  respectively come from
> different proteins
>
> Any insights will be really appreciated.
>
> Thanks
>
> Dong Xiao
>
>
>
>
>
>
>
>


Re: [ccp4bb] Correct reading of HDF5 (meta)data

2017-03-24 Thread Herbert J. Bernstein
Dear Colleagues,

  The information Graeme is requesting would be very helpful in making
accurate NeXus/HDF5, minicbf, and full cbf beamline templates.  We would be
happy to host the beamline photographs on the HDRMX web site (
www.medsbio.org/hdrmx).  These photographs would be even more useful if you
added markings for all axes showing the direction of increasing translation
for each translation axis  and a curled arrow showing the direction of
increasing rotation for each rotation axis.

  Thank you.

Regards,
  Herbert

On Fri, Mar 24, 2017 at 4:18 AM, Graeme Winter 
wrote:

> Dear All,
>
> There has been much discussion of XDS efficiently reading HDF5 data - this
> is of course highly desirable though not sufficient for the correct
> processing of the data.
>
> One thing which I think could very much help the community would be to
> have data published from beamlines where Eiger detectors are in use,
> including the following:
>
>  - a *photograph* of the beamline showing the orientation of the detector
> and principle rotation axis
>  - a single rotation scan e.g. of thermolysin or some other
> easy-to-solve-by-SAD structure
>
> Between these there is sufficient information to ensure that the geometry
> of the experiment described in the headers (master file) is correct.
>
> While XDS does not use this, and many beamline systems generate an XDS.INP
> file, in the future this record of the experiment in the master file may be
> all that remains and so ensuring that this is correct seems like a very
> good idea.
>
> So - beamline people - how do you feel about the above? Clearly this will
> also help with software people making sure data from your beamline process
> correctly!
>
> Thanks & best wishes Graeme
>
>
> --
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>


[ccp4bb] SAUC cell search now includes COD cells as well as PDB

2016-11-27 Thread Herbert J. Bernstein
The unit cell search at

  http://iterate.sf.net/sauc/

now includes 368 thousand Crystallographic Open Database (COD) cells
through 26 November 2016, as well as 111 thousand PDB cells through 18
November 2016.

Please report problems to yaya...@gmail.com

  -- H. J. Bernstein


[ccp4bb] Comments needed CBF X-axis definition ambiguity

2014-05-08 Thread Herbert J. Bernstein

Dear Colleagues,

  Please pardon the shotgun distribution of this query.  It may only
directly concern a few beam-line scientists and software developers,
but comments from all interested parties are welcome.

  Many people have happily used the IUCr imgCIF dictionary definitions
in data collection and processing software for many years.  Just today,
however, we discovered that there is an ambiguity in the interpretation
of the CBF laboratory standard coordinate frame definition that comes
from two alternate readings of the definition of the X-axis.  Before
we put clarifying wording in the dictionary and resolve the ambiguity,
we would appreciate knowing which of the two interpretations is currently
in major use so that the resolution will be as non-disruptive as possible.

  The imgCIF dictionary says:

Axis 1 (*X*): The*X*-axis is aligned to the mechanical axis pointing from
 the sample or specimen along the  principal axis of the goniometer or
 sample positioning system if the sample positioning system has an axis
 that intersects the origin and which form an angle of more than 22.5
 degrees with the beam axis.

Without any intention of saying which of the following intepretations
is the original intention of this definition by the ordering, here is
what people have gotten from this:

Interpretation 1:  If you treat the sample as the origin, the +X axis runs
from the sample along the pin _into_ the sample holder; or

Interpretation 2:  If you treat the sample as the origin, the -X axis runs
from the sample along the pin _into_ the sample holder;

There are important implications for processing software on the handedness
of the resulting scan rotations, so we would appreciate whatever guidance
any of you can provide as to how you have been reading this spec.

Please send your comments to this list, or, if you prefer, to me personally
at yaya...@gmail.com

My apologies to the community for not having resolved this sooner, but
we only became aware today that some people had been reading the spec one way,
and others the other way.

With deepest apologies,
  Herbert J. Bernstein

--

Herbert J. Bernstein
Professor of Mathematics and Computer Science
Dowling College, Brookhaven Campus, A210/B205
1300 William Floyd Parkway, Shirley, NY, 11967

+1-631-244-1328
Lab: +1-631-244-1935
Cell: +1-631-428-1397
y...@dowling.edu


/Users/yaya/Documents/Eudora Folder/Signature Folder/Alternate


Re: [ccp4bb] mmCIF as working format?

2013-08-05 Thread Herbert J. Bernstein

Dear Colleagues,

This exchange is a wonderful illustration of the simple fact that 
different scientists
work differently, favoring different approach and different tools. For 
some, the latest
and greatest formats and support systems are what they need to be 
productive. For
a surprising large number of others, change to new methods is a 
pointless distraction
from doing good science. What we need to do as a community is not to 
tell one
another how they _must_ do their work, but to listen to one another, 
being helpful

where we can, and showing mutual respect where we cannot.

To this end, Frances and I have revived an old idea from 2006 of 
creating a format
that looks much like the old PDB format but is 132 columns wide with 
more characters

allotted to fields that need them. We re-enabled the WPDB server at
http://biomol.dowling.edu/wpdb which can produce either a 132-column 
'PDB' entry or
an 80 column PDB entry based on the mmCIF files on the wwPDB server. 
This allows
people who work best with tools such as grep and a simple fixed-field 
format to have
most of the newer, larger PDB entries in a wide version of the PDB 
format. If you don't
need it, or don't like it, you should not use it. If you have need for 
it, and need some

things changed, send us an email, and we'll see what we can do to oblige.

Right now it is on an old, slow server. If there is significant use, 
I'll move it

to something bigger and faster.

Regards,
Herbert and Frances Bernstein


On 8/5/13 4:05 PM, Boaz Shaanan wrote:



/Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220 Skype: boaz.shaanan
Fax: 972-8-647-2992 or 972-8-646-1710 /
//
//
/

/

*From:* Nat Echols [nathaniel.ech...@gmail.com]
*Sent:* Monday, August 05, 2013 10:45 PM
*To:* בעז שאנן
*Cc:* CCP4BB@JISCMAIL.AC.UK
*Subject:* Re: [ccp4bb] mmCIF as working format?

On Mon, Aug 5, 2013 at 12:37 PM, Boaz Shaanan > wrote:


There seems to be some kind of a gap between users and developers
as far the eagerness to abandon PDB in favour of mmCIF. I myself
fully agree with Jeffrey about the ease of manipulating PDB's
during work, particularly when encountering unusual circumstances
(and there are many of those, as we all know). And how about
non-crystallographers that are using PDB's for visualization and
understanding how their proteins work? I teach many such students
and it's fairly easy to explain to them where to look in the PDB
for particular pieces of information relevant to the structure. I
can't imagine how they'll cope with the cryptic mmCIF format.


>I think the only gap is between developers and *expert* users - most 
of the community simply wants tools and formats that work with a 
>minimum of fiddling.


That assumes that you can offer such software, but can you? I doubt 
that this goal is reachable (in fact our daily experience proves just 
that), with all due respect to you developers.


>Again, if users are having to examine the raw PDB records visually to 
find information, this is a failure of the software.
It's not raw, it's easily readable text, very easy to interpret with 
very little effort.


Anyway, this discussion is a waste of time. The decision has been 
taken, mmCIF will prevail and we (expert and non-expert users) have to 
swallow the pill.


Boaz

-Nat


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Herbert J. Bernstein

Dear Colleagues,

  Clearly, no system will be able to perfectly preserve every pixel of
every dataset collected at a cost that can be afforded.  Resources are
finite and we must set priorities.  I would suggest that, in order
of declining priority, we try our best to retain:

  1.  raw data that might tend to refute published results
  2.  raw data that might tend to support published results
  3.  raw data that may be of significant use in currently
ongoing studies either in refutation or support
  4.  raw data that may be of significant use in future
studies

While no archiving system can be perfect, we should not let the
search for a perfect solution prevent us from working with
currently available good solutions, and even in this era of tight
budgets, there are good solutions.

  Regards,
Herbert

On 4/5/12 7:16 AM, John R Helliwell wrote:

Dear 'aales...@burnham.org',

Re the pixel detector; yes this is an acknowledged raw data archiving
challenge; possible technical solutions include:- summing to make
coarser images ie in angular range, lossless compression (nicely
described on this CCP4bb by James Holton) or preserving a sufficient
sample of data(but nb this debate is certainly not yet concluded).

Re "And all this hassle is for the only real purpose of preventing data fraud?"

Well.Why publish data?
Please let me offer some reasons:
• To enhance the reproducibility of a scientific experiment
• To verify or support the validity of deductions from an experiment
• To safeguard against error
• To allow other scholars to conduct further research based on
experiments already conducted
• To allow reanalysis at a later date, especially to extract 'new'
science as new techniques are developed
• To provide example materials for teaching and learning
• To provide long-term preservation of experimental results and future
access to them
• To permit systematic collection for comparative studies
• And, yes, To better safeguard against fraud than is apparently the
case at present

Also to (probably) comply with your funding agency's grant conditions:-
Increasingly, funding agencies are requesting or requiring data
management policies (including provision for retention and access) to
be taken into account when awarding grants. See e.g. the Research
Councils UK Common Principles on Data Policy
(http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
Curation Centre overview of funding policies in the UK
(http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies).
See also http://forums.iucr.org/viewtopic.php?f=21&t=58 for discussion
on policies relevant to crystallography in other countries. Nb these
policies extend over derived, processed and raw data, ie without
really an adequate clarity of policy from one to the other stages of
the 'data pyramid' ((see
http://www.stm-assoc.org/integration-of-data-and-publications).


And just to mention IUCr Journals Notes for Authors for biological
macromolecular structures, where we have our ie macromolecular
crystallography's version of the 'data pyramid' :-

(1) Derived data
• Atomic coordinates, anisotropic or isotropic displacement
parameters, space group information, secondary structure and
information about biological functionality must be deposited with the
Protein Data Bank before or in concert with article publication; the
article will link to the PDB deposition using the PDB reference code.
• Relevant experimental parameters, unit-cell dimensions are required
as an integral part of article submission and are published within the
article.

(2) Processed experimental data
• Structure factors must be deposited with the Protein Data Bank
before or in concert with article publication; the article will link
to the PDB deposition using the PDB reference code.

(3) Primary experimental data (here I give small and macromolecule
Notes for Authors details):-
For small-unit-cell crystal/molecular structures and macromolecular
structures IUCr journals have no current binding policy regarding
publication of diffraction images or similar raw data entities.
However, the journals welcome efforts made to preserve and provide
primary experimental data sets. Authors are encouraged to make
arrangements for the diffraction data images for their structure to be
archived and available on request.
For articles that present the results of powder diffraction profile
fitting or refinement (Rietveld) methods, the primary diffraction
data, i.e. the numerical intensity of each measured point on the
profile as a function of scattering angle, should be deposited.
Fibre data should contain appropriate information such as a photograph
of the data. As primary diffraction data cannot be satisfactorily
extracted from such figures, the basic digital diffraction data should
be deposited.


Finally to mention that many IUCr Commissions are interested in the
possibility of establishing community practices for the orderly
retention and referencing of raw data s

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Herbert J. Bernstein

Dear Colleagues,

  One thing that would help is avoiding misappropriated priority of 
research
results would be to join the math and physics community in their robust 
use of open-access
preprints in arXiv.  Such public preprints establish reliable timelines 
for research credit

and help to ensure timely access to new results by the entire community.
Fully peer-reviewed publications in "real" journals are still desirable, 
but to make
this work, our journals would have to be willing to accept papers for 
which such

a preprint system has been used.  To understand the complexity of the issue,
see

http://nanoscale.blogspot.com/2008/01/arxiv-and-publishing.html

I believe the IUCr is willing to accept papers that are posted on a 
preprint server (somebody

correct me if I am wrong).

  It works for the math and physics community.  Perhaps it would work 
for the

crystallographic community.


On 4/3/12 1:28 PM, Mark J van Raaij wrote:

In fact, I would put it even stronger, if we know a referee is being dishonest, 
it is our duty to make sure he is removed from science, blacklisted from the 
journal etc.

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote:

   

Mark,

I know some stories (which of course I'll not post here)  from the Crystallography field 
and from other fields where reviewers profit from the fact that suddenly they have new, 
interpreted data which fits very well with their own results. Stories like to block a 
manuscript or ask for more results for the reviewer to be able to submit its own paper 
(with "new" ideas) in time, or copy a structure from the figures, or ask for 
experiments that only the reviewer can do so he/she is included in the paper, or submit 
as fast as possible in another journal with an extremely short delay of acceptance (e.g. 
10 days,  without revision?, talking to the editorial board?) things like this. Well, it 
is not question of making a full list, here!. The whole problem comes from publishing 
first, from competition.

The hope with fraud with X-ray data is that it seems to be detectable, thanks 
to valuable people that develop methods to detect it. But it is very difficult 
to demonstrate that your work, ideas or results have been copied. How do you 
defend from this? And how after giving to them the valuable PDB?

Finally, how many crystallographers are in the world? 5000?  The concept of 
ethics can change from one place to another and, more than this, there is the 
fact that the reviewer is anonymous.

I try to respond to my reviewers the best I can and I really trust their criteria, sometimes a bit 
too much, indeed. I think they all have done a very nice job. But some of the stories from above 
happened to me or close to me and I feel really insecure with the idea of sending a manuscript, the 
X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. It's too risky: with an 
easy molecular replacement someone can solve a difficult structure and publish it first. And then 
the only thing left to the "bad reviewer" is to change the author's list! (and for the 
"true" author what is left is to feel like an idiot).

In my humble opinion, we must be strict but not kill ourselves. Trust authors 
as we trust reviewers. Otherwise, the whole effort might be useless.

Maria

Dep. Structural Biology
IBMB-CSIC
Baldiri Reixach 10-12
08028 BARCELONA
Spain
Tel: (+34) 93 403 4950
Fax: (+34) 93 403 4979
e-mail: maria.s...@ibmb.csic.es

On 3 April 2012 16:58, Mark J van Raaij  wrote:
The remedy for the fact that some reviewers act unethically is not withholding 
coordinates and structure factors, but a more active role for the authors to 
denounce these possible violations and more effective investigations by the 
journals whose reviewers are suspected by the authors of committing these 
violations.
I have witnessed authors being hesitant to complain about possible violations 
and journals not always taking complaints seriously enough.

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



On 3 Apr 2012, at 16:45, Bosch, Juergen wrote:

 

Hi Fred,

I'll go public on this one. This happened to me. I will not reveal who reviewed 
my paper and which paper it was only that your naive assumption might not 
always be correct. I have learned my lesson and exclude people with overlapping 
interests (even though they actually might be the best critical reviewers for 
your work). Unfortunately you don't really have control if the journal still 
decides to pick those excluded reviewers.
As a suggestion to people out there, make sure to not encrypt your comments as 
pdf and PW protect them - that's how I found

Re: [ccp4bb] Fwd: HR3699, Research Works Act

2012-02-16 Thread Herbert J. Bernstein
venue other than philanthropy comes from the 
intellectual property and added value of its journals, some of which 
represent the finest in physical chemistry relevant to our community. 
Dylla deserves kudos for his effort to find consensus, something that 
seems to have gone way out of fashion in recent years.


Charlie



On Feb 16, 2012, at 10:37 AM, Ian Tickle wrote:


Dear Herbert

Thanks for your detailed explanation.  I had missed the important
point that it's the requirement on the authors to assent to open
access after a year, which the proposed Bill seeks to abolish, that's
critical here.

I will go and sign the petition right now!

Best wishes

-- Ian

On 16 February 2012 15:24, Herbert J. Bernstein
 wrote:

The bill summary says:

Research Works Act - Prohibits a federal agency from adopting, 
maintaining,
continuing, or otherwise engaging in any policy, program, or other 
activity

that: (1) causes, permits, or authorizes network dissemination of any
private-sector research work without the prior consent of the 
publisher; or

*(2) requires that any actual or prospective author, or the author's
employer, assent to such network dissemination. *

Defines "private-sector research work" as an article intended to be
published in a scholarly or scientific publication, or any version 
of such

an article, that is not a work of the U.S. government, describing or
interpreting research funded in whole or in part by a federal 
agency and to
which a commercial or nonprofit publisher has made or has entered 
into an
arrangement to make a value-added contribution, including peer 
review or
editing, but does not include progress reports or raw data outputs 
routinely
required to be created for and submitted directly to a funding 
agency in the

course of research.

==

It is the second provision that really cuts the legs out from the 
NIH open
access policy. What the NIH policy does is to make open access 
publication a

condition imposed on the grant holders in publishing work that the NIH
funded. This has provided the necessary lever for NIH-funded 
authors to be
able to publish in well-respected journals and still to be able to 
require
that, after a year, their work be available without charge to the 
scientific
community. Without that lever we go back to the unlamented old 
system (at
least unlamented by almost everybody other than Elsevier) in which 
pubishers
could impose an absolute copyright transfer that barred the authors 
from

ever posting copies of their work on the web. People affiliated with
libraries with the appropriate subscriptions to the appropriate 
archiving

services may not have noticed the difference, but for the significant
portions of both researchers and students who did not have such 
access, the
NIH open access policy was by itself a major game changer, making 
much more

literature rapidly accessible, and even more importantly changed the
culture, making open access much more respectable.

The NIH policy does nothing more than put grant-sponsored research 
on almost
the same footing as research done directly by the government which 
has never
been subject to copyright at all, on the theory that, if the 
tax-payers
already paid for the research, they should have open access to the 
fruits of
that research. This law would kill that policy. This would be a 
major step

backwards.

Please read:

http://blogs.scientificamerican.com/evo-eco-lab/2012/01/16/mistruths-insults-from-the-copyright-lobby-over-hr-3699/ 



http://www.taxpayeraccess.org/action/action_access/12-0106.shtml

http://www.care2.com/causes/open-access-under-threat-hr-3699.html

Please support the petition. This is a very bad bill. It is not about
protecting copyright, it is an effort to restrict the free flow of
scientific information in our community.

Regards,
Herbert

On 2/16/12 9:02 AM, Fischmann, Thierry wrote:


Herbert

I don't see how the act could affect the NIH open access policy. 
Could you

please shed some light on that?

What I read seems reasonable and I intend to ask my 
representatives to
support this text. But obviously I am missing something and like 
to learn

from you first.

Regards
Thierry


-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
Herbert J. Bernstein
Sent: Thursday, February 16, 2012 8:16 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Fwd: HR3699, Research Works Act

Dear Ian,

   You are mistaken.  The proposed law has nothing to do with 
preventing

the
encouragement people to break copyright law.  It has everything to 
do with
trying to kill the very reasonable NIH open access policy that 
properly

balances the rights of publishers with the rights of authors and the
interests of
the scientific community.  Most publishers fare quite well under a
policy that
gives them a year of exclusive control over papers, followed by open
access.

   It is, unfortunately, a 

Re: [ccp4bb] Fwd: HR3699, Research Works Act

2012-02-16 Thread Herbert J. Bernstein

Dear Colleagues,

  Acta participates very nicely and fully in the NIH Open Access 
program.  After

one year of the normal restricted access any NIH-funded paper automatically
enters the NIH open access system.  The journals get to get their 
revenue when

the paper is most in demand, but the community is not excessively delayed in
free access.

  I most certainly do suggest that it is a good idea for people who are 
not US taxpayers
to also have access to the science the NIH funding produces.  We will 
all live longer
and happier lives by seeing a much progress made as rapidly as possible 
world-wide
in health-related scientific research.  I would hate to think of the 
cure of a disease being
greatly delayed because some researcher in Europe or India or China 
could not get
access to research results.  We all benefit from seeing the best 
possible use made

of NIH-funded research.

  I agree that in this case, adding more legislation is a bad idea -- 
particularly

adding this legislation.

  I agree that

"If the authors of a paper wants their work to be available to the 
general public there is Wikipedia.
I strongly support an effort by all members of ccp4bb to contribute a 
general public summary of their work on Wikipedia.

There are Open Source journals as well. "

However, there is a practical reality for post-docs and junior faculty 
that, at least in the US,
most institutions will not consider Wikipedia articles in tenure and 
promotion evaluations,
so it really is a good idea for them to, in addition to publishing in 
Wikipedia, to write "real"
journal articles.  I also agree that using open source journals in a 
good idea in the abstract,
but I, for one, really don't want the IUCr journals to go away, and the 
NIH Open Access
policy allows me to both support the IUCr and have my work become open 
access a
year later.  I think it is a wonderful compromise.  Please, don't let 
the perfect be the
enemy of the good.  If we don't prevent Elsevier from killing NIH Open 
Access with
this bill, then there is a risk that many fewer people will publish in 
the IUCr publications.


You seem to be arguing strongly that we should both have Open Access and 
have money
for editing journals.  I agree.  The current NIH Open Access policy does 
just that.
It is the pending bill that will face you with the start choice of 
either having Open Access
or having edited journals.  You come much closer to your goals if you 
sign the petition
and help the NIH Open Access policy to continue in force, than if the 
bill passes and
the NIH Open Access policy dies.  If the Open Access policy dies, I for 
one will face
a difficult choice -- publish in the IUCr journals and pay them an open 
access fee
I may not be able to come up with, or publish in free, pure open source 
journals
but fail to support the IUCr.  Let is hope the petition gets lots of 
signatures and this

misguided bill dies.

Regards,
  Herbert


On 2/16/12 12:17 PM, Enrico Stura wrote:
I am strongly in favour of Open Acess, but Open Access is not always 
helped

by lack of money for editing etc.

For example:
Acta Crystallographica is not Open Acess.
In one manner or another publishing must be financed.
Libraries pay fees for the journals. The fees help the International 
Union of Crystallography.
The money is used for sponsoring meetings, and some scientists that 
come from less rich

institutions benefit from it.

Open Acess to NIH sponsored scientific work will be for all world tax 
payers and tax doggers as well.
OR May be you would suggest that NIH sponsored work should be accessed 
only by US tax payers with a valid social security number?
The journal server will verify that Tax for the current year has been 
filed with the IRS server. A dangerous invasion of privacy!

The more legislation we add the worse off we are.

If the authors of a paper wants their work to be available to the 
general public there is Wikipedia.
I strongly support an effort by all members of ccp4bb to contribute a 
general public summary of their work on Wikipedia.

There are Open Source journals as well.

I would urge everybody NOT to sign the petition. Elsevier will not 
last for ever, and the less
accessible the work that they publish, the worse for them in terms of 
impact factor.
In the old days, if your institution did not have the journal, most 
likely you would not reference the work

and the journal was worth nothing.
We are the ones that will decide the future of Elsevier.  Elsevier 
will need to strike a balance between excellent
publishing with resonable fees or not getting referenced. A law that 
enforces a copyright will not help them.

They are wasting their money on lobbing.

The argument that NIH scientist need to publish in High Impact Factor 
Journals by Elsevier does not hold up:
1) We should consider the use of impact factor as a NEGATIVE 
contribution to science.
2) Each article can now have its own impact factor on Google Scholar, 
independent on the jour

Re: [ccp4bb] Fwd: HR3699, Research Works Act

2012-02-16 Thread Herbert J. Bernstein

The bill summary says:

Research Works Act - Prohibits a federal agency from adopting, 
maintaining, continuing, or otherwise engaging in any policy, program, 
or other activity that: (1) causes, permits, or authorizes network 
dissemination of any private-sector research work without the prior 
consent of the publisher; or *(2) requires that any actual or 
prospective author, or the author's employer, assent to such network 
dissemination. *


Defines "private-sector research work" as an article intended to be 
published in a scholarly or scientific publication, or any version of 
such an article, that is not a work of the U.S. government, describing 
or interpreting research funded in whole or in part by a federal agency 
and to which a commercial or nonprofit publisher has made or has entered 
into an arrangement to make a value-added contribution, including peer 
review or editing, but does not include progress reports or raw data 
outputs routinely required to be created for and submitted directly to a 
funding agency in the course of research.


==

It is the second provision that really cuts the legs out from the NIH 
open access policy. What the NIH policy does is to make open access 
publication a condition imposed on the grant holders in publishing work 
that the NIH funded. This has provided the necessary lever for 
NIH-funded authors to be able to publish in well-respected journals and 
still to be able to require that, after a year, their work be available 
without charge to the scientific community. Without that lever we go 
back to the unlamented old system (at least unlamented by almost 
everybody other than Elsevier) in which pubishers could impose an 
absolute copyright transfer that barred the authors from ever posting 
copies of their work on the web. People affiliated with libraries with 
the appropriate subscriptions to the appropriate archiving services may 
not have noticed the difference, but for the significant portions of 
both researchers and students who did not have such access, the NIH open 
access policy was by itself a major game changer, making much more 
literature rapidly accessible, and even more importantly changed the 
culture, making open access much more respectable.


The NIH policy does nothing more than put grant-sponsored research on 
almost the same footing as research done directly by the government 
which has never been subject to copyright at all, on the theory that, if 
the tax-payers already paid for the research, they should have open 
access to the fruits of that research. This law would kill that policy. 
This would be a major step backwards.


Please read:

http://blogs.scientificamerican.com/evo-eco-lab/2012/01/16/mistruths-insults-from-the-copyright-lobby-over-hr-3699/

http://www.taxpayeraccess.org/action/action_access/12-0106.shtml

http://www.care2.com/causes/open-access-under-threat-hr-3699.html

Please support the petition. This is a very bad bill. It is not about 
protecting copyright, it is an effort to restrict the free flow of 
scientific information in our community.


Regards,
Herbert

On 2/16/12 9:02 AM, Fischmann, Thierry wrote:

Herbert

I don't see how the act could affect the NIH open access policy. Could you 
please shed some light on that?

What I read seems reasonable and I intend to ask my representatives to support 
this text. But obviously I am missing something and like to learn from you 
first.

Regards
Thierry

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Herbert 
J. Bernstein
Sent: Thursday, February 16, 2012 8:16 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Fwd: HR3699, Research Works Act

Dear Ian,

You are mistaken.  The proposed law has nothing to do with preventing the
encouragement people to break copyright law.  It has everything to do with
trying to kill the very reasonable NIH open access policy that properly
balances the rights of publishers with the rights of authors and the
interests of
the scientific community.  Most publishers fare quite well under a
policy that
gives them a year of exclusive control over papers, followed by open access.

It is, unfortunately, a standard ploy in current American politics to
make  a
law which does something likely to be very unpopular and very unreasonable
sound like it is a law doing something quite different.

Please reread it carefully.  I think you will join in opposing this
law.  Science
benefits from the NIH open access policy and the rights of all concerned
are respected.  It would be a mistake to allow the NIH open access policy to
be killed.

I hope you will sign the petition.

Regards,
  Herbert


On 2/16/12 6:29 AM, Ian Tickle wrote:
   

Reading the H.R.3699 bill as put forward
(http://thomas.loc.gov/cgi-bin/bdquery/z?d112:HR03699:@@@L&summ2=m&;)
it seems to be about prohibiting US federal agencies from having
policie

Re: [ccp4bb] Fwd: HR3699, Research Works Act

2012-02-16 Thread Herbert J. Bernstein

Dear Ian,

  You are mistaken.  The proposed law has nothing to do with preventing the
encouragement people to break copyright law.  It has everything to do with
trying to kill the very reasonable NIH open access policy that properly
balances the rights of publishers with the rights of authors and the 
interests of
the scientific community.  Most publishers fare quite well under a 
policy that

gives them a year of exclusive control over papers, followed by open access.

  It is, unfortunately, a standard ploy in current American politics to 
make  a

law which does something likely to be very unpopular and very unreasonable
sound like it is a law doing something quite different.

  Please reread it carefully.  I think you will join in opposing this 
law.  Science

benefits from the NIH open access policy and the rights of all concerned
are respected.  It would be a mistake to allow the NIH open access policy to
be killed.

  I hope you will sign the petition.

  Regards,
Herbert


On 2/16/12 6:29 AM, Ian Tickle wrote:

Reading the H.R.3699 bill as put forward
(http://thomas.loc.gov/cgi-bin/bdquery/z?d112:HR03699:@@@L&summ2=m&;)
it seems to be about prohibiting US federal agencies from having
policies which permit, authorise or require authors' assent to break
the law of copyright in respect of published journal articles
describing work funded at least in part by a US federal agency.  I'm
assuming that "network dissemination without the publisher's consent"
is the same thing as breaking the law of copyright.

It seems to imply that it would still be legal for US federal agencies
to encourage others to break the law of copyright in respect of
journal articles describing work funded by say UK funding agences! -
or is there already a US law in place which prohibits that?  I'm only
surprised that encouraging others to break the law isn't already
illegal (even for Govt agencies): isn't that the law of incitement
(http://en.wikipedia.org/wiki/Incitement)?

This forum in fact already has such a policy in place for all journal
articles (i..e not just those funded by US federal agencies but by all
funding agencies), i.e. we actively discourage postings which incite
others to break the law by asking for copies of copyrighted published
articles.  Perhaps the next petition should seek to overturn this
policy?

This petition seems to be targeting the wrong law: if what you want is
free flow of information then it's the copyright law that you need to
petition to overturn, or you get around it by publishing in someplace
that doesn't require transfer of copyright.

Cheers

-- Ian

On 16 February 2012 09:35, Tim Gruene  wrote:
   

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Raji,

maybe you could increase the number of supporters if you included a link
to (a description of) the content of HR3699 - I will certainly not sign
something only summarised by a few polemic sentences ;-)

Cheers,
Tim

On 02/15/2012 11:53 PM, Raji Edayathumangalam wrote:
 

If you agree, please signing the petition below. You need to register on
the link below before you can sign this petition. Registration and signing
the petition took about a minute or two.

Cheers,
Raji

-- Forwarded message --
From: Seth Darst
Date: Tue, Feb 14, 2012 at 12:40 PM
Subject: HR3699, Research Works Act
To:


Rep. Caroline Maloney has not backed off in her attempt to put forward the
interests of Elsevier and other academic publishers.

If you oppose this measure, please sign this petition on the official 'we
the people' White House web site. It needs 23,000 signatures before
February 22nd and only 1100 so far. Please forward far and wide.


Oppose HR3699, the Research Works Act

HR 3699, the Research Works Act will be detrimental to the free flow of
scientific information that was created using Federal funds. It is an
attempt to put federally funded scientific information behind pay-walls,
and confer the ownership of the information to a private entity. This is an
affront to open government and open access to information created using
public funds.

This link gets you to the petition:
https://wwws.whitehouse.gov/petitions#!/petition/oppose-hr3699-research-works-act/vKMhCX9k





   

- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPPM3kUxlJ7aRr7hoRAsKYAKDIs/jZHPBIV4AB2qrpBdXrSOn+VwCePabR
Nm6+LK17jLJnPTqkjsQ4fV8=
=a27t
-END PGP SIGNATURE-
 
   


Re: [ccp4bb] image compression

2011-11-08 Thread Herbert J. Bernstein

ADSC has been a leader in supporting compressed CBF's.
=
  Herbert J. Bernstein
Professor of Mathematics and Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Tue, 8 Nov 2011, Phil Evans wrote:

It would be a good start to get all images written now with lossless 
compression, instead of the uncompressed images we still get from the 
ADSC detectors. Something that we've been promised for many years


Phil



Re: [ccp4bb] image compression

2011-11-08 Thread Herbert J. Bernstein

Um, but isn't Crystallograpy based on a series of
one-way computational processes:
 photons -> images
 images -> {struture factors, symmetry}
 {structure factors, symmetry, chemistry} -> solution
 {structure factors, symmetry, chemistry, solution}
  -> refined solution

At each stage we tolerate a certain amount of noise
in "going backwards".  Certainly it is desirable to
have the "original data" to be able to go forwards,
but until the arrival of pixel array detectors, we
were very far from having the true original data,
and even pixel array detectors don't capture every
single photon.

I am not recommending lossy compressed images as
a perfect replacement for lossless compressed images,
any more than I would recommend structure factors
are a replacement for images.  It would be nice
if we all had large budgets, huge storage capacity
and high network speeds and if somebody would repeal
the speed of light and other physical constraints, so that
engineering compromises were never necessary, but as
James has noted, accepting such engineering compromises
has been of great value to our colleagues who work
with the massive image streams of the entertainment
industry.  Without lossy compression, we would not
have the _higher_ image quality we now enjoy in the
less-than-perfectly-faithful HDTV world that has replaced
the highly faithful, but lower capacity, NTSC/PAL world.

Please, in this, let us not allow the perfect to be
the enemy of the good.  James is proposing something
good.

Regards,
  Herbert
=============
  Herbert J. Bernstein
Professor of Mathematics and Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Tue, 8 Nov 2011, Harry Powell wrote:


Hi


I am not a fan
of one-way computational processes with unique data.

Thoughts anyone?

Cheerio,

Graeme



I agree.

Harry
--
Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre, Hills Road, 
Cambridge, CB2 0QH

http://www.iucr.org/resources/commissions/crystallographic-computing/schools/mieres2011



Re: [ccp4bb] image compression

2011-11-07 Thread Herbert J. Bernstein

Dear James,

  You are _not_ wasting your time.  Even if the lossy compression ends
up only being used to stage preliminary images forward on the net while
full images slowly work their way forward, having such a compression
that preserves the crystallography in the image will be an important
contribution to efficient workflows.  Personally I suspect that
such images will have more important, uses, e.g. facilitating
real-time monitoring of experiments using detectors providing
full images at data rates that simply cannot be handled without
major compression.  We are already in that world.  The reason that
the Dectris images use Andy Hammersley's byte-offset compression,
rather than going uncompressed or using CCP4 compression is that
in January 2007 we were sitting right on the edge of a nasty 
CPU-performance/disk bandwidth tradeoff, and the byte-offset

compression won the competition.   In that round a lossless
compression was sufficient, but just barely.  In the future,
I am certain some amount of lossy compression will be
needed to sample the dataflow while the losslessly compressed
images work their way through a very back-logged queue to the disk.

  In the longer term, I can see people working with lossy compressed
images for analysis of massive volumes of images to select the
1% to 10% that will be useful in a final analysis, and may need
to be used in a lossless mode.  If you can reject 90% of the images
with a fraction of the effort needed to work with the resulting
10% of good images, you have made a good decision.

  An then there is the inevitable need to work with images on
portable devices with limited storage over cell and WIFI networks. ...

  I would not worry about upturned noses.  I would worry about
the engineering needed to manage experiments.  Lossy compression
can be an important part of that engineering.

  Regards,
Herbert


At 4:09 PM -0800 11/7/11, James Holton wrote:

So far, all I really have is a "proof of concept" compression algorithm here:
http://bl831.als.lbl.gov/~jamesh/lossy_compression/

Not exactly "portable" since you need ffmpeg and the x264 libraries
set up properly.  The latter seems to be constantly changing things
and breaking the former, so I'm not sure how "future proof" my
"algorithm" is.

Something that caught my eye recently was fractal compression,
particularly since FIASCO has been part of the NetPBM package for
about 10 years now.  Seems to give comparable compression vs quality
as x264 (to my eye), but I'm presently wondering if I'd be wasting my
time developing this further?  Will the crystallographic world simply
turn up its collective nose at lossy images?  Even if it means waiting
6 years for "Nielsen's Law" to make up the difference in network
bandwidth?

-James Holton
MAD Scientist

On Mon, Nov 7, 2011 at 10:01 AM, Herbert J. Bernstein
 wrote:

 This is a very good question.  I would suggest that both versions
 of the old data are useful.  If was is being done is simple validation
 and regeneration of what was done before, then the lossy compression
 should be fine in most instances.  However, when what is being
 done hinges on the really fine details -- looking for lost faint
 spots just peeking out from the background, looking at detailed
 peak profiles -- then the lossless compression version is the
 better choice.  The annotation for both sets should be the same.
 The difference is in storage and network bandwidth.

 Hopefully the fraud issue will never again rear its ugly head,
 but if it should, then having saved the losslessly compressed
 images might prove to have been a good idea.

 To facilitate experimentation with the idea, if there is agreement
 on the particular lossy compression to be used, I would be happy
 to add it as an option in CBFlib.  Right now all the compressions

 > we have are lossless.


 Regards,
  Herbert


 =
  Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
 =

 On Mon, 7 Nov 2011, James Holton wrote:


 At the risk of sounding like another "poll", I have a pragmatic question
 for the methods development community:

 Hypothetically, assume that there was a website where you could download
 the original diffraction images corresponding to any given PDB file,
 including "early" datasets that were from the same project, but because of
 smeary spots or whatever, couldn't be solved.  There might even be datasets
 with "unknown" PDB IDs because that particular project never did work out,
 or because the relevant protein sequence has been lost.  Remember, few of
 these datasets will be less than 5 years old if we try to allow enough tim

Re: [ccp4bb] image compression

2011-11-07 Thread Herbert J. Bernstein

This is a very good question.  I would suggest that both versions
of the old data are useful.  If was is being done is simple validation
and regeneration of what was done before, then the lossy compression
should be fine in most instances.  However, when what is being
done hinges on the really fine details -- looking for lost faint
spots just peeking out from the background, looking at detailed
peak profiles -- then the lossless compression version is the
better choice.  The annotation for both sets should be the same.
The difference is in storage and network bandwidth.

Hopefully the fraud issue will never again rear its ugly head,
but if it should, then having saved the losslessly compressed
images might prove to have been a good idea.

To facilitate experimentation with the idea, if there is agreement
on the particular lossy compression to be used, I would be happy
to add it as an option in CBFlib.  Right now all the compressions
we have are lossless.

Regards,
  Herbert


=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Mon, 7 Nov 2011, James Holton wrote:

At the risk of sounding like another "poll", I have a pragmatic question for 
the methods development community:


Hypothetically, assume that there was a website where you could download the 
original diffraction images corresponding to any given PDB file, including 
"early" datasets that were from the same project, but because of smeary spots 
or whatever, couldn't be solved.  There might even be datasets with "unknown" 
PDB IDs because that particular project never did work out, or because the 
relevant protein sequence has been lost.  Remember, few of these datasets 
will be less than 5 years old if we try to allow enough time for the original 
data collector to either solve it or graduate (and then cease to care).  Even 
for the "final" dataset, there will be a delay, since the half-life between 
data collection and coordinate deposition in the PDB is still ~20 months. 
Plenty of time to forget.  So, although the images were archived (probably 
named "test" and in a directory called "john") it may be that the only way to 
figure out which PDB ID is the "right answer" is by processing them and 
comparing to all deposited Fs.  Assume this was done.  But there will always 
be some datasets that don't match any PDB.  Are those interesting?  What 
about ones that can't be processed?  What about ones that can't even be 
indexed?  There may be a lot of those!  (hypothetically, of course).


Anyway, assume that someone did go through all the trouble to make these 
datasets "available" for download, just in case they are interesting, and 
annotated them as much as possible.  There will be about 20 datasets for any 
given PDB ID.


Now assume that for each of these datasets this hypothetical website has two 
links, one for the "raw data", which will average ~2 GB per wedge (after gzip 
compression, taking at least ~45 min to download), and a second link for a 
"lossy compressed" version, which is only ~100 MB/wedge (2 min download). 
When decompressed, the images will visually look pretty much like the 
originals, and generally give you very similar Rmerge, Rcryst, Rfree, 
I/sigma, anomalous differences, and all other statistics when processed with 
contemporary software.  Perhaps a bit worse.  Essentially, lossy compression 
is equivalent to adding noise to the images.


Which one would you try first?  Does lossy compression make it easier to hunt 
for "interesting" datasets?  Or is it just too repugnant to have "modified" 
the data in any way shape or form ... after the detector manufacturer's 
software has "corrected" it?  Would it suffice to simply supply a couple of 
"example" images for download instead?


-James Holton
MAD Scientist



Re: [ccp4bb] To archive or not to archive, that's the question!

2011-10-29 Thread Herbert J. Bernstein

Dear John,

  Most sound institutional data repositories use some form of
off-site backup.  However, not all of them do, and the
standards of reliabilty vary.  The advantages of an explicit
partnering system are both practical and psychological.  The
practical part is the major improvement in reliability --
even if we start at 6 nines, 12 nines is better.  The
psychological part is that members of the community can
feel reassured that reliability has in been improved to
levels at which they can focus on other, more scientific
issues, instead ot the question of reliability.

  Regards,
Herbert

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Sat, 29 Oct 2011, Jrh wrote:


Dear Herbert,
I imagine it likely that eg The Univ Manchester eScholar system will have in 
place duplicate storage for the reasons you outline below. However for it to be 
geographically distant is, to my reckoning, less likely, but still possible. I 
will add that further query to my first query to my eScholar user support re 
dataset sizes and doi registration.
Greetings,
John
Prof John R Helliwell DSc



On 29 Oct 2011, at 15:49, "Herbert J. Bernstein"  
wrote:


One important issue to address is how deal with the perceived
reliability issues of the federated model and how to start to
approach the higher reliability of the centralized model described bu
Gerard K, but without incurring what seems to be at present
unacceptable costs.  One answer comes from the approach followed in
communications systems.  If the probability of data loss in each
communication subsystem is, say, 1/1000, then the probability of data
loss in two independent copies of the same lossy system is only
1/1,000,000.  We could apply that lessonto the
federated data image archive model by asking each institution
to partner with a second independent, and hopefully geographically
distant, institution, with an agreement for each to host copies
of the other's images.  If we restrict that duplication protocol, at least at
first, to those images strongly related to an actual publication/PDB
deposition, the incremental cost of greatly improved reliability
would be very low, with no disruption of the basic federated
approach being suggested.

Please note that I am not suggesting that institutional repositories
will have 1/1000 data loss rates, but they will certainly have some
data loss rate, and this modest change in the proposal would help to
greatly lower the impact of that data loss rate and allow us to go
forward with greater confidence.

Regards,
 Herbert


At 7:53 AM +0100 10/29/11, Jrh wrote:

Dear Gerard K,
Many thanks indeed for this.
Like Gerard Bricogne you also indicate that the location option being the 
decentralised one is 'quite simple and very cheap in terms of centralised 
cost'. The SR Facilities worldwide I hope can surely follow the lead taken by 
Diamond Light Source and PaN, the European Consortium of SR and Neutron 
Facilities, and keep their data archives and also assist authors with the doi 
registration process for those datasets that result in publication. Linking to 
these dois from the PDB for example is as you confirm straightforward.

Gerard B's pressing of the above approach via the 'Pilot project' within the 
IUCr DDD WG various discussions, with a nicely detailed plan, brought home to 
me the merit of the above approach for the even greater challenge for raw data 
archiving for chemical crystallography, both in terms of number of datasets and 
also the SR Facilities role being much smaller. IUCr Journals also note the 
challenge of moving large quantities of data around ie if the Journals were to 
try and host everything for chemical crystallography, and them thus becoming 
'the centre' for these datasets.

So:-  Universities are now establishing their own institutional repositories, 
driven largely by Open Access demands of funders. For these to host raw 
datasets that underpin publications is a reasonable role in my view and indeed 
they already have this category in the University of Manchester eScholar 
system, for example.  I am set to explore locally here whether they would 
accommodate all our Lab's raw Xray images datasets per annum that underpin our 
published crystal structures.

It would be helpful if readers of this CCP4bb could kindly also explore with 
their own universities if they have such an institutional repository and if raw 
data sets could be accommodated. Please do email me off list with this 
information if you prefer but within the CCP4bb is also good.

Such an approach involving institutional repositories would also work of course 
for the 25% of MX structures that are for no

Re: [ccp4bb] To archive or not to archive, that's the question!

2011-10-29 Thread Herbert J. Bernstein
EU

 - by charging depositors (just like they are charged Open Access 
charges, which can often be reclaimed from the funding agencies) - 
would you be willing to pay, say, 5000 USD per dataset to secure 
"perpetual" storage?


 - by charging users (i.e., Gerard Bricogne :-) - just kidding!

 Of course, if the consensus is to go for decentralised storage and 
a DOI-like identifier system, there will be no need for a central 
archive, and the identifiers could be captured upon deposition in 
the PDB. (We could also check once a week if the files still exist 
where they are supposed to be.)


 ---

 (4) Location.

 If the consensus is to have decentralised storage, the solution is 
quite simple and very cheap in terms of "centralised" cost - wwPDB 
can capture DOI-like identifiers upon deposition and make them 
searchable.


 If central storage is needed, then there has to be an institution 
willing and able to take on this task. The current wwPDB partners 
are looking at future funding that is at best flat, with increasing 
numbers of depositions that also get bigger and more complex. There 
is *no way on earth* that wwPDB can accept raw data (be it X-ray, 
NMR or EM! this is not an exclusive X-ray issue) without *at least* 
double the current level of funding (and not just in the US for 
RCSB, but also in Japan for PDBj and in Europe for PDBe)! I am 
pretty confident that this is simply *not* going to happen.


 [Besides, in my own humble opinion, in order to remain relevant 
(and fundable!) in the biomedical world, the PDB will have to 
restyle itself as a biomedical resource instead of a 
crystallographic archive. We must take the structures to the 
biologists, and we must expand in breadth of coverage to include 
emerging hybrid methods that are relevant for structural cell (as 
opposed to molecular) biology. This mission will be much easier to 
fund on three continents than archiving TBs of raw data that have 
little or no tangible (i.e., fundable) impact on our quest to find 
a cure for various kinds of cancer (or hairloss) or to feed a 
growing population.]


 However, there may be a more realistic solution. The role model 
could be NMR, which has its own global resource for data storage in 
the BMRB. BMRB is a wwPDB partner - if you deposit an NMR model 
with us, we take your ensemble coordinates, metadata, restraints 
and chemical shifts - any other NMR data (including spectra and 
FIDs) can subsequently be deposited with BMRB. These data will get 
their own BMRB ID which can be linked to the PDB ID.

 >
 A model like this has advantages - it could be housed in a single 
place, run by X-ray experts (just as BMRB is co-located with 
NMRFAM, the national NMR facility at Madison), and there would be 
only one place that would need to secure the funding (which would 
be substantially larger than the estimate of $1000 per year 
suggested by a previous poster from La La Land). This could for 
instance be a synchrotron (linked to INSTRUCT?), or perhaps one of 
the emerging nations could be enticed to take on this challenging 
task. I would expect that such a centre would be closely affiliated 
with the wwPDB organisation, or become a member just like BMRB. A 
similar model could also be employed for archiving raw EM image 
data.

 >

 ---

 I've said enough for today. It's almost time for the booze-up that 
kicks off the PDB40 symposium here at CSHL! Heck, some of you who 
read this might be here as well!


 Btw - Colin Nave wrote:

 "(in increasing order of influence/power do we have the Pope, US 
president, the Bond Market and finally Gerard K?)"


 I'm a tad disappointed to be only in fourth place, Colin! What has 
the Pope ever done for crystallography?


 --Gerard

 **
   Gerard J. Kleywegt

  http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
 **
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.
 **
   Little known gastromathematical curiosity: let "z" be the
   radius and "a" the thickness of a pizza. Then the volume
    of that pizza is equal to pi*z*z*a !
 **



--
=
 Herbert J. Bernstein, Professor of Computer Science
 Dowling College, Brookhaven Campus, B111B
   1300 William Floyd Parkway, Shirley, NY, 11967

 +1-631-244-1328
   Lab: +1-631-244-1935
  Cell: +1-631-428-1397
 y...@dowling.edu
=


Re: [ccp4bb] To archive or not to archive, that's the question!

2011-10-28 Thread Herbert J. Bernstein

As the poster who mentioned the $1000 - $3000 per terabyte per year
figure, I should point out that the figure originated not from "La La
land" but from an NSF RDLM workshop in Princeton last summer.  Certainly
the actual costs may be higher or lower depending on 
economies/diseconomies of scale and required ancilary task to be

performed.  The base figure itself seems consistent with the GBP 1500
figure cited for EBI.

That aside, the list presented seems very useful to the discussion.
I would suggest adding to it the need to try to resolve the
complex intellectual property issues involved.  This might be
a good time to try to get a consensus of the scientific community
of what approach to IP law would best serve our interests going
forward.  The current situation seems a bit messy.

Regards,
  Herbert

=====
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Fri, 28 Oct 2011, Gerard DVD Kleywegt wrote:


Gerard
I said in INCREASING order of influence/power i.e. you are in first place.


Ooo! *Now* it makes sense! :-)

--Gerard



The joke comes from
" I used to think if there was reincarnation, I wanted to come back as the 
President or the Pope or a .400 baseball hitter. But now I want to come 
back as the bond market. You can intimidate everyone.

--James Carville, Clinton campaign strategist"

Thanks for the comprehensive reply
Regards
  Colin

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
Gerard DVD Kleywegt

Sent: 28 October 2011 22:03
To: ccp4bb
Subject: [ccp4bb] To archive or not to archive, that's the question!

Hi all,

It appears that during my time here at Cold Spring Harbor, I have missed a
small debate on CCP4BB (in which my name has been used in vain to boot).

I have not yet had time to read all the contributions, but would like to 
make
a few points that hopefully contribute to the discussion and keep it with 
two
feet on Earth (as opposed to La La Land where the people live who think 
that

image archiving can be done on a shoestring budget... more about this in a
bit).

Note: all of this is on personal title, i.e. not official wwPDB gospel. Oh,
and sorry for the new subject line, but this way I can track the replies 
more

easily.

It seems to me that there are a number of issues that need to be separated:

(1) the case for/against storing raw data
(2) implementation and resources
(3) funding
(4) location

I will say a few things about each of these issues in turn:

---

(1) Arguments in favour and against the concept of storing raw image data, 
as
well as possible alternative solutions that could address some of the 
issues

at lower cost or complexity.

I realise that my views carry a weight=1.0 just like everybody else's, and
many of the arguments and counter-arguments have already been made, so I 
will

not add to these at this stage.

---

(2) Implementation details and required resources.

If the community should decide that archiving raw data would be 
scientifically
useful, then it has to decide how best to do it. This will determine the 
level

of resources required to do it. Questions include:

- what should be archived? (See Jim H's list from (a) to (z) or so.) An
initial plan would perhaps aim for the images associated with the data used 
in

the final refinement of deposited structures.

- how much data are we talking about per dataset/structure/year?

- should it be stored close to the source (i.e., responsibility and costs 
for

depositors or synchrotrons) or centrally (i.e., costs for some central
resource)? If it is going to be stored centrally, the cost will be
substantial. For example, at the EBI -the European Bioinformatics 
Institute-
we have 15 PB of storage. We pay about 1500 GBP (~2300 USD) per TB of 
storage
(not the kind you buy at Dixons or Radio Shack, obviously). For stored 
data,
we have a data-duplication factor of ~8, i.e. every file is stored 8 times 
(at

three data centres, plus back-ups, plus a data-duplication centre, plus
unreleased versus public versions of the archive). (Note - this is only for
the EBI/PDBe! RCSB and PDBj will have to acquire storage as well.) 
Moreover,

disks have to be housed in a building (not free!), with cooling, security
measures, security staff, maintenance staff, electricity (substantial 
cost!),
rental of a 1-10 Gb/s connection, etc. All hardware has a life-cycle of 
three

years (barring failures) and then needs to be replaced (at lower cost, but
still not free).

- if the data is going to be stored centrally, how will it get there? Using
ftp will probably not be feasible.

- if it is not stored centrally, how will long-term data availability be
e

Re: [ccp4bb] IUCr committees, depositing images

2011-10-26 Thread Herbert J. Bernstein

Dear Colleagues,

  Gerard strikes a very useful note in pleading for a "can-do"
approach.  Part of going from "can-do" to "actually-done"
is to make realistic estimates of the costs of "doing" and
then to adjust plans appropriately to do what can be afforded
now and to work towards doing as much of what remains undone
as has sufficient benefit to justify the costs.

  We appear to be in a fortunate situation in which some
portion of the raw data behind a signficant portion of the
studies released in the PDB could probably be retained for some
significant period of time and be made available for further
analysis.  It would seem wise to explore these possibilities
and try to optimize the approaches used -- e.g. to consider
moves towards well documented formats, and retention of critical
metadata with such data to help in future analysis.

  Please do not let the perfect be the enemy of the good.

  Regards,
Herbert

=============
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Wed, 26 Oct 2011, Gerard Bricogne wrote:


Dear John and colleagues,

There seem to be a set a centrifugal forces at play within this thread
that are distracting us from a sensible path of concrete action by throwing
decoys in every conceivable direction, e.g.

* "Pilatus detectors spew out such a volume of data that we can't
possibly archive it all" - does that mean that because the 5th generation of
Dectris detectors will be able to write one billion images a second and
catch every scattered photon individually, we should not try and archive
more information than is given by the current merged structure factor data?
That seems a complete failure of reasoning to me: there must be a sensible
form of raw data archiving that would stand between those two extremes and
would retain much more information that the current merged data but would
step back from the enormous degree of oversampling of the raw diffraction
pattern that the Pilatus and its successors are capable of.

* "It is all going to cost an awful lot of money, therefore we need a
team of grant writers to raise its hand and volunteer to apply for resources
from one or more funding agencies" - there again there is an avoidance of
the feasible by invocation of the impossible. The IUCr Forum already has an
outline of a feasibility study that would cost only a small amount of
joined-up thinking and book-keeping around already stored information, so
let us not use the inaccessibility of federal or EC funding as a scarecrow
to justify not even trying what is proposed there. And the idea that someone
needs to decide to stake his/her career on this undertaking seems totally
overblown.

Several people have already pointed out that the sets of images that
would need to be archived would be a very small subset of the bulk of
datasets that are being held on the storage systems of synchrotron sources.
What needs to be done, as already described, is to be able to refer to those
few datasets that gave rise to the integrated data against which deposited
structures were refined (or, in some cases, solved by experimental phasing),
to give them special status in terms of making them visible and accessible
on-line at the same time as the pdb entry itself (rather than after the
statutory 2-5 years that would apply to all the rest, probably in a more
off-line form), and to maintain that accessibility "for ever", with a link
from the pdb entry and perhaps from the associated publication. It seems
unlikely that this would involve the mobilisation of such large resources as
to require either a human sacrifice (of the poor person whose life would be
staked on this gamble) or writing a grant application, with the indefinite
postponement of action and the loss of motivation this would imply.

Coming back to the more technical issue of bloated datasets, it is a
scientific problem that must be amenable to rational analysis to decide on a
sensible form of compression of overly-verbose sets of thin-sliced, perhaps
low-exposure images that would already retain a large fraction, if not all,
of the extra information on which we would wish future improved versions of
processing programs to cut their teeth, for a long time to come. This
approach would seem preferable to stoking up irrational fears of not being
able to cope with the most exaggerated predictions of the volumes of data to
archive, and thus doing nothing at all.

I very much hope that the "can do" spirit that marked the final
discussions of the DDDWG (Diffraction Data Deposition Working Group) in
Madrid will emerge on top of all the counter-arguments that consist in
moving

Re: [ccp4bb] IUCr committees, depositing images

2011-10-25 Thread Herbert J. Bernstein

To be fair to those concerned about cost, a more conservative estimate
from the NSF RDLM workshop last summer in Princeton is $1,000 to $3,000
per terabyte per year for long term storage allowing for overhead in
moderate-sized institutions such as the PDB.  Larger entities, such
as Google are able to do it for much lower annual costs in the range of
$100 to $300 per terabyte per year.  Indeed, if this becomes a serious
effort, one might wish to consider involving the large storage farm
businesses such as Google and Amazon.  They might be willing to help
support science partially in exchange for eyeballs going to their sites.

Regards,
  H. J. Bernstein

At 1:56 PM -0600 10/25/11, James Stroud wrote:

On Oct 24, 2011, at 3:56 PM, James Holton wrote:


The PDB only gets about 8000 depositions per year



Just to put this into dollars. If each dataset is about 17 GB in 
size, then that's about 14 TB of storage that needs to come online 
every year to store the raw data for every structure. A two second 
search reveals that Newegg has a 3GB hitachi for $200. So that's 
about $1000 / year of storage for the raw data behind PDB deposits.


James



--
=====
 Herbert J. Bernstein, Professor of Computer Science
 Dowling College, Brookhaven Campus, B111B
   1300 William Floyd Parkway, Shirley, NY, 11967

 +1-631-244-1328
   Lab: +1-631-244-1935
  Cell: +1-631-428-1397
 y...@dowling.edu
=


Re: [ccp4bb] Mac OSX 10.7 Lion

2011-09-10 Thread Herbert J. Bernstein

Dear Colleagues,

  Lion is an reality all developers have to live with.  While I
agree that it would be a bad idea to update one's primary
development environment to Lion, it does seem a good idea to have
at least one system with sufficient memory, disk and good enough
graphics and the new UI (user interface) to be able to wring out
problems, especially on the UI side.

  So the question is:  Is a new MacBook Pro or new MacBook Air
sufficient for such testing, or is something heftier with a
studio display necessary?

  Regards,
Herbert



At 10:26 AM +0100 9/10/11, harry powell wrote:

Hi

My two ha'porth.

If you are thinking of "upgrading" your sole Mac software 
development box to Lion I'd say "don't do it unless you like a lot 
of pain". Anything built on Snow Leopard should run okay on Lion (my 
Tiger builds seem okay on 10.4, 10.5, 10.6...), so unless you really 
have an over-riding need to move to 10.7, I'd hang fire.


If you're only running applications and not developing, you can 
always shout at the developers if things go wrong.


My plan is to install Lion on a spare bootable disk and see what 
happens - if all else fails, at least I can ignore the upgrade until 
Apple release 10.7.1, 10.7.2, etc and fix most of their screw-ups.


On 10 Sep 2011, at 08:03, Jacques-Philippe Colletier wrote:


Hi,

Overall, the transition from 10.6 is seemingless, crystallographically-wise.

Of course you need to have the new XCode 4.1 installed, and you 
should also download new, 10.7-dedicated 64bits gcc/gfortran/g77 
bundles from http://hpc.sourceforge.net/

And then, CNS, Phenix, CCP4, etc... will just run perfect.
I had to reinstall Coot -- but that's minor.
(Mac)Pymol also works fine, yet (for some reason) uses a lot of 
resources even when idle.
As per the Upsalla Soft. Factory programs, you'll have to recompile 
then using the above mentioned bundles, or get already compiled 
binaries from Mark Harris.
You'll also need to recompile Gromacs (and fftw3) if you're using 
it -- but it then works great.


I agree with W. Scott on the fact that the new OS is really greedy 
in terms of resources.
I surely wont upgrade my other, older mac computers that run just 
fine on 10.6.

But if you have a new Mac, Lion is is really neat.

Best
Jacques


Le Sep 10, 2011 à 2:09 AM, William Scott a écrit :


Hi Phil:

I've found few, if any advantages.  I fear for the future.

I've had problems getting coot to run stereo due to the X11 
implementation in 10.7.  Apart from that, no major problems with 
crystallographic software.


Lion greedily uses memory, and any computer I have with less than 
4 gig of memory has become extremely sluggish as a consequence of 
the "upgrade."  Ideally, you need 8 gig.


Even with that, on my 2010 mini that I use for music playback, I 
regressed to 10.6.8, because of the audio interface. (It seems 
less robust, more prone to dropouts and now lacks integer mode 
output).


Sara has been screaming at me for the last two weeks (nothing us 
usual in of itself) because Apple decided to get rid of "Save As".


Xcode and the compiler set is free again on 10.7.

I've put some suggestions here for how to get rid of the most 
annoying new features:

http://sage.ucsc.edu/~wgscott/xtal/wiki/index.php/Lion_upgrade_notes

All the best,

Bill






On Sep 9, 2011, at 1:28 AM, Phil Evans wrote:

Is there any opinion or experience about whether Lion is ready 
for crystallographic use? Should I "upgrade"?


Phil


William G. Scott

Contact info:
http://chemistry.ucsc.edu/~wgscott/


Harry
--
Dr Harry Powell,
MRC Laboratory of Molecular Biology,
Hills Road,
Cambridge,
CB2 0QH



--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=


Re: [ccp4bb] Software to Produce Linear Map of Surface Accessible Residues

2010-11-02 Thread Herbert J. Bernstein

Try RasMol 2.7.5, e.g.

load ../data/pdb1w0k.ent
restrict not hoh
map generate LRsurf dots
map select atom within 1.8
show selected


=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Tue, 2 Nov 2010, Buz Barstow wrote:


Dear All,

I'm looking for a software program to produce, given a 3D atomic structure of a 
molecule, a linear map showing the surface accessibility of residues in a 
protein structure.

Would any one know of a program that can produce this sort of map.

Thanks! and all the best,

--Buz



Re: [ccp4bb] Whither pymol?

2010-09-21 Thread Herbert J. Bernstein
.

Peace and joy,

Bill






William G. Scott
Professor
Department of Chemistry and Biochemistry
and The Center for the Molecular Biology of RNA
228 Sinsheimer Laboratories
University of California at Santa Cruz
Santa Cruz, California 95064
USA

phone:  +1-831-459-5367 (office)
      +1-831-459-5292 (lab)
fax:+1-831-4593139  (fax)



--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=


[ccp4bb] Important proposed changes in CIF

2010-07-11 Thread Herbert J. Bernstein

Dear CIF users,

As some of you may be aware, a new CIF dictionary framework is under
development. This framework consists of an updated CIF syntax
(CIF2), a new set of dictionary attributes (DDLm), and a
machine-readable language for describing algorithmic relationships
between datanames (dREL).  The working group for developing this new
framework has come up with a final draft for the CIF2 syntax
component, which is available at

http://www.iucr.org/__data/assets/pdf_file/0017/41426/cif2_syntax_changes_jrh20100705.pdf

We are now seeking feedback from the community on this proposed new
syntax standard.  Please note that this CIF2 standard is designed to
coexist with the CIF1 standard (which it closely resembles), rather
than to replace it.

The discussions surrounding the CIF2 specification are archived at

http://www.iucr.org/__data/iucr/lists/ddlm-group/ [www.iucr.org] .

Some highlights of the the proposed CIF2 syntax:

* A list datavalue is introduced: lists are enclosed by square
brackets, e.g. [1 2 3 4] or  [[1 'x'] 3 ['y' 5 ['pqr' 7] 8 ]].
List-valued data items are vital for economically expressing matrix
and vector relationships in dREL algorithms.

* A table datavalue is introduced, enclosed by curly braces, e.g.
{"colour":"red" "size":"really big"}.  Table datastructures allow
tabulated values (e.g. f' values) to be transparently accessed in dREL
algorithms.

* Both lists and tables are recursive, that is, lists and tables can
contain other lists and tables

* Multi-line strings may now be delimited using triple quotes (""") or
triple single quotes ('''), as well as the CIF1.1 
delimiter.

* Single-quote delimited strings and double-quote delimited strings
may not contain instances of the delimiter character.  This differs
from the CIF1.1 standard, which allowed instances of the delimiting
character if the next character was not whitespace.

* CIF2 files are in UTF8 encoding.  Note that ASCII is a proper subset of 
UTF8.


The DDLm working group would welcome any feedback you may have on this
specification, whether through open discussion on this list or by
contacting members of the working group (see the online discussion
archive for names of the participants).


=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=


Re: [ccp4bb] lossy compression of diffraction images

2010-05-09 Thread Herbert J. Bernstein

Dear Colleagues,

  The main problem with a lossy compression that suppresses weak
spots is that those spots may be a tip-off to a misidentified
symmetry, so you may wish to keep some faithful copy of the
original diffraction image until you are very certain of having
the symmetry right.

  That being said, such a huge compression sounds very useful,
and I would be happy to add it as an option of CBFlib for people
to play with once the code is reasonably stable and available,
and if it is not tied up in patents or licenses that conflict
with the LGPL.

  Regards,
Herbert

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Sun, 9 May 2010, James Holton wrote:


Frank von Delft wrote:
Just looked at the algorithm, how it stores the average "non-spot" through 
all the images.


What happens with dataset where the "non-spot" (e.g. background) changes 
systematically through the dataset, i.e. anisotropic datasets or thin 
crystals lying flat in a thin loop?  How much worse is compression for 
that?

Cheers
phx
Well, what will happen in that case (with the current "algorithm") is that 
once a background pixel deviates from the median level by more than 4 
"sigmas", it will start to get stored losslessly.  Essentially, they will be 
treated as "spots" and the overall compression ratio will start to approach 
that of bzip2.


A "workaround" for this is simply to store the data set in "chunks" where the 
background level is similar, but I suppose a more intelligent thing to do 
would be to simply "scale" each image to the median background image, and 
store the scale factors (a list of 100 numbers for a 100-image data set) 
along with the other ancillary data.  I haven't done that yet.  Didn't want 
to spend too much time on this in case I incited some kind of revolt.


-James Holton
MAD Scientist





On 07/05/2010 06:07, James Holton wrote:

Ian Tickle wrote:

I found an old e-mail from James Holton where he suggested lossy
compression for diffraction images (as long as it didn't change the
F's significantly!) - I'm not sure whether anything came of that!


Well, yes, something did come of this  But I don't think Gerard 
Bricogne is going to like it.


Details are here:
http://bl831.als.lbl.gov/~jamesh/lossy_compression/

Short version is that I found a way to compress a test lysozyme dataset by 
a factor of ~33 with no apparent ill effects on the data.  In fact, 
anomalous differences were completely unaffected, and Rfree dropped from 
0.287 for the original data to 0.275 when refined against Fs from the 
compressed images.  This is no doubt a fluke of the excess noise added by 
compression, but I think it highlights how the errors in crystallography 
are dominated by the inadequacies of the electron density models we use, 
and not the quality of our data.


The page above lists two data sets: "A" and "B", and I am interested to 
know if and how anyone can "tell" which one of these data sets was 
compressed.  The first image of each data set can be found here:

http://bl831.als.lbl.gov/~jamesh/lossy_compression/firstimage.tar.bz2

-James Holton
MAD Scientist




Re: [ccp4bb] Refining against images instead of only reflections

2010-01-20 Thread Herbert J. Bernstein
I agree with Bernhard -- both on the soundness of the idea and on 
the difficulty in finding the right home for it in NSF or NIH, but

I would suggest giving it a try. -- Herbert

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Wed, 20 Jan 2010, Bernhard Rupp wrote:


I think these arguments for image conservation and image *use* are well
taken.
The best source of information of what is going on my be the imgCIF
people, - I'd start with Andy Howard and Herbert Bernstein.

I think that image data (after detector- and configuration-specific
corrections to the raw images that should be quite accurate) might be
a good start for such efforts.

I also think that this is a *most interesting area* combining X-ray physics
and
biomolecular refinement. This also kills the idea. Because the NSF will
reject
any proposal because it has the b-word (bio) in it, and NIH will reject it
because it has the p-word (physics) in it.

If someone still wants to try, let me know.

Best, BR

-
The man who follows the crowd will get
no further than the crowd.
The man who walks alone will find himself
in places where no one has been before.
-


-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Jacob
Keller
Sent: Wednesday, January 20, 2010 9:47 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Refining against images instead of only reflections

Dear Crystallographers,

One can see from many posts on this listserve that in any given x-ray
diffraction experiment, there are more data than merely the diffraction
spots. Given that we now have vastly increased computational power and data
storage capability, does it make sense to think about changing the paradigm
for model refinements? Do we need to "reduce" data anymore? One could
imagine applying various functions to model the intensity observed at every
single pixel on the detector. This might be unneccesary in many cases, but
in some cases, in which there is a lot of diffuse scattering or other
phenomena, perhaps modelling all of the pixels would really be more true to
the underlying phenomena? Further, it might be that the gap in R values
between high- and low-resolution structures would be narrowed significantly,

because we would be able to model the data, i.e., reproduce the images from
the models, equally well for all cases. More information about the nature of

the underlying macromolecules might really be gleaned this way. Has this
been discussed yet?

Regards,

Jacob Keller

***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-kell...@northwestern.edu
***



Re: [ccp4bb] Warren DeLano

2009-11-05 Thread Herbert J. Bernstein

Dear Colleagues,

  Warren's passing is a great loss to the community, not just for what he 
did for all of us with PyMOL in the past but for the loss of all the great 
things he would have done for the community in the future.


  Perhaps we can encourage one or more of the societies to create an award 
in his name to be given to people who carry on his ideals.  Those ideals 
deserve to be encoraged and perpetuated.


  With deepest condolences to his family,

Herbert


=====
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Thu, 5 Nov 2009, Axel Brunger wrote:


Dear CCP4 Community:

I write today with very sad news about Dr. Warren Lyford DeLano.

I was informed by his family today that Warren suddenly passed
away at home on Tuesday morning, November 3rd.

While at Yale, Warren made countless contributions to the computational tools
and methods developed in my laboratory (the X-PLOR and CNS programs),
including the direct rotation function, the first prediction of helical 
coiled coil
structures, the scripting and parsing tools that made CNS a universal 
computational

crystallography program.

He then joined Dr. Jim Wells laboratory at USCF and Genentech where he 
pursued

a Ph.D. in biophysics, discovering some of the principles that govern
protein-protein interactions.

Warren then made a fundamental contribution to biological sciences by 
creating the

Open Source molecular graphics program PyMOL that is widely used throughout
the world. Nearly all publications that display macromolecular structures use 
PyMOL.


Warren was a strong advocate of freely available software and the Open Source
movement.

Warren's family is planning to announce a memorial service, but arrangements 
have

not yet been made. I will send more information as I receive it.

Please join me in extending our condolences to Warren's family.

Sincerely yours,
Axel Brunger

Axel T. Brunger
Investigator,  Howard Hughes Medical Institute
Professor of Molecular and Cellular Physiology
Stanford University

Web:http://atbweb.stanford.edu
Email:  brun...@stanford.edu
Phone:  +1 650-736-1031
Fax:+1 650-745-1463








Re: [ccp4bb] I compressed my images by ~ a factor of two, and they load and process in mosflm faster

2009-09-22 Thread Herbert J. Bernstein

Which compression was used?  The packed compression saves a lot of space,
but requires much more CPU involvement.  The byte offset compression saves
less space but takes less CPU time.  From the numbers, I would guess it
was the packed.

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Mon, 21 Sep 2009, Harry Powell wrote:


Hi

Not a typical run, but I just got these on my Macbook pro from a 320 image 
1.5Å myoglobin dataset, collected on a Q315 -


[macf3c-4:~/test/cbf] harry% cd cbf
[macf3c-4:~/test/cbf/cbf] harry% time mosflm < integrate > integrate.lp
445.355u 27.951s 8:38.57 91.2%  0+0k 1+192io 41pf+0w
[macf3c-4:~/test/cbf/cbf] harry% cd ../original
[macf3c-4:~/test/cbf/original] harry% time mosflm < integrate > integrate.lp
279.331u 18.691s 8:05.76 61.3%  0+0k 0+240io 16pf+0w

I am somewhat surprised at this. Since I wasn't running anything else, I'm 
also a little surprised that, although the "user" times above are so 
different, so are the percentages of the elapsed clock times. Herb may be 
able to comment more knowledgeably.


I don't have my Snow Leopard box here so can't compare the "ditto'd" files 
just at the moment.


On 21 Sep 2009, at 13:26, Waterman, David (DLSLtd,RAL,DIA) wrote:

Yes, this is exactly what I meant. If the data are amenable (which was 
addressed in the previous discussion with reference to diffraction images) 
and there is a suitable lossless compression/expansion algorithm, then on 
most modern computers it is faster to read the compressed data from disk 
and expand it in RAM, rather than directly read the uncompressed image from 
a magnetic plate. Of course this depends on all sorts of factors such as 
the speed of the disk, the compression ratio, the CPU(s) clock speed, if 
the decompression can be done in parallel, how much calculation the 
decompression requires, and so on.


Bill's example is nice because the compression is transparent, so no extra 
work needs to be done by developers. However, this is one for Macs only. 
I'd like to know whether integration runs faster using CBF images with the 
decompression overhead of CBFlib compared with reading the same data in 
uncompressed form on "standard" hardware (whatever that means).


Cheers
David

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of 
Andrew Purkiss-Trew

Sent: 18 September 2009 21:52
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] I compressed my images by ~ a factor of two, and they 
load and process in mosflm faster


The current bottleneck with file systems is the speed of getting data on or 
off the magnetic surface. So filesystem compression helps, as less data 
needs to be physically written or read per image. The CPU time spent 
compressing the data is less than the time saved in writing less data to 
the surface.


I would be interested to see if the speed up is the same with a solid state 
drive, as there is near 'random access' here, unlike with a magnetic drive 
where the seek time is one of the bottlenecks. For example, mechanical hard 
drives are limited to about 130MB/s, whereas SSDs can already manage 
200MB/s (faster than a first generation SATA interface at 150MB/s can cope 
with and one of the drivers behind the 2nd (300MB/s) and 3rd generation 
(600MB/s) SATA intefaces). The large size of our image files should make 
them ideal for use with SSDs.



Quoting "James Holton" :


I think it important to point out that despite the subject line, Dr.
Scott's statement was:
"I think they process a bit faster too"
Strangely enough, this has not convinced me to re-format my RAID array
with an new file system nor re-write all my software to support yet
another new file format.  I guess I am just lazy that way.  Has anyone
measured the speed increase?  Have macs become I/O-bound again? In any
case, I think it is important to remember that there are good reasons
for leaving image file formats uncompressed.  Probably the most
important is the activation barrier to new authors writing new
programs that read them.  "fread()" is one thing, but finding the
third-party code for a particular compression algorithm, navigating a
CVS repository and linking to a library are quite another!  This is
actually quite a leap for those
of us who never had any formal training in computer science.
Personally, I still haven't figured out how to read pck images, as
it is much easier to write "jiffy" programs for uncompressed data.
For example, if all you want to do is extract a group of pixels (such
as a spot), then you have to decompress the whole image!  In co

[ccp4bb] RasMol 2.7.5 Release

2009-07-24 Thread Herbert J. Bernstein

The testing of RasMol 2.7.5 has gone well so far, and we intend to make it
the default release.  Except for a few remaining warning messages, the
problems reported during testing have been resolved.

The current best releases are:
Sourcekit:
http://downloads.sf.net/openrasmol/rasmol-2.7.5-23Jul09.tar.gz

MS Windows Installer
http://downloads.sf.net/openrasmol/RasWin_2_7_5_Install_24Jul09.exe

Mac OS X 10.5 Intel i386 install kit
http://downloads.sf.net/openrasmol/RasMol_2_7_5_i386_OSX_24Jul09.tar.gz

Slackware Linux 12 i686 install kit
http://downloads.sf.net/openrasmol/RasMol_2_7_5_i686_Slackware_24Jul09.tar.gz

Mac OS X 10.3.9 PPC install kit
http://downloads.sf.net/openrasmol/RasMol_2_7_5_PPC_OSX_24Jul09.tar.gz

The unix-style install kits are should be unpacked and the
rasmol_install.sh installation script should be run.  Do
rasmol_install.sh --help to see the options.

This release is based on RasMol 2.7.4.2, the final reference release for
the 2.7.4 series.

The changes made between the 2.7.5 release candidate release of 17 July
2009 and the formal release on 23 July 2009 were:

* Correction to the support for core CIF data file loads that was
disabled in the move to CBFlib in place of the internal CIF support.
* Correction to the CCP4 map read logic in the case of symmetry lines.
Thanks to Marian Szebenyi for finding this bug.
* Clarification to the install instructions for 64-bit unix systems.
Thanks to Marian Szebenyi and Mark Diekhans for pointing out the lack of
clarity.

The major changes in RasMol 2.7.5 are:

* Support for SBEVSL movie commands.
* Support for Lee-Richards surface approximation by contouring
pseudo-Gaussian electron densities.
* Selection of atoms by proximity to map contours
* Coloring of maps by the colors of neighboring atoms
* Significant improvements to the GTK version by Teemu Ikonen

The SVN at http://sf.net/projects/openrasmol should be consulted for
details of the code changes.  All rasmol source repositories have been
updated.  If you are building from source, be sure to download a
current CBFlib 0.8.1 kit to address the problem in loading core
CIF data files.


=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=


[ccp4bb] Testers needed for RasMol 2.7.5

2009-07-17 Thread Herbert J. Bernstein

Testers would be appreciated for the release candidate binary kits for the
RasMol 2.7.5 on sourceforge:

http://downloads.sf.net/openrasmol/RasWin_2_7_5_Install_17Jul09.exe
http://downloads.sf.net/openrasmol/RasMol_2_7_5_i686_Slackware_17Jul09.tar.gz
http://downloads.sf.net/openrasmol/RasMol_2_7_5_i386_OSX_17Jul09.tar.gz

The source tarball is

http://downloads.sf.net/openrasmol/rasmol-2.7.5-17Jul09.tar.gz

The manual is available at

http://www.bernstein-plus-sons.com/software/RasMol_2.7.5/doc/rasmol.html

Further documentation and web pages are in progress.l

The MS Windows installer RasWin_2_7_5_Install_17Jul09.exe is run by 
double-clicking it after
download.  The Slackware and Intel Mac OSX installers are run by 
downloading, unpacking,

entering the directory
and executing

rasmol_install.sh --compilefonts

The major changes in this release are support for approximation to
Lee-Richards surfaces by use of the commands
  map generate LRsurf mesh
  map generate LRsurf surface
coloring of maps by
  map color atom
and selection of atoms near a map contour by
  map select atom

as well as preliminary support for the SBEVSL record movie making
commands

  http://sbevsl.wiki.sourceforge.net/Movie+Making+Commands

Bug reports, comments, correction and suggestions would be appreciated

Please report problems to

  y...@bernstein-plus-sons.com

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=


Re: [ccp4bb] 3 positions available in the CCP4 core group

2009-06-03 Thread Herbert J. Bernstein

My apologies to the list.  That was intended as an off-list comment.

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Wed, 3 Jun 2009, Herbert J. Bernstein wrote:

FBU259 LEAD COMPUTATIONAL SCIENTIST looks like a wonderful job.  Think what 
Larry could do with it.  It is a shame he is not portable.


=
Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
   Idle Hour Blvd, Oakdale, NY, 11769

+1-631-244-3035
y...@dowling.edu
=

On Wed, 3 Jun 2009, Martyn Winn wrote:


3 COMPUTATIONAL SCIENTIST POSITIONS IN THE CCP4 PROJECT FOR
MACROMOLECULAR CRYSTALLOGRAPHY

COMPUTATIONAL SCIENCE & ENGINEERING DEPARTMENT
STFC RUTHERFORD-APPLETON LABORATORY, UK

CCP4 (Collaborative Computational Project 4) is a large project with
the aim of developing and maintaining state of the art analysis
software for macromolecular crystallography (http://www.ccp4.ac.uk).
The CCP4 software suite is distributed to over 500 academic and
commercial sites around the world. In autumn 2009, the core STFC group
of CCP4 is relocating from Daresbury Laboratory to the Rutherford
Appleton Laboratory, and will be housed in the newly built Research
Complex at Harwell (http://www.rc-harwell.ac.uk). This will provide a
stimulating environment adjacent to the Diamond Light Source and other
developments.

FBU259 LEAD COMPUTATIONAL SCIENTIST
3 YEAR FIXED TERM, SALARY RANGE ??41,930 - ??46,589

An experienced person is sought to provide scientific leadership to
the core group, develop external collaborations and help define the
future direction of CCP4. Responsibilities will include:
* Day-to-day scientific direction of the core team
* Independent software development projects relevant to CCP4
* Pursuing strong collaborations with Diamond Light Source staff
* Contributing actively to the CCP4 educational and publicity programmes

FBU260 SCIENTIFIC PROGRAMMER
3 YEAR FIXED TERM, SALARY RANGE ??26,088 - ??36,798

An experienced scientific programmer is required to contribute to the
development of the next generation CCP4 graphical user interface and
the underlying automation framework. The principal role will be to
develop the framework for automated structure solution. In addition,
the postholder will contribute to the associated GUI and database
developments. The postholder will work closely with other CCP4
developers and collaborating scientists to ensure useability of the
released software

FBU258: SCIENTIFIC PROGRAMMER
3 YEAR FIXED TERM, SALARY RANGE ??26,088 - ??28,986

A scientific programmer is required to contribute to the core
activities of the CCP4 team. Duties will include development of the
code base, assisting with the public release of software, contributing
to educational programmes and user support. The postholder will
work closely with other members of the core team and collaborating
developers. The CCP4 team contribute to several major development
projects, and the postholder may be involved in these as appropriate.

An excellent index linked pension scheme and generous leave allowance
are also offered.

For further information and how to apply: please visit www.scitech.ac.uk
(the Careers section will take you to the recruitment site
https://erecruit.cclrc.ac.uk/), telephone 01235 446677 or e-mail
recruitment-...@rl.ac.uk quoting the relevant reference number. Informal
enquiries may be made to Martyn Winn (martyn.w...@stfc.ac.uk).

Closing date for applications is 26 Jun 2009

Interviews will be held between 16th and 24th July


--
***
* *
*   Dr. Martyn Winn   *
* *
*   STFC Daresbury Laboratory, Daresbury, Warrington, WA4 4AD, U.K.   *
*   Tel: +44 1925 603455E-mail: martyn.w...@stfc.ac.uk*
*   Fax: +44 1925 603825Skype name: martyn.winn   *
* URL: http://www.ccp4.ac.uk/martyn/  *
***


Re: [ccp4bb] 3 positions available in the CCP4 core group

2009-06-03 Thread Herbert J. Bernstein
FBU259 LEAD COMPUTATIONAL SCIENTIST looks like a wonderful job.  Think 
what Larry could do with it.  It is a shame he is not portable.


=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Wed, 3 Jun 2009, Martyn Winn wrote:


3 COMPUTATIONAL SCIENTIST POSITIONS IN THE CCP4 PROJECT FOR
MACROMOLECULAR CRYSTALLOGRAPHY

COMPUTATIONAL SCIENCE & ENGINEERING DEPARTMENT
STFC RUTHERFORD-APPLETON LABORATORY, UK

CCP4 (Collaborative Computational Project 4) is a large project with
the aim of developing and maintaining state of the art analysis
software for macromolecular crystallography (http://www.ccp4.ac.uk).
The CCP4 software suite is distributed to over 500 academic and
commercial sites around the world. In autumn 2009, the core STFC group
of CCP4 is relocating from Daresbury Laboratory to the Rutherford
Appleton Laboratory, and will be housed in the newly built Research
Complex at Harwell (http://www.rc-harwell.ac.uk). This will provide a
stimulating environment adjacent to the Diamond Light Source and other
developments.

FBU259 LEAD COMPUTATIONAL SCIENTIST
3 YEAR FIXED TERM, SALARY RANGE £41,930 - £46,589

An experienced person is sought to provide scientific leadership to
the core group, develop external collaborations and help define the
future direction of CCP4. Responsibilities will include:
* Day-to-day scientific direction of the core team
* Independent software development projects relevant to CCP4
* Pursuing strong collaborations with Diamond Light Source staff
* Contributing actively to the CCP4 educational and publicity programmes

FBU260 SCIENTIFIC PROGRAMMER
3 YEAR FIXED TERM, SALARY RANGE £26,088 - £36,798

An experienced scientific programmer is required to contribute to the
development of the next generation CCP4 graphical user interface and
the underlying automation framework. The principal role will be to
develop the framework for automated structure solution. In addition,
the postholder will contribute to the associated GUI and database
developments. The postholder will work closely with other CCP4
developers and collaborating scientists to ensure useability of the
released software

FBU258: SCIENTIFIC PROGRAMMER
3 YEAR FIXED TERM, SALARY RANGE £26,088 - £28,986

A scientific programmer is required to contribute to the core
activities of the CCP4 team. Duties will include development of the
code base, assisting with the public release of software, contributing
to educational programmes and user support. The postholder will
work closely with other members of the core team and collaborating
developers. The CCP4 team contribute to several major development
projects, and the postholder may be involved in these as appropriate.

An excellent index linked pension scheme and generous leave allowance
are also offered.

For further information and how to apply: please visit www.scitech.ac.uk
(the Careers section will take you to the recruitment site
https://erecruit.cclrc.ac.uk/), telephone 01235 446677 or e-mail
recruitment-...@rl.ac.uk quoting the relevant reference number. Informal
enquiries may be made to Martyn Winn (martyn.w...@stfc.ac.uk).

Closing date for applications is 26 Jun 2009

Interviews will be held between 16th and 24th July


--
***
* *
*   Dr. Martyn Winn   *
* *
*   STFC Daresbury Laboratory, Daresbury, Warrington, WA4 4AD, U.K.   *
*   Tel: +44 1925 603455E-mail: martyn.w...@stfc.ac.uk*
*   Fax: +44 1925 603825Skype name: martyn.winn   *
* URL: http://www.ccp4.ac.uk/martyn/  *
***


Re: [ccp4bb] images

2009-03-18 Thread Herbert J. Bernstein

Some of us have already been discussing that possibility.
  -- Herbert

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Wed, 18 Mar 2009, Bernhard Rupp wrote:


All right: How about then putting in a NIH challenge grant (due April 27)
for image archiving? Who is in?

BR

-Original Message-
From: Gerard Bricogne [mailto:g...@globalphasing.com]
Sent: Wednesday, March 18, 2009 4:12 PM
To: Bernhard Rupp
Cc: 'Gerard Bricogne'; CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] images

Dear Bernhard,

Re-reading your previous message, I can see that I did indeed misread
it, and I apologise for that. Perhaps it was the expression "put to rest" in
relation to a topic where so much action is needed that made me charge in
the wrong direction.

Although this "thread" is now attracting additional suggestions about
what else it might be a good thing to archive, this should not result in a
dilution of the urgency of this particular item. As for the argument that
any new task can only be done if there is extra money, then isn't this the
ideal time to argue that we need a "PDB stimulus package"? After all, the
PDB is a bank ... .


With best wishes,

 Gerard.

--
On Wed, Mar 18, 2009 at 12:14:29PM -0700, Bernhard Rupp wrote:

Maybe I was misunderstood. There is no doubt in my opinion and that of

those

that have put effort into image conservation issues years ago that
keeping and archiving the images is more than desirable, for precisely
the reasons mentioned.

Nail my nauseating spell checker for the nausea that may have caused you.

Cheers, BR

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of

Gerard

Bricogne
Sent: Wednesday, March 18, 2009 11:03 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] images

Dear Bernhard,

 I suppose you meant "ad nauseam" ;-) .

 In any case, what is the use of discussions and recommendations that
are not followed by action, and only result in making their contributors
themselves nauseated to the point of wanting to "put this to rest"?

 As Ethan has nicely stated in his reply to Garib's double-check of
whether we do need images, this matter should NOT be put to rest: it

should

be dealt with. As was argued at the end of the paper by Joosten, Womack et
al. (Acta Cryst. D65, 176-185), the main advantage of depositing images
would be that it would enable and stimulate the further developement and
testing of image integration and data processing software, to the same
degree that the deposition of structure factors has stimulated progress

and

testing for structure refinement software.

 Far from a boring issue only capable of giving headaches to Standards
Committee members, this is a vital issue: with each undeposited set of
images that contributed in one way or another to the determination or
refinement of a deposited structure, there disappears an opportunity to

test

improvements in methods and software that would be likely to improve that
deposited entry (and most others) at a future stage. I think we need to

take

a long view on this, and abandon the picture of the PDB as a static

archive

of frozen results: instead, it should be seen as a repository of what is
required not only to validate/authenticate the deposited models, but to

feed

the continued improvement of the methods used - and hence, at the next
iteration, the constant revision and improvement of those very models. In
what way can this topic be a source of nausea?


 With best wishes,

  Gerard.

--
On Wed, Mar 18, 2009 at 10:16:42AM -0700, Bernhard Rupp wrote:

As Herb will attest, the need for keeping images and the various reasons
for it have been discussed ad nauseum and agreed upon in various imgCIF
meetings - I am sure Herb or Andy Howard can provide links to the
documents/recommendations, to put this to rest.

Best, BR

Past ACA Data Standards Committee serf

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of

Kay

Diederichs
Sent: Wednesday, March 18, 2009 10:02 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] images


--

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===





Re: [ccp4bb] images

2009-03-18 Thread Herbert J. Bernstein

Actually the radiologists who manage CT and PET scans of brains do have
a solution, called DICOM, see http://medical.nema.org/.  If we work
together as a community we should be able to do as well as the
rocket scientists and the brain surgeons' radiologists, perhaps even
better. -- Herbert

=====
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Wed, 18 Mar 2009, Jacob Keller wrote:

Apparently it DOES take a rocket scientist to solve this problem. Maybe the 
brain surgeons also have a solution?


JPK

***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

- Original Message - From: "Klaas Decanniere" 


To: 
Sent: Wednesday, March 18, 2009 5:36 AM
Subject: Re: [ccp4bb] images



Herbert J. Bernstein wrote:

Other sciences have struggled with this and seem to have found an answer.
Have e.g. a look at http://heasarc.nasa.gov/docs/heasarc/fits.html

kind regards,

Klaas



  This is a good time to start a major crystallogrpahic image
archiving effort.  Money may well be available now that will not be
avialable six month from now, and we have good, if not perfect,
solutions available for many, if not all, of the technical issues
involved.  Is it really wise to let this opportunity pass us by?


The deposition of images would be possible providing some consistent
imagecif format was agreed.
This would of course be of great use to developers for certain
pathological cases, but not I suspect much value to the user
community - I down load structure factors all the time for test
purposes but I probably would not bother to go through the data
processing, and unless there were extensive notes associated with
each set of images I suspect it would be hard to reproduce sensible
results.







Re: [ccp4bb] images

2009-03-16 Thread Herbert J. Bernstein

Dear Colleagues,

  The issue Harry is describing, of people writing multiple variations of 
"image formats" even though all of them are imgCIF is not really a 
problems with the images themselves.  Rather it is a lack of agreement on 
the metadata to go with the images.  This is similar to the problem of 
lack of consistency in REMARKS for early PDB data sets, which eventually 
required the adoption of standardized REMARKS and reprocessing of almost 
all data sets.  I don't think it would have been easier to reprocess those 
data sets if the original data sets had also had their coordinates and 
sequences recorded with wide variations in formats.


  The advantage of using imgCIF for an archive is not that it would force 
everybody to to their experiments using precisely the same format, but 
that, because it is capable of faithfully representing all the wide 
variations in current formats, it would allow what we now have to be 
captured and preserved and, when someone needed a dataset back, to be 
recast in an format appropriate to the use.


  Think of it as that little figure-8 plug and socket we are able to use 
to adapt our power cords for travel around the world.  There are other 
possible hub format (NeXus, DICOM, etc.), but the sensisble thing for an 
archive is to choose one of them for internal use, just as the PDB uses a 
variation on mmCIF for its internal use to allow it to easily deliver 
valid PDB, CIF and XML versions of sets of coordinates.  For an archive, 
the advantages of using imgCIF internally, no matter which of the more 
than 200 current formats were to be used at beam lines and in labs, is 
that it would not be necessary to discard any of the metadata people 
provided and it could be made to interoperate easily with the systems used 
internally by the PDB for coordinate data sets.


  For many of the formats in current use, there is no place to store some 
of the information people provide and translation to other formats can 
sometimes be much more difficult than one might expect unless additional 
metadata is provided.  Even such obvious things as image orientations are 
sometimes carried separately from the images themselves and can easily get 
lost.


  Don't let the perfect be the enemy of the good.  Archiving images in a 
common format, such as imgCIF, or, if you prefer, say, in the NeXus 
transliteration of imgCIF, would help to make some very useful information 
accessible for future use.  It may not be a perfect solution, but it is a 
good one.


  This is a good time to start a major crystallogrpahic image archiving 
effort.  Money may well be available now that will not be avialable six 
month from now, and we have good, if not perfect, solutions available for 
many, if not all, of the technical issues involved.  Is it really wise to 
let this opportunity pass us by?


  Regards,
Herbert
=============
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Mon, 16 Mar 2009, Harry Powell wrote:


Hi

I'm afraid the adoption of imgCIF (or CBF, its useful binary equivalent) 
doesn't help a lot - I know of three different manufacturers of detectors 
who, between them, write out four different image formats, all of which 
apparently conform to the agreed IUCr imgCIF standard. Each manufacturer has 
its own good and valid reasons for doing this. It's actually less work for me 
as a developer of integration software to write new code to incorporate a new 
format than to make sure I can read all the different imgCIFs properly.



On 16 Mar 2009, at 09:32, Eleanor Dodson wrote:

The deposition of images would be possible providing some consistent 
imagecif format was agreed.
This would of course be of great use to developers for certain pathological 
cases, but not I suspect much value to the user community - I down load 
structure factors all the time for test purposes but I probably would not 
bother to go through the data processing, and unless there were extensive 
notes associated with each set of images I suspect it would be hard to 
reproduce sensible results.


The research council policy in the UK is that original data is meant to be 
archived for publicly funded projects. Maybe someone should test the 
reality of this by asking the PI for the data sets?

Eleanor


Garib Murshudov wrote:

Dear Gerard and all MX crystallographers

As I see there are two problems.
1) Minor problem: Sanity, semantic and other checks for currently 
available data. It should not be difficult to do. Things like I/sigma, 
some statistical analysis expected vs "observed" statistical behaviour 
should sort out many of these problems (Eleanor mentioned some and they 
can be used). I do n

Re: [ccp4bb] cl.exe

2009-03-12 Thread Herbert J. Bernstein

Have you considered gcc under MINGW?  It produced more stable
code that runs under more versions of windows than CL.

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 y...@dowling.edu
=

On Wed, 11 Mar 2009, Bernhard Rupp wrote:


Dear All,



mysteriously the CL.exe from my MS visual developer studio ancient
disappeared and I

have no more backup CD. Could someone kindly zip me that program? I think
any

CL.exe will work. I need to compile/link some C-modules..



Thx!!!



Bernhard Rupp
001 (925) 209-7429
+43 (676) 571-0536
b...@qedlife.com
bernhardr...@sbcglobal.net
http://www.ruppweb.org/








Re: [ccp4bb] Non-sequential residue numbering?

2008-09-19 Thread Herbert J. Bernstein
copies of the 
message and any attached documents.
Astex Therapeutics Ltd monitors, controls and protects all its 
messaging traffic in compliance with its corporate email policy. 
The Company accepts no liability or responsibility for any onward 
transmission or use of emails and attachments having left the 
Astex Therapeutics domain.  Unless expressly stated, opinions in 
this message are those of the individual sender and not of Astex 
Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex 
Therapeutics Ltd accepts no liability for damage caused by any 
virus transmitted by this email. E-mail is susceptible to data 
corruption, interception, unauthorized amendment, and tampering, 
Astex Therapeutics Ltd only send and receive e-mails on the basis 
that the Company is not liable for any such alteration or any 
consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge 
Science Park, Cambridge CB4 0QA under number 3751674






--
Linda S. Brinen
Adjunct Assistant Professor
Dept of Cellular & Molecular Pharmacology and
The Sandler Center for Basic Research in Parasitic Diseases
Phone: 415-514-3426 FAX: 415-502-8193
E-mail: [EMAIL PROTECTED]
QB3/Byers Hall 508C
1700 4th Street
University of California
San Francisco, CA 94158-2550
USPS:
UCSF MC 2550
Byers Hall Room 508
1700 4th Street
San Francisco, CA 94158



--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] LGPL for more of CBFlib

2008-05-23 Thread Herbert J. Bernstein

imgCIF/CBF is a format for image data, such as synchrotron diffraction
images.  CBFlib is a software package supporting the imgCIF/CBF format.
For background information, see Hall and McMahon, International Tables for
Crystallography, Volume G, Definition and exchange of crystallographic data,
IUCr, Springer, 2008, Dordrecht, NL, esp. chapters 2.3, 3.7, 4.6 and 5.6.

The CBFlib package available from

  http://www.sourceforge.net/projects/cbflib

is an open source package covered by the GNU General Public Licence (GPL).
The CBFlib Applications Programming Interface (API) is also covered by the
GNU Lesser General Public License (LGPL), which is also know as the GNU
Library Public License.

Effective immediately, all functions, methods, subroutines and procedures
in the CBFlib package will be considered to be part of the API and to
be covered by the LGPL as an alternative to the GPL that covers everything
in the CBFlib package.

This change results from the discussions at the 22 May 2008 workshop
at BNL to help make detector vendors and others with proprietary software
more comfortable in using the CBFlib package.

Thanks to Teemu Ikonen, since February 2008 CBFlib is a debian package
and you may link to the functions in the CBFlib package from a
proprietary program just as you may link to glibc or to the trigonometry
functions in the libm math library.

Use it in good health.

  -- Herbert J. Bernstein

--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] Raw Image Workshop BNL 22 May 08

2008-05-03 Thread Herbert J. Bernstein

The agenda for the one day Raw Image Workshop at BNL on
22 May 08 has been posted at

http://www.medsbio.org/meetings/BNL_May08_imgCIF_Workshop.html

We can take a few more participants.  If you wish to attend,
send an email to [EMAIL PROTECTED] no later
than this coming Friday, 9 May 2008.

The abstract deadline is Friday, 9 May 2008 and the
handout deadline is Wednesday, 14 May 2008.

--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] Registration deadline Workshop on Raw Image Format

2008-04-13 Thread Herbert J. Bernstein

Reminder:  The registration deadline for the Workshop on Raw Image
Formats in Structural Biology at BNL on 22 May 2008 is approaching.
Please send email to [EMAIL PROTECTED] no later than
15 April 2008 if you are interested in attending.

===

There will be a workshop on Raw Image Formats in Structural Biology
to be held at Brookhaven National Laboratory on 22 May 2008.

Over the past 2 years, imgCIF has seen increasing use, and the interactions
among raw image formats for x-ray crystallography, neutron crystallography
and microscopy have started to be addressed.  This one-day meeting
will be a follow-up to the 2006--2007 imgCIF workshops at the summer
2006 ACA meeting, at BNL in May 2007, and in summer 2007 in
conjunction with BSR 2007 in Manchester and at the Diamond Light
Source. We will have reviews of the current status of imgCIF,
exploration of ways to move between imgCIF and NeXus using XML and
HDF and ways to work with microscopy and tomography images.

The morning will be used for presentations and the afternoon for
discussions and plans for the future.  There is no registration fee,
but space is limited, so registration will be required.  Lunch will
be provided.  If you are interested in participating, please email
[EMAIL PROTECTED] no later than 15 April 2008.

The new imgCIF workshop series has been funded in part by NSF, NIH and
DOE.


--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] RasMol: 2.7.4.2 binaries updated

2008-04-12 Thread Herbert J. Bernstein

11 April 2008:

The binaries for the recently released RasMol 2.7.4.2 kit on
sourceforge have been updated to include new MS Windows, Linux, Mac
OS X 10.5 Universal Binary and Mac OS X 10.3.9 PPC binary kits. The
Mac and Linux binary kits include rasmol_install.sh and rasmol_run.sh
scripts. A recently discovered problem in the load command has been
fixed in these binaries.

If you downloaded a binary prior to 11 April 2008, you may wish to
download these updated binaries.

The widely used molecular graphics program RasMol has been upgraded
to include support for maps, for Bulgarian, Chinese, English,
Italian, Japanese, Russian and Spanish messages and menus, and
support for the wwPDB remediated PDB format. File releases are
available at http://www.sourceforge.net/projects/openrasmol.
Additional binaries will be added as more ports are completed.

RasMol was written by Roger Sayle in 1992, and has become an
important tool in both research and education in structural biology.
The program displays molecular images of PDB , mmCIF and many other
formats of macromolecular data. It uses simple menus and a simple and
effective command/scripting language.

The project CVS on sourceforge has been updated to this level.
Molecular graphics software developers are invited to contribute new
code. System managers and users are invited to contribute compiled
binaries.

Many thanks to Roger Sayle who wrote RasMol and managed the
development of the program for most of the 1990s and to all the
people who contributed to the development of RasMol and who continue
to contribute. There are too many to list here, but please checkout
the names listed in the manual and the changelog.

Among the tasks in our queue are extension of the scripting language
for compatibility with PyMOL and Jmol via the Structural Biology
Extensible Visualization Scripting Language (SBEVSL), addition of
more language variants (including a full Chinese translation of the
RasMol manual), and more install kits. Come join the fun.

Work on RasMol has been supported in part by the U.S. Department of
Energy, the U. S. National Science Foundation, and the U.S. NIH
National Institute of General Medical Sciences. Any opinions,
findings, and conclusions or recommendations expressed in this
material are those of the author(s) and do not necessarily reflect
the views of the funding agencies.

If you find RasMol useful, we hope that you will consider making a
donation to help support further development via the sourceforge
donation system.

Herbert J. Bernstein
yaya-hjb at users.sourceforge.net
openrasmol.sourceforge.net

--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] Non-US participants at Workshop on Raw Image Formats in Structural Biology

2008-03-21 Thread Herbert J. Bernstein

*** IMPORTANT INFORMATION FOR NON-US CITIZENS PLANNING TO ATTEND THE
22 MAY 2008 BNL WORKSHOP ON RAW IMAGE FORMATS IN STRUCTURAL BIOLOGY ***

In order to allow sufficient time for processing and approvals,
if you are a non-US citizen planning to attend the a workshop on
Raw Image Formats in Structural Biology to be held at Brookhaven
National Laboratory on 22 May 2008, please complete the on-line "Guest
Registration Form" located on the left-hand side of the BNL Home Page
http://www.bnl.gov/.  You will need to use Microsoft Internet Explorer
in order to complete the on-line form.

Please select the "Biology Department" for the "department to be
assigned to" and please complete the Passport and Visa section.
To ensure we receive approvals in time for the workshop,
please try to complete your forms by next Friday morning, March 28.

The building in which we will hold the workshop is building 463.
If you may be visiting the NSLS, please send email to
[EMAIL PROTECTED], inasmuch as we need to list each
building you will visit.

If you have any difficulty with the system, please send email to
[EMAIL PROTECTED]

==


There will be a workshop on Raw Image Formats in Structural Biology
to be held at Brookhaven National Laboratory on 22 May 2008.

Over the past 2 years, imgCIF has seen increasing use, and the interactions
among raw image formats for x-ray crystallography, neutron crystallography
and microscopy have started to be addressed.  This one-day meeting
will be a follow-up to the 2006--2007 imgCIF workshops at the summer
2006 ACA meeting, at BNL in May 2007, and in summer 2007 in
conjunction with BSR 2007 in Manchester and at the Diamond Light
Source. We will have reviews of the current status of imgCIF,
exploration of ways to move between imgCIF and NeXus using XML and
HDF and ways to work with microscopy and tomography images.

The morning will be used for presentations and the afternoon for
discussions and plans for the future.  There is no registration fee,
but space is limited, so registration will be required.  Lunch will
be provided.  If you are interested in participating, please email
[EMAIL PROTECTED] no later than 15 April 2008.

The new imgCIF workshop series has been funded in part by NSF, NIH and
DOE.

--
=========
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] Workshop on Raw Image Formats in Structural Biology

2008-03-19 Thread Herbert J. Bernstein

There will be a workshop on Raw Image Formats in Structural Biology
to be held at Brookhaven National Laboratory on 22 May 2008.

Over the past 2 years, imgCIF has seen increasing use, and the interactions
among raw image formats for x-ray crystallography, neutron crystallography
and microscopy have started to be addressed.  This one-day meeting
will be a follow-up to the 2006--2007 imgCIF workshops at the summer
2006 ACA meeting, at BNL in May 2007, and in summer 2007 in
conjunction with BSR 2007 in Manchester and at the Diamond Light
Source. We will have reviews of the current status of imgCIF,
exploration of ways to move between imgCIF and NeXus using XML and
HDF and ways to work with microscopy and tomography images.

The morning will be used for presentations and the afternoon for
discussions and plans for the future.  There is no registration fee,
but space is limited, so registration will be required.  Lunch will
be provided.  If you are interested in participating, please email
[EMAIL PROTECTED] no later than 15 April 2008.

The new imgCIF workshop series has been funded in part by NSF, NIH and
DOE.

--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


Re: [ccp4bb] Missing fonts...

2008-03-05 Thread Herbert J. Bernstein

Dear Harry,

  The simplest solution may well be to do what we are starting to do
with RasMol to deal with the divergence in font sets on Linux systems.
We are gathering font kits, and then setting up install and run scripts that
put the fonts into convenient directories in the user account and then
do an xset +fp on each run.

  Regards,
Herbert



At 10:34 AM + 3/5/08, Harry Powell wrote:

Hi folks

I've had a couple of reports recently of people trying to run 
ipmosflm on Fedora Core 8 machines, and they get the following error 
-


** xdl_view error in routine xdl_open_view **
** Unable to load *adobe-courier-medium-r*--8* font **

I'm assuming that the install for FC8 is not including all the 
Courier fonts; since I don't have an FC8 machine handy, could anyone 
here offer advice about the best way to install these?


 Harry
--
Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre, 
Hills Road, Cambridge, CB2 2QH



--
=========
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


Re: [ccp4bb] bond lengths, angles, ideality and refinements

2008-01-09 Thread Herbert J. Bernstein
"tables 1" is formally correct but awkward.  "table 1s" is confusing. I 
would suggest that we treat "table 1" like sheep and make the plural the 
same as the singular.  If you don't approve of revising the English 
language, then a valid way to avoid the need for a plural is to say "each 
table 1".


  --  Herbert

=============
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=

On Wed, 9 Jan 2008, Gerard DVD Kleywegt wrote:

...>
and start quoting RMS-Z-scores (from whatcheck or, soon, from refmac) in your 
tables 1 ("table 1s"? what *is* the plural of "table 1"?).

...


Re: [ccp4bb] converting structure factor files to mtz files

2007-10-31 Thread Herbert J. Bernstein

If you want to "roll your own"...


If you add the data_xxx line to make this a legal CIF, you should
be able to read it with ciftbx if you are working in a Fortran application
or CBFlib if you are working in a C application.  You will find
a variety of CIF tools pointed to from the IUCr web site and from the
RCSB web site.



At 7:49 PM -0400 10/31/07, Zheng Zhou wrote:

Hi,

Could anyone give a quick hint for the Fortran format for the 
following structure factor mmCIF file? or Is there any easy program 
or better way to convert it? I think I need to skip first 3 columns.


Thanks in advance.

Joe

loop_
_refln.crystal_id
_refln.wavelength_id
_refln.scale_group_code
_refln.index_h
_refln.index_k
_refln.index_l
_refln.F_meas_au
_refln.F_meas_sigma_au
_refln.status
1 1 1200 617.50   5.41  o
1 1 1400 773.50   6.92  o
1 1 1600  62.30   3.19  o

I am trying to view the electron density of a published structure. I 
downloaded the file from pdb and used cif2mtz in ccp4. I think the 
following output mtz is wrong.


* Column Labels :

 H K L FREE FP SIGFP F(+) SIGF(+) F(-) SIGF(-) DP SIGDP I(+) SIGI(+) 
I(-) SIGI(-)


 * Column Types :

 H H H I F Q G L G L D Q K M K M

 * Associated datasets :

 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1

 * Cell Dimensions : (obsolete - refer to dataset cell dimensions above)

   88.0800   86.3600   80.7700   90.   95.7100   90.

 *  Resolution Range :

0.000450.29217 ( 47.298 -  1.850 A )

 * Sort Order :

  1 2 3 0 0

 * Space group = 'C 1 2 1' (number 5)



 OVERALL FILE STATISTICS for resolution range   0.000 -   0.292
 ===


 Col SortMinMaxNum  % Mean Mean   Resolution 
Type Column
 num order   Missing complete  abs.   Low 
High   label


   1 ASC-47  47  0  100.00  -1.4 17.9  47.28   1.85   H  H
   2 NONE 0  46  0  100.00 17.2 17.2  47.28   1.85   H  K
   3 NONE 0  43  0  100.00 16.4 16.4  47.28   1.85   H  L
   4 NONE0.0 19.0 0  100.00 9.52 9.52  47.28 
1.85   I  FREE

   5 NONE0.0  1566.050   99.90   162.85   162.85  47.28   1.85   F  FP
   6 NONE0.082.150   99.90 9.49 9.49  47.28 
1.85   Q  SIGFP
   7 BOTH ?   ?  513730.00  ??  -999.00 
0.00   G  F(+)
   8 BOTH ?   ?  513730.00  ??  -999.00 
0.00   L  SIGF(+)
   9 BOTH ?   ?  513730.00  ??  - 999.00 
0.00   G  F(-)
  10 BOTH ?   ?  513730.00  ??  -999.00 
0.00   L  SIGF(-)

  11 BOTH ?   ?  513730.00  ??  -999.00   0.00   D  DP
  12 BOTH ?   ?  513730.00  ??  -999.00 
0.00   Q  SIGDP
  13 BOTH ?   ?  513730.00  ??  -999.00 
0.00   K  I(+)
  14 BOTH ?   ?  513730.00  ??  -999.00 
0.00   M  SIGI(+)
  15 BOTH ?   ?  513730.00  ??  -999.00 
0.00   K  I(-)
  16 BOTH ?   ?  513730.00  ??  -999.00 
0.00   M  SIGI(-)



 No. of reflections used in FILE STATISTICS51373



 LIST OF REFLECTIONS
 ===

  -47   1   10.00  0.00  0.00   ? ? ? 
  ? ? ? ? ? ? 
  ? 
  -47   1   20.00  0.00  0.00   ? ? ? 
  ? ? ? ? ? ? 
  ? 
  -47   1   3   17.00  0.00  0.00   ? ? ? 
  ? ? ? ? ? ?



--
=============
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


Re: [ccp4bb] Why wwPDB and members are doing a poor job.

2007-09-19 Thread Herbert J. Bernstein
ions should be a clear indication of why we should NOT 
 switch

 from PDB to mmCIF format for coordinate files. Instead, we should take
 this opportunity of wwPDB members abandoning the PDB format to take 
 over

 management of the format ourselves. I was quite irate with them for
 going against our wishes on several features of the PDB format, like
 supporting the SegID. Instead, I think we should realize that "modern
 database" management goals are different from experimentalist 
 goals, and

 that we should not rely on them to decide how our own data should be
 represented.

 I think that we should intentionally avoid mmCIF for coordinate files,
 and stick to the PDB format. The wwPDB has absolutely no policy for 
 user

 involvement, and RCSB has clearly dropped the previously establish
 PDB-format change policy. Their task was never to manage a public file
 format standard. This is an opportunity to turn the PDB file format 
 into

 a public standard.

 I have started a PDB Format Wiki, running on my home computer, at
 http://pdb.homeunix.org. If it gains interest, I will see about moving
 it to a proper Internet host.

 Joe Krahn

 Miller, Mitchell D. wrote:

 Hi Boaz,
   We were informed by an RCSB annotator in April 2006 that the
 RCSB had suspended including REMARK 42 records in PDB files
 pending the review of the process by the wwPDB.

   In looking at the new annotation guidelines, it looks
 like the result of that review was to reject the REMARK 42
 record and the listing of additional validation items.
 See page 23 of the July 2007 "wwPDB Processing Procedures
 and Policies Document"
 http://www.wwpdb.org/documentation/wwPDB-A-20070712.pdf

 "REMARK 42 and use of other programs for validation Use of REMARK 
 42 is

 discontinued.

 If authors wish to indicate presubmission validation and other 
 programs used before
 deposition, the programs may be listed in a new remark, REMARK 40. 
 This remark will
 list the software name, authors and function of the program. 
 Results of the software will

 not be listed. Use of this remark is voluntary."

 It seems that the wwPDB only allows the inclusion of validation
 statistics output by the refinement program but not from additional
 validation programs. So for additional statistics to be included
 in the PDB header, they will either need to be implemented by the
 refinement package or the wwPDB annotators.

 Regards,
 Mitch


--
Paul Adams
Senior Scientist, Physical Biosciences Division, Lawrence Berkeley Lab
Adjunct Professor, Department of Bioengineering, U.C. Berkeley
Head, Berkeley Center for Structural Biology
Deputy Principal Investigator, Berkeley Structural Genomics Center

Building 64, Room 248
Tel: 510-486-4225, Fax: 510-486-5909
http://cci.lbl.gov/paul

Lawrence Berkeley Laboratory
1 Cyclotron Road
BLDG 64R0121
Berkeley, CA 94720, USA.
--



--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


Re: [ccp4bb] diffraction images images/jpeg2000

2007-08-21 Thread Herbert J. Bernstein
In order to support thumbnails, we will be adding JPEG
and JPEG 2000 support to CBFlib.  There would then be no
reason why one could not use JPEG 2000 for diffraction
images as well, but I am not certain anything would
be gained in practice for those images over what the
ccp4 J. P. Abrahams pack_c.c compression offers.  At the
DLS session of the most recent imgCIF workshop we
agreed to start a serious benchmarking project and will
include a careful comparison of timings of byte
offset versus ccp4 versus jpeg 2000 versus ... in
what we do.

  -- Herbert

=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=

On Mon, 20 Aug 2007, Winter, G (Graeme) wrote:

> Hi,
>
> I looked at jpeg2000 as a compression for diffraction images for
> archiving purposes - it works well but is *SLOW*. It's designed with the
> idea in mind of compressing a single image, not the several hundred
> typical for our work. There is also no place to put the header.
>
> Bzip2 works pretty much as well and is standard, but again slow. This is
> what people mostly seem to use for putting diffraction images on the
> web, particularly the JCSG.
>
> The ccp4 "pack" format which has been around for a very long time works
> very well and is jolly quick, and is supported in a number of data
> processing packages natively (Mosflm, XDS). Likewise there is a new
> compression being used for the Pilatus detector which is quicker again.
> These two have the advantage of being designed for diffraction images
> and with speed in mind.
>
> So there are plenty of good compression schemes out there - and if you
> use CBF these can be supported natively in the image standard... So you
> don't even need to know or care...
>
> Just my 2c on this one.
>
> Cheers,
>
> Graeme
>
> -Original Message-
> From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of
> Maneesh Yadav
> Sent: 18 August 2007 00:02
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: [ccp4bb] diffraction images images/jpeg2000
>
> FWIW, I don't agree with storing image data, I don't think they justify
> the cost of storage even remotely (some people debate the value of the
> structures themselves)...but if you want to do it anyway, maybe we
> should use a format like jpeg2000.
>
> Last time I checked, none of the major image processing suites used it,
> but it is a very impressive and mature format that (I think) would be
> suitable for diffraction images.  If anyone is up for experimenting, you
> can get a nice suite of tools from kakadu (just google kakdu +
> jpeg2000).
>


Re: [ccp4bb] PDB format survey?

2007-08-11 Thread Herbert J. Bernstein
Dear Warren,

  In general I agree with the idea of minimal
intervention and maximal upwards compatibility,
but that approach is producing more and more
dialect variants on the PDB format where different
people (including the BNL PDB and now the RCSB
PDB) try one bandaid after another.  We now have
a format in which software can get very confused
about:

  atom serial numbers -- are they unique -- one
distinct identifier for each atom -- or are they
reused as the PDB has now done for several years
in large NMR entries

  atom names -- do they follow the 1992 naming
conventions that to some extent carried the
element type, or do they try to come closer
to IUPAC naming and rely on other columns
for the element name

  charges -- are they there, as they are for
a few entries, or are they ignored

  bonds -- are the old salt-bridge and hydrogren
bond conventions followed or are they not
followed

  text -- is it all upper case or are the type-setting
conventions used

  segment ids -- are they used or aren't they

  etc., etc.

I believe that the compounded effect of good intentions
and scientifically sound tinkering with the PDB
format has increasingly produced bad results and
scientific errors.

It is time to stop, take a deep breath and specify
and design a workable replacement for the current
legacy format that, while it repects the needs
of existing software, avoids the trap in which
we now find ourselves of having data sets from
multiple sources that cannot be unambiguously
parsed without advice from the delphic oracle.

  Regards,
Herbert J. Bernstein




=====
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=

On Fri, 10 Aug 2007, Warren DeLano wrote:

>
> > Actually, everything proposed [will] break some software.
>
> Breaking some is far better than all, and there is a very important
> principle being followed  here:  The notion behind these minimalist
> improvements is that they will only break in cases where the PDB format
> itself breaks.
>
> For example, right-justifying two-letter chain IDs over columns 21-22
> means that only files that employ two-character chain IDs will break --
> otherwise you get a normal-looking one-letter chain ID in column 22.
> Likewise, the hybrid36 convention produces an otherwise normal PDB
> except when the atom count limit is exceeded.  And, with the CONECT
> valence convention, if there is no valence information defined for
> HETATM ligands, then the CONECT output will be single bonds as is the
> case with standard PDB files.  It is only when you have ligand valences
> defined that the output vill vary from standard PDB conventions.
>
> In other words, the file format variances only occur when they are
> needed, and that is why the proposed approaches are preferable over
> something like wide-PDB.
>
> Cheers,
> Warren
>
> > -Original Message-
> > From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On
> > Behalf Of Herbert J. Bernstein
> > Sent: Friday, August 10, 2007 8:53 AM
> > To: CCP4BB@JISCMAIL.AC.UK
> > Subject: Re: [ccp4bb] PDB format survey?
> >
> > Actually, everything proposed with break some software.
> > The real question is one of how much value the community
> > gains from how much of a change.  mmCIF was one proposal that
> > would "solve" the problem, but which met a lot of resistance.
> >  The change in atom serial numbers to strings is another
> > possibility.  If you want something in between that stretches
> > the line, but preserves the programming style, take a look at:
> >
> >   http://biomol.dowling.edu/WPDB/
> >
> > that extends the line and handles 999,999,999 atoms and 10
> > character chain names.
> >
> >   I apologiza for the server that provides sample runs from
> > the page being down.  We had a couple of bad power failures,
> > and that machine is not back in service yet, but the spec is
> > available.
> >
> >   Regards,
> > Herbet J. Bernstein
> > =
> >  Herbert J. Bernstein, Professor of Computer Science
> >Dowling College, Kramer Science Center, KSC 121
> > Idle Hour Blvd, Oakdale, NY, 11769
> >
> >  +1-631-244-3035
> >  [EMAIL PROTECTED]
> > =
> >
> > On Fri, 10 Aug 2007, Warren DeLano wrote:
> >
> > > That's easy:  Backward compatibility, both in terms of old programs
>

Re: [ccp4bb] PDB format survey?

2007-08-10 Thread Herbert J. Bernstein
If you want to give it a try, I switched the
buttons to an alternate server.  It is slower
than the one we were using, but it will
give you the idea.  Try

http://biomol.dowling.edu/WPDB/

put in a PDB id code in the box near the top
of the page and click on "CLICK HERE for WPDB version ..."

The code changes for graphics programs are fairly
simple, since most things are in the same place
in very similar formats, just with more of a
range.

-- HJB

=====
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=

On Fri, 10 Aug 2007, Herbert J. Bernstein wrote:

> Actually, everything proposed with break some software.
> The real question is one of how much value the
> community gains from how much of a change.  mmCIF
> was one proposal that would "solve" the problem,
> but which met a lot of resistance.  The change
> in atom serial numbers to strings is another
> possibility.  If you want something in between
> that stretches the line, but preserves the
> programming style, take a look at:
>
>   http://biomol.dowling.edu/WPDB/
>
> that extends the line and handles 999,999,999 atoms
> and 10 character chain names.
>
>   I apologiza for the server that provides sample
> runs from the page being down.  We had a couple
> of bad power failures, and that machine is
> not back in service yet, but the spec is
> available.
>
>   Regards,
>     Herbet J. Bernstein
> =
>  Herbert J. Bernstein, Professor of Computer Science
>Dowling College, Kramer Science Center, KSC 121
> Idle Hour Blvd, Oakdale, NY, 11769
>
>  +1-631-244-3035
>  [EMAIL PROTECTED]
> =
>
> On Fri, 10 Aug 2007, Warren DeLano wrote:
>
> > That's easy:  Backward compatibility, both in terms of old programs and
> > old data.
> >
> > The idea is to maintain as much interoperability as possible.
> >
> > > -Original Message-
> > > From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On
> > > Behalf Of Santarsiero, Bernard D.
> > > Sent: Friday, August 10, 2007 8:17 AM
> > > To: CCP4BB@JISCMAIL.AC.UK
> > > Subject: [ccp4bb] PDB format survey?
> > >
> > > Can I ask a dumb question? Just curious...
> > >
> > > Why are we now limited to 80 "columns"? In the old days, that
> > > was a limit with Fortran and punched cards. Can a "record"
> > > (whatever it's called now) be as long as we wish? Instead of
> > > compressing a lot on a PDB record line, can we lengthen it to
> > > 130 columns?
> > >
> > >
> > > Bernie Santarsiero
> > >
> > >
> > > On Fri, August 10, 2007 10:10 am, Warren DeLano wrote:
> > > > Correction:  Scratch what I wrote -- the PDB format does
> > > now support a
> > > > formal charge field in columns 79-80 (1+,2+,1- etc.).  Hooray!
> > > >
> > > > Thus, adoption of the CONECT valency convention is all it
> > > would take
> > > > for us to be able to convey chemically-defined structures using the
> > > > PDB format.
> > > >
> > > > I'll happily add two-letter chain IDS and hybrid36 to PyMOL
> > > but would
> > > > really, really like to see valences included as well -- widespread
> > > > adoption of that simple convention would represent a major
> > > practical
> > > > advance for interoperability in structure-based drug discovery.
> > > >
> > > > Cheers,
> > > > Warren
> > > >
> > > >
> > > >> -Original Message-
> > > >> From: CCP4 bulletin board [mailto:[EMAIL PROTECTED]
> > > On Behalf Of
> > > >> Warren DeLano
> > > >> Sent: Thursday, August 09, 2007 5:53 PM
> > > >> To: CCP4BB@JISCMAIL.AC.UK
> > > >> Subject: Re: [ccp4bb] PDB format survey?
> > > >>
> > > >> Joe,
> > > >>
> > > >> I feel that atom serial numbers are particularly important, since
> > > >> they, combined with CONECT records, provide the only semi-standard
> > > >> convention I know of for reliably encoding bond valences
> > > information
> > > >> into a PDB file.
> 

Re: [ccp4bb] PDB format survey?

2007-08-10 Thread Herbert J. Bernstein
Actually, everything proposed with break some software.
The real question is one of how much value the
community gains from how much of a change.  mmCIF
was one proposal that would "solve" the problem,
but which met a lot of resistance.  The change
in atom serial numbers to strings is another
possibility.  If you want something in between
that stretches the line, but preserves the
programming style, take a look at:

  http://biomol.dowling.edu/WPDB/

that extends the line and handles 999,999,999 atoms
and 10 character chain names.

  I apologiza for the server that provides sample
runs from the page being down.  We had a couple
of bad power failures, and that machine is
not back in service yet, but the spec is
available.

  Regards,
Herbet J. Bernstein
=====
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=

On Fri, 10 Aug 2007, Warren DeLano wrote:

> That's easy:  Backward compatibility, both in terms of old programs and
> old data.
>
> The idea is to maintain as much interoperability as possible.
>
> > -Original Message-
> > From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On
> > Behalf Of Santarsiero, Bernard D.
> > Sent: Friday, August 10, 2007 8:17 AM
> > To: CCP4BB@JISCMAIL.AC.UK
> > Subject: [ccp4bb] PDB format survey?
> >
> > Can I ask a dumb question? Just curious...
> >
> > Why are we now limited to 80 "columns"? In the old days, that
> > was a limit with Fortran and punched cards. Can a "record"
> > (whatever it's called now) be as long as we wish? Instead of
> > compressing a lot on a PDB record line, can we lengthen it to
> > 130 columns?
> >
> >
> > Bernie Santarsiero
> >
> >
> > On Fri, August 10, 2007 10:10 am, Warren DeLano wrote:
> > > Correction:  Scratch what I wrote -- the PDB format does
> > now support a
> > > formal charge field in columns 79-80 (1+,2+,1- etc.).  Hooray!
> > >
> > > Thus, adoption of the CONECT valency convention is all it
> > would take
> > > for us to be able to convey chemically-defined structures using the
> > > PDB format.
> > >
> > > I'll happily add two-letter chain IDS and hybrid36 to PyMOL
> > but would
> > > really, really like to see valences included as well -- widespread
> > > adoption of that simple convention would represent a major
> > practical
> > > advance for interoperability in structure-based drug discovery.
> > >
> > > Cheers,
> > > Warren
> > >
> > >
> > >> -Original Message-
> > >> From: CCP4 bulletin board [mailto:[EMAIL PROTECTED]
> > On Behalf Of
> > >> Warren DeLano
> > >> Sent: Thursday, August 09, 2007 5:53 PM
> > >> To: CCP4BB@JISCMAIL.AC.UK
> > >> Subject: Re: [ccp4bb] PDB format survey?
> > >>
> > >> Joe,
> > >>
> > >> I feel that atom serial numbers are particularly important, since
> > >> they, combined with CONECT records, provide the only semi-standard
> > >> convention I know of for reliably encoding bond valences
> > information
> > >> into a PDB file.
> > >>
> > >> single bond = bond listed once
> > >> double bond = bond listed twice
> > >> triple bond = bond listed thrice
> > >> aromatic bond = bond listed four times.
> > >>
> > >> This is a convention long supported by tools like MacroModel and
> > >> PyMOL.
> > >> For example, here is formaldehyde, where the bond between
> > atoms 1 and
> > >> 3 is listed twice:
> > >>
> > >> HETATM1  C01 C=O 1   0.000  -0.020   0.000  0.00  0.00
> > >> C
> > >> HETATM2  N01 C=O 1   1.268  -0.765   0.000  0.00  0.00
> > >> N
> > >> HETATM3  O02 C=O 1   0.000   1.188   0.000  0.00  0.00
> > >> O
> > >> HETATM4  H01 C=O 1   1.260  -1.775   0.000  0.00  0.00
> > >> H
> > >> HETATM5  H02 C=O 1   2.146  -0.266   0.000  0.00  0.00
> > >> H
> > >> HETATM6  H03 C=O 1  -0.946  -0.562   0.000  0.00  0.00
> > >> H
> > >> CONECT12
> > >> CONECT13
> > >> CONECT13
> > >> CONECT16
&g

[ccp4bb] imgCIF workshops at BSR 2007

2007-08-02 Thread Herbert J. Bernstein

Dear Colleagues,

   The agenda for the imgCIF workshops in conjunction with
BSR 2007 in Manchester and at Diamond has been posted at:

http://www.medsbio.org/meetings/BSR_2007_imgCIF_Workshop.html

   We hope you can join us.

   The BSR 2007 meeting organizers have asked me to urge anyone who
is planning to attend either the Tuesday, 14 August 2007 imgCIF
workshop in Manchester or the Friday, 17 August 2007 imgCIF
workshop at Diamond to please register by sending an email
message to:


 [EMAIL PROTECTED]   <<<<<<<<<<


giving the following information

   your name
   if you will be joining us for lunch on Tuesday, 14 August 2007
   if you will be joining us for lunch on Friday, 17 August 2007

   Please send this message no later than 12:00 GMT on Friday,
3 August 2007.

   Thank you for your cooperation.

   Regards,
 Herbert J. Bernstein


--
=========
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] RasWin binary supporting PDB format V3 available

2007-07-15 Thread Herbert J. Bernstein

RasWin binary supporting PDB format V3 available.

wwPDB staff have contributed the changes for RasMol
to support PDB format V3.   A windows binary, RasWin.exe
based in RasMol 2.7.3.1 was built by Petko Kamburov
at Dowling College, and is available
on http://sourceforge.net/projects/openrasmol along
with the wwPDB source tarball.  Users on other platforms
can build binaries from the tarball.  We expect to
add additional precompiled binaries to the sourceforge
page in early August just after the ACA meeting.
A full merge of these mods into the mainline of
RasMol development to ensure cross compatibility
with older formats is in progress, but users of
current remediated PDB entries are encouraged to
use this new version now.


--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] imgCIF workshops, 14 and 17 Aug 2007

2007-07-11 Thread Herbert J. Bernstein

The Management of Synchrotron Image Data:
Changes to the imgCIF dictionary and software, interaction with NeXus

Sponsored by DOE under grant ER64212-1027708-0011962, NSF under grant
DBI-0610407 and NIH under grant 1R13RR023192-01A1

You are cordially invited to a CBF/imgCIF workshop in two lunch
sessions at BSR 2007 in Manchester and at Diamond.
The first session will in Manchester on Tuesday, 14 August from
12:45 to 13:45.  It will provide an introduction to imgCIF and
NeXus and a brief review of current progress.  The room will be
announced on the meeting web site and on the MEDSBIO.org web site.
This lunch meeting is open, but advance registration would be
appreciated.

The second session will be at Diamond on Friday, 17 August at 12:30 to
discuss recent changes in the imgCIF dictionary and software and the
interaction with NeXus.  It will be held in room 1.17 of Diamond House
as a working lunch.  This lunch is open both to BSR attendees and others,
though places are limited and advance registration is required.  Lunch
will be provided.

There has been a great deal of progress in the past year.  There is
a lot to report and a lot to discuss.  If you work with raw
experimental data in structural biology, these workshops may
prove interesting for you.

Thanks to funding from DOE, NSF and NIH we have some funds to help
with travel to these workshops.  If you need assistance, please
contact us.

For further information and to register please contact Herbert J.
Bernstein and Alun Ashton at [EMAIL PROTECTED]


--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


Re: [ccp4bb] The CCP4 license is ambiguous

2007-07-06 Thread Herbert J. Bernstein

Dear Colleagues,

May I suggest that, at this point, we all need a clarification
of the licensing for the libraries in CCP4 (as distinct from
the licensing for the programs).  The community as a whole would
benefit from an unambiguous release of the current libraries (as
opposed to the next to current libraries) under the LGPL (as
opposed to the GPL or the full-blown CCP4 license).  I may be
missing something, but I cannot see who would be hurt by such
an action.  I think we all can see the benefit.

  Regards,
Herbert



At 12:04 PM +1000 7/6/07, Tim Grune wrote:

To me it seems that clause 2.1.1 of the CCP4 academic license says that one
can distribute work derived from or using the CCP4 libraries provided that it
complies with clause 2.1.2
The last sentence in clause 2.1.2 says it itself becomes void if the derived
work is distributed under the GPL or LGPL. Doesn't that mean that the CCP4
academic license does NOT impose any restrictions at all for work distributed
under the LGPL or GPL, because anything that might restrict it is void?

Tim


On Wednesday 04 July 2007 18:49, Kevin Cowtan wrote:

 I was speaking imprecisely. I will try again.

 You cannot create a derived work containing both CCP4 6.* licensed code
 and GPL'd code, and distribute the resulting program, since the GPL
 demands that the derived work be distributed without additional
 restirctions and the CCP4 6.* license imposes additional restrictions on
 redistribution - in particular (but not limited to) an indemnity clause.

 Ethan A Merritt wrote:
 > On Tuesday 03 July 2007 06:55, Kevin Cowtan wrote:
 >> I'm afraid there is no ambiguity. You can't use the CCP4 version 6.*
 >> libraries in GPL software.
 >
 > This sounds strange to me.
 > The question is usually raised in the other direction - whether GPL
 > libraries can be used by a non-GPL program [*].
 >
 > Here you are saying that a GPL program cannot use non-GPL libraries.
 > I believe this is false.  To take an obvious example, consider GPL
 > software running on Windows and calling into the system libraries.
 > Do you think that Cygwin has been in violation of the GPL all these
 > years?
 >
 > Or perhaps I misunderstand.  Are you saying that the current CCP4
 > license does not permit combination with non-CCP4 code?


--
Tim Grune
Australian Synchrotron
800 Blackburn Road
Clayton, VIC 3168
Australia

Attachment converted: Macintosh HD:Untitled 15 (/) (0016B5AD)



--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


Re: [ccp4bb] The CCP4 license is ambiguous

2007-07-03 Thread Herbert J. Bernstein

While it is correct that one should not try to mix GPL code with
commerical code, the use of LGPL APIs to support commercial code
(as well as to support open source code under other licenses) is
in general viable, and not likely to get you tangled with lawyers.
  -- Herbert



At 1:28 PM -0700 7/3/07, Warren DeLano wrote:

 > Behalf Of Ethan Merritt


 What you cannot do is mix GPL and non-GPL code within a single
 program. This sounds clear until the lawyers start arguing
 about what is or is not a single program [*]. At this point
 opinions and arguments and legal precedents diverge.


[*] "Ay, there's the rub", which is why I am such a big fan of
non-GPL/non-viral open-source licenses, and especially so nowadays given
that the lines separating programs, processes, threads,
remote-procedure-calls, and even acts of redistribution are disappearing
in modern systems (e.g. AJAX/Web2.0).

Commercial reliance upon usage and deployment of mixed solutions
involving both GPL and non-GPL-compatible code is ill-advised unless you
have time and resources to spend on lawyers.  Doing so exposes oneself
to all sorts of legal ambiguities arising out of diverging opinions and
interpretations.  I'm not saying it can't be done legally, just that you
had better be prepared to defend your actions if you chose to take such
risks. 


Academic efforts are less likely to be sued outright, but, in my view,
when sharing both open-source code and actual products, it is best to go
either all GPL-like/viral (e.g. GROMACS), all BSD-like/unrestricted
(e.g. PyMOL), or all original-code under your own license.

DISCLAIMER: these are just my opinions, IANAL.

Cheers,
Warren


 -Original Message-
 From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On
 Behalf Of Ethan Merritt
 Sent: Tuesday, July 03, 2007 12:47 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] The CCP4 license is ambiguous

 On Tuesday 03 July 2007 12:09, Michel Fodje wrote:
 > On Tue, 2007-07-03 at 10:54 -0700, Ethan Merritt wrote:
 > >
 > > They do have the same rights.  They can use it, modify it, and
 > > redistribute it.  They may or may not be permitted to
 distribute 3rd
 > > party libraries with it, but that was true of the original
 > > distributor also.  
 >

 > The specific rights that must be transferred with the software are:
 > 1 -  The freedom to run the program, for any purpose (freedom 0)
 > 2 -  The freedom to study how the program works, and adapt
 it to your
 > needs (freedom 1). Access to the source code is a
 precondition for this.
 > 3 -  The freedom to redistribute copies so you can help
 your neighbor
 > (freedom 2).
 > 4 - The freedom to improve the program, and release your
 improvements
 > to the public, so that the whole community benefits (freedom 3).
 > Access to the source code is a precondition for this.

 Yes. That is a more complete statement of rights under the GPL.
 Please note, however, that "the source code" to which you are
 guaranteed access is the source code to the GPL-ed program
 itself, not to pieces of the operating environment it runs in.

 > If you distribute software that, in whole or in part does
 not convey
 > all those freedoms, it is a violation of the GPL if you use
 GPL code in it.

 This is an overstatement, or could be mis-read as an overstatement.
 You can distribute a mixture of GPL and non-GPL code together.
 Any random linux distribution is an example of this.  What
 you cannot do is mix GPL and non-GPL code within a single
 program. This sounds clear until the lawyers start arguing
 about what is or is not a single program [*]. At this point
 opinions and arguments and legal precedents diverge. The
 divergence in opinion is particularly notable with regard to
 libraries.

 >

Ethan

 [*] Please note that "single program" is my own imprecise
 term, not a specific legal wording that is under dispute.

 --
 Ethan A Merritt







--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] Agenda for BNL imgCIF workshop

2007-05-13 Thread Herbert J. Bernstein

The agenda for the Synchrotron Image-Data Format Workshop
on Thursday, 24 May 2007 at Brookhaven National Laboratory has been
posted at:

http://www.medsbio.org/meetings/BNL_May07_imgCIF_Workshop.html

There will be a light breakfast at 8:30, and the talks will
start at 9 am.  Lunch will be provided.  Space is limited so ...

Important: Even if you are already registered for the CFN/NSLS
meeting or are a local BNL person, it is important to contact us if
you wish to attend so we can be sure to have enough chairs and food.
Please send email to Herbert J. Bernstein and Bob Sweet at
[EMAIL PROTECTED] no later than 17 May 2007.
--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] imgCIF workshop (new series) at BSR 2007

2007-04-16 Thread Herbert J. Bernstein

Third imgCIF workshop (new series) at BSR 2007 in Manchester and at Diamond:

The Management of Synchrotron Image Data:
Changes to the imgCIF dictionary and software, interaction with NeXus

Sponsored by DOE under grant ER64212-1027708-0011962, NSF under grant
DBI-0610407.

You are cordially invited to a CBF/imgCIF workshop in two lunch
sessions at BSR 2007 in Manchester and at Diamond. There will be a
working lunch on 17 August as a breakout session to the BSR2007
meeting during a visit to Diamond Light Source. The meeting will be
held at 12:30 in room 1.17 of Diamond House. The lunch is open to
those not attending the main BSR2007 meeting, though places are
limited. The major topics discussed at the Diamond session will be
recent changes in the imgCIF dictionary and software and the
interaction with NeXus. For those who cannot make it to the Diamond
session or who want to get an introduction to the subject, there will
also be a working lunch during the BSR meeting in Manchester. The
time and room will be announced on the MEDSBIO.org web site and at
the meeting. For further information and to register please contact
Herbert J. Bernstein and Alun Ashton at [EMAIL PROTECTED]

--
=
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769

 +1-631-244-3035
 [EMAIL PROTECTED]
=


[ccp4bb] imgCIF workshop at BNL on 24 May 2007

2007-03-13 Thread Herbert J. Bernstein

Second imgCIF workshop (new series) at BNL after NSLS/CFN meeting:

Synchrotron Image-Data Format Workshop

Herbert J. Bernstein, [EMAIL PROTECTED]
Robert M. Sweet, [EMAIL PROTECTED]
Sponsored by DOE under grant ER64212-1027708-0011962, NSF under grant
DBI-0610407. NIH support pending.

There will be a workshop on data formats for synchrotron image data
after the NSLS/CFN meeting on 24 May 2007 at BNL in the Biology Dept
Conference Room, Bldg 463, starting at 9 am. Topics to be discussed
include proposed extensions to imgCIF, the use of NeXus, progress on
software and the status of imgCIF at Diamond and at SLS. Space is
limited, so please contact Herbert J. Bernstein and Bob Sweet at
[EMAIL PROTECTED] to reserve a place.

 * Review of imgCIF and CBFlib
 * Proposed extensions to the imgCIF dictionary
 * Status of imgCIF adoption at SLS, Diamond, ...
 * Future directions
 * Discussion