Hello CDK and Blue-Obelisk People,

Egon mentioned that the reaction-balancer discussed a few weeks ago on
the CDK-user list might be useful to the Blue Obelisk Green Chain
Reaction initiative.

Attached is the code formed in that discussion
authors: Gilleain Torrance, wateriestfire, Jules Kerssemakers

The original discussion can be found in the CDK mail-archives here:
http://sourceforge.net/mailarchive/forum.php?thread_name=AANLkTik7z0WBgfj6_c4amHleL6VRTEOL2cZJQbZK-Kfn%40mail.gmail.com&forum_name=cdk-user

Kind regards, and hope it helps!
Jules Kerssemakers

On 8 August 2010 15:13, Egon Willighagen <[email protected]> wrote:
> I think there recently was a discussion on reaction balancing... did
> not have time to tune in on that yet... perhaps someone can reply to
> the Blue Obelisk mailing list?
>
> E.
>
>
> ---------- Forwarded message ----------
> From: Peter Murray-Rust <[email protected]>
> Date: Sun, Aug 8, 2010 at 2:07 PM
> Subject: [BlueObelisk-discuss] The Green Chain Reaction; can the Blue
> Obelisk help?
> To: BlueObelisk-Discuss <[email protected]>
>
>
> As you can see from http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=2526
> we are starting an ambitious communal event in datadriven science. It
> will be presented live at the British Library for Science Online 2010
> (nature, Mendely, BL sponsors)  with a very high profile (e.g. getting
> widely reported, involving a world community) Jean-Claude and MatTodd
> will be taking part. I would love help from more BlueObelisk readers
> and anyone else who might be interested.
>
> The question we want to answer is “Are chemical reactions becoming
> greener?” by using text-mining and other information extraction from
> the Open Chemical literature.
>
> The Blue Obelisk and the chemical blogosphere can help.
> Examples of help are:
> * blog the event
> * project page/wiki
> * list of solvents (do we have this?)
> * list of reagents
> * software to balance reactions from products and reactants and
> reagents. How much of this does CDK do?
> * software to deduce reaction type from products and reactants and
> reagents. How much of this does CDK do?
> * sources or Open reactions. The most likely are personal papers,
> theses, etc. data mining behind publishers' firewalls is forbidden
> * requests to publishers to make their experiemental sections fully Open.
>
> and of course your ideas on what else we can do...
>   Rich, do you have Open reactions?
>   Can we scrape reactions from Wikipedia?
>   Can we get solvents , reagents from Wikipedia?
>  .. etc.
>
>
> P.
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by
>
> Make an app they can't live without
> Enter the BlackBerry Developer Challenge
> http://p.sf.net/sfu/RIM-dev2dev
> _______________________________________________
> Blueobelisk-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
>
>
>
>
> --
> Post-doc @ Uppsala University
> Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by
>
> Make an app they can't live without
> Enter the BlackBerry Developer Challenge
> http://p.sf.net/sfu/RIM-dev2dev
> _______________________________________________
> Cdk-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Set;

import org.openscience.cdk.Atom;
import org.openscience.cdk.Molecule;
import org.openscience.cdk.Reaction;
import org.openscience.cdk.exception.CDKException;
import org.openscience.cdk.interfaces.IBond;
import org.openscience.cdk.interfaces.IElement;
import org.openscience.cdk.interfaces.IMolecularFormula;
import org.openscience.cdk.interfaces.IMolecule;
import org.openscience.cdk.interfaces.IMoleculeSet;
import org.openscience.cdk.isomorphism.UniversalIsomorphismTester;
import org.openscience.cdk.tools.manipulator.AtomContainerManipulator;
import org.openscience.cdk.tools.manipulator.AtomContainerSetManipulator;
import org.openscience.cdk.tools.manipulator.MolecularFormulaManipulator;

/**
* The ReactionBalancer tries to stoichiometrically balance a reaction. It currently uses
* water for balancing oxygen, protons for balancing charges and H2 for balancing
* remaining hydrogens. If any other elements have to be balanced, permutations of stoichiometric
* coefficients up to five are tested.
*
* <p><b>Warning</b>: If the reaction cannot be balanced, the method balance returns false.
* Nevertheless the original reaction is modified. Use a copy of the reaction to
* avoid this behaviour.
*
* <code><pre>
* Reaction reaction;
* ReactionBalancer rb = new ReactionBalancer();
* boolean balanced;
* if(!rb.isBalanced(reaction)) balanced = rb.balance(reaction);
* </pre></code>
*
* @author Kai Hartmann
* @author Bart Geurten
* @cdk.created 2004-02-20
* @cdk.module reaction
*/
public class ReactionBalancer {

    private static Molecule water = new Molecule();
    private static Molecule hydrogen = new Molecule();
    private static Molecule proton = new Molecule();
    private Reaction reaction = null;
    private Map<String, Double> diff = new HashMap<String, Double>();


    /**
* Constructor for the ReactionBalancer object
*/
    public ReactionBalancer() {
        if (water.getAtomCount() == 0) {
            water.addAtom(new Atom("H"));
            water.addAtom(new Atom("H"));
            water.addAtom(new Atom("O"));
            water.addBond(0, 2, IBond.Order.SINGLE);
            water.addBond(1, 2, IBond.Order.SINGLE);
            water.setProperty("name", "H2O");
            hydrogen.addAtom(new Atom("H"));
            hydrogen.addAtom(new Atom("H"));
            hydrogen.addBond(0, 1, IBond.Order.SINGLE);
            hydrogen.setProperty("name", "H2");
            proton.addAtom(new Atom("H"));
            proton.getAtom(0).setFormalCharge(1);
            proton.setProperty("name", "H+");
        }
    }
    
    private Map<String, Double> getFormulaHash(IMolecule molecule) {
        IMolecularFormula formula =
            MolecularFormulaManipulator.getMolecularFormula(molecule);
        List<IElement> elements = MolecularFormulaManipulator.elements(formula);
        Map<String, Double> map = new HashMap<String, Double>();
        for (int i = 0; i < elements.size(); i++) {
            IElement element = elements.get(i);
            map.put(element.getSymbol(),
                 new Double(MolecularFormulaManipulator.getElementCount(
                         formula, element)));
        }
        return map;
    }


    /**
* Gets the cloned Reaction of the ReactionBalancer object
*
*...@return The Reaction
*/
    public Reaction getReaction() {
        return reaction;
    }


    /**
* Gets the Hashtable diff of the ReactionBalancer object. It contains the
* Atom symbols of the Atoms that are not balanced. Values greater zero mean
* there is an excess of this Element on the product side, values lower zero
* mean the opposite.
*
*...@return The Hashtable diff
*/
    public Map<String, Double> getDiffHashtable() {
        return diff;
    }


    /**
* Tests if the reaction of this object is balanced.
*
* @return true if reaction is balanced.
*/
    public boolean isBalanced(Reaction reaction) {
        if (reaction != this.reaction) {
            this.reaction = reaction;
            makeDiffHashtable();
        }
        if (AtomContainerSetManipulator.getTotalFormalCharge(reaction.getProducts())
                 - AtomContainerSetManipulator.getTotalFormalCharge(reaction.getReactants()) == 0
                 && diff.isEmpty()) {
            return true;
        }
        return false;
    }


    /**
* Try to balance this Reaction.
*
*...@return true if Reaction could be balanced.
*/
    public boolean balance(Reaction reaction) throws CDKException {

        if (reaction != this.reaction) {
            this.reaction = reaction;
            makeDiffHashtable();
        }
        
        double chargeDifference =
                AtomContainerSetManipulator.getTotalFormalCharge(reaction.getProducts())
                 - AtomContainerSetManipulator.getTotalFormalCharge(reaction.getReactants());

        // If reaction is already balanced, return true.
        if (diff.isEmpty() && chargeDifference == 0) {
            return true;
        }

        if (!containsOnlyOH(diff)) {
            if (!permutateStoichiometries(5)) {
                return false;
            }
        }

        addMoleculeToBalanceElement(water, "O");
        if (diff.isEmpty() && chargeDifference == 0) {
            return true;
        }
        
        balanceCharge(proton);

        if (diff.isEmpty()) {
            return true;
        }

        
        addMoleculeToBalanceElement(hydrogen, "H");
        if (diff.isEmpty()) {
            return true;
        }

        return false;
    }


    /**
* Construct the Hashtable diff for this Reaction
*
*/
    protected void makeDiffHashtable() {

        addMoleculeHash(reaction.getReactants(), -1, diff);
        addMoleculeHash(reaction.getProducts(), 1, diff);
        removeZeroEntries(diff);

        return;
    }


    /**
* Adds the FormulaHashtables of a SetOfMolecules to a Hashtable
*
*...@param som The SetOfMolecules to be added
*...@param side equals 1 for products, -1 for reactants
*...@param hash The feature to be added to the MoleculeHash attribute
*/
    protected void addMoleculeHash(
            IMoleculeSet som, int side, Map<String, Double> hash) {
        for (int i = 0; i < som.getAtomContainerCount(); i++) {
            Map<String, Double> molHash = getFormulaHash(som.getMolecule(i));
            for (String element : molHash.keySet()) {
                double elementCount = molHash.get(element);
                if (hash.containsKey(element)) {
                    double count = hash.get(element);
                    count += som.getMultiplier(i) * elementCount * side;
                    hash.put(element, new Double(count));
                } else {
                    hash.put(element,
                            new Double(
                                   som.getMultiplier(i) * elementCount * side));
                }
            }
        }
        return;
    }


    /**
* Tries to balance the formal charge using a charged Molecule.
*
*...@param mol The Molecule that should be used.
*/
    public void balanceCharge(Molecule mol) throws CDKException{
        int molCharge = AtomContainerManipulator.getTotalFormalCharge(mol);
        if (molCharge == 0) {
            return;
        }
        
        double chargeDifference =
                AtomContainerSetManipulator.getTotalFormalCharge(reaction.getProducts())
                 - AtomContainerSetManipulator.getTotalFormalCharge(reaction.getReactants());
        
        double molsToAdd = (chargeDifference) / ((double) molCharge);
        
        Map<String, Double> molHash = getFormulaHash(mol);

        if (chargeDifference == 0) {
            return;
        } else if (molsToAdd < 0) {
            balanceSetOfMolecules(reaction.getProducts(), -molsToAdd, mol);
            updateDiffHashtable(molsToAdd, molHash);
            return;
        } else if (molsToAdd > 0) {
            balanceSetOfMolecules(reaction.getReactants(), molsToAdd, mol);
            updateDiffHashtable(molsToAdd, molHash);
            return;
        }
        return;
    }


    /**
* Checks whether the Hashtable hash contains other Atoms than oxygen and
* hydrogen.
*
*...@param hash The hashtable to be tested
*...@return true if the diff Hashtable contains the elements oxygen and
* hydrogen only.
*/
    public boolean containsOnlyOH(Map<String, Double> hash) {
        for (String s : hash.keySet()) {
            if (!s.equals("H") && !s.equals("O")) {
                return false;
            }
        }
        return true;
    }


    /**
* Add a Molecule to the Reaction to balance a certain element.
*
*...@param mol The molecule to add
*...@param element The element that is to be balanced
*/
    protected void addMoleculeToBalanceElement(Molecule mol, String element)
              throws CDKException{

        if (!diff.containsKey(element)) {
            return;
        }

        // The Hashtable of the molecule used for balancing
        Map<String, Double> molHash = getFormulaHash(mol);

        // How many elements do we need (negative, if more educts than products)
        double elementCount = diff.get(element) / molHash.get(element);
        
        // get position of molecule in educts or products
        int edPos = getMoleculePosition(reaction.getReactants(), mol);
        int prodPos = getMoleculePosition(reaction.getProducts(), mol);
        
        // Add molecule to the correct side
        if (elementCount < 0) {
            // is true if more elements on educt side
            
            if (edPos != -1) {
                // is true if molecule is present on educt side
                
                double stoich = reaction.getReactants().getMultiplier(edPos);
                if (stoich < -elementCount) {
                    reaction.getReactants().setMultiplier(edPos, new Double(0));
                    elementCount += stoich;
                } else {
                    reaction.getReactants().setMultiplier(
                            edPos, stoich + elementCount);
                    elementCount = 0;
                }
                updateDiffHashtable(stoich, molHash);
            }
            if (elementCount < 0) {
                balanceSetOfMolecules(
                        reaction.getProducts(), -elementCount, mol, prodPos);
                updateDiffHashtable(elementCount, molHash);
            }

        } else if (elementCount > 0) {
            // is true if more elements on product side
            
            if (prodPos != -1) {
                // is true if molecule is present on product side
                
                double stoich = reaction.getProducts().getMultiplier(prodPos);
                if (stoich < elementCount) {
                    reaction.getProducts().setMultiplier(prodPos, new Double(0));
                    elementCount -= stoich;
                } else {
                    reaction.getProducts().setMultiplier(prodPos, stoich - elementCount);
                    elementCount = 0;
                }
                updateDiffHashtable(stoich, molHash);
            }
            if (elementCount > 0) {
                balanceSetOfMolecules(reaction.getReactants(), elementCount, mol, edPos);
                updateDiffHashtable(elementCount, molHash);
            }
        }
        
        return;
    }


    /**
* Add a number of Molecules to a SetOfMolecules
*
*...@param som The SetOfMolecules that the Molecules are to be added
*...@param elementCount The number of Molecules to be added
*...@param mol The Molecule to be added
*/
    public void balanceSetOfMolecules(IMoleculeSet som, double elementCount, Molecule mol) throws CDKException{
        int molPosition = getMoleculePosition(som, mol);
        balanceSetOfMolecules(som, elementCount, mol, molPosition);
    }

    /**
* Add a number of Molecules to a SetOfMolecules
*
*...@param som The SetOfMolecules that the Molecules are to be added
*...@param elementCount The number of Molecules to be added
*...@param mol The Molecule to be added
*...@param molPosition The position of Molecule mol in SetOfMolecules som
*/
    public void balanceSetOfMolecules(IMoleculeSet som, double elementCount, Molecule mol, int molPosition) {
        
        if (molPosition == -1) {
            som.addAtomContainer(mol, elementCount);
            return;
        } else {
            som.setMultiplier(molPosition, elementCount + som.getMultiplier(molPosition));
        }
    }

    
    
    /**
* Test whether a Molecule is already in this SetOfMolecules
*
*...@param som The SetOfMolecules to be tested in
*...@param mol The Molecule to be checked
*...@return The position in SetOfMolecules if found, -1 otherwise
*/
    public int getMoleculePosition(IMoleculeSet som, Molecule mol) throws CDKException {
        for (int i = 0; i < som.getAtomContainerCount(); i++) {
            if (som.getAtomContainer(i).getAtomCount() == mol.getAtomCount()) {
                if (mol.getBondCount() == 0 || som.getAtomContainer(i).getBondCount() == 0) {
                    if (mol.getAtom(0).getSymbol().equals(som.getAtomContainer(i).getAtom(0).getSymbol())) {
                        return i;
                    } else {
                        continue;
                    }
                }
                
                if (UniversalIsomorphismTester.isIsomorph(som.getAtomContainer(i), mol)) {
                    return i;
                }
            }
        }
        return -1;
    }


    /**
* Updates the diff Hashtable by another Hashtable
*
*...@param elementCount The number of occurences of the hash
*...@param hash The Hashtable to use for update
*/
    protected void updateDiffHashtable(
            double elementCount, Map<String, Double> hash) {
        for (String element : hash.keySet()) {
            double tempStoich = -elementCount * hash.get(element);
            if (diff.containsKey(element)) {
                tempStoich += diff.get(element);
            }
            diff.put(element, new Double(tempStoich));
        }
        removeZeroEntries(diff);
        return;
    }


    /**
* Remove entries from diff Hashtable whose values equal zero
*
* @param hash The Hashtable to be cleaned.
*/
    protected static void removeZeroEntries(Map<String, Double> hash) {
        for (Iterator<String> it = hash.keySet().iterator(); it.hasNext(); )
                if (hash.get(it.next())==0.0) {
                        it.remove();
                }
        return;
    }
    
    private class DoubleArrayPermutor
            extends Permutor implements Iterator<Double[]> {
        
        public Double[] doubleArray;

        public DoubleArrayPermutor(Double[] doubleArray) {
            super(doubleArray.length);
            this.doubleArray = doubleArray;
        }

        @Override
        public Double[] next() {
            Double[] permutation = new Double[doubleArray.length];
            int i = 0;
            for (int j : getNextPermutation()) {
                permutation[i] = doubleArray[j];
            }
            return permutation;
        }

        @Override
        public void remove() {
            // do nothing
        }
        
    }


    /**
* Permutate the stoichiometry
*
*...@param max The maximal value of stoichiometric coefficients
*...@return true if reaction could be balanced with respect to non-O and non-H Atoms
*/
    public boolean permutateStoichiometries(double max) {

        int noOfEducts = reaction.getReactantCount();
        int noOfProducts = reaction.getProductCount();
        int noOfMolecules = noOfEducts + noOfProducts;
        Double[] coefficients = new Double[noOfMolecules];

        for (int p = 0; p < noOfMolecules; ++p) {
            coefficients[p] = 1.0;
        }
        
        DoubleArrayPermutor permutor = new DoubleArrayPermutor(coefficients);

        int[] permutation = new int[noOfMolecules];
        while (coefficients[0] <= max) {
            while (permutor.hasNext()) {
                Double[] newCoefficients = permutor.next();

                for (int i = 0; i < permutation.length; ++i) {
                    newCoefficients[i] = coefficients[permutation[i]];
                }
                Double[] eductCoefficients = new Double[noOfEducts];
                Double[] productCoefficients = new Double[noOfProducts];
                for (int e = 0; e < noOfEducts; ++e) {
                    eductCoefficients[e] = newCoefficients[e];
                }
                for (int p = noOfEducts; p < noOfMolecules; ++p) {
                    productCoefficients[p - noOfEducts] = newCoefficients[p];
                }
                reaction.setProductCoefficients(productCoefficients);
                reaction.setReactantCoefficients(eductCoefficients);
                diff = new HashMap<String, Double>();
                makeDiffHashtable();
                if (containsOnlyOH(diff)) {
                    return true;
                }
            }
            Arrays.sort(coefficients);
            coefficients[0] += 1.0;
        }
        return false;
    }

}
import java.util.Random;

/**
* General permutation generator, that uses orderly generation by ranking and
* unranking. The basic idea is that all permutations of length N can be ordered
* (lexicographically) like:
* <pre>
* 0 [0, 1, 2]
* 1 [0, 2, 1]
* 2 [1, 0, 2]
* ...
* </pre>
* where the number to the left of each permutation is the <i>rank</i> - really
* just the index in this ordered list. The list is created on demand, by a
* process called <i>unranking</i> where the rank is converted to the
* permutation that appears at that point in the list.
*
* <p>The algorithms used are from the book "Combinatorial Generation :
* Algorithms, Generation, and Search" (or C.A.G.E.S.) by D.L. Kreher and D.R.
* Stinson</p>
*
* @author maclean
*
*/
public class Permutor {
    
    /**
* The current rank of the permutation to use
*/
    private int currentRank;
    
    /**
* The maximum rank possible, given the size
*/
    private int maxRank;
    
    /**
* The number of objects to permute
*/
    private int size;
    
    /**
* For accessing part of the permutation space
*/
    private Random random;
    
    /**
* Create a permutor that will generate permutations of numbers up to
* <code>size</code>.
*
* @param size the size of the permutations to generate
*/
    public Permutor(int size) {
        this.currentRank = 0;
        this.size = size;
        this.maxRank = this.calculateMaxRank();
        this.random = new Random();
    }
    
    public boolean hasNext() {
        return this.currentRank < this.maxRank;
    }
    
    /**
* Set the permutation to use, given its rank.
*
* @param rank the order of the permutation in the list
*/
    public void setRank(int rank) {
        this.currentRank = rank;
    }
    
    public int getRank() {
        return this.currentRank;
    }
    
    /**
* Set the currently used permutation.
*
* @param permutation the permutation to use, as an int array
*/
    public void setPermutation(int[] permutation) {
// this.currentRank = this.rankPermutationLexicographically(permutation);
        currentRank = rankPermutationLexicographically(permutation) - 1; // TMP
    }
    
    /**
* Randomly skip ahead in the list of permutations.
*
* @return a permutation in the range (current, N!)
*/
    public int[] getRandomNextPermutation() {
        int d = maxRank - currentRank;
        int r = this.random.nextInt(d);
        this.currentRank += Math.max(1, r);
        return this.getCurrentPermutation();
    }
    
    /**
* Get the next permutation in the list.
*
* @return the next permutation
*/
    public int[] getNextPermutation() {
        this.currentRank++;
        return this.getCurrentPermutation();
    }
    
    /**
* Get the permutation that is currently being used.
*
* @return the permutation as an int array
*/
    public int[] getCurrentPermutation() {
        return this.unrankPermutationLexicographically(currentRank, size);
    }
    
    /**
* Calculate the max possible rank for permutations of N numbers.
*
* @return the maximum number of permutations
*/
    public int calculateMaxRank() {
        return factorial(size) - 1;
    }
    
    // much much more efficient to pre-calculate this (or lazily calculate)
    // and store in an array, at the cost of memory.
    private int factorial(int i) {
        if (i > 0) {
            return i * factorial(i - 1);
        } else {
            return 1;
        }
    }
    
    /**
* Convert a permutation (in the form of an int array) into a 'rank' - which
* is just a single number that is the order of the permutation in a lexico-
* graphically ordered list.
*
* @param permutation the permutation to use
* @return the rank as a number
*/
    private int rankPermutationLexicographically(int[] permutation) {
        int rank = 0;
        int n = permutation.length;
        int[] counter = new int[n + 1];
        System.arraycopy(permutation, 0, counter, 1, n);
        for (int j = 1; j <= n; j++) {
            rank = rank + ((counter[j] - 1) * factorial(n - j));
            for (int i = j + 1; i < n; i++) {
                if (counter[i] > counter[j]) {
                    counter[i]--;
                }
            }
        }
        return rank;
    }
    
    /**
* Performs the opposite to the rank method, producing the permutation that
* has the order <code>rank</code> in the lexicographically ordered list.
*
* As an implementation note, the algorithm assumes that the permutation is
* in the form [1,...N] not the more usual [0,...N-1] for a list of size N.
* This is why there is the final step of 'shifting' the permutation. The
* shift also reduces the numbers by one to make them array indices.
*
* @param rank the order of the permutation to generate
* @param size the length/size of the permutation
* @return a permutation as an int array
*/
    private int[] unrankPermutationLexicographically(int rank, int size) {
        int[] permutation = new int[size + 1];
        permutation[size] = 1;
        for (int j = 1; j < size; j++) {
            int d = (rank % factorial(j + 1)) / factorial(j);
            rank = rank - d * factorial(j);
            permutation[size - j] = d + 1;
            for (int i = size - j + 1; i <= size; i++) {
                if (permutation[i] > d) {
                    permutation[i]++;
                }
            }
        }
        
        // convert an array of numbers like [1...n] to [0...n-1]
        int[] shiftedPermutation = new int[size];
        for (int i = 1; i < permutation.length; i++) {
            shiftedPermutation[i - 1] = permutation[i] - 1;
        }
        return shiftedPermutation;
    }

}
------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to