Hello CDK and Blue-Obelisk People,
Egon mentioned that the reaction-balancer discussed a few weeks ago on
the CDK-user list might be useful to the Blue Obelisk Green Chain
Reaction initiative.
Attached is the code formed in that discussion
authors: Gilleain Torrance, wateriestfire, Jules Kerssemakers
The original discussion can be found in the CDK mail-archives here:
http://sourceforge.net/mailarchive/forum.php?thread_name=AANLkTik7z0WBgfj6_c4amHleL6VRTEOL2cZJQbZK-Kfn%40mail.gmail.com&forum_name=cdk-user
Kind regards, and hope it helps!
Jules Kerssemakers
On 8 August 2010 15:13, Egon Willighagen <[email protected]> wrote:
> I think there recently was a discussion on reaction balancing... did
> not have time to tune in on that yet... perhaps someone can reply to
> the Blue Obelisk mailing list?
>
> E.
>
>
> ---------- Forwarded message ----------
> From: Peter Murray-Rust <[email protected]>
> Date: Sun, Aug 8, 2010 at 2:07 PM
> Subject: [BlueObelisk-discuss] The Green Chain Reaction; can the Blue
> Obelisk help?
> To: BlueObelisk-Discuss <[email protected]>
>
>
> As you can see from http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=2526
> we are starting an ambitious communal event in datadriven science. It
> will be presented live at the British Library for Science Online 2010
> (nature, Mendely, BL sponsors) with a very high profile (e.g. getting
> widely reported, involving a world community) Jean-Claude and MatTodd
> will be taking part. I would love help from more BlueObelisk readers
> and anyone else who might be interested.
>
> The question we want to answer is “Are chemical reactions becoming
> greener?” by using text-mining and other information extraction from
> the Open Chemical literature.
>
> The Blue Obelisk and the chemical blogosphere can help.
> Examples of help are:
> * blog the event
> * project page/wiki
> * list of solvents (do we have this?)
> * list of reagents
> * software to balance reactions from products and reactants and
> reagents. How much of this does CDK do?
> * software to deduce reaction type from products and reactants and
> reagents. How much of this does CDK do?
> * sources or Open reactions. The most likely are personal papers,
> theses, etc. data mining behind publishers' firewalls is forbidden
> * requests to publishers to make their experiemental sections fully Open.
>
> and of course your ideas on what else we can do...
> Rich, do you have Open reactions?
> Can we scrape reactions from Wikipedia?
> Can we get solvents , reagents from Wikipedia?
> .. etc.
>
>
> P.
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by
>
> Make an app they can't live without
> Enter the BlackBerry Developer Challenge
> http://p.sf.net/sfu/RIM-dev2dev
> _______________________________________________
> Blueobelisk-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
>
>
>
>
> --
> Post-doc @ Uppsala University
> Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by
>
> Make an app they can't live without
> Enter the BlackBerry Developer Challenge
> http://p.sf.net/sfu/RIM-dev2dev
> _______________________________________________
> Cdk-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Set;
import org.openscience.cdk.Atom;
import org.openscience.cdk.Molecule;
import org.openscience.cdk.Reaction;
import org.openscience.cdk.exception.CDKException;
import org.openscience.cdk.interfaces.IBond;
import org.openscience.cdk.interfaces.IElement;
import org.openscience.cdk.interfaces.IMolecularFormula;
import org.openscience.cdk.interfaces.IMolecule;
import org.openscience.cdk.interfaces.IMoleculeSet;
import org.openscience.cdk.isomorphism.UniversalIsomorphismTester;
import org.openscience.cdk.tools.manipulator.AtomContainerManipulator;
import org.openscience.cdk.tools.manipulator.AtomContainerSetManipulator;
import org.openscience.cdk.tools.manipulator.MolecularFormulaManipulator;
/**
* The ReactionBalancer tries to stoichiometrically balance a reaction. It currently uses
* water for balancing oxygen, protons for balancing charges and H2 for balancing
* remaining hydrogens. If any other elements have to be balanced, permutations of stoichiometric
* coefficients up to five are tested.
*
* <p><b>Warning</b>: If the reaction cannot be balanced, the method balance returns false.
* Nevertheless the original reaction is modified. Use a copy of the reaction to
* avoid this behaviour.
*
* <code><pre>
* Reaction reaction;
* ReactionBalancer rb = new ReactionBalancer();
* boolean balanced;
* if(!rb.isBalanced(reaction)) balanced = rb.balance(reaction);
* </pre></code>
*
* @author Kai Hartmann
* @author Bart Geurten
* @cdk.created 2004-02-20
* @cdk.module reaction
*/
public class ReactionBalancer {
private static Molecule water = new Molecule();
private static Molecule hydrogen = new Molecule();
private static Molecule proton = new Molecule();
private Reaction reaction = null;
private Map<String, Double> diff = new HashMap<String, Double>();
/**
* Constructor for the ReactionBalancer object
*/
public ReactionBalancer() {
if (water.getAtomCount() == 0) {
water.addAtom(new Atom("H"));
water.addAtom(new Atom("H"));
water.addAtom(new Atom("O"));
water.addBond(0, 2, IBond.Order.SINGLE);
water.addBond(1, 2, IBond.Order.SINGLE);
water.setProperty("name", "H2O");
hydrogen.addAtom(new Atom("H"));
hydrogen.addAtom(new Atom("H"));
hydrogen.addBond(0, 1, IBond.Order.SINGLE);
hydrogen.setProperty("name", "H2");
proton.addAtom(new Atom("H"));
proton.getAtom(0).setFormalCharge(1);
proton.setProperty("name", "H+");
}
}
private Map<String, Double> getFormulaHash(IMolecule molecule) {
IMolecularFormula formula =
MolecularFormulaManipulator.getMolecularFormula(molecule);
List<IElement> elements = MolecularFormulaManipulator.elements(formula);
Map<String, Double> map = new HashMap<String, Double>();
for (int i = 0; i < elements.size(); i++) {
IElement element = elements.get(i);
map.put(element.getSymbol(),
new Double(MolecularFormulaManipulator.getElementCount(
formula, element)));
}
return map;
}
/**
* Gets the cloned Reaction of the ReactionBalancer object
*
*...@return The Reaction
*/
public Reaction getReaction() {
return reaction;
}
/**
* Gets the Hashtable diff of the ReactionBalancer object. It contains the
* Atom symbols of the Atoms that are not balanced. Values greater zero mean
* there is an excess of this Element on the product side, values lower zero
* mean the opposite.
*
*...@return The Hashtable diff
*/
public Map<String, Double> getDiffHashtable() {
return diff;
}
/**
* Tests if the reaction of this object is balanced.
*
* @return true if reaction is balanced.
*/
public boolean isBalanced(Reaction reaction) {
if (reaction != this.reaction) {
this.reaction = reaction;
makeDiffHashtable();
}
if (AtomContainerSetManipulator.getTotalFormalCharge(reaction.getProducts())
- AtomContainerSetManipulator.getTotalFormalCharge(reaction.getReactants()) == 0
&& diff.isEmpty()) {
return true;
}
return false;
}
/**
* Try to balance this Reaction.
*
*...@return true if Reaction could be balanced.
*/
public boolean balance(Reaction reaction) throws CDKException {
if (reaction != this.reaction) {
this.reaction = reaction;
makeDiffHashtable();
}
double chargeDifference =
AtomContainerSetManipulator.getTotalFormalCharge(reaction.getProducts())
- AtomContainerSetManipulator.getTotalFormalCharge(reaction.getReactants());
// If reaction is already balanced, return true.
if (diff.isEmpty() && chargeDifference == 0) {
return true;
}
if (!containsOnlyOH(diff)) {
if (!permutateStoichiometries(5)) {
return false;
}
}
addMoleculeToBalanceElement(water, "O");
if (diff.isEmpty() && chargeDifference == 0) {
return true;
}
balanceCharge(proton);
if (diff.isEmpty()) {
return true;
}
addMoleculeToBalanceElement(hydrogen, "H");
if (diff.isEmpty()) {
return true;
}
return false;
}
/**
* Construct the Hashtable diff for this Reaction
*
*/
protected void makeDiffHashtable() {
addMoleculeHash(reaction.getReactants(), -1, diff);
addMoleculeHash(reaction.getProducts(), 1, diff);
removeZeroEntries(diff);
return;
}
/**
* Adds the FormulaHashtables of a SetOfMolecules to a Hashtable
*
*...@param som The SetOfMolecules to be added
*...@param side equals 1 for products, -1 for reactants
*...@param hash The feature to be added to the MoleculeHash attribute
*/
protected void addMoleculeHash(
IMoleculeSet som, int side, Map<String, Double> hash) {
for (int i = 0; i < som.getAtomContainerCount(); i++) {
Map<String, Double> molHash = getFormulaHash(som.getMolecule(i));
for (String element : molHash.keySet()) {
double elementCount = molHash.get(element);
if (hash.containsKey(element)) {
double count = hash.get(element);
count += som.getMultiplier(i) * elementCount * side;
hash.put(element, new Double(count));
} else {
hash.put(element,
new Double(
som.getMultiplier(i) * elementCount * side));
}
}
}
return;
}
/**
* Tries to balance the formal charge using a charged Molecule.
*
*...@param mol The Molecule that should be used.
*/
public void balanceCharge(Molecule mol) throws CDKException{
int molCharge = AtomContainerManipulator.getTotalFormalCharge(mol);
if (molCharge == 0) {
return;
}
double chargeDifference =
AtomContainerSetManipulator.getTotalFormalCharge(reaction.getProducts())
- AtomContainerSetManipulator.getTotalFormalCharge(reaction.getReactants());
double molsToAdd = (chargeDifference) / ((double) molCharge);
Map<String, Double> molHash = getFormulaHash(mol);
if (chargeDifference == 0) {
return;
} else if (molsToAdd < 0) {
balanceSetOfMolecules(reaction.getProducts(), -molsToAdd, mol);
updateDiffHashtable(molsToAdd, molHash);
return;
} else if (molsToAdd > 0) {
balanceSetOfMolecules(reaction.getReactants(), molsToAdd, mol);
updateDiffHashtable(molsToAdd, molHash);
return;
}
return;
}
/**
* Checks whether the Hashtable hash contains other Atoms than oxygen and
* hydrogen.
*
*...@param hash The hashtable to be tested
*...@return true if the diff Hashtable contains the elements oxygen and
* hydrogen only.
*/
public boolean containsOnlyOH(Map<String, Double> hash) {
for (String s : hash.keySet()) {
if (!s.equals("H") && !s.equals("O")) {
return false;
}
}
return true;
}
/**
* Add a Molecule to the Reaction to balance a certain element.
*
*...@param mol The molecule to add
*...@param element The element that is to be balanced
*/
protected void addMoleculeToBalanceElement(Molecule mol, String element)
throws CDKException{
if (!diff.containsKey(element)) {
return;
}
// The Hashtable of the molecule used for balancing
Map<String, Double> molHash = getFormulaHash(mol);
// How many elements do we need (negative, if more educts than products)
double elementCount = diff.get(element) / molHash.get(element);
// get position of molecule in educts or products
int edPos = getMoleculePosition(reaction.getReactants(), mol);
int prodPos = getMoleculePosition(reaction.getProducts(), mol);
// Add molecule to the correct side
if (elementCount < 0) {
// is true if more elements on educt side
if (edPos != -1) {
// is true if molecule is present on educt side
double stoich = reaction.getReactants().getMultiplier(edPos);
if (stoich < -elementCount) {
reaction.getReactants().setMultiplier(edPos, new Double(0));
elementCount += stoich;
} else {
reaction.getReactants().setMultiplier(
edPos, stoich + elementCount);
elementCount = 0;
}
updateDiffHashtable(stoich, molHash);
}
if (elementCount < 0) {
balanceSetOfMolecules(
reaction.getProducts(), -elementCount, mol, prodPos);
updateDiffHashtable(elementCount, molHash);
}
} else if (elementCount > 0) {
// is true if more elements on product side
if (prodPos != -1) {
// is true if molecule is present on product side
double stoich = reaction.getProducts().getMultiplier(prodPos);
if (stoich < elementCount) {
reaction.getProducts().setMultiplier(prodPos, new Double(0));
elementCount -= stoich;
} else {
reaction.getProducts().setMultiplier(prodPos, stoich - elementCount);
elementCount = 0;
}
updateDiffHashtable(stoich, molHash);
}
if (elementCount > 0) {
balanceSetOfMolecules(reaction.getReactants(), elementCount, mol, edPos);
updateDiffHashtable(elementCount, molHash);
}
}
return;
}
/**
* Add a number of Molecules to a SetOfMolecules
*
*...@param som The SetOfMolecules that the Molecules are to be added
*...@param elementCount The number of Molecules to be added
*...@param mol The Molecule to be added
*/
public void balanceSetOfMolecules(IMoleculeSet som, double elementCount, Molecule mol) throws CDKException{
int molPosition = getMoleculePosition(som, mol);
balanceSetOfMolecules(som, elementCount, mol, molPosition);
}
/**
* Add a number of Molecules to a SetOfMolecules
*
*...@param som The SetOfMolecules that the Molecules are to be added
*...@param elementCount The number of Molecules to be added
*...@param mol The Molecule to be added
*...@param molPosition The position of Molecule mol in SetOfMolecules som
*/
public void balanceSetOfMolecules(IMoleculeSet som, double elementCount, Molecule mol, int molPosition) {
if (molPosition == -1) {
som.addAtomContainer(mol, elementCount);
return;
} else {
som.setMultiplier(molPosition, elementCount + som.getMultiplier(molPosition));
}
}
/**
* Test whether a Molecule is already in this SetOfMolecules
*
*...@param som The SetOfMolecules to be tested in
*...@param mol The Molecule to be checked
*...@return The position in SetOfMolecules if found, -1 otherwise
*/
public int getMoleculePosition(IMoleculeSet som, Molecule mol) throws CDKException {
for (int i = 0; i < som.getAtomContainerCount(); i++) {
if (som.getAtomContainer(i).getAtomCount() == mol.getAtomCount()) {
if (mol.getBondCount() == 0 || som.getAtomContainer(i).getBondCount() == 0) {
if (mol.getAtom(0).getSymbol().equals(som.getAtomContainer(i).getAtom(0).getSymbol())) {
return i;
} else {
continue;
}
}
if (UniversalIsomorphismTester.isIsomorph(som.getAtomContainer(i), mol)) {
return i;
}
}
}
return -1;
}
/**
* Updates the diff Hashtable by another Hashtable
*
*...@param elementCount The number of occurences of the hash
*...@param hash The Hashtable to use for update
*/
protected void updateDiffHashtable(
double elementCount, Map<String, Double> hash) {
for (String element : hash.keySet()) {
double tempStoich = -elementCount * hash.get(element);
if (diff.containsKey(element)) {
tempStoich += diff.get(element);
}
diff.put(element, new Double(tempStoich));
}
removeZeroEntries(diff);
return;
}
/**
* Remove entries from diff Hashtable whose values equal zero
*
* @param hash The Hashtable to be cleaned.
*/
protected static void removeZeroEntries(Map<String, Double> hash) {
for (Iterator<String> it = hash.keySet().iterator(); it.hasNext(); )
if (hash.get(it.next())==0.0) {
it.remove();
}
return;
}
private class DoubleArrayPermutor
extends Permutor implements Iterator<Double[]> {
public Double[] doubleArray;
public DoubleArrayPermutor(Double[] doubleArray) {
super(doubleArray.length);
this.doubleArray = doubleArray;
}
@Override
public Double[] next() {
Double[] permutation = new Double[doubleArray.length];
int i = 0;
for (int j : getNextPermutation()) {
permutation[i] = doubleArray[j];
}
return permutation;
}
@Override
public void remove() {
// do nothing
}
}
/**
* Permutate the stoichiometry
*
*...@param max The maximal value of stoichiometric coefficients
*...@return true if reaction could be balanced with respect to non-O and non-H Atoms
*/
public boolean permutateStoichiometries(double max) {
int noOfEducts = reaction.getReactantCount();
int noOfProducts = reaction.getProductCount();
int noOfMolecules = noOfEducts + noOfProducts;
Double[] coefficients = new Double[noOfMolecules];
for (int p = 0; p < noOfMolecules; ++p) {
coefficients[p] = 1.0;
}
DoubleArrayPermutor permutor = new DoubleArrayPermutor(coefficients);
int[] permutation = new int[noOfMolecules];
while (coefficients[0] <= max) {
while (permutor.hasNext()) {
Double[] newCoefficients = permutor.next();
for (int i = 0; i < permutation.length; ++i) {
newCoefficients[i] = coefficients[permutation[i]];
}
Double[] eductCoefficients = new Double[noOfEducts];
Double[] productCoefficients = new Double[noOfProducts];
for (int e = 0; e < noOfEducts; ++e) {
eductCoefficients[e] = newCoefficients[e];
}
for (int p = noOfEducts; p < noOfMolecules; ++p) {
productCoefficients[p - noOfEducts] = newCoefficients[p];
}
reaction.setProductCoefficients(productCoefficients);
reaction.setReactantCoefficients(eductCoefficients);
diff = new HashMap<String, Double>();
makeDiffHashtable();
if (containsOnlyOH(diff)) {
return true;
}
}
Arrays.sort(coefficients);
coefficients[0] += 1.0;
}
return false;
}
}
import java.util.Random;
/**
* General permutation generator, that uses orderly generation by ranking and
* unranking. The basic idea is that all permutations of length N can be ordered
* (lexicographically) like:
* <pre>
* 0 [0, 1, 2]
* 1 [0, 2, 1]
* 2 [1, 0, 2]
* ...
* </pre>
* where the number to the left of each permutation is the <i>rank</i> - really
* just the index in this ordered list. The list is created on demand, by a
* process called <i>unranking</i> where the rank is converted to the
* permutation that appears at that point in the list.
*
* <p>The algorithms used are from the book "Combinatorial Generation :
* Algorithms, Generation, and Search" (or C.A.G.E.S.) by D.L. Kreher and D.R.
* Stinson</p>
*
* @author maclean
*
*/
public class Permutor {
/**
* The current rank of the permutation to use
*/
private int currentRank;
/**
* The maximum rank possible, given the size
*/
private int maxRank;
/**
* The number of objects to permute
*/
private int size;
/**
* For accessing part of the permutation space
*/
private Random random;
/**
* Create a permutor that will generate permutations of numbers up to
* <code>size</code>.
*
* @param size the size of the permutations to generate
*/
public Permutor(int size) {
this.currentRank = 0;
this.size = size;
this.maxRank = this.calculateMaxRank();
this.random = new Random();
}
public boolean hasNext() {
return this.currentRank < this.maxRank;
}
/**
* Set the permutation to use, given its rank.
*
* @param rank the order of the permutation in the list
*/
public void setRank(int rank) {
this.currentRank = rank;
}
public int getRank() {
return this.currentRank;
}
/**
* Set the currently used permutation.
*
* @param permutation the permutation to use, as an int array
*/
public void setPermutation(int[] permutation) {
// this.currentRank = this.rankPermutationLexicographically(permutation);
currentRank = rankPermutationLexicographically(permutation) - 1; // TMP
}
/**
* Randomly skip ahead in the list of permutations.
*
* @return a permutation in the range (current, N!)
*/
public int[] getRandomNextPermutation() {
int d = maxRank - currentRank;
int r = this.random.nextInt(d);
this.currentRank += Math.max(1, r);
return this.getCurrentPermutation();
}
/**
* Get the next permutation in the list.
*
* @return the next permutation
*/
public int[] getNextPermutation() {
this.currentRank++;
return this.getCurrentPermutation();
}
/**
* Get the permutation that is currently being used.
*
* @return the permutation as an int array
*/
public int[] getCurrentPermutation() {
return this.unrankPermutationLexicographically(currentRank, size);
}
/**
* Calculate the max possible rank for permutations of N numbers.
*
* @return the maximum number of permutations
*/
public int calculateMaxRank() {
return factorial(size) - 1;
}
// much much more efficient to pre-calculate this (or lazily calculate)
// and store in an array, at the cost of memory.
private int factorial(int i) {
if (i > 0) {
return i * factorial(i - 1);
} else {
return 1;
}
}
/**
* Convert a permutation (in the form of an int array) into a 'rank' - which
* is just a single number that is the order of the permutation in a lexico-
* graphically ordered list.
*
* @param permutation the permutation to use
* @return the rank as a number
*/
private int rankPermutationLexicographically(int[] permutation) {
int rank = 0;
int n = permutation.length;
int[] counter = new int[n + 1];
System.arraycopy(permutation, 0, counter, 1, n);
for (int j = 1; j <= n; j++) {
rank = rank + ((counter[j] - 1) * factorial(n - j));
for (int i = j + 1; i < n; i++) {
if (counter[i] > counter[j]) {
counter[i]--;
}
}
}
return rank;
}
/**
* Performs the opposite to the rank method, producing the permutation that
* has the order <code>rank</code> in the lexicographically ordered list.
*
* As an implementation note, the algorithm assumes that the permutation is
* in the form [1,...N] not the more usual [0,...N-1] for a list of size N.
* This is why there is the final step of 'shifting' the permutation. The
* shift also reduces the numbers by one to make them array indices.
*
* @param rank the order of the permutation to generate
* @param size the length/size of the permutation
* @return a permutation as an int array
*/
private int[] unrankPermutationLexicographically(int rank, int size) {
int[] permutation = new int[size + 1];
permutation[size] = 1;
for (int j = 1; j < size; j++) {
int d = (rank % factorial(j + 1)) / factorial(j);
rank = rank - d * factorial(j);
permutation[size - j] = d + 1;
for (int i = size - j + 1; i <= size; i++) {
if (permutation[i] > d) {
permutation[i]++;
}
}
}
// convert an array of numbers like [1...n] to [0...n-1]
int[] shiftedPermutation = new int[size];
for (int i = 1; i < permutation.length; i++) {
shiftedPermutation[i - 1] = permutation[i] - 1;
}
return shiftedPermutation;
}
}
------------------------------------------------------------------------------
This SF.net email is sponsored by
Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user