Hi all,

I have spent some time looking into how we might be able to represent 
stochastic models in CellML. This is something that would be useful to 
ensure we have properly contemplated for CellML 1.2. I have pasted in 
the notes I wrote on this below.

Please let me know if you have any suggestions, comments, or criticisms 
of the below document. At some point, this will obviously need to be 
transformed into a more robust proposal, but for now, I just want to 
make sure we keep the option open to use the CellML 1.2 core to 
represent stochastic models.

Best regards,
Andrew

-----

Overall goal:
  Describe a framework which can be used in CellML 1.2 to represent a 
range of
  different systems which require stochastic differential equations to
  describe.

Constraints:
  Do not want to describe the procedure for solving the model in core 
CellML,
    only the underlying mathematical / statistical model.
  Want to express the model in such a way that the procedure is computable
    from the model.
  Want there to be only one interpretation of the model.
  Want the representation to be abstract enough that it is meaningful for a
    number of different fields, and not just chemical equations in a well
    stirred vessel.
  Want the representation to work naturally when mixed with systems of 
ordinary
    differential equations.

Use cases:
  Chemical reactions under the Chemical Master Equation model of Gillespie:
    We need to split these into separate species. This is a Poisson process,
    so there are simple ways to represent it.

    It is more efficient to represent models using Weiner processes when 
this
    there are large enough numbers of molecules to justify this but ODEs are
    not being used.

    However, Poisson and Weiner processes are both Levy Processes, that is,
    they have stationary independent increments, are zero at time zero, and
    are cadlag. This is not necessarily a good thing because some things we
    want to model might have memory of past events or a time dependence.

How we can represent this in CellML:
  For the continuous case, integral equations for the increment in terms of
  built in processes like Weiner and Poisson processes (I don't believe 
there
  is a clean way to represent increments in MathML).

  Implementations will need to identify these and work out the 
distribution of
  the time until the next event (good implementations might be able to 
perform
  symbolic algebra to work this out, but most implementations would probably
  just recognise common cases like Poisson distributions with arbitrary
  parameters, and deal with expressions involving a Weiner process by 
sampling
  from the increment distribution in each time step), which could be put 
into a
  slightly generalised Gibson-Bruck type of framework where we store the 
time of
  the next event.

  None of this helps for non stationary independent processes, however.

How could this be related to the standards:
  It has been proposed that CellML 1.2 have a core specification which 
describes
  the basic way of representing the mathematical structure in very general
  terms, and secondary specifications be used to narrow CellML down to 
specific
  subsets which can be implemented in their entirety by a actual software
  packages.

  Core CellML 1.2 should be general enough to represent concepts like
  probability distributions (MathML allows new operators to be defined, 
so we
  could create ones for our base types of stochastic process). The ODE IV
  secondary specification would not allow stochastic models, while we would
  have another alternative secondary specification which extended the ODE IV
  one to allow certain limited types of stochastic model (limited to 
types we
  know how to solve).

  In terms of interaction with the typing system, in a stochastic system we
  have both reals and real-valued random variables. Once we have one random
  variable in our system, this will propagate to everything else which
  is affected by it, so most of the model will technically be a random 
variable.
  However, we want to be able to easily mix stochastic models with 
existing ODE
  IV models to create hybrid models, so we don't really want the required
  datatype to change just to connect up a random variable to a non-random
  variable. For this reason, I don't think it is worthwhile to consider a
  random variable of a certain type a different datatype, and instead we 
would
  require tools to deduce this information if they require it.

_______________________________________________
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion

Reply via email to