Hello,
I am writing a computational package integrating Python and C++ using
Boost.Python. The interface and the initial data collection are in
Python, but the workhorse data structures and computation are in C++. I
am having difficulties with memory errors at two points: at the transfer
of data between numpy and C++, and upon the destruction of wrapped C++
objects. I suspect the issues are related, and are not the result of
limitations of Python, C++, or Boost.Python, but rather are the result
of my lack of familiarity with C++. I apologize for the long email, but
my lack of experience with C++ makes it hard to describe my problems
succinctly, especially because I have no idea where the problems are
occurring, nor do I know the right vocabulary. I've pasted below the C++
source for the relevant class, as well as the Boost.Python module
wrapper code. This is all for academic research, and under the legal
conditions of my grants it is necessarily open-source, but because I
intend to publish this package in a research journal, I'd appreciate it
if you didn't share the code beyond this mailing list; I'll create pypi
and sourceforge entries for the source once it's ready for release. Most
importantly, feel free to talk trash about and point out any errors in
my code: I just started with C++ three weeks ago, and criticism is very
helpful and much appreciated! :-)
My package is for the analysis of large biological data sets, and a new
C++ object must be created for each gene analyzed, which for the human
genome amounts to about 30,000 times for the analysis of a single data
set. A single gene object stores many variables, some of which are
scalar and could be built into the class as static variables, but most
variables are array data which greatly vary in size (and possibly
dimension) depending upon the biological structure of the individual
gene as well as the nature of the experiment being analyze, which cannot
be known in advance. As such, all the array variables are dynamically
allocated. I have used Boost.Python shared pointers and shared arrays
for all dynamically allocated variables. In an attempt at uniformity, in
this version of the code, all gene class variables are dynamically
allocated.
I have separated the creation of a gene object and the filling of
variable values into two steps. All access of C++ variables is done
through wrapped get/set methods, and all data is transferred via numpy
arrays. I have no problems upon object creation, but I get memory errors
both when filling the variables from numpy arrays, and transferring
calculated variable values back into numpy arrays. The issue seems to be
that I cannot force a copy of variable values between numpy arrays and
C++, but rather Boost.Python seems intent upon using references (in fact
the documentation for the extract<> template explicitly states that, and
I can't find any other way to get variables out of numpy arrays and into
C++). This matters because the C++ gene object will modify some of its
own variable values that were initially taken from Python, and the user
will similarly modify within Python variables used to fill the gene
object, and I need to make sure changes in one do not get automatically
reflected in the other. In both cases I get bus errors / segmentation
faults.
The second problem relates to object deletion. Once a single gene has
been analyzed, neither the C++ object nor the corresponding Python
object are needed; the important numerical results are simply stored in
a numpy array. Because of the huge number of genes, I have one of two
options when it comes to memory management, in order to avoid the
build-up of excess memory consumption by genes for which the analysis
has been completed: I can either explicitly destroy the C++ object, by
brute force or by reassigning a new object to its variable in Python, or
I can reinitialize its variables to store a new gene (I am inclined to
the second). But whenever I try to destroy a wrapped C++ object, using
either method, I get a segmentation fault. I've written a C++ destructor
for my gene class; I tried to wrap it for Python, but the Boost.Python
documentation doesn't mention how to register destructors to Python, so
I just wrapped it like any other bound method. (Note: that wrapper is
not present in the version I've posted below, because I assumed I got it
wrong anyway). I still get the faults using that method, as called by my
assigned name. Should I be using the Python object's .__del__ method or
the global Python del(), and if so, how do I specialize them to my gene
object's destructor? Even if I go with the option of just reinitializing
the C++ gene object variables, rather than explicit deletion, I still
get a segmentation fault when I leave the Python interpreter, presumably
because it's trying to destroy the wrapped C++ gene objects created in
that session.
Any and all help and advice is very much appreciated, and I thank you in
advance for your help. If any of you provide particular help, I will be
happy to include you in the acknowledgements should I get this darn
thing published; I'd offer co-authorship, but my boss would never allow
it. Have a good weekend.
-- Chris Cowing-Zitron
fp_boosters.hpp, the source file defining the Boost.Python module:
#include <fp_genes.hpp>
BOOST_PYTHON_MODULE(fp_boosters)
{
def("loggamma", &fingerpuppet::evaluations::loggamma_py,
return_value_policy<return_by_value>());
def("digamma", &fingerpuppet::evaluations::digamma_py,
return_value_policy<return_by_value>());
def("trigamma", &fingerpuppet::evaluations::trigamma_py,
return_value_policy<return_by_value>());
def("ll_single", &fingerpuppet::evaluations::ll_single_py);
def("ll_grad_single", &fingerpuppet::evaluations::ll_grad_single_py);
def("ll_hess_single", &fingerpuppet::evaluations::ll_hess_single_py);
def("ll_full", &fingerpuppet::evaluations::ll_full_py);
def("ll_grad_full", &fingerpuppet::evaluations::ll_grad_full_py);
def("ll_hess_full", &fingerpuppet::evaluations::ll_hess_full_py);
class_<fingerpuppet::genes::gene, boost::noncopyable>("gene")
.def("register_gene", &fingerpuppet::genes::gene::register_new_gene_py)
.def("reset_gene", &fingerpuppet::genes::gene::reset_gene)
.def("empty_gene", &fingerpuppet::genes::gene::empty_gene)
.def("eval_ll", &fingerpuppet::genes::gene::eval_ll_py)
.def("eval_cons", &fingerpuppet::genes::gene::eval_cons_py)
.def("eval_ll_grad", &fingerpuppet::genes::gene::eval_ll_grad_py)
.def("eval_cons_jac", &fingerpuppet::genes::gene::eval_cons_jac_py)
.def("eval_ll_hess", &fingerpuppet::genes::gene::eval_ll_hess_py)
.def("eval_cons_hess", &fingerpuppet::genes::gene::eval_cons_hess_py)
.def("eval_lagrangian", &fingerpuppet::genes::gene::eval_lagrangian_py)
.def("update_var", &fingerpuppet::genes::gene::update_var_py)
.def("return_var", &fingerpuppet::genes::gene::return_var_py)
;
numeric::array::set_module_and_type("numpy", "ndarray");
};
fp_genes.hpp, the header defining the gene object class:
#include <fp_evaluations.hpp>
namespace fingerpuppet {
namespace genes {
class gene;
class gene : public boost::enable_shared_from_this<gene>, public
boost::noncopyable
{
public:
/* Constructors, destructor, and construction utilities */
gene()
{
gene_count = 0;
reset_gene();
}
~gene()
{
empty_gene();
}
boost::shared_ptr<gene> return_this()
{
return shared_from_this();
}
bool register_new_gene(int *ns, int *statecolsin, int *sampcolsin, int
*exonrowsin, int *segrowsin, double *readsin)
{
reset_gene();
gene_count += 1;
nstate.reset(new int);
nsamp.reset(new int);
nexon.reset(new int);
nseg.reset(new int);
nwin.reset(new int);
nvar.reset(new int);
ncon.reset(new int);
nsingle.reset(new int);
hsingle.reset(new int);
jac_entries.reset(new int);
hess_entries.reset(new int);
current_ll.reset(new double);
current_obj_factor.reset(new double);
*nstate = ns[0];
*nsamp = ns[1];
*nexon = ns[2];
*nseg = ns[3];
*nwin = ns[4];
*nvar = ns[5];
*ncon = *nsamp * *nexon;
*nsingle = 9;
*hsingle = 21;
*jac_entries = 2 * *ncon;
*hess_entries = 0;
*current_ll = 0.0;
*current_obj_factor = 1.0;
statecols.reset(new int [*nsamp]);
sampcols.reset(new int [*nsamp]);
exonrows.reset(new int [*nwin]);
segrows.reset(new int [*nwin]);
sampsperstate.reset(new int [*nstate]);
segsperexon.reset(new int [*nseg]);
single_hess_rows.reset(new int [*hsingle]);
single_hess_cols.reset(new int [*hsingle]);
jac_rows.reset(new int [*jac_entries]);
jac_cols.reset(new int [*jac_entries]);
eval_counts.reset(new int [7]);
reads.reset(new double [*nsamp * *nwin]);
readmaxes.reset(new double [*nsamp * *nexon]);
start_X.reset(new double [*nvar]);
x_L.reset(new double [*nvar]);
x_U.reset(new double [*nvar]);
g_L.reset(new double [*ncon]);
g_U.reset(new double [*ncon]);
current_X.reset(new double [*nvar]);
current_cons.reset(new double [*ncon]);
current_ll_grad.reset(new double [*nvar]);
current_cons_jac.reset(new double [*jac_entries]);
current_cons_hess.reset(new double [*ncon]);
current_lambda.reset(new double [*ncon]);
for (int entry = 0; entry < *ncon; ++entry) {
readmaxes[entry] = 0.0;
}
for (int samp = 0; samp < *nsamp; ++samp) {
for (int win = 0; win < *nwin; ++win) {
if (reads[(samp * *nwin) + win] > readmaxes[(samp * *nexon) +
exonrows[win]]) {
readmaxes[(samp * *nexon) + exonrows[win]] = reads[(samp * *nwin) + win];
}
}
}
for (int entry = 0; entry < *nstate; ++entry) {
sampsperstate[entry] = 0;
}
for (int entry = 0; entry < *nsamp; ++entry) {
sampsperstate[statecols[entry]] += 1;
}
for (int entry = 0; entry < *nexon; ++entry) {
segsperexon[entry] = 0;
}
segsperexon[0] = 1;
for (int entry = 1; entry < *nwin; ++entry) {
if (segrows[entry] != segrows[entry - 1]) {
segsperexon[exonrows[entry]] += 1;
}
}
init_jac_struct();
init_hess_struct();
init_bounds();
init_start_X();
}
bool register_new_gene_py(numeric::array const &ns_py, numeric::array
const &statecolsin_py, numeric::array const &sampcolsin_py,
numeric::array const &exonrowsin_py, numeric::array const &segrowsin_py,
numeric::array const &readsin_py)
{
reset_gene();
gene_count += 1;
nstate.reset(new int);
nsamp.reset(new int);
nexon.reset(new int);
nseg.reset(new int);
nwin.reset(new int);
nvar.reset(new int);
ncon.reset(new int);
nsingle.reset(new int);
hsingle.reset(new int);
jac_entries.reset(new int);
hess_entries.reset(new int);
current_ll.reset(new double);
current_obj_factor.reset(new double);
*nstate = int(ns_py[0]);
*nsamp = int(ns_py[1]);
*nexon = int(ns_py[2]);
*nseg = int(ns_py[3]);
*nwin = int(ns_py[4]);
*nvar = int(ns_py[5]);
*ncon = *nsamp * *nexon;
*nsingle = 9;
*hsingle = 21;
*jac_entries = 2 * *ncon;
*hess_entries = 0;
*current_ll = 0.0;
*current_obj_factor = 1.0;
statecols.reset(new int [*nsamp]);
sampcols.reset(new int [*nsamp]);
exonrows.reset(new int [*nwin]);
segrows.reset(new int [*nwin]);
sampsperstate.reset(new int [*nstate]);
segsperexon.reset(new int [*nseg]);
single_hess_rows.reset(new int [*hsingle]);
single_hess_cols.reset(new int [*hsingle]);
jac_rows.reset(new int [*jac_entries]);
jac_cols.reset(new int [*jac_entries]);
eval_counts.reset(new int [7]);
reads.reset(new double [*nsamp * *nwin]);
readmaxes.reset(new double [*nsamp * *nexon]);
start_X.reset(new double [*nvar]);
x_L.reset(new double [*nvar]);
x_U.reset(new double [*nvar]);
g_L.reset(new double [*ncon]);
g_U.reset(new double [*ncon]);
current_X.reset(new double [*nvar]);
current_cons.reset(new double [*ncon]);
current_ll_grad.reset(new double [*nvar]);
current_cons_jac.reset(new double [*jac_entries]);
current_cons_hess.reset(new double [*ncon]);
current_lambda.reset(new double [*ncon]);
for (int entry = 0; entry < *nsamp; ++entry) {
statecols[entry] = extract<int>(statecolsin_py[entry]());
}
for (int entry = 0; entry < *nsamp; ++entry) {
sampcols[entry] = extract<int>(sampcolsin_py[entry]());
}
for (int entry = 0; entry < *nwin; ++entry) {
exonrows[entry] = extract<int>(exonrowsin_py[entry]());
}
for (int entry = 0; entry < *nwin; ++entry) {
segrows[entry] = extract<int>(segrowsin_py[entry]());
}
for (int entry = 0; entry < *nsamp * *nwin; ++entry) {
reads[entry] = extract<double>(readsin_py[entry]());
}
for (int entry = 0; entry < *ncon; ++entry) {
readmaxes[entry] = 0.0;
}
for (int samp = 0; samp < *nsamp; ++samp) {
for (int win = 0; win < *nwin; ++win) {
if (reads[(samp * *nwin) + win] > readmaxes[(samp * *nexon) +
exonrows[win]]) {
readmaxes[(samp * *nexon) + exonrows[win]] = reads[(samp * *nwin) + win];
}
}
}
for (int entry = 0; entry < *nstate; ++entry) {
sampsperstate[entry] = 0;
}
for (int entry = 0; entry < *nsamp; ++entry) {
sampsperstate[statecols[entry]] += 1;
}
for (int entry = 0; entry < *nexon; ++entry) {
segsperexon[entry] = 0;
}
segsperexon[0] = 1;
for (int entry = 1; entry < *nwin; ++entry) {
if (segrows[entry] != segrows[entry - 1]) {
segsperexon[exonrows[entry]] += 1;
}
}
init_jac_struct();
init_hess_struct();
init_bounds();
init_start_X();
}
bool reset_gene()
{
nstate.reset(new int);
nsamp.reset(new int);
nexon.reset(new int);
nseg.reset(new int);
nwin.reset(new int);
nvar.reset(new int);
ncon.reset(new int);
nsingle.reset(new int);
hsingle.reset(new int);
jac_entries.reset(new int);
hess_entries.reset(new int);
current_ll.reset(new double);
current_obj_factor.reset(new double);
statecols.reset(new int [1]);
sampcols.reset(new int [1]);
exonrows.reset(new int [1]);
segrows.reset(new int [1]);
sampsperstate.reset(new int [1]);
segsperexon.reset(new int [1]);
single_hess_rows.reset(new int [1]);
single_hess_cols.reset(new int [1]);
jac_rows.reset(new int [1]);
jac_cols.reset(new int [1]);
full_hess_rows.reset(new int [1]);
full_hess_cols.reset(new int [1]);
eval_counts.reset(new int [1]);
reads.reset(new double [1]);
readmaxes.reset(new double [1]);
start_X.reset(new double [1]);
x_L.reset(new double [1]);
x_U.reset(new double [1]);
g_L.reset(new double [1]);
g_U.reset(new double [1]);
current_X.reset(new double [1]);
current_cons.reset(new double [1]);
current_ll_grad.reset(new double [1]);
current_cons_jac.reset(new double [1]);
current_ll_hess.reset(new double [1]);
current_cons_hess.reset(new double [1]);
current_lambda.reset(new double [1]);
current_lagrangian.reset(new double [1]);
return true;
}
bool empty_gene()
{
nstate.reset();
nsamp.reset();
nexon.reset();
nseg.reset();
nwin.reset();
nvar.reset();
ncon.reset();
nsingle.reset();
hsingle.reset();
jac_entries.reset();
hess_entries.reset();
current_ll.reset();
current_obj_factor.reset();
statecols.reset();
sampcols.reset();
exonrows.reset();
segrows.reset();
sampsperstate.reset();
segsperexon.reset();
single_hess_rows.reset();
single_hess_cols.reset();
jac_rows.reset();
jac_cols.reset();
full_hess_rows.reset();
full_hess_cols.reset();
eval_counts.reset();
reads.reset();
readmaxes.reset();
start_X.reset();
x_L.reset();
x_U.reset();
g_L.reset();
g_U.reset();
current_X.reset();
current_cons.reset();
current_ll_grad.reset();
current_cons_jac.reset();
current_ll_hess.reset();
current_cons_hess.reset();
current_lambda.reset();
current_lagrangian.reset();
return true;
}
bool init_jac_struct()
{
for (int samp = 0; samp < *nsamp; ++samp) {
for (int exon = 0; exon < *nexon; ++exon) {
jac_rows[2 * ((samp * *nexon) + exon)] = (6 * *nstate) + samp;
jac_cols[2 * ((samp * *nexon) + exon)] = (samp * *nexon) + exon;
jac_rows[(2 * ((samp * *nexon) + exon)) + 1] = (6 * *nstate) + *nsamp +
(samp * *nexon) + exon;
jac_cols[(2 * ((samp * *nexon) + exon)) + 1] = (samp * *nexon) + exon;
}
}
return true;
}
bool init_hess_struct()
{
int shr[21] = {3,4,2,4,2,5,6,1,6,1,7,8,0,8,0,0,1,2,1,2,2};
int shc[21] = {3,3,3,4,4,5,5,5,6,6,7,7,7,8,8,0,0,0,1,1,2};
for (int entry = 0; entry < 21; ++entry) {
single_hess_rows[entry] = shr[entry];
single_hess_cols[entry] = shc[entry];
}
for (int state = 0; state < *nstate; ++state) {
*hess_entries += 3 + (2 * *nseg * sampsperstate[state]);
}
for (int state = 0; state < *nstate; ++state) {
*hess_entries += 3 + (2 * *nexon * sampsperstate[state]);
}
for (int state = 0; state < *nstate; ++state) {
*hess_entries += 3 + (2 * sampsperstate[state]);
}
for (int samp = 0; samp < *nsamp; ++samp) {
*hess_entries += 1 + *nexon + *nseg;
}
for (int exon = 0; exon < *nexon; ++exon) {
*hess_entries += *nsamp * (1 + segsperexon[exon]);
}
*hess_entries += *nsamp * *nseg;
full_hess_rows.reset(new int [*hess_entries]);
full_hess_cols.reset(new int [*hess_entries]);
current_ll_hess.reset(new double [*hess_entries]);
current_lagrangian.reset(new double [*hess_entries]);
int entry = 0;
for (int state = 0; state < *nstate; ++state) {
for (int var = 0; var < 2 + (*nseg * sampsperstate[state]); ++var) {
full_hess_cols[entry + var] = state;
}
full_hess_rows[entry] = state;
full_hess_rows[entry + 1] = *nstate + state;
entry += 2;
for (int samp = 0; samp < *nsamp; ++samp) {
for (int seg = 0; seg < *nseg; ++seg) {
if (statecols[samp] == state) {
full_hess_rows[entry] = (6 * *nstate) + (*nsamp * (*nexon + 1)) + (samp
* *nseg) + seg;
entry += 1;
}
}
}
}
for (int state = 0; state < *nstate; ++state) {
for (int var = 0; var < 1 + (*nseg * sampsperstate[state]); ++var) {
full_hess_cols[entry + var] = *nstate + state;
}
full_hess_rows[entry] = *nstate + state;
entry += 1;
for (int samp = 0; samp < *nsamp; ++samp) {
for (int seg = 0; seg < *nseg; ++seg) {
if (statecols[samp] == state) {
full_hess_rows[entry] = (6 * *nstate) + (*nsamp * (*nexon + 1)) + (samp
* *nseg) + seg;
entry += 1;
}
}
}
}
for (int state = 0; state < *nstate; ++state) {
for (int var = 0; var < 2 + (*nexon * sampsperstate[state]); ++var) {
full_hess_cols[entry + var] = (2 * *nstate) + state;
}
full_hess_rows[entry] = (2 * *nstate) + state;
full_hess_rows[entry + 1] = (3 * *nstate) + state;
entry += 2;
for (int samp = 0; samp < *nsamp; ++samp) {
for (int exon = 0; exon < *nexon; ++exon) {
if (statecols[samp] == state) {
full_hess_rows[entry] = (6 * *nstate) + *nsamp + (samp * *nexon) + exon;
entry += 1;
}
}
}
}
for (int state = 0; state < *nstate; ++state) {
for (int var = 0; var < 1 + (*nexon * sampsperstate[state]); ++var) {
full_hess_cols[entry + var] = (3 * *nstate) + state;
}
full_hess_rows[entry] = (3 * *nstate) + state;
entry += 1;
for (int samp = 0; samp < *nsamp; ++samp) {
for (int exon = 0; exon < *nexon; ++exon) {
if (statecols[samp] == state) {
full_hess_rows[entry] = (6 * *nstate) + *nsamp + (samp * *nexon) + exon;
entry += 1;
}
}
}
}
for (int state = 0; state < *nstate; ++state) {
for (int var = 0; var < 2 + sampsperstate[state]; ++var) {
full_hess_cols[entry + var] = (4 * *nstate) + state;
}
full_hess_rows[entry] = (4 * *nstate) + state;
full_hess_rows[entry + 1] = (5 * *nstate) + state;
entry += 2;
for (int samp = 0; samp < *nsamp; ++samp) {
if (statecols[samp] == state) {
full_hess_rows[entry] = (6 * *nstate) + samp;
entry += 1;
}
}
}
for (int state = 0; state < *nstate; ++state) {
for (int var = 0; var < 1 + sampsperstate[state]; ++var) {
full_hess_cols[entry + var] = (5 * *nstate) + state;
}
full_hess_rows[entry] = (5 * *nstate) + state;
entry += 1;
for (int samp = 0; samp < *nsamp; ++samp) {
if (statecols[samp] == state) {
full_hess_rows[entry] = (6 * *nstate) + samp;
entry += 1;
}
}
}
for (int samp = 0; samp < *nsamp; ++samp) {
for (int var = 0; var < 1 + *nexon + *nseg; ++var) {
full_hess_cols[entry + var] = (6 * *nstate) + samp;
}
full_hess_rows[entry] = (6 * *nstate) + samp;
entry += 1;
for (int exon = 0; exon < *nexon; ++exon) {
full_hess_rows[entry] = (6 * *nstate) + *nsamp + (samp * *nexon) + exon;
entry += 1;
}
for (int seg = 0; seg < *nseg; ++seg) {
full_hess_rows[entry] = (6 * *nstate) + (*nsamp * (*nexon + 1)) + (samp
* *nseg) + seg;
entry += 1;
}
}
for (int samp = 0; samp < *nsamp; ++samp) {
int startseg = 0;
for (int exon = 0; exon < *nexon; ++exon) {
for (int var = 0; var < 1 + segsperexon[exon]; ++var) {
full_hess_cols[entry + var] = (6 * *nstate) + *nsamp + (samp * *nexon) +
exon;
}
startseg += segsperexon[exon];
full_hess_rows[entry] = (6 * *nstate) + *nsamp + (samp * *nexon) + exon;
entry += 1;
for (int seg = startseg; seg < startseg + segsperexon[exon]; ++seg) {
full_hess_rows[entry] = (6 * *nstate) + (*nsamp * (*nexon + 1)) + (samp
* *nseg) + seg;
entry += 1;
}
}
}
for (int var = (6 * *nstate) + (*nsamp * (*nexon + 1)); var < *nvar;
++var) {
full_hess_cols[entry] = var;
full_hess_rows[entry] = var;
entry += 1;
}
return true;
}
bool init_bounds()
{
double curmax;
for (int entry = 0; entry < 4 * *nvar; ++entry) {
x_L[entry] = 0.01;
x_U[entry] = 100.0;
}
for (int entry = (6 * *nstate) + *nsamp; entry < *nvar; ++entry) {
x_L[entry] = 0.001;
x_U[entry] = 0.999;
}
for (int samp = 0; samp < *nsamp; ++samp) {
curmax = 0;
for (int exon = 0; exon < *nexon; ++exon) {
if (readmaxes[(samp * *nexon) + exon] > curmax) {
curmax = readmaxes[(samp * *nexon) + exon];
}
}
x_L[(6 * *nstate) + samp] = curmax / 0.999;
x_U[(6 * *nstate) + samp] = 1000 * curmax;
}
for (int state = 0; state < *nstate; ++state) {
curmax = 0;
for (int samp = 0; samp < *nsamp; ++samp) {
if (statecols[samp] == state) {
curmax += x_L[(6 * *nstate) + samp];
}
}
curmax /= sampsperstate[state];
x_L[(4 * *nstate) + state] = 0.0001 * curmax;
x_U[(4 * *nstate) + state] = 10000 * curmax;
x_L[(5 * *nstate) + state] = 0.0000001 * curmax;
x_U[(5 * *nstate) + state] = 10000 * curmax;
}
for (int cons = 0; cons < *ncon; ++cons) {
g_L[cons] = 0.001;
g_U[cons] = 2.0 * pow(10, 19);
}
return true;
}
bool init_start_X()
{
for (int entry = 0; entry < 4 * *nstate; ++entry) {
start_X[entry] = 1.0;
}
double readmax;
double statemeans[*nstate];
for (int state = 0; state < *nstate; ++state) {
statemeans[state] = 0;
}
for (int samp = 0; samp < *nsamp; ++samp) {
readmax = 0;
for (int win = 0; win < *nwin; ++win) {
if (reads[(samp * *nwin) + win] > readmax) {
readmax = reads[(samp * *nwin) + win];
}
}
start_X[(6 * *nstate) + samp] = 2.0 * readmax;
statemeans[statecols[samp]] += 2.0 * readmax;
}
for (int state = 0; state < *nstate; ++state) {
start_X[(4 * *nstate) + state] = statemeans[state] / sampsperstate[state];
start_X[(5 * *nstate) + state] = 0.1 * statemeans[state] /
sampsperstate[state];
}
for (int entry = (6 * *nstate) + *nsamp; entry < *nvar; ++entry) {
start_X[entry] = 0.99;
}
for (int entry = 0; entry < *nvar; ++entry) {
current_X[entry] = start_X[entry];
}
for (int entry = 0; entry < 7; ++entry) {
eval_counts[entry] = 0;
}
return true;
}
/* Likelihood and constraint calculations */
bool eval_ll(double *X_in, double &eval)
{
boost::shared_array<double> X(X_in);
evaluations::ll_full(*nsamp,*nwin,*nvar,statecols,
exonrows,segrows,*nstate,*nexon,*nseg,
reads,X,eval);
for (int entry = 0; entry < *nvar; ++entry) { current_X[entry] =
X[entry] ; }
*current_ll = eval;
eval_counts[0] += 1;
return true;
}
bool eval_cons(double *X, double *cons)
{
for (int samp = 0; samp < *nsamp; ++samp) {
for (int exon = 0; exon < *nexon; ++exon) {
cons[(samp * *nexon) + exon] = (X[(6 * *nstate) + samp] * X[(6 *
*nstate) + *nsamp + (samp * *nexon) + exon]) - readmaxes[(samp * *nexon)
+ exon];
}
}
for (int entry = 0; entry < *nvar; ++entry) { current_X[entry] =
X[entry] ; }
for (int entry = 0; entry < *ncon; ++entry) { current_cons[entry] =
cons[entry] ; }
eval_counts[1] += 1;
return true;
}
bool eval_ll_grad(double *X_in, double *grad_in)
{
boost::shared_array<double> X(X_in);
boost::shared_array<double> grad(grad_in);
evaluations::ll_grad_full(*nsamp,*nwin,*nvar,statecols,
exonrows,segrows,*nstate,*nexon,*nseg,
reads,X,grad);
for (int entry = 0; entry < *nvar; ++entry) { current_X[entry] =
X[entry] ; }
for (int entry = 0; entry < *nvar; ++entry) { current_ll_grad[entry] =
grad[entry] ; }
eval_counts[2] += 1;
return true;
}
bool eval_cons_jac(double *X, double *jac)
{
for (int samp = 0; samp < *nsamp; ++samp) {
for (int exon = 0; exon < *nexon; ++exon) {
jac[2 * ((samp * *nexon) + exon)] = X[(6 * *nstate) + *nsamp + (samp *
*nexon) + exon];
jac[(2 * ((samp * *nexon) + exon)) + 1] = X[(6 * *nstate) + samp];
}
}
for (int entry = 0; entry < *nvar; ++entry) { current_X[entry] =
X[entry] ; }
for (int entry = 0; entry < *jac_entries; ++entry) {
current_cons_jac[entry] = jac[entry] ; }
eval_counts[3] += 1;
return true;
}
bool eval_ll_hess(double *X_in, double *hess_in)
{
boost::shared_array<double> X(X_in);
boost::shared_array<double> hess(hess_in);
evaluations::ll_hess_full(*nsamp,*nwin,*nvar,*hess_entries,statecols,
exonrows,segrows,*nstate,*nexon,*nseg,
sampsperstate,segsperexon,reads,X,hess);
for (int entry = 0; entry < *nvar; ++entry) { current_X[entry] =
X[entry] ; }
for (int entry = 0; entry < *hess_entries; ++entry) {
current_ll_hess[entry] = hess[entry] ; }
eval_counts[4] += 1;
return true;
}
bool eval_cons_hess(double *X, double *hess)
{
for (int entry = 0; entry < *ncon; ++entry) {
hess[entry] = 1.0;
}
for (int entry = 0; entry < *nvar; ++entry) { current_X[entry] =
X[entry] ; }
for (int entry = 0; entry < *ncon; ++entry) { current_cons_hess[entry] =
hess[entry] ; }
eval_counts[5] += 1;
return true;
}
bool eval_lagrangian(double *X, double &obj_factor, double *lambda,
double *lagrangian)
{
int Nstart;
eval_ll_hess(X, lagrangian);
for (int entry = 0; entry < *hess_entries; ++entry) {
lagrangian[entry] *= obj_factor;
}
for (int samp = 0; samp < *nsamp; ++samp) {
Nstart = (9 * *nstate) + (2 * *nseg * *nsamp) + (2 * *nexon * *nsamp) +
(2 * *nsamp) + (samp * (1 + *nexon + *nseg));
for (int exon = 0; exon < *nexon; ++exon) {
lagrangian[Nstart + 1 + exon] += lambda[(samp * *nexon) + exon];
}
}
for (int entry = 0; entry < *nvar; ++entry) { current_X[entry] =
X[entry] ; }
*current_obj_factor = obj_factor;
for (int entry = 0; entry < *ncon; ++entry) { current_lambda[entry] =
lambda[entry] ; }
for (int entry = 0; entry < *hess_entries; ++entry) {
current_lagrangian[entry] = lagrangian[entry] ; }
eval_counts[6] += 1;
return true;
}
bool eval_ll_py(numeric::array const &X_py, numeric::array const &eval_py)
{
double *X = new double [*nvar];
double eval;
for (int entry = 0; entry < *nvar; ++entry) { X[entry] =
double(X_py[entry]) ; }
eval_ll(X, eval);
eval_py[0] = eval;
return true;
}
bool eval_cons_py(numeric::array const &X_py, numeric::array const &cons_py)
{
double *X = new double [*nvar];
double *cons = new double [*ncon];
for (int entry = 0; entry < *nvar; ++entry) { X[entry] =
double(X_py[entry]) ; }
eval_cons(X, cons);
for (int entry = 0; entry < *ncon; ++entry) { cons_py[entry] =
cons[entry] ; }
return true;
}
bool eval_ll_grad_py(numeric::array const &X_py, numeric::array const
&grad_py)
{
double *X = new double [*nvar];
double *grad = new double [*nvar];
for (int entry = 0; entry < *nvar; ++entry) { X[entry] =
double(X_py[entry]) ; }
eval_ll_grad(X, grad);
for (int entry = 0; entry < *nvar; ++entry) { grad_py[entry] =
grad[entry] ; }
return true;
}
bool eval_cons_jac_py(numeric::array const &X_py, numeric::array const
&jac_py)
{
double *X = new double [*nvar];
double *jac = new double [*jac_entries];
for (int entry = 0; entry < *nvar; ++entry) { X[entry] =
double(X_py[entry]) ; }
eval_cons_jac(X, jac);
for (int entry = 0; entry < *jac_entries; ++entry) { jac_py[entry] =
jac[entry] ; }
return true;
}
bool eval_ll_hess_py(numeric::array const &X_py, numeric::array const
&hess_py)
{
double *X = new double [*nvar];
double *hess = new double [*hess_entries];
for (int entry = 0; entry < *nvar; ++entry) { X[entry] =
double(X_py[entry]) ; }
eval_ll_hess(X, hess);
for (int entry = 0; entry < *hess_entries; ++entry) { hess_py[entry] =
hess[entry] ; }
return true;
}
bool eval_cons_hess_py(numeric::array const &X_py, numeric::array const
&hess_py)
{
double *X = new double [*nvar];
double *hess = new double [*ncon];
for (int entry = 0; entry < *nvar; ++entry) { X[entry] =
double(X_py[entry]) ; }
eval_cons_hess(X, hess);
for (int entry = 0; entry < *ncon; ++entry) { hess_py[entry] =
hess[entry] ; }
return true;
}
bool eval_lagrangian_py(numeric::array const &X_py, numeric::array const
&obj_factor_py, numeric::array const &lambda_py, numeric::array const
&lagrangian_py)
{
double *X = new double [*nvar];
double obj_factor;
double *lambda = new double [*ncon];
double *lagrangian = new double [*hess_entries];
for (int entry = 0; entry < *nvar; ++entry) { X[entry] =
double(X_py[entry]) ; }
obj_factor = double(obj_factor_py[0]);
for (int entry = 0; entry < *ncon; ++entry) { lambda[entry] =
double(lambda_py[entry]) ; }
eval_lagrangian(X, obj_factor, lambda, lagrangian);
for (int entry = 0; entry < *hess_entries; ++entry) {
lagrangian_py[entry] = lagrangian[entry] ; }
return true;
}
/* Variable access */
template <class T> bool update_var(std::string var_name, T var_value)
{
if (var_name == "gene_count") { gene_count = var_value ; }
if (var_name == "nstate") { *nstate = var_value ; }
if (var_name == "nsamp") { *nsamp = var_value ; }
if (var_name == "nexon") { *nexon = var_value ; }
if (var_name == "nseg") { *nseg = var_value ; }
if (var_name == "nwin") { *nwin = var_value ; }
if (var_name == "nvar") { *nvar = var_value ; }
if (var_name == "ncon") { *ncon = var_value ; }
if (var_name == "nsingle") { *nsingle = var_value ; }
if (var_name == "hsingle") { *hsingle = var_value ; }
if (var_name == "jac_entries") { *jac_entries = var_value ; }
if (var_name == "hess_entries") { *hess_entries = var_value ; }
if (var_name == "current_ll") { *current_ll = var_value ; }
if (var_name == "current_obj_factor") { *current_obj_factor = var_value ; }
}
template <class T> bool return_var(std::string var_name, T &var_value)
{
if (var_name == "gene_count") {var_value = gene_count ; }
if (var_name == "nstate") { var_value = *nstate ; }
if (var_name == "nsamp") { var_value = *nsamp ; }
if (var_name == "nexon") { var_value = *nexon ; }
if (var_name == "nseg") { var_value = *nseg ; }
if (var_name == "nwin") { var_value = *nwin ; }
if (var_name == "nvar") { var_value = *nvar ; }
if (var_name == "ncon") { var_value = *ncon ; }
if (var_name == "nsingle") { var_value = *nsingle ; }
if (var_name == "hsingle") { var_value = *hsingle ; }
if (var_name == "jac_entries") { var_value = *jac_entries ; }
if (var_name == "hess_entries") { var_value = *hess_entries ; }
if (var_name == "current_ll") { var_value = *current_ll ; }
if (var_name == "current_obj_factor") { var_value = *current_obj_factor ; }
}
template <class T> bool update_vec_var(std::string var_name, int
var_length, T var_value[])
{
if (var_name == "statecols") {for (int entry = 0; entry < var_length;
++entry) { statecols[entry] = var_value[entry] ; } }
if (var_name == "sampcols") {for (int entry = 0; entry < var_length;
++entry) { sampcols[entry] = var_value[entry] ; } }
if (var_name == "exonrows") {for (int entry = 0; entry < var_length;
++entry) { exonrows[entry] = var_value[entry] ; } }
if (var_name == "segrows") {for (int entry = 0; entry < var_length;
++entry) { segrows[entry] = var_value[entry] ; } }
if (var_name == "sampsperstate") {for (int entry = 0; entry <
var_length; ++entry) { sampsperstate[entry] = var_value[entry] ; } }
if (var_name == "segsperexon") {for (int entry = 0; entry < var_length;
++entry) { segsperexon[entry] = var_value[entry] ; } }
if (var_name == "single_hess_rows") {for (int entry = 0; entry <
var_length; ++entry) { single_hess_rows[entry] = var_value[entry] ; } }
if (var_name == "single_hess_cols") {for (int entry = 0; entry <
var_length; ++entry) { single_hess_cols[entry] = var_value[entry] ; } }
if (var_name == "jac_rows") {for (int entry = 0; entry < var_length;
++entry) { jac_rows[entry] = var_value[entry] ; } }
if (var_name == "jac_cols") {for (int entry = 0; entry < var_length;
++entry) { jac_cols[entry] = var_value[entry] ; } }
if (var_name == "full_hess_rows") {for (int entry = 0; entry <
var_length; ++entry) { full_hess_rows[entry] = var_value[entry] ; } }
if (var_name == "full_hess_cols") {for (int entry = 0; entry <
var_length; ++entry) { full_hess_cols[entry] = var_value[entry] ; } }
if (var_name == "reads") {for (int entry = 0; entry < var_length;
++entry) { reads[entry] = var_value[entry] ; } }
if (var_name == "readmaxes") {for (int entry = 0; entry < var_length;
++entry) { readmaxes[entry] = var_value[entry] ; } }
if (var_name == "eval_counts") {for (int entry = 0; entry < var_length;
++entry) { eval_counts[entry] = var_value[entry] ; } }
if (var_name == "start_X") {for (int entry = 0; entry < var_length;
++entry) { start_X[entry] = var_value[entry] ; } }
if (var_name == "x_L") {for (int entry = 0; entry < var_length; ++entry)
{ x_L[entry] = var_value[entry] ; } }
if (var_name == "x_U") {for (int entry = 0; entry < var_length; ++entry)
{ x_U[entry] = var_value[entry] ; } }
if (var_name == "g_L") {for (int entry = 0; entry < var_length; ++entry)
{ g_L[entry] = var_value[entry] ; } }
if (var_name == "g_U") {for (int entry = 0; entry < var_length; ++entry)
{ g_U[entry] = var_value[entry] ; } }
if (var_name == "current_X") {for (int entry = 0; entry < var_length;
++entry) { current_X[entry] = var_value[entry] ; } }
if (var_name == "current_cons") {for (int entry = 0; entry < var_length;
++entry) { current_cons[entry] = var_value[entry] ; } }
if (var_name == "current_ll_grad") {for (int entry = 0; entry <
var_length; ++entry) { current_ll_grad[entry] = var_value[entry] ; } }
if (var_name == "current_cons_jac") {for (int entry = 0; entry <
var_length; ++entry) { current_cons_jac[entry] = var_value[entry] ; } }
if (var_name == "current_ll_hess") {for (int entry = 0; entry <
var_length; ++entry) { current_ll_hess[entry] = var_value[entry] ; } }
if (var_name == "current_cons_hess") {for (int entry = 0; entry <
var_length; ++entry) { current_cons_hess[entry] = var_value[entry] ; } }
if (var_name == "current_lambda") {for (int entry = 0; entry <
var_length; ++entry) { current_lambda[entry] = var_value[entry] ; } }
if (var_name == "current_lagrangian") {for (int entry = 0; entry <
var_length; ++entry) { current_lagrangian[entry] = var_value[entry] ; } }
}
template <class T> bool return_vec_var(std::string var_name, int
var_length, T var_value[])
{
if (var_name == "statecols") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = statecols[entry] ; } }
if (var_name == "sampcols") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = sampcols[entry] ; } }
if (var_name == "exonrows") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = exonrows[entry] ; } }
if (var_name == "segrows") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = segrows[entry] ; } }
if (var_name == "sampsperstate") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = sampsperstate[entry] ; } }
if (var_name == "segsperexon") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = segsperexon[entry] ; } }
if (var_name == "single_hess_rows") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = single_hess_rows[entry] ; } }
if (var_name == "single_hess_cols") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = single_hess_cols[entry] ; } }
if (var_name == "jac_rows") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = jac_rows[entry] ; } }
if (var_name == "jac_cols") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = jac_cols[entry] ; } }
if (var_name == "full_hess_rows") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = full_hess_rows[entry] ; } }
if (var_name == "full_hess_cols") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = full_hess_cols[entry] ; } }
if (var_name == "reads") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = reads[entry] ; } }
if (var_name == "readmaxes") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = readmaxes[entry] ; } }
if (var_name == "eval_counts") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = eval_counts[entry] ; } }
if (var_name == "start_X") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = start_X[entry] ; } }
if (var_name == "x_L") {for (int entry = 0; entry < var_length; ++entry)
{ var_value[entry] = x_L[entry] ; } }
if (var_name == "x_U") {for (int entry = 0; entry < var_length; ++entry)
{ var_value[entry] = x_U[entry] ; } }
if (var_name == "g_L") {for (int entry = 0; entry < var_length; ++entry)
{ var_value[entry] = g_L[entry] ; } }
if (var_name == "g_U") {for (int entry = 0; entry < var_length; ++entry)
{ var_value[entry] = g_U[entry] ; } }
if (var_name == "current_X") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = current_X[entry] ; } }
if (var_name == "current_cons") {for (int entry = 0; entry < var_length;
++entry) { var_value[entry] = current_cons[entry] ; } }
if (var_name == "current_ll_grad") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = current_ll_grad[entry] ; } }
if (var_name == "current_cons_jac") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = current_cons_jac[entry] ; } }
if (var_name == "current_ll_hess") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = current_ll_hess[entry] ; } }
if (var_name == "current_cons_hess") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = current_cons_hess[entry] ; } }
if (var_name == "current_lambda") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = current_lambda[entry] ; } }
if (var_name == "current_lagrangian") {for (int entry = 0; entry <
var_length; ++entry) { var_value[entry] = current_lagrangian[entry] ; } }
}
bool update_var_py(std::string var_name_py, numeric::array const &const
&var_value_py, int var_length_py, std::string var_type_py)
{
if (var_type_py == "int") {
if (var_length_py == 1) {
int var_value = int(var_value_py[0]);
update_var(var_name_py, var_value);
} else {
int *var_value = new int [var_length_py];
for (int entry = 0; entry < var_length_py; ++entry) {
var_value[entry] = int(var_value_py[entry]);
}
update_vec_var(var_name_py, var_length_py, var_value);
}
} else {
if (var_length_py == 1) {
double var_value = double(var_value_py[0]);
update_var(var_name_py, var_value);
} else {
double *var_value = new double [var_length_py];
for (int entry = 0; entry < var_length_py; ++entry) {
var_value[entry] = double(var_value_py[entry]);
}
update_vec_var(var_name_py, var_length_py, var_value);
}
}
return true;
}
numeric::array const &return_var_py(std::string var_name_py, int
var_length_py, std::string var_type_py)
{
list var_value_list_py = list();
if (var_type_py == "int") {
if (var_length_py == 1) {
int var_value;
return_var(var_name_py, var_value);
var_value_py.append(int(var_value));
} else {
int *var_value = new int [var_length_py];
return_vec_var(var_name_py, var_length_py, var_value);
for (int entry = 0; entry < var_length_py; ++entry) {
var_value_py.append(int(var_value[entry]));
}
}
} else {
if (var_length_py == 1) {
double var_value;
return_var(var_name_py, var_value);
var_value_py.append(double(var_value));
} else {
double *var_value = new double [var_length_py];
return_vec_var(var_name_py, var_length_py, var_value);
for (int entry = 0; entry < var_length_py; ++entry) {
var_value_py.append(double(var_value[entry]));
}
}
}
numeric::array const &var_value_array_py =
numeric::array(var_value_list_py);
return var_value_array_py;
}
private:
int gene_count;
boost::shared_ptr<int> nstate;
boost::shared_ptr<int> nsamp;
boost::shared_ptr<int> nexon;
boost::shared_ptr<int> nseg;
boost::shared_ptr<int> nwin;
boost::shared_ptr<int> nvar;
boost::shared_ptr<int> ncon;
boost::shared_ptr<int> nsingle;
boost::shared_ptr<int> hsingle;
boost::shared_ptr<int> jac_entries;
boost::shared_ptr<int> hess_entries;
boost::shared_ptr<double> current_ll;
boost::shared_ptr<double> current_obj_factor;
boost::shared_array<int> statecols;
boost::shared_array<int> sampcols;
boost::shared_array<int> exonrows;
boost::shared_array<int> segrows;
boost::shared_array<int> sampsperstate;
boost::shared_array<int> segsperexon;
boost::shared_array<int> single_hess_rows;
boost::shared_array<int> single_hess_cols;
boost::shared_array<int> jac_rows;
boost::shared_array<int> jac_cols;
boost::shared_array<int> full_hess_rows;
boost::shared_array<int> full_hess_cols;
boost::shared_array<int> eval_counts;
boost::shared_array<double> reads;
boost::shared_array<double> readmaxes;
boost::shared_array<double> start_X;
boost::shared_array<double> x_L;
boost::shared_array<double> x_U;
boost::shared_array<double> g_L;
boost::shared_array<double> g_U;
boost::shared_array<double> current_X;
boost::shared_array<double> current_cons;
boost::shared_array<double> current_ll_grad;
boost::shared_array<double> current_cons_jac;
boost::shared_array<double> current_ll_hess;
boost::shared_array<double> current_cons_hess;
boost::shared_array<double> current_lambda;
boost::shared_array<double> current_lagrangian;
};
}
}
_______________________________________________
Cplusplus-sig mailing list
Cplusplus-sig@python.org
http://mail.python.org/mailman/listinfo/cplusplus-sig