This is exactly what BIRN has been working on through the Smart Atlas
project and now MBAT. The inverse query is also true: What genes are
expressed here? As Bill indicated, there are several spatially normalized
atlas projects (ABA, GEnepaint) that can do that. We've been working on
spatial normalization of some of the GEnsat images, although we haven't
gotten very far. More importantly, BIRN has been working on exchange of
coordinate systems so that different atlases can talk to each other.
I think that's why Bill has been trying to get everyone together on this.
I've added Ilya Zaslavsky, our GIS expert, to this list.
Maryann Martone, Ph. D.
Professor-in-Residence
Dept of Neuroscience
University of California, San Diego
San Diego CA 92093-0446
858 822 0745 (T)
858 822 0828 (F)
On Sat, 3 Mar 2007, William Bug wrote:
Hi Kei,
You are right on target re: use of a coordinate-based, spatial query system
to resolve the relatively simple query: "In which brain regions is GENE X
expressed?"
This is the whole goal of several major neuroinformatics projects currently
underway which are designed to use either 2D or 3D digital brain atlases to
make such a query possible. Several of those efforts are associated with the
BIRN project. In fact several such projects working on inbred mouse strain
atlases have been striving to function synergistically within a single system
(the Mouse BIRN Atlasing Tool or MBAT) specifically to support such a query.
ABA is not currently available to query within MBAT, because it's not
registered to the primary atlas being used in MBAT right now. This work may
eventually get done, but it won't be ready for the demo.
The absolute pre-requisites for resolving such a query are:
1) you must have a set of canonical brain images (2D) or a true voxel
based canonical brain (3D) - "ATLASES" - that include expert-assisted brain
region segmentation.
2) these canonical pixel-based brain images (2D) or voxel based
images (3D) must be situated within a defined coordinate space.
3) the segmented brain regions must be deterministically placed
within the same coordinate space.
4) the images containing the gene expression patterns must be
segmented (manually, semi-automatically, or automatically) to provide defined
geometries for the expression patterns.
5) the images containing the gene expression patterns must be
registered to the canonical atlas data and coordinate space (whether 2D or
3D).
With these conditions met, you could then present a user with a nice 3D
visualization of the atlas (or even just the list of brain region IDs or
preferred labels) and/or a list of gene names/IDs and let them ask both of
the following questions:
a) In which brain regions is GENE X expressed?
b) Which genes does BRAIN REGION X contain defined expression values
beyond some baseline?
Right now, GENSAT is not registered to an atlas, so there is no coordinate
frame to support resolving such as query. They have manually curated many of
the gene-specific images with both brain regions and cell types, so you can
pose that query and get an answer based on the curation they have had the
resources to do so far, but there is no way to place it in a GIS context (2D
or 3D), since none of their info is YET linked to a canonical coordinate
space (several projects are working on this very issue).
ABA has aligned to a 2D mouse brain atlas (F&P C57Bl/6 adult brain atlas).
In doing so, the 2D brain region segmentations on each of the images in the
F&P mouse atlas can be super-imposed on the registered images from any of the
20,000+ brains. The problem is the current registration has a moderate error
associated with it, so that answering that query programmatically is
problematic and often not very informative. The following can be done:
- along the coronal sectioning axis, give me the plate numbers for
all the images in the atlas that contain a slice through the STRIATUM
- for ABA brain stained for GENE X, give me all the sections that
have been roughly aligned to that set of F&P atlas images.
From there the alignment is so coarse at this point, you could only use the
atlas plates and location of the STRIATUM to help guide a qualitative
assessment of whether there appears to be any staining in the STRIATUM.
In fact, via this route, many contributers to GeneNetwork.org have actually
linked the probe sets in their microarray QTL database to staining patterns
in ABA. In other words, if through there system, you uncover via QTL a locus
or collection of SNPs associated with altered expression of a given gene -
say Dopamine Receptor, type D2 (DRD2) - you might find someone has added an
ABA or GENSAT annotation for DRD2 using the GeneNetwork.org GeneWiki.
1) Go to www.genenetwork.org
http://www.genenetwork.org/search3.html
2) Enter 'DRD2' in the 'ANY' box searching against the default
settings for other fields - & hit 'Search'
3) Click on the single result entry
4) In the record for DRD2, click on the GeneWiki button near the top
of the page
5) This will bring up a listing of all the annotations in GeneNetwork
for DRD2 including qualitative annotations that someone did for the ABA DRD2
brain.
If you want to see ALL of the genes for which ABA or GENSAT GeneWiki entries
exist, just go back to step '1', enter wiki=ABA or wiki=GENSAT respectively
in one of the 'ANY' boxes, and hit 'Search'. Then pick up at step '3' above.
Were we able to SCRAPE this, then you would have annotation for ABA that is
roughly equivalent to that which exists for GENSAT - ONLY - it probably is
doesn't cover the ABA very thoroughly (using the generic 'wiki=aba' brings up
948 probe sets - or ~5% of ABA - pretty remarkable, actually, given its a
manual effort), and these GeneWiki annotations are mostly in free-text right
now and are not done to a controlled vocabulary or classification scheme.
:-(
When the registration to the atlas improves to say the 50 - 100 micron range,
then the flood-gates will open, and all 20,000 brains in ABA each staining
for a particular gene will be able to automatically provide relatively solid
answers to these straight-forward questions related to where in the brain is
Gene X expressed - and which genes does Brain Region Y show marked expression
of. Even here, however, there will be continued room for nuance in defining
the ABA staining patterns - AND - there will be a need to eventually to add
the time dimension to all these queries (e.g., "When is Gene X expressed in
Brain Region Y?").
Because the ABA has created multi-resolution versions of their brain images
(both the Nissl stains for cell bodies and the pseudo-colored ISH images for
a given gene), it is possible to use the very nice Google Maps API GUI Alan
created to select a given 1 of the 20,000 ABA brains and simply Zoom & Pan on
the actual pixel image data. However, there is no straight-forward way to
use it to pose and answer SPATIAL queries.
What MIGHT be possible - based on the alignment they have done and the
information provided in that brain region ontology Excel file Alan has - is
to say, for the 'DRD2' brain, filter the sagittal image series to create a
subset including only those images aligned to an F&P atlas images which
contains a section through Brain Region X (say 'STRIATUM'). This way, if
through some SPARQL query you pulled up a relation between DRD2 and STRIATUM,
you'd be able to present a user with a very nice, low-tech interface to
quickly pan&zoom on the median section of that 'STRIATUM'-filtered series to
look at the staining pattern. You could add a navigation control to go
back-n-forth through the series for the DRD2 brain, so they could get a
pretty good sense in 3D where DRD2 expression is in the striatum. You might
also go to BAMS or CoCoMac (BAMS is better in this instance since it's rodent
focused - whereas CoCoMac is primate focused) to automatically determine what
regions connect to (is_afferent_to) and what regions are connected to
(is_efferent_to) the STRIATUM. You could then bring up another HTML frame
that gives you a view of the DRD2 subset series for those brain regions, too.
THAT WOULD ACTUALLY BE A VERY NICE INTERFACE - and is probably quite
tractable for the demo - if this sounds like a useful feature to provide.
Running atlas-based SPATIAL queries against GENSAT and ABA is a very much
sought after goal both for the curators of those repositories and for the
neuroscience community at large, but we are not there yet.
I'm not certain I understand what you are asking re: highly expressed genes
that correlate with high levels of ADDL or Abeta. I could see how you might
be able to use GENSAT (which has a 'staining intensity' annotation field) to
ask whether genes associated with high levels of specific ADDL species or
with plaque deposition are expressed at high levels in the GENSAT data set -
and if so - where are they expressed in the brain - and at what developmental
time. Given the sparse nature of the GENSAT data set, this would not be a
comprehensive answer to the question, but it could prove very interesting.
I'm certain June, Gwen, or Elisabeth could help us identify genes whose
expression correlates with high levels of ADDL species (most interesting
question given current AD research) or with other APP related macromolecules
or plaques. I'm not certain how you'd ask the same question of ABA, given
there are not systematic annotations on staining intensity or pattern -
though some of this has been done (see below).
Cheers,
Bill
On Mar 3, 2007, at 8:01 PM, kc28 wrote:
Alan et al.,
In addition to mapping to brain regions, what seems to be also missing is
some kind of brain coordinates. I thought one major advanatage of using
Google Map is the ability to issue GIS-like queries. With this type
queries, one can potentially query something like finding expressed genes
for a given brain region and its neighbouring/adjacent regions.
While we are talking about gene expression, what seems to be also logical
to consider is whether some highly expressed genes correlate with high
abundance of pathological proteins (e.g., amyloid beta). Any take from
neuroscientists?
-Kei
Alan Ruttenberg wrote:
On Mar 2, 2007, at 1:56 PM, Kei Cheung wrote:
By reading the AD/PD use case, one of the questions has to do with what
genes are expressed in what regions of the brain (if such gene
expressions are localized to certain brain regions). I wonder what
Alan's currently working on can help address this type of question (even
though the kind of gene expression data is for the mouse -- perhaps we
can find homologous genes for human). Also, I'd encourage people to take
look at what Bill Bug's Wiki page:
What I can do is add an orthology mapping. Probably from orthogene.
I can also scrape the Allen site for the following query they provide
Brain Region(see list below), Expression-level(low/high),Expression-
density(low/high), expression pattern(clustered/not clustered). => gene
set
So this would be 16x2x2x2 = 128 different gene sets.
There is also their "Fine structure search" :
Fine structure annotation lists are genes that have high specificity
expression in particular brain regions or nuclei.
They provide these gene lists for a set of structures listed below (fine
structures).
This can lead us to a particular image, though I don't have a way yet to
identify which portion of the image corresponds to a particular region or
structure.
Bill Bug
Senior Research Analyst/Ontological Engineer
Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email - [EMAIL PROTECTED]