Hello All:
I had done some work with Peter Karp on collaborative KB development
which works
quite well in practice. This work is described at:
http://www.ai.sri.com/pub_list/390
I will be happy to discuss it more as per the requirements.
Keep Smiling!
Vinay.
Mark Musen wrote:
On Jul 16, 2006, at 9:36 PM, William Bug wrote:
Are you referring to the JDBC Protégé
(http://protege.cim3.net/cgi-bin/wiki.pl?JdbcDatabaseBackend), or are
there other ways of connecting Protégé to an RDBMS backend?
That's the JDBC backend, Bill.
It certainly is a hurculean task to work out the O-R mapping in a way
that is flexible enough to accommodate all the graphs someone might
construct either in Protege-Frames or Protege-OWL, so if this is
already implemented and working, it behooves all us who need to
support this sort of community ontology curation re-use what's being
constructed by SMI and/or NCI.
Yes, see http://protege.cim3.net/cgi-bin/wiki.pl?MultiUserTutorial
The only problem is creating an efficient means to support this sort
of community curation - and sharing of ontologies from other sources
- a direct JDBC connection isn't going to work well. They'll be
firewall issues which I believe will add way too much to each
individual's overhead of bringing this capability online. When the
group is supported by a single IT staff and working within the same
LAN environment (including those who'd connect via VPN), this can be
a viable approach, but outside of that, it will probably be too much
trouble for all the folks who need access.
We've been putting an enormous amount of energy into enhancing the
performance of the thick Protégé client the past few months for
precisely this reason. In supporting NCI, we indeed have to deal with
the significant latencies imposed by fire walls and VPNs. The
enhancements we have made have led to remarkably improved database
performance and transaction processing. For example, some transaction
times have been improved by two orders of magnitude or more. We will
be migrating all these changes back into the main Protégé release over
the next several months.
This is why we've been talking with Daniel about expanding the web
version of Protégé developed in your group so as to "open" it and
release it from the JDBC port requirements using a combination of a
service-oriented architecture (web services) and the Java Portlet
framework. In our lab, we've implemented very simple WSDL web
service response/request pairs to implement a generic SQL interface
via web services to meet this need. It works extremely well, even
for fairly complicated queries and can even be used to return binary
objects (in our case histological images) via SOAP + attachments.
This is all running over relatively firewall friendly ports such as
are used by HTTP and the Tomcat Java Servlet framework.
We'd certainly like to know more about what you are doing within
BIRN. The "Web version" of Protégé, alas, was a student project, that
I think will need a bit more work to be stable. We do intend to put
additional effort into enhancing "Web Protégé" in the next few months.
The community should note that Stanford recently submitted a proposal
to the National Library of Medicine for ongoing support of the Protégé
resource. One of our key objectives in the new phase of our work is
to engineer a true thin-client version of Protégé that adopts a
services-oriented architecture. We would welcome input for the entire
community as we move forward with these plans, assuming that we get
funded to work on them.
I assume when you mention the NCIT community curation this is a
project being developed/hosted/supported by the NCI Bioinformatics
group as a part of the caBIG project?
Although the NCI Thesaurus is an important resource for caBIG, our
work with NCI predates caBIG and comes directly from the NCI Center
for Bioinformatics. We work directly with the folks at NCI developing
the caCORE resources.
By any chance is the work they are doing with the Protége-RDBMS
shared ontology environment (CODS - Collaborative
Ontology Development Server (or Collaborative Ontology Development
Service project)) taking this approach to make the system less
reliant on running JDBC over the net and through firewalls? I saw on
one of the Protégé CODS server configuration pages ports 4020 - 4039
were used, which again, given these do have public assignments for
proprietary applications
(http://www.iana.org/assignments/port-numbers) can be difficult to
use, unless all contributors are being hosted by the same IT staff
and/or are on the same LAN (even if its a VLAN).
Are there pages on the Protégé Wiki where more complete documentation
discusses some of these details for the NCI CODS project?
The CODS project is not supported by NCI, but rather by CIM3
Engineering. The goal of CODS is to make the multi-user version of
Protégé publicly available so that users can experiment with creating
and maintaining a shared ontology library
(see http://protege.cim3.net/cgi-bin/wiki.pl?CODS).
Many thanks again for the info, Mark.
My pleasure!
Cheers,
Bill
On Jul 16, 2006, at 12:53 AM, Mark Musen wrote:
On Jul 10, 2006, at 11:40 PM, William Bug wrote:
However, there doesn't appear to be a means within the OBO/NCBO
community for doing this sort of distributed ontology design right
now. Two of the tools in wide spread use - Protégé and OBO-Edit
are really not designed to support distributed and shared
development, such as you'd find in a typical distributed
architecture - whether it be a standard client-server RDBMS-based
approach, one using some "active pages" technology such as php,
Zope, Ruby on Rails, Java Servlet/Portlet frameworks, etc. - or a
more asynchronous approach using messaging and/or web services to
assemble the required components from the various authoritative
sources.
Bill,
I hate to sound like a salesperson, but Protégé in its multi-user
mode (using the relational database backend) would seem to be just
what you are looking for. Protégé (both the frames and the OWL
facility) allow distributed users to work simultaneously on an
ontology stored on a remote server. As the ontology is updated, all
the Protégé clients refresh automatically to display the changes.
NCI currently is experimenting with this architecture for the
development of the NCI Thesaurus in OWL, and they have developers
stationed all across the country. I'm told that Perot Systems,
using the frame-based representation, has nearly 100 Protégé users
working on the same ontology simultaneously.
Mark
P.S. While I'm plugging Protégé, don't forget that the Ninth Annual
Protégé Conference takes place at Stanford next week (see
http://protege.stanford.edu/conference/2006/).
Bill Bug
Senior Analyst/Ontological Engineer
Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email - [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>
This email and any accompanying attachments are confidential.
This information is intended solely for the use of the individual
to whom it is addressed. Any review, disclosure, copying,
distribution, or use of this email communication by others is strictly
prohibited. If you are not the intended recipient please notify us
immediately by returning this message to the sender and delete
all copies. Thank you for your cooperation.