Re: Distributed ontology development (was Ontology entity IDs)

Vinay Chaudhri Wed, 26 Jul 2006 09:07:26 -0700


Hello All:

I had done some work with Peter Karp on collaborative KB developmentwhich works

quite well in practice.  This work is described at:

http://www.ai.sri.com/pub_list/390

I will be happy to discuss it more as per the requirements.

Keep Smiling!
Vinay.

Mark Musen wrote:

On Jul 16, 2006, at 9:36 PM, William Bug wrote:
Are you referring to the JDBC Protégé(http://protege.cim3.net/cgi-bin/wiki.pl?JdbcDatabaseBackend), or arethere other ways of connecting Protégé to an RDBMS backend?
That's the JDBC backend, Bill.
It certainly is a hurculean task to work out the O-R mapping in a waythat is flexible enough to accommodate all the graphs someone mightconstruct either in Protege-Frames or Protege-OWL, so if this isalready implemented and working, it behooves all us who need tosupport this sort of community ontology curation re-use what's beingconstructed by SMI and/or NCI.
Yes, see http://protege.cim3.net/cgi-bin/wiki.pl?MultiUserTutorial
The only problem is creating an efficient means to support this sortof community curation - and sharing of ontologies from other sources- a direct JDBC connection isn't going to work well. They'll befirewall issues which I believe will add way too much to eachindividual's overhead of bringing this capability online. When thegroup is supported by a single IT staff and working within the sameLAN environment (including those who'd connect via VPN), this can bea viable approach, but outside of that, it will probably be too muchtrouble for all the folks who need access.
We've been putting an enormous amount of energy into enhancing theperformance of the thick Protégé client the past few months forprecisely this reason. In supporting NCI, we indeed have to deal withthe significant latencies imposed by fire walls and VPNs. Theenhancements we have made have led to remarkably improved databaseperformance and transaction processing. For example, some transactiontimes have been improved by two orders of magnitude or more. We willbe migrating all these changes back into the main Protégé release overthe next several months.
This is why we've been talking with Daniel about expanding the webversion of Protégé developed in your group so as to "open" it andrelease it from the JDBC port requirements using a combination of aservice-oriented architecture (web services) and the Java Portletframework. In our lab, we've implemented very simple WSDL webservice response/request pairs to implement a generic SQL interfacevia web services to meet this need. It works extremely well, evenfor fairly complicated queries and can even be used to return binaryobjects (in our case histological images) via SOAP + attachments.This is all running over relatively firewall friendly ports such asare used by HTTP and the Tomcat Java Servlet framework.
We'd certainly like to know more about what you are doing withinBIRN. The "Web version" of Protégé, alas, was a student project, thatI think will need a bit more work to be stable. We do intend to putadditional effort into enhancing "Web Protégé" in the next few months.
The community should note that Stanford recently submitted a proposalto the National Library of Medicine for ongoing support of the Protégéresource. One of our key objectives in the new phase of our work isto engineer a true thin-client version of Protégé that adopts aservices-oriented architecture. We would welcome input for the entirecommunity as we move forward with these plans, assuming that we getfunded to work on them.
I assume when you mention the NCIT community curation this is aproject being developed/hosted/supported by the NCI Bioinformaticsgroup as a part of the caBIG project?
Although the NCI Thesaurus is an important resource for caBIG, ourwork with NCI predates caBIG and comes directly from the NCI Centerfor Bioinformatics. We work directly with the folks at NCI developingthe caCORE resources.
By any chance is the work they are doing with the Protége-RDBMSshared ontology environment (CODS - CollaborativeOntology Development Server (or Collaborative Ontology DevelopmentService project)) taking this approach to make the system lessreliant on running JDBC over the net and through firewalls? I saw onone of the Protégé CODS server configuration pages ports 4020 - 4039were used, which again, given these do have public assignments forproprietary applications(http://www.iana.org/assignments/port-numbers) can be difficult touse, unless all contributors are being hosted by the same IT staffand/or are on the same LAN (even if its a VLAN).
Are there pages on the Protégé Wiki where more complete documentationdiscusses some of these details for the NCI CODS project?
The CODS project is not supported by NCI, but rather by CIM3Engineering. The goal of CODS is to make the multi-user version ofProtégé publicly available so that users can experiment with creatingand maintaining a shared ontology library(see http://protege.cim3.net/cgi-bin/wiki.pl?CODS).
Many thanks again for the info, Mark.
My pleasure!
Cheers,
Bill

On Jul 16, 2006, at 12:53 AM, Mark Musen wrote:
On Jul 10, 2006, at 11:40 PM, William Bug wrote:
However, there doesn't appear to be a means within the OBO/NCBOcommunity for doing this sort of distributed ontology design rightnow. Two of the tools in wide spread use - Protégé and OBO-Editare really not designed to support distributed and shareddevelopment, such as you'd find in a typical distributedarchitecture - whether it be a standard client-server RDBMS-basedapproach, one using some "active pages" technology such as php,Zope, Ruby on Rails, Java Servlet/Portlet frameworks, etc. - or amore asynchronous approach using messaging and/or web services toassemble the required components from the various authoritativesources.
Bill,
I hate to sound like a salesperson, but Protégé in its multi-usermode (using the relational database backend) would seem to be justwhat you are looking for. Protégé (both the frames and the OWLfacility) allow distributed users to work simultaneously on anontology stored on a remote server. As the ontology is updated, allthe Protégé clients refresh automatically to display the changes.
NCI currently is experimenting with this architecture for thedevelopment of the NCI Thesaurus in OWL, and they have developersstationed all across the country. I'm told that Perot Systems,using the frame-based representation, has nearly 100 Protégé usersworking on the same ontology simultaneously.
Mark
P.S. While I'm plugging Protégé, don't forget that the Ninth AnnualProtégé Conference takes place at Stanford next week (seehttp://protege.stanford.edu/conference/2006/).
Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email - [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
This email and any accompanying attachments are confidential.This information is intended solely for the use of the individualto whom it is addressed. Any review, disclosure, copying,distribution, or use of this email communication by others is strictlyprohibited. If you are not the intended recipient please notify usimmediately by returning this message to the sender and deleteall copies. Thank you for your cooperation.

Re: Distributed ontology development (was Ontology entity IDs)

Reply via email to