Here are my notes from today¹s teleconf. I changed the syntax for action items to indicate the date on which they originated. Should help prevent excessive slippage.
Steve ====================================== Minutes from 13 Nov 2008 DAS teleconference Teleconference Info: See http://www.biodas.org/wiki/BioDAS:Community_Portal#Teleconference Attendees: Free agent: Gregg Helt Affymetrix: Steve Chervitz EBI: Andy Jenkinson Sanger: Jonathan Warren LBNL (Suzi's Lab): Ed Lee, Leo(?), Nomi Harris Note taker: Steve Chervitz Action items are flagged with '[A-YYMMDD]' indicating the date they originated. New items arising in the discussion are flagged with '[A-new]'. All pending action items are summarized at the bottom of the minutes. The teleconference schedule and links to past minutes are available from the Community Portal section of the biodas.org site: http://www.biodas.org/wiki/BioDAS:Community_Portal#Teleconference DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit on the the discussion list. ======================================================== Agenda: ======== * Matters arising * Review progress on action items from last week (based on minutes) * Discuss possible modifications to the DAS1+2 sources doc * Writeback issues Matters arising ===================== GBOL Discusion [see: http://gmod.org/wiki/Gbol ] EL: Gbol is still in dev. Requires a chado backend, flybase db. GH: UML diags for gbol? EL: No. For simple obj layer is a direct mimic of chado data model. Some menu stuff for convenience. For the biological layer haven't created any diags yet. Based on a subset of SO. GH: For simple obj layer I should refer to the chado object diag? EL: Each table is an obj, with someconvenience things. EL: A data model, lightweight versatile. Chado is complex, not very user friendly. Gbol layer is geared toward biologist. A gene object, not worrying about underlying structure. Plug and play architecture. Set up factory to take care of I/O. Easy to add new data sources. Read from chado and write to GFF3, e.g., or from two diff data sources. One test: Everyone has diff implementation of chado. Gbol will do on the fly translation, based on controlled vocabularies. GH: Primary use case is Apollo. EL: Planned for Jbrowse (Ajax-based Gbrowse in Ian's lab) All done on the server (I/O), not client using web services. Java based. GH: Looking at data model, not hard to do a DAS1/2 translation. What is current das support? EO: No good chado-specific das servers. Can use Gbrowse as an intermediary. Old gbrowse is being deprecated. Gbol will act as a JSON provider for Gbrowse, but no reason it could not act as a DAS server too. Discussion of action items from 30 Oct 2008 teleconf: ===================================================== [A-081030] All: Review Gregg's DAS UML modeling, post any comments to list. AJ: Looked at it. GH: Let me know if you see anything problematic. Its a pretty realistic representation. AJ: Regarding methods: In das/1 should a method be part of a type or an entity in itself. GH: Das/2 combine method and type into the type. There's an optional method in type. Types use ontological terms (not reps thereof). Das/2 type 'transcript' is not a 1:1 mapping to the SO term (e.g., method=Genscan, type=transcript) . So you may have more types than SO terms. AJ: Can do it another way. DAS/1 id for a type is the ontology ID. Important for translation issues. GH: Haven't done much Where do you see it changing? AJ: An optional element with a required attrib. You might have an ontology to describe method. Might want to say something is result from a type of experiment, type of algorithm, type of sample. May not want to shoehorn them into the type. Das has moved away from complex query capabilities. Servers don't impl types. People tend to make a separate das source for each data type. Moving away from queryability and towards simplicity. GH: Complexity is then pushed into understanding different sources. Good to [A-new] Gregg: Work on translation of method and type in Trellis Ivy proxy. [A-081030] All: Review Gregg's DAS1->DAS2 proxy work (Trellis/Ivy/Vine), post any comments to list. [A-081030] AJ: Continue checking out Gregg's DAS1->DAS2 proxy, esp. the XML. GH: Any feedback? JW: Had a look. Interested in locations. GH: Translating das1 feats starts/stops into location. also translating target starts/stop and group. AJ: Seemed to work quite well. Problem comes when people abuse the spec a bit. GH: If no start/stop, = locationless feature. Phase and score are additional complexity. If they are non-numbers it filters them out now. If numbers, uses das1 score element AJ: What non-numbers in score? GH: Dash is allowed = no score available. Sometimes '*' or '.' AJ: DAS spec sometimes uses '.' or '0', for strand [A-081030] AJ: Post info about March '09 Hinxton DAS workshop to biodas.org/current_events JW: Done. Got a lot of registrations already. Aiming for 30 for accomodations, 50 total (including campus folks). Will hit these numbers easily. GH: Hoping to attend. AJ/JW: may have trouble accomodating everyone who wants to talk. [A-new] Gregg: register for '09 Hinxton DAS workshop soon! [A-081030] GH: Send out action and agenda items well in advance of teleconf. Done. [A-081030] GH: Add auth and security on the agenda so interested folks can call in. [A-081030] GH: Solicit feedback about security/auth from interested parties. GH: Not added to agenda this week. [A-081030] GH: Contribute to the DAS changes document re: DAS/2, sources & deprecating DSN. GH: Still pending. Hopefully next week. [A-081030] GH: Get new teleconf number from Suzi; post to list with agenda. GH: We are going to use Suzi's number going forward. SC: I put this on the biodas.org wiki. Can also post the date of upcomming teleconfs. [A-081030] JN: Post preliminary java web start IGB release on bioviz.org GH: Not on this call today. Next time. [A-081030] SC: Merge DAS2 subscribers to DAS list. Redirect DAS2 posts to DAS list. [A-081030] SC: Consider making DAS list auto reject posts from non-subsribers. [A-081030] SC: Add Andy J and Jonathan W as admins to the DAS mailing list. SC: All pending, though I did update the section of the biodas.org wiki to indicate that the das2 list is being retired and all traffic should be sent to the das list. [A-081030] SC: Change 2 -> 2.1 and say it is "evolving"; declare the HTML spec as "frozen" [A-081030] SC: Send link to the 2.1 wiki spec to list. SC: Done. GH: Need to do the same for the 1.5 vs 1.6 spec. [A-new] SC: Add AJ and JW as biodas.org sysops (can't edit side bar) [A-new] AJ/JW: Put link to 1.6 evolving version of the 1.5 das spec on biodas.org sidebar. [A-081030] SC: Fix Affy IGB launching links on SF page. SC: Have not done. Noticed today that they appear to be fixed (probably by Ann's group -- thanks!) [A-081030] SC: Update biodas.org community portal page with new teleconf number. SC: Done [A-081016] SL: Summarize authentication pros and cons. Review descriptions, make a decision. EL: Was there a write up of this? GH: People posted comments to the list: David Nix, Andy, Steven Blanchard. Suzi is supposed to summarize. [A-081016] SL: Decide Ian or Suzi is PI on grant. Issue reciprocal letters of collab. GH: Suzi's grant action item: (Feb 2009) Feedback from funding people is that they're interested in DAS part of it (distributed annotation). Suzi will have some feedback after conf call on 11/14. Topic: Writeback ================== GH: Given LBL folks are here. How does it work in Apollo, retrieve and edit curations? EL: Rudimentary via das. Supports a number of data sources, load into Apollo data model, modify, translat Data sources: Chado, chado-xml, gff3, genbank records, some others. GH: Thinking about for das/2 writeback: ID assignment and batch operation. How do you do that? EL: Id assignment is a chado (db) issue. Configurable by user (in following format), vs database ID. In the db, at time of writeback writing to chado instance, gets next available ID (pk), meaningless to user, just db internal. GH: DAS/2 writeback spec, if it's new curation, client assigns temp id, post of xml for that feature to server, server responds back with same xml but with temp id in 'old-uri' and new id in the 'uri' field. EL: Similar idea. When working with db, will generate temp id, and modify it. GH: Related to that: changing one feature can have side effects on other features. Change one exon boundaries, changes phase of other exons downstream. EL: Done via client side through Apollo. Didn't like having server do it, since it ties to a particular db, relying on stored procs ties you into a specific DMBS. Decided to do it on client-side. When time to write to db, client queries db to determine available id space. GH: Queries db before it creates a feature? JDBC? EL: Yes and yes. Type 3 drivers. GH: Changing in light of Gbol? EL: Planning major rewrite of Apollo. Gbol will be able to handle it. Apollo won't care about I/O. That's all through Gbol. For das/1->2 translation, should be efficient with our framework. Conversion between different data sources via the data model should be easy. GH: Regarding batch operations: easy via JDBC? Integrity across several operations. EL: Many DBMS don't work well across lots of transactions. Run out of log space. Forces you to do lots of mini-transactions, with transaction management. Can't do massive update of whole genome. We can do per-CDS/protein/gene type edits as atomic operations. GH: Trasactional integrity in DAS/2: a single http call is the atomic unit. Any changes specific there are to be an atomic operation. EL: Will be an issue with large writeback. GH: Our model is a single human curator editing one gene at a time. Not via a major automated pipeline script. Not sure what happens in http when sending large amounts of data back and forth. EL: Problem with timeouts while client is waiting for response. GH: Have considered an arrangement where client receives 'accepted' (HTTP 202) and then a redirect to another source to receive the writeback, or check status. Not in the spec now. AJ: Has been mentioned before, "come back later" not just for writeback. Not doing anything about it yet. Not hard to add something like this, since most libraries support redirection. Just check the header. GH: Only sending data for features that change not everything (delta). EL: ... GH: Some of this will take trials. Getting to work with single user. AJ: Keep it simple, add it as needed. GH: Write back spec discussion on the mailing list (Gustavo). Can be generalized. Very few things in there now. Think we can have the thing that gets posted be the feature XML (DAS/1 or DAS/2). Can strip out, simplify it. RESTful. Have a link for this on wiki. Not yet populated. [A-new]: Gregg write up new writeback proposal on wiki. [A-new]: Steve - wikify the das/2 writeback here first. AJ: Focused around proteins. Just get it working with Dasty (which uses OpenID). Better for him to post them as DAS/1 style features. GH: Like it because: more restful, and not just for features (applies to seqs, types, alignments, etc.) AJ: Use diff http commands to do different things. Post, put, get GH: Problem for post,put,delete: you might want to do all of those in one operation. In the general case. Something that Google data folks are writing over posts, but are effectively doing puts and deletes too. AJ: Simplicity is the way to go. GH: Reduces the number of elements. Pending Action Items: ======================== [A-081016] SL: Decide Ian or Suzi is PI on grant. Issue reciprocal letters of collab. [A-081016] SL: Summarize authentication pros and cons. Review descriptions, make a decision. [A-081030] All: Review Gregg's DAS UML modeling, post any comments to list. [A-081030] GH: Solicit feedback about security/auth from interested parties. Add to agenda. [A-081030] GH: Contribute to the DAS changes document re: DAS/2, sources & deprecating DSN. [A-081030] JN: Post preliminary java web start IGB release on bioviz.org [A-081030] SC: Merge DAS2 subscribers to DAS list. Redirect DAS2 posts to DAS list. [A-081030] SC: Consider making DAS list auto reject posts from non-subsribers. [A-081030] SC: Add Andy J and Jonathan W as admins to the DAS mailing list. [A-081113] AJ/JW: Put link to 1.6 evolving version of the 1.5 das spec on biodas.org sidebar. [A-081113] GH: Work on translation of method and type in Trellis Ivy proxy. [A-081113] GH: register for '09 Hinxton DAS workshop soon! [A-081113] GH: Write up writeback proposal ideas in the DAS/2.1 wiki. [A-081113] SC: Add AJ and JW as biodas.org sysops (so they can edit side bar) [A-081113] SC: Wikify the das/2.0 writeback HTML document in das/2.1 wiki. [A-081113] All: Next teleconf in three weeks: 04-Dec-08 [A-081113] All: Anyone that has items they want discussed, send to Gregg. ======================================= CVS Repository version: $Id: das2-teleconference-2008-11-13.txt,v 1.3 2008/11/14 01:14:55 sac Exp $ ------------------------------------------------------------ This transmission is intended for the sole use of the individual and entity to whom it is addressed, and may contain information that is privileged, confidential and exempt from disclosure under applicable law. You are hereby notified that any use, dissemination, distribution or duplication of this transmission by someone other than the intended addressee or its designated agent is strictly prohibited. If you have received this transmission in error, please notify the sender immediately by reply to this transmission and delete it from your computer. _______________________________________________ DAS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/das
