This seems like a *very* challenging and involved problem to me... On Tuesday, October 1, 2013, Pei Chen wrote:
> Agreed. > Yes, I think this is slight augmentation and extension of the original > vision of the clinical common type system- by having it work with other > UIMA based NLP system. Having worked on item (3) for cTAKES, I actually > think the tough part will be getting consensus and agreement on a system > between all parties and less on the required code changes. Hence, just > wanted to ping the community to gauge interest and see if this actually > makes sense [It would be nice to plug in different POSTaggers or example > without having to remap types]. > If we have a willing volunteer (Richard :)?) to perform some of the prelim > analysis Q1 2014 with our existing type system, perhaps we can actually > make this happen. > > 4a) I think the SHARP4 development group has essentially moved to the > cTAKES ASF community which is probably even better since it already has a > meritocratic/governance mechanism to handle changes. > > > > On Tue, Oct 1, 2013 at 10:39 AM, Wu, Stephen T., Ph.D. > <[email protected] <javascript:;>>wrote: > > > Pei et al, > > That was the vision for the SHARP "common type system", except it was > > meant to include medical-related projects rather than general projects. > > > > Steve's process below is probably the most realistic way to do things, > and > > it's basically how we did the current cTAKES type system. Unfortunately, > > the "someone" doing #1 was me, and I didn't realize that it would be > quite > > difficult. I guess I know more about how to do it now but #1 and #2 were > > surprisingly harder than I expected. I'm adding a #4: > > > > (1) Have someone inspect the various type systems closely and make a > > proposal > > A. Know each of the type systems on their own. Essential to visualize > > them appropriately, but it is still difficult to understand the > > implications of type changes just by looking. (By the way, we never came > > up with a really great automatic visualization tool, closest was a > Protégé > > plugin. Excellent visualization would go a long way, especially if edits > > were possible.) > > B. Categorize portions of type systems to compare and take them a step > > at a time. > > C. Clearly limit which type systems you are going to consider for your > > comparison and reconciliation. > > D. Pick a starting point. I found it nearly impossible to create from > > scratch when you're staring at 4-5 other type systems. We started from > > the old cTAKES type system but that did cause some bias! > > E. Develop real criteria (or at least opinions) for choosing between > the > > many options. > > > > (2) Agree on the proposal. > > A. Multiple projects should make a binding agreement to implement. This > > means, most likely, that they somebody needs to have assurance of > funding. > > In our case, we only made it binding for cTAKES, so it is only used by > > cTAKES (as far as I know). > > B. With different projects' vested interests on the line, have some > real > > discussions of what your project is going to give up with the proposed > > stuff. > > > > (3) Spend the time to re-write all the code to use the new type system. > > * As Steve said, this is time-consuming, especially if things get > broken > > and models need to be retrained, etc. > > > > (4) Ensure maintenance and modifiability across projects. > > A. The original SHARP common type system vision handed off the > > maintenance to the Software Development Group, but that never really > > happened. I hope the Apache community can serve as this to some degree, > > but so far it has still depended on unreliable people like myself. > > B. A means of having everyone automatically draw from the same source > > code would be preferable. > > C. If, in the future, you need to consider another UIMA project whose > > type system should be reconciled... Well, that's happening right now. I > > guess you can worry about it when you get there if you have a community > > that's willing to deal with it. > > > > > > Those are just some thoughts. It's not impossible, but neither is it > > simple. > > > > stephen > > > > > > > > > > On 9/30/13 8:17 PM, "Steven Bethard" <[email protected]> wrote: > > > > >We (ClearTK) talked with Richard (DKPro) about doing this for ClearTK > > >and DKPro. Basically, both groups were all for it, but the main issue > > >was time. Basically you need to: > > > > > >(1) Have someone inspect the various type systems closely and make a > > >proposal > > >(2) Agree on the proposal. > > >(3) Spend the time to re-write all the code to use the new type system. > > > > > >Step (3) is especially time consuming, but in fact, we never managed > > >to get the free time for step (1). > > > > > >That all said, ClearTK would love to share a common type system with > > >other projects. > > > > > >Steve > > > > > > > > >On Mon, Sep 30, 2013 at 7:38 PM, Pei Chen <[email protected]> wrote: > > >> Richard, I, and few others had an interesting bar conversation... > > >> In the spirit of interoperability, What if we had a baseline common > type > > >> system that could be reused across UIMA compatible NLP systems? > > >> Imagine for a moment that OpenNLP, Clea -- -- Karthik Sarma UCLA Medical Scientist Training Program Class of 20?? Member, UCLA Medical Imaging & Informatics Lab Member, CA Delegation to the House of Delegates of the American Medical Association [email protected] gchat: [email protected] linkedin: www.linkedin.com/in/ksarma
