I can't believe that I forgot to mention ...
There will also be a presentation (maybe two?) by a group that has adapted ctakes to work with two other languages. They have also integrated ctakes with other tools such as FreeLing and HeidelTime. So cool ... Cheers, Sean ________________________________ From: Finan, Sean Sent: Monday, July 6, 2020 9:08 AM To: dev@ctakes.apache.org; u...@ctakes.apache.org Subject: ApacheCon 2020 Hi all, The ctakes representation at ApacheCon 2020 is looking good! ApacheCon 2020 runs September 29 through October 1. Submission runs through Sunday, July 12. Technically it is 8:00 a.m. Eastern time Monday, but please don't procrastinate. Registration is free. I am excited to announce that we have three groups interested in giving presentations on their configuration and use of ctakes at a large scale! We also have a presentation on the installation of the ctakes Rest service using the ctakes-rest module! Knowledge on these topics is always extremely valuable to our users, and I for one really want to see how sites use ctakes when given different resources, requirements and restrictions. Because of that, I am trying to put together (technology allowing) a roundtable discussion with those presenters. That should be of value to every user no matter what your situation. We still need more presentations! To encourage you, here is a little information: 1. What you do is interesting! If you think that nobody out there cares about what you've done and how, then you probably aren't fully aware of how large and diverse our user base really is. People want to know about things like integration, customization, clinical specialty application, augmentation and favorite capability fascination. 2. Submission is very simple. This is not like a scientific conference that requires a complete paper describing your work. You only need to submit a blurb that loosely covers your topic and major talking point(s). Half a dozen sentences will suffice. In fact, what I sent last week (far below) could pass muster for a submission. Go for something that will be on a brochure / schedule. 3. The audience is made up of people just like you. Developers, Bioinformaticians, IT Specialists, Students, Medical Researchers, AI Explorers and far more Hackers than Rock Stars. 4. Slick presentation skills are not necessary. Don't worry if you have never spoken to a room full of listeners. Don't worry if English isn't your first language. Don't worry if your slides are "sloppy". Your presentation will not be graded. 5. You don't need to prepare your whole talk before submitting. Idea now, details later. 6. Registration is FREE. Right now the speaking time is anything up to 50 minutes. If you don't want to present a full 50 minutes then that is ok ... The rest can be filled with extra question/answer or somebody else may fill the remaining time with a presentation on a similar topic. I am going to put together a lightning round. If you think that you can cover some material in five to fifteen minutes then this is for you! Lightning rounds can be fun as you can make an impact with two or three slides and barely enough speaking to run out of breath. This is really a free-for-all. You can pack the time with data, give a short demonstration, compare using ctakes to breaking a mustang, or even do some on-topic (ctakes, nlp, AI, bioinformatics) stand up. Anything goes. This was an interesting (full) talk last year: https://aceu19.apachecon.com/session/confessions-middle-aged-coder-turned-gravel-grinder. If you want to be in the lightning round, just write me a couple of sentences on your strike and I will put together the full submission for ApacheCon. Does it get any easier? I will present one or two things, but to maximize impact I would like to know what most interests / would help all of you. So, please write me a topic or two that would best apply to your work. Some links ... ApacheCon Home Page: https://www.apachecon.com/ ApacheCon Registration: https://hopin.to/events/apachecon-home ApacheCon Submission: https://acna2020.jamhosted.net/cfp.html Lastly, so that we don't crash a server, I would like to have a rough head count for attendance estimation. If you think that you will watch any presentation of ctakes then please send me ( seanfi...@apache.org ) an email with the subject "Attend" and "+1" in the body. Cheers, Sean ________________________________ From: Finan, Sean Sent: Monday, June 29, 2020 11:02 AM To: dev@ctakes.apache.org Subject: ApacheCon 2020 Hi all, General admission to ApacheCon 2020 is free: https://hopin.to/events/apachecon-home I think that price of admission and travel costs have held back ctakes users from attending past conferences, and lack of a sizable audience has diminished the comparative value of ctakes presentations in the eyes of ApacheCon planners. Because of the "at home" nature of this year's conference, an app with smaller presence and less hip buzz has a better chance of grabbing some time on the schedule. The predetermined tracks are still an ill fit when it comes to the nature of ctakes. https://apachecon.com/acah2020/cfp.html However, I think that we can still use this opportunity to deliver some powerful introduction and training videos, as well as user stories and clinical project application. Perhaps we can argue for a NLP track and do some coordination with projects like OpenNLP and UIMA. There are a scant two weeks to come up with presentations, and less time to propose a track/topic. The call for presentations ends July 13th. That is a deadline that requires immediate attention by anybody who wants to show off their project or expertise. Apache wants to have a single point of contact for each project, and I am volunteering to be that person for ctakes. I am volunteering, not laying claim, so if you think that you are a better fit for the position please let me know. I have written some ideas for presentations below. If you want to take one (modify as you like) then please write me and post to the devlist. If you have ideas for another presentation topic, please let me and the devlist know - even if you aren't volunteering to do the presentation yourself perhaps somebody else will. Again ... two weeks. Thank you, Sean * The following talk ideas are by and large directed toward training. That does not mean that topics should stay within that scope. ================================================================= Customizing cTAKES: First Principles Built using Apache UIMA, cTAKES is modular and extensible. Why is it frequently treated as a black box? Is it lack of need, sparsity of resources, or simply fear of the unknown? This is a quick start tutorial on adding custom elements to cTAKES. We illustrate creating simple classes to input, process and output data. This involves a concise overview of Apache uimaFIT and the cTAKES type system, as well as building a UIMA pipeline using piper files. ================================================================= Loading a shippable with cTAKES DockHand Customizing a simple pipeline need not be left to cTAKES experts. Making a cTAKES installation need not be confined to source code checkouts or lengthy multi-stage binary downloads. We introduce cTAKES DockHand, a compact single-file installation tool that allows one to construct custom pipelines as well as local installations, Rest Services and Dockerfiles. ================================================================== Secret Engines of cTAKES The cTAKES default natural language processing pipeline is a standard in the clinical research community. What is past that standard? While the default clinical pipeline uses almost 20 engines, there are dozens more in various cTAKES modules. We present and discuss the top 10 annotation engines you never knew you had. ==================================================== Does cTAKES Know "The Best Words"? Named Entity Recognition is at the core of all complete natural language processing tools. Out of the box cTAKES uses a dictionary containing part of the Unified Medical Language System (UMLS) that covers most common clinical terms. But it also comes with a custom dictionary creator. If you think that your clinical research is directed, then you should probably have a directed dictionary. UMLS subsets, non-english dictionaries and novel custom dictionaries have all been successfully used with cTAKES. This is an overview of cTAKES named entity recognition with the essential what, why and how of custom dictionaries as the centerpiece. ==================================================== Academic Software: Performance or Performance? A conundrum faced by all academic software projects is how to make the best of a small amount of resources. Clinical natural language processing projects that use cTAKES are not exempt, and balancing accuracy of results against speed of processing often becomes central when it is time to put things into production (or just please the boss). More than a history of cTAKES and its evolutionary efforts in precision, speed and usability, this presentation contains examples of how to best utilize each aspect. ================================================================ Diet cTAKES One reason cTAKES is a popular framework in clinical natural language processing tools is its use of Apache Maven for project management. Navigating cTAKES dependencies can be difficult, leading to a common practice of consuming the whole project. Much of what ends up in your system may lead to unnecessary bloat. Going piecemeal through the values and weights of cTAKES modules and resources, this presentation will assist any cTAKES user in trimming project bulk from gigabytes to megabytes. ================================================================ cTAKES Saved my Life The title is inappropriate when it comes to healthcare in practice. However, I used Apache cTAKES for my clinical research project on ________, and its [versatile / comprehensive / speedy / ?] nature was important in completing things [on time / accurately / ?]. We share our real-world experiences with using cTAKES, discuss why we chose it, issues we faced and how we overcame unexpected problems. ================================================================ Large-scale cTAKES, an Installation Story At our _____ facility, we needed to process _____ [patients / notes / term lists / ?] on a ______ system. We present a real-world application of cTAKES on a large scale, our needs for _____ input and ____ output. We compare and contrast cTAKES with other [clinical] NLP platforms that we tried and explain why we chose [it / another] in the end. We will also share the novel [techniques / code / integration] that we used for the success of our installation. ================================================================ My Engine is Faster than Yours We have created a cTAKES annotation engine that performs the task of _____. This is [newer / faster / more comprehensive] than existing engines in [cTAKES / other]. We will present [numbers , usage , capabilities / i/o ] of the engine and its [model / data ]. We will also commit the code and documentation to Apache cTAKES. ================================================================ cTAKES on the Catwalk We have created a Machine Learning model that can be used in cTAKES for ______. The model uses the third party ______ for [newer / faster / more comprehensive] results. We will present the essentials of model creation as well as [numbers , usage , capabilities / i/o ] of our model. We will also advocate for the third party _____ and how we integrated it with cTAKES. We will also commit the code [model] and documentation to Apache cTAKES.