This is an automated email from the ASF dual-hosted git repository.

seanfinan pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/ctakes.git


The following commit(s) were added to refs/heads/main by this push:
     new b94c667  Improved getting started for new developers.
     new 3918e0e  Merge remote-tracking branch 'origin/main' into main
b94c667 is described below

commit b94c667a7ca2b0e55678090dfc86b36383aa4467
Author: Sean Finan <[email protected]>
AuthorDate: Mon Jun 9 10:22:45 2025 -0400

    Improved getting started for new developers.
---
 README.md | 93 ++++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 63 insertions(+), 30 deletions(-)

diff --git a/README.md b/README.md
index 29619e7..23e76ae 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,6 @@
 
 ## Introduction
 
-
 The Apache™ clinical Text Analysis and Knowledge Extraction System (cTAKES™) 
focuses on extracting knowledge
 from clinical text through Natural Language Processing (NLP) techniques.
 
@@ -31,58 +30,92 @@ We encourage people from all backgrounds to get involved! 
(link)
 <br>
 
 ## Supported Environments
-1. **Java 1.8** is required to run cTAKES versions 5.x and older. Versions 6+ 
require java 17.  Run this command to check your Java version:
-```
-$ java -version
-```
-2. **Maven 3** is required to build cTAKES. Run this to command to check your 
Maven version:
+1. **Java 17** is required to run cTAKES 6.0.0 and higher.  **Java 8 or Java 
11** is required to run cTAKES 5.  Run this command to check your Java version:
 ```
-$ mvn -version
+java -version
 ```
-3. A license for the [Unified Medical Language System 
(UMLS)](https://www.nlm.nih.gov/research/umls/index.html)
+> [!NOTE]
+> If you are using an integrated development environment (IDE), please see its 
documentation on using Java.
+2. A license for the National Library of Medicine's [Unified Medical Language 
System (UMLS)](https://www.nlm.nih.gov/research/umls/index.html)
    is required to use the named entity recognition module (dictionary lookup) 
with the default dictionary.
-4. **Python 3** is required to use cTAKES [Python Bridge to Java 
(PBJ)](https://github.com/apache/ctakes/wiki/pbj_intro). 
-Run this to command to check your Python version:
+3. **Python 3** is required to use cTAKES [Python Bridge to Java 
(PBJ)](https://github.com/apache/ctakes/wiki/pbj_intro).
+   Run this to command to check your Python version:
 ```
-$ python -V
+python -V
 ```
+> [!NOTE]
+> If you are using an integrated development environment (IDE), please see its 
documentation on using python.
+<br/>
+### For developers:
 
+1. Apache **Maven 3** is required to build cTAKES. Run this to command to 
check your Maven version:
+```
+mvn -version
+```
+> [!NOTE]
+> If you are using an integrated development environment (IDE), please see its 
documentation on using Apache Maven.
 
 <br/>
 
 
 ## Getting Started
 
-### New Users
-
-The easiest way for new users to get a jump start running cTAKES is to use the 
[Standard Pipeline Installation Facility](artifacts).
-The Standard Pipeline Installation Facility is a tool that can install cTAKES 
configured to run the most popular cTAKES pre-built pipelines. 
-You can then use the [Piper File 
Submitter](https://github.com/apache/ctakes/wiki/Piper+File+Submitter) GUI to 
submit jobs or submit them from the command line.
+### New Users (Non-Developers)
 
-For access to all cTAKES capabilities, download a [zip]() or [tar.z]() file 
containing a fully-built installation of the most recent cTAKES release.
-Then, after obtaining a UMLS license, use the [UMLS Package 
Fetcher](https://github.com/apache/ctakes/wiki/cTAKES+UMLS+Package+Fetcher) GUI 
to install a copy of the 
+For access to all cTAKES capabilities, download a pre-built copy of a cTAKES 
installation from the [release 
area](https://github.com/apache/ctakes/releases).  
+The names of pre-built installations follow the format 
`apache-ctakes-#.#.#-bin.zip`.
+After unzipping the release file and obtaining a UMLS license, use the [UMLS 
Package 
Fetcher](https://github.com/apache/ctakes/wiki/cTAKES+UMLS+Package+Fetcher) GUI 
to install a copy of the
 default dictionary for Named Entity Recognition (NER) using cTAKES Fast 
Dictionary Lookup.
+You can then use the [Piper File 
Submitter](https://github.com/apache/ctakes/wiki/Piper+File+Submitter) GUI to 
submit jobs, 
+or run any of the scripts in the `bin/` directory.
 
-### New Developers
 
-__Notice:__ cTAKES 7.0.0-SNAPSHOT requires jdk 17 to build and run.
+### New Developers
 
 All source code for cTAKES versions 5+ is available from the [cTAKES GitHub 
repository](https://github.com/apache/ctakes).
-1. Clone this repository
+1. Clone the cTAKES code repository using git.
 ```
-$ git clone https://github.com/apache/ctakes.git
+git clone https://github.com/apache/ctakes.git
 ```
-2. Open your local copy of the repository in an IDE of your choice.
-3. Run directly from the code (link).  
-   or
-4. Build a binary installation (link), and
-5. Run a binary installation (link). 
+> [!NOTE]
+> If you are using an integrated development environment (IDE), please see its 
documentation on using git.
+2. Compile the cTAKES code using Apache Maven.  In your cTAKES root directory, 
run this command:
+```
+mvn clean compile
+```
+> [!NOTE]
+> If you are using an integrated development environment (IDE), please see its 
documentation on using Apache Maven.
+3. 
[Download](https://sourceforge.net/projects/ctakesresources/files/sno_rx_16ab.zip)
 the default cTAKES dictionary zip file.
+4. Copy the contents of the zip file to the 
`resources/org/apache/ctakes/dictionary/lookup/fast` directory.
+> [!NOTE]
+> As an alternative to steps 3 and 4, you can use the [UMLS Package 
Fetcher](https://github.com/apache/ctakes/wiki/cTAKES+UMLS+Package+Fetcher) GUI.
+> Run the class `DictionaryDownloader.java` to launch that tool, or use the 
`getUmlsDictionary` script if using a full build of cTAKES.
+5. Run the cTAKES default pipeline using the Java class 
`PiperFileRunner.java`. To use the [Piper File 
Submitter](https://github.com/apache/ctakes/wiki/Piper+File+Submitter) GUI, run 
the `PiperRunnerGui.java` class.
+> [!NOTE]
+> To run the cTAKES Java classes, the full Java classpath must be configured. 
Setting up a classpath is beyond the scope of this document.  
+> An integrated development environment (IDE) should set up the classpath for 
you, please see its documentation.
+
+<br>
+> [!IMPORTANT]
+> You cannot run scripts in the `bin/` directory within a development 
environment.
+> Within a cTAKES development environment you can run Java classes and Maven 
profiles, but no scripts in the `bin` directory.
+
+> [!TIP]
+> You can build your own cTAKES installation from a development environment 
using Apache Maven. 
+> A cTAKES installation is required to run scripts in the `bin/` directory.
+6. Build using Apache Maven:
+```
+mvn clean compile package
+```
+> [!NOTE]
+> If you are using an integrated development environment (IDE), please see its 
documentation on using Apache Maven.
 
+After packaging, there should be tar and zip files for `apache-ctakes-...-bin` 
and ` apache-ctakes-...-src` in your `ctakes-distribution/target/` directory.
+7. Unzip the `apache-ctakes-...-bin` into a directory *outside* your cTAKES 
development area.
 
-## More information
 
-Much more information can be found on the [cTAKES 
wiki](https://github.com/apache/ctakes/wiki).
+## More information
 
-You can also write to the cTAKES user and developer mailing lists: user at 
ctakes.apache.org and dev at apache.ctakes.org
+You can write to the cTAKES user and developer mailing lists: **user** at 
`ctakes.apache.org` and **dev** at `apache.ctakes.org`
 and find answers to previously asked questions by searching the 
[user](https://lists.apache.org/[email protected])
 and [developer](https://lists.apache.org/[email protected]) 
mail archives.
\ No newline at end of file

Reply via email to