Re: Eclise Annotation Editor

2006-12-14 Thread Thilo Goetz

Hi Joern,

is this your Text Analysis Environment on SourceForge 
(https://sourceforge.net/projects/tae/)?  Looks pretty cool!  This would 
be a nice addition to our Eclipse-based tooling.


--Thilo

Jörn Kottmann wrote:

Hello,

I have developed an eclipse editor to edit xcas files, it can add, 
remove and change
 Annotations and FeatureStructures. This is done within the editor and 
some views.
The plugin also defines its own project and has an special explorer for 
it like the package explorer in
JDT. Its also supports the execution of Annotators and CASConsumers 
against the xcas files stored

inside the project.

If you are interested to add this or parts of it to the UIMA project I 
would like to sponsor the code and

time to integrate it.

Let me know what you think,
Jörn


Re: Eclise Annotation Editor

2006-12-14 Thread Jörn Kottmann
yes, there are some still compatibility issues with UIMA. The current  
release has many smaller issues,

many of them are already fixed. I will make the next release soon.

On Dec 14, 2006, at 10:27 AM, Thilo Goetz wrote:


Hi Joern,

is this your Text Analysis Environment on SourceForge (https:// 
sourceforge.net/projects/tae/)?  Looks pretty cool!  This would be  
a nice addition to our Eclipse-based tooling.


--Thilo

Jörn Kottmann wrote:

Hello,
I have developed an eclipse editor to edit xcas files, it can add,  
remove and change
 Annotations and FeatureStructures. This is done within the editor  
and some views.
The plugin also defines its own project and has an special  
explorer for it like the package explorer in
JDT. Its also supports the execution of Annotators and  
CASConsumers against the xcas files stored

inside the project.
If you are interested to add this or parts of it to the UIMA  
project I would like to sponsor the code and

time to integrate it.
Let me know what you think,
Jörn




Re: Eclise Annotation Editor

2006-12-14 Thread Adam Lally

On 12/14/06, Thilo Goetz [EMAIL PROTECTED] wrote:

Hi Joern,

is this your Text Analysis Environment on SourceForge
(https://sourceforge.net/projects/tae/)?  Looks pretty cool!  This would
be a nice addition to our Eclipse-based tooling.

--Thilo



I got this from SourceForge but was unable to run it.  The net.sf.tae
plugins show up with red X's in the plugin registry, even though I've
installed GEF and UIMA 1.3.2.  There's nothing in the error log.  What
might I be doing wrong?

Anway, it sounds like this would be a useful addition.  Thanks, Joern,
for offering to contribute it.

What do the other commiters think -- would this make a good first
project for our UIMA sandbox?

-Adam


Re: [jira] Created: (UIMA-116) Always deliver the base CAS to the process method

2006-12-14 Thread Thilo Goetz

Adam Lally wrote:

On 12/13/06, Thilo Goetz [EMAIL PROTECTED] wrote:


I couldn't agree more (except for the default bag indexes).  It makes no
sense at all that global indexes must be accessed via a particular view.



I can't tell what exactly you're agreeing to.  Are you thinking that
anything indexed in a view would also be by definition indexed in the
global view?  Do we need different index definitions for the global
view (so we don't have a global index over annotations sorted by
begin, end but containing annotations from multiple Sofas)?


I was agreeing to your statement that non-sofa (i.e., non-annotation) 
indexes make sense in a global view.


I would think that annotations for different sofas would be in different 
indexes.  Not sure what we currently do though.  All those indexes might 
be accessible from the global view.



In that case, a view could be seen as just a set of indexes, with
possibly just two methods: getIndexes() (and variations) and
addToIndexes(FS).  The base CAS would be a view on everything.  A view
might be what we now call index repository.  In fact, if we just rename
the index repository to view, we're done ;-).  Just a little
implementation to make more than one index repository possible.



We haven't addressed Sofas yet.

The base CAS does not have a single subject of anlaysis, so methods
like getDocumentText() and its relatives are a problem.  These methods
should belong to a view.  (According to the spec, not all views
necessarily have a Sofa, but it is a common use case supported by the
particular kind of view called an Anchored View.)


Sure, those would be on the view as well.  Would we then have 
text-specific views, like TCasView?  I'm not proposing this, mind you, 
just asking.



So no CAS cas = inCas.getView()?



Certainly, I never liked that idea but are we back to essentially 
requiring:

CasView viewOfMySofa = inCas.getView() ?


+1 to that.



Since inCas.getDocumentText() would not work, and inCas cannot be used
to iterate over or index annotations belonging to a particular Sofa.


inCas.getDocumentText() would not work, +1 to that.  However, I was 
thinking that you would be able to access *all* indexes (and their 
contents of course) from the CAS, not just the sofa-neutral ones. 
Perhaps you wouldn't know how to interpret the sofa information, but the 
annotations would still be accessible.  I think this is consistent with 
your idea/proposal that the CAS is the container of all data.  Was this 
also what you were thinking?


--Thilo



Re: [jira] Created: (UIMA-128) ll_setStringValue not checking if feature range is subtype of String with Allowed Values, not doing Allowed Value check

2006-12-14 Thread Thilo Goetz

Oops.  Let me look into that.

--Thilo

Marshall Schor (JIRA) wrote:

ll_setStringValue not checking if feature range is subtype of String with 
Allowed Values, not doing Allowed Value check
---

 Key: UIMA-128
 URL: http://issues.apache.org/jira/browse/UIMA-128
 Project: UIMA
  Issue Type: Bug
  Components: Core Java Framework
Affects Versions: 2.1
Reporter: Marshall Schor
Priority: Minor


The JCas code generated for setting string values uses the ll_setStringValue 
method in the CASImpl.  This method does not check if the type of the feature 
being set is a *subtype* of String with allowed values, and doesn't throw the 
needed exception if the item being set is not in the set of allowed values.



Re: Eclise Annotation Editor

2006-12-14 Thread Michael Baessler

Adam Lally wrote:

What do the other commiters think -- would this make a good first
project for our UIMA sandbox?
Yes I think this is good first component for the UIMA sandbox. But first 
we have to clarify the details for the submission...

I seems that we need an Apache Software Grant for this code.

If found the following at: www.apache.org/licenses
Software Grants


When an individual or corporation decides to donate a body of existing 
software or documentation to one of the Apache projects, they need to 
execute a formal Software Grant 
http://www.apache.org/licenses/software-grant.txt agreement with the 
ASF. Typically, this is done after negotiating approval with the ASF 
Incubator http://incubator.apache.org/ or one of the PMCs, since the 
ASF will not accept software unless there is a viable community 
available to support a collaborative project.


-- Michael


JUnit test extension files

2006-12-14 Thread Michael Baessler

Hi,

I think we should move all files from the uimaj-test-utils project that 
are used in uimaj-core also to the uimaj-core project. I would like to 
remove the dependency that

uimaj-core needs uimaj-test-util.
These are:
cid:part1.03010504.05030908@michael-baessler.deExceptionPrinter.java
FileCompare.java
JUnitExtension.java
TestPropertyReader.java

Why is this necessary. I would like to use the uimaj-test-util project 
to provide some helper classes for the annotator testing. So that users 
that write analysis component for our sandbox
can all use the same test methods. This make things easier to 
understand. To do this, uimaj-test-util needs a dependency on uimaj-core.


What do you think?

-- Michael




[jira] Closed: (UIMA-61) CasCreationUtils.createCas(Collection) silently ignores TypeSystemDescription objects,

2006-12-14 Thread Adam Lally (JIRA)
 [ http://issues.apache.org/jira/browse/UIMA-61?page=all ]

Adam Lally closed UIMA-61.
--

Resolution: Fixed

Previously this threw an exception if an Aggregate AE descriptor contained a 
URISpecifier.  This broke some user code.
Fixed problem by adding support for URISpecifiers.

 CasCreationUtils.createCas(Collection) silently ignores TypeSystemDescription 
 objects,
 --

 Key: UIMA-61
 URL: http://issues.apache.org/jira/browse/UIMA-61
 Project: UIMA
  Issue Type: Bug
  Components: Core Java Framework
Reporter: Adam Lally
 Assigned To: Adam Lally
Priority: Minor
 Fix For: 2.1

 Attachments: UIMA-61-testCase.patch


 The CasCreationUtils.createCas(Collection,...) methods only accept certain 
 kinds of objects in the Collection: AnalysisEngineDescription, 
 CollectionReaderDescription, CasInitializerDescription, 
 CasConsumerDescription, or ProcessingResourceMetaData.  Any other kinds of 
 objects in the collection are silently ignored.
 A user tried to pass a TypeSystemDescription object, expecting that it would 
 be used to initialize the CAS type system.  This didn't work but didn't cause 
 an error, so the user had a hard time figuring out what was wrong with their 
 application.
 There's no reason why these methods could accept TypeSystemDescription 
 objects (as well as FsIndexCollection and TypePriorities objects).  
 Furthermore they should throw an error if passed a type of object that is not 
 allowed.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Progress on Migration Utility

2006-12-14 Thread Adam Lally

I've made some good progress on a utility that can be used to help
users migrate their code from IBM UIMA to Apache UIMA.

The class is org.apache.uima.tools.migration.IbmUimaToApacheUima in
uimaj-tools, and there are corresponding .bat/.sh scripts in
uimaj-distr.

It's basically just a glorified search-and-replace utility, but has
some special treatments of package names to make sure that it only
updates actual UIMA package names, not just everything with a
com.ibm.uima prefix (which several of our users within IBM do).

Anyway I've tried it on some pretty big UIMA projects and it seems to
do very well.  Here are some things it won't handle:

* User code that's in a package with the same exact name as one of the
UIMA packages.  Hopefully this occurs rarely, but unfortunately
there's one common case - DocumentAnnotation - which I mentioned in a
previous email. In such a case the package statement would get
replaced, but the .java file will then be in the wrong place in the
source tree.

* Package names that are prefixed by org.apache.uima AND start with a
capital letter.  I hope no one has a package named
com.ibm.uima.MyPackage.  This would be treated as a class name and
replaced with org.apache.uima.MyPackage wherever it occurs.

* Use of _undocumented_ classes in the com.ibm.uima.util package,
because these moved to a different package than the documented
classes.  Can be fixed in Eclipse by a simple Organize Imports
operation.

* xi:include in descriptors.  There's no easy way to automatically
replace this, unfortunately.  Users will have to manually replace them
with the appropriate use of import.


More work on this will probably be needed as we make more decisions to
change things, for example if we entirely remove TCAS.

-Adam