Strange "about" file - can it be deleted?

2007-07-09 Thread Marshall Schor
uimaj-tools has in src/main/resources   in 
org/apache/uima/tools/util/gui/ a file "about.txt, with the content:



Apache UIMA (Unstructured Information Management Architecture) SDK
Version ${version}-incubating
http://incubator.apache.org/uima

Copyright 2006, 2007 The Apache Software Foundation

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).


This file seems left over from something previous.  The ${version} is 
never replaced.  Normally, "about" files are for Eclipse plugins, I 
think, where you add additional licensing temrs that apply to that plugin.


Does anyone know the purpose of this file, if there is none, does anyone 
object to removing it?


-Marshall


Re: Strange "about" file - can it be deleted?

2007-07-10 Thread Marshall Schor
OK - I was misled by the name of the file.  The Eclipse license refers 
to "about" files kept in plugins.

I thought it was one of these...

-Marshall

Thilo Goetz wrote:

This is the about text that is loaded for all utilities (DocumentAnalyzer,
CVD etc.) in org.apache.uima.tools.util.gui.AboutDialog.  Please do not
remove ;-)

--Thilo

Marshall Schor wrote:
  

uimaj-tools has in src/main/resources   in
org/apache/uima/tools/util/gui/ a file "about.txt, with the content:


Apache UIMA (Unstructured Information Management Architecture) SDK
Version ${version}-incubating
http://incubator.apache.org/uima

Copyright 2006, 2007 The Apache Software Foundation

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).


This file seems left over from something previous.  The ${version} is
never replaced.  Normally, "about" files are for Eclipse plugins, I
think, where you add additional licensing temrs that apply to that plugin.

Does anyone know the purpose of this file, if there is none, does anyone
object to removing it?

-Marshall




  




July UIMA Board report due tomorrow - I put something in the wiki

2007-07-10 Thread Marshall Schor

Please review and fix / augment as needed :-)  Wiki is:
http://wiki.apache.org/incubator/July2007

-Marshall


multi-threading and TypeSystemImpl

2007-07-11 Thread Marshall Schor

I poked around in the code for TypeSystemImpl a bit and concluded:

1) Re: The Array of arbitrary FSs mechanism, which is implemented to 
allow adding additional arrays of specific FS types after the type 
system has been committed.  The call which actually adds a type is 
ts.getArrayType(componentType).  The framework seems to avoid calling 
this, other than when creating a whole type system.  But this method is 
public, defined in the TypeSystem interfaqce, and so a user could call 
it at any time, on any thread:


 /**
  * Obtain an array type with component type componentType.
  *
  * @param componentType
  *  The type of the elements of the resulting array type. This 
can be any type, even

  *  another array type.
  * @return The array type with the corresponding component type.
  */
 Type getArrayType(Type componentType);

This method will do a lookup in componentToArrayTypeMap, and if it 
doesn't find the component type, it will add it, and update this using a 
"put".  The componentToArrayTypeMap is a non-synchronized 
IntRedBlackTree.  So this is a potential failure if this happens in a 
multi-threaded environment, via a user calling ts.getArrayType(...).


We probably need to make access to this synchronized, or use some 
fancier method of locking.


2) It seems very surprising to me that the synchronization around 
get(int) would ever have a collision - it's too tiny a piece of code.  
The code for get(int) I think just is an array dereference, contained in 
a synchronized method.  So - I think something else is causing the 
slowdown, but I don't know what that could be.  It also could be that 
the JProbe code is altering the behavior.


3) We should probably eliminate the call in ll_isArrayType that calls 
ll_isValidTypeCode - it's not needed.


4) I think we want to design the TypeSystemImpl so that when it is 
"locked" - it is thread-safe for running with multiple threads.  This 
seems to involve adding synchronization to other object accesses.  
(Example object: the "locked" boolean).  Because of (2) above, I'm 
guessing (hoping) this would not have an impact on performance.  Before 
it is locked, it can probably be assumed that only one thread will ever 
be accessing the type system (is this true / "provable" in the current 
design?  - if not - I suppose we could make the design thread-safe even 
before "committting").  

5) The instance of CASMetaData  and FSClassRegistry are tied to 
instances of TypeSystemImpl, and so, also has to be thread safe.


Note that the FSClassRegistry "generators" already have shadow instances 
for each CAS in a CAS Pool that might be running

on different threads.

If we did all this, we still probably haven't solved the problem of 
slowdown for xmi serialization.  Would be great to have a test case for 
this.  Greg - can you sanitize up something simple that shows the 
problem, and then submit a Jira issue and attach the test as a patch?


-Marshall









Re: PEAR InstallationController API

2007-07-11 Thread Marshall Schor
There was a (in)famous incident where the implementation for rerunning 
the Semantic Index Builder did the same thing - it (formerly) erased all 
the files in the target directory.


What happened was a senior level manager (Dave Ferrucci) was just trying 
out this code, he was busy and didn't read all the fine print, and when 
it asked him where to put the index, he said - well - hmm - just put it 
on my desk top.  Sure enough, the code erased his entire desktop...  He 
was not happy about this, to say the least.


So - if you want to do this, please be sure this kind of user scenario 
won't occur by accident. 


-Marshall

Michael Baessler wrote:

Hi,

when installing a PEAR file, currently the InstallationController API 
overrides all available data in the target installation directory. The 
directory content isn't removed before.
So it can happen that old class files or descriptor files that are 
never valid are still in the installation directory since they will 
not be removed when the new stuff is installed. Existing files with 
the same name will be overridden.


So my plan was to clean the target installation directory before the 
new PEAR file is installed to it. If the target install directory 
should be cleaned can be specified with an additional parameter when 
using the PEAR installer API.


An known issue with that approach is, that after installing a PEAR 
file to a directory, the same PEAR file cannot be installed again to 
the same directory with the same JVM since the JVM has locked the jars 
so they cannot be deleted when the directory for the next installation 
should be cleaned. To do that, the JVM must be restarted, or another 
installation directory must be used. The JVM locks the jar files when 
the installation verification is executed.


What do others think about this approach?

-- Michael









[Fwd: Re: Iterators: problem when using standard methods in combination with moveTo*]

2007-07-12 Thread Marshall Schor

Thilo - is this "fixable" - so it just works as users expect?

-Marshall

 Original Message 
Subject:Re: Iterators: problem when using standard methods in
combination with moveTo*
Date:   Thu, 12 Jul 2007 13:33:31 +0200
From:   Thilo Goetz <[EMAIL PROTECTED]>
Reply-To:   [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
References:
<[EMAIL PROTECTED]> 


<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>



Hi Julien,

Julien Nioche wrote:

Thilo and Marshall,

Thanks for sharing the tip. Indeed it would be a good idea to add this
little example to the documentation.

A quick comment about the Iterator methods. I had a problem with the
following piece of code:

/while (wordFormIterator.hasNext()){
WordForm wf = (WordForm)wordFormIterator.next();
if (wf.getBegin()==token.getBegin() && wf.getEnd()==token.getEnd()){
liste.add(wf);
}
else {
//  move back
wordFormIterator.moveToPrevious();
 return liste;
 }
}
/
The last element of the iterator was never accessible because
/hasNext()/ returned false despite the fact that there WAS an element
left in there. /moveToPrevious /had been previously called on this iterator.

Should not /hasNext() /return true even if the cursor has been moved
forward or backward within the iterator? Or is the use of the legacy
methods (hasNext(), next()) incompatible with the /moveTo* /methods?


hm, I thought this was in our documentation, but couldn't find it myself.
You should not mix the use of next()/hasNext() with the methods defined
in the FSIterator interface.  They do not work well together.  If you use
the FSIterator APIs, you should use them exclusively.  Sorry about that.
I'll add a comment to the javadocs.



Thanks

Julien

To be a bit more explicit, here's some code that will determine how
many tokens the longest sentence in the document contains.  It's a
silly example, but it illustrates the concept.  Maybe this should go
in the docs.  Note: I have not actually run this code, it may not
work immediately ;-)

CAS cas = ...;
Type sentenceType = cas.getTypeSystem().getType("yourSentenceTypeName");
Type tokenType = cas.getTypeSystem().getType("yourTokenTypeName");
FSIterator sentenceIt = cas.getAnnotationIndex(sentenceType).iterator();
AnnotationIndex tokenIndex = cas.getAnnotationIndex(tokenType);
FSIterator tokenIt;
int maxLen = 0;
int currentLen;
for (sentenceIt.moveToFirst(); sentenceIt.isValid(); 
sentenceIt.moveToNext()) {
  tokenIt = tokenIndex.subiterator((AnnotationFS) sentenceIt.get());
  currentLen = 0;
  for (tokenIt.moveToFirst(); tokenIt.isValid(); tokenIt.moveToNext()) {
++currentLen;
  }
  maxLen = ((maxLen < currentLen) ? currentLen : maxLen);
}
System.out.println("Longest sentence contains " + maxLen + " tokens.");

--Thilo

Marshall Schor wrote:
  

Did you consider using subIterators?  These are (briefly) described in
section 4.7.4 of the Apache UIMA Reference book, and may include exactly
what you're trying to get at - an interator over elements that are
"contained" in the span of other elements.

-Marshall

Julien Nioche wrote:


Hi,

Sorry if someone already asked the question.
Is there a direct way to obtain from a Cas all the annotations of a
given type located between two positions in the text? Something like
getContained(String type,int start,int end)?
I am trying to get all the Tokens contained within a specific
Sentence. I have used iterators for doing that and compared the offset
with those of the Sentence but it is a bit tedious. Have I missed
something obvious?

Thanks

Julien


  









Re: [Fwd: Re: Iterators: problem when using standard methods in combination with moveTo*]

2007-07-12 Thread Marshall Schor

It seems to me the only object of confusion arises when users use
next() to get an element and move to the next element, and then use
moveToPrevious, which as you say, may or may not work if the iterator
ended up "invalid" because the next() moved it past the last element.

So the only "improvement" I would think is wanted is to make the moveTo
operations work reliably in these kinds of cases.  Is that hard to do?

-Marshall

Thilo Goetz wrote:

What is the expected behavior?  Here's the impl:

  public boolean hasNext() {
return isValid();
  }

  public Object next() {
Object result = get();
moveToNext();
return result;
  }

Perfectly reasonable, but has some consequences
that may not be obvious.  For example, when you
do

FS fs1 = it.next();
FS fs2 = it.get();

then fs1 != fs2.  Is that intuitive?  I don't know.
Is it fixable?  Not easily, no.

We also have this more subtle behavior:

FS fs1 = it.next();
it.moveToPrevious();
FS fs2 = it.next();

The last line may throw a NoSuchElementException.  Why?
Because the first line may invalidate the iterator, and
then moveToPrevious() will not normally make the iterator
valid again (sometimes it will, depending on the iterator
implementation).

So in terms of what is reasonable, the iterators behave
as expected.  Still, because the interacations are so
subtle, it is not a good idea to mix the paradigms.  I
never do, even though I think I understand what's going
on.

I'm pretty sure I've documented this before.  I don't know
where that text went.  Maybe I dreamed it.

--Thilo

Marshall Schor wrote:
  

Thilo - is this "fixable" - so it just works as users expect?

-Marshall

 Original Message 
Subject: Re: Iterators: problem when using standard methods in
combination with moveTo*
Date: Thu, 12 Jul 2007 13:33:31 +0200
From: Thilo Goetz <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
References:
<[EMAIL PROTECTED]>

<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>



Hi Julien,

Julien Nioche wrote:


Thilo and Marshall,

Thanks for sharing the tip. Indeed it would be a good idea to add this
little example to the documentation.

A quick comment about the Iterator methods. I had a problem with the
following piece of code:

/while (wordFormIterator.hasNext()){
WordForm wf = (WordForm)wordFormIterator.next();
if (wf.getBegin()==token.getBegin() && wf.getEnd()==token.getEnd()){
liste.add(wf);
}
else {
//  move back
wordFormIterator.moveToPrevious();
 return liste;
 }
}
/
The last element of the iterator was never accessible because
/hasNext()/ returned false despite the fact that there WAS an element
left in there. /moveToPrevious /had been previously called on this
iterator.

Should not /hasNext() /return true even if the cursor has been moved
forward or backward within the iterator? Or is the use of the legacy
methods (hasNext(), next()) incompatible with the /moveTo* /methods?
  

hm, I thought this was in our documentation, but couldn't find it myself.
You should not mix the use of next()/hasNext() with the methods defined
in the FSIterator interface.  They do not work well together.  If you use
the FSIterator APIs, you should use them exclusively.  Sorry about that.
I'll add a comment to the javadocs.



Thanks

Julien
  

To be a bit more explicit, here's some code that will determine how
many tokens the longest sentence in the document contains.  It's a
silly example, but it illustrates the concept.  Maybe this should go
in the docs.  Note: I have not actually run this code, it may not
work immediately ;-)

CAS cas = ...;
Type sentenceType =
cas.getTypeSystem().getType("yourSentenceTypeName");
Type tokenType = cas.getTypeSystem().getType("yourTokenTypeName");
FSIterator sentenceIt =
cas.getAnnotationIndex(sentenceType).iterator();
AnnotationIndex tokenIndex = cas.getAnnotationIndex(tokenType);
FSIterator tokenIt;
int maxLen = 0;
int currentLen;
for (sentenceIt.moveToFirst(); sentenceIt.isValid();
sentenceIt.moveToNext()) {
  tokenIt = tokenIndex.subiterator((AnnotationFS) sentenceIt.get());
  currentLen = 0;
  for (tokenIt.moveToFirst(); tokenIt.isValid();
tokenIt.moveToNext()) {
++currentLen;
  }
  maxLen = ((maxLen < currentLen) ? currentLen : maxLen);
}
System.out.println("Longest sentence contains " + maxLen + "
tokens.");

--Thilo

Marshall Schor wrote:
 


Did you consider using subIterators?  These are (briefly) described in
section 4.7.4 of the Apache UIMA Reference book, and may include
exactly
what you're trying to get at - an interator over elements that are
"contained" in the span of other elements.

-Marshall

Julien Nioche

Re: UIMA Javadocs in Eclipse

2007-07-12 Thread Marshall Schor

Thilo Goetz wrote:

Is it just me, or is the following from section 1.1 of the Tutorial and Guide 
out of context:

Note

In Eclipse 3.1, if you highlight a UIMA class or method defined in the UIMA SDK 
JavaDocs, you can
conveniently have Eclipse open the corresponding JavaDoc for that class or 
method in a browser, by
pressing Shift + F2.

I was unable to find any instructions that tell me how to set things up so this 
works.
Should we add something to section 3.2 of the Overview?  Did I miss anything?
  
Probably should add something.  There's a bit about this in the Javadocs 
chapter - Ch 1 of the References.

I also didn't find a description on how to set up things in Eclipse so I have 
the
UIMA source code linked up to my classpath, so that F3 works.  It would be 
convenient
if we had a src.jar, like I see done in other projects.  

I agree.  see Jira http://issues.apache.org/jira/browse/UIMA-248

Does Maven have an easy
way to create that?
  

Probably just another assembly step

--Marshall


Re: multi-threading and TypeSystemImpl

2007-07-12 Thread Marshall Schor

Thilo Goetz wrote:

Adam Lally wrote:
  

On 7/12/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:


What's unclear about this method?

  

You can get a Type object that represents a typed-array, but there is
no way to create an instance of such an array.  What good is it then
to get the Type object?

-Adam



:-) So the whole concept is useless.  Remind me why we
have parametric arrays?
  
Two uses currently:  One is XMI serialization - it makes use of this 
info for a much more compact serialized form. 
The second: JCasGen uses this info when generating cover functions to do 
compile-time checking of arguments, and returning the right class of 
result.  So, if you have an FSArray of Foo objects, defined as the 
type:feature MyType:FooArray, the setters and getters for elements of 
this type,  anInstanceOfMyType.setFooArray(index, value) has the method 
parameter type for "value" be of class Foo, rather than of class Top, 
and anInstanceOfMyType.getFooArray(index) returns an instance of Foo class.


Perhaps there are other uses, but those come to mind right now. -Marshall

--Thilo



  




taking a crack at writing a xmi-serialization in multiple threads test case

2007-07-12 Thread Marshall Schor
I'm going to try and make a test case to see if I can duplicate Greg's 
slowdowns.  If anyone has beaten me to it, please let me know right away ;-)


-Marshall


Re: taking a crack at writing a xmi-serialization in multiple threads test case

2007-07-12 Thread Marshall Schor
I committed a new test case to 
org.apache.uima.cas.impl.XmiCasDeserializerTest:  It makes "n" copies of 
the largest
test cas, and then starts 1, 2, 4, 8, ... n threads to serialize this 
out using XMI serialization.  It prints out the normalized
times for each of these runs. 

It seems to work, but doesn't show any degrading when more threads are 
run.  (Perhaps the test case is faulty - please

take a look).

This would confirm my intuition that synchronization locking for the 
"vector" get operation (which is just a fetch from an array) is unlikely 
to be causing any slowdown here. 

Greg - it would be good if you could run this test (from Eclipse - if 
you have the SVN source checked out, you right-click the
class and select "run as JUNIT" and it runs (or maybe not - if it runs 
out of memory - then just open the run menu on this runner, and add to 
the vm args something like -Xmx384m)).  If it works for you, please see 
if you can isolate what's different in your failing case verses this 
artificial test.


-Marshall

Marshall Schor wrote:
I'm going to try and make a test case to see if I can duplicate Greg's 
slowdowns.  If anyone has beaten me to it, please let me know right 
away ;-)


-Marshall






Re: LGPL Icons

2007-07-12 Thread Marshall Schor

Jörn Kottmann wrote:

Hello,

is it possible to use LGPL icons in the Cas Editor ?
I don't think so.  This is because downloaders of things from Apache 
expect to be free of "obligations" when they

do things with what they downloaded.

See http://people.apache.org/~cliffs/3party.html.  This lists LGPL under 
Category X: Excluded Licenses


-Marshall


Re: [Fwd: Re: Iterators: problem when using standard methods in combination with moveTo*]

2007-07-13 Thread Marshall Schor
Those arguments sound convincing to me.  I guess I would only prefer the 
docs were less mysterious,
and in addition to saying "not to mix the two styles",  they  would 
state the reasons, as you have here :-)


-Marshall

Thilo Goetz wrote:

Marshall Schor wrote:
  

It seems to me the only object of confusion arises when users use
next() to get an element and move to the next element, and then use
moveToPrevious, which as you say, may or may not work if the iterator
ended up "invalid" because the next() moved it past the last element.

So the only "improvement" I would think is wanted is to make the moveTo
operations work reliably in these kinds of cases.  Is that hard to do?

-Marshall



What do you mean by "reliably"?  Suppose I'm at the end of an iteration,
and I call moveToNext() 3 more times.  How many times do I need to call
moveToPrevious() to make the iterator valid again?  Does each iterator
implementation need to track that kind of information?

The problem is that this sort of thing is not part of the iterator
contract, and we have quite a few iterator implementations, for various
indexes etc.  When an iterator !isValid(), all bets are off.  You can
call moveToFirst(), moveToLast(), or moveTo(FS) to reset it.

hasNext()/next() were simply not designed for that kind of environment,
which is why I originally implemented a different interface.  I think
that's why, for example, there's no previous() in that API.  Suppose
you were to call next(), then previous().  What would the expected
result be?  Would you expect to get the same element twice, or not?
I personally have no intuition.

So I'll stick with my earlier recommendation: don't mix the two styles.
If you just need to iterate from start to end, by all means use the
java.util.Iterator style.  Else, use the FSIterator API.

--Thilo


  




Re: svn commit: r555746 - /incubator/uima/uimaj/trunk/uimaj-core/src/test/java/org/apache/uima/cas/impl/XmiCasDeserializerTest.java

2007-07-13 Thread Marshall Schor

Michael Baessler wrote:
As far as I know is System.nanoTime() a Java 5 feature and is not 
available in Java 1.4.

So do we still want to be Java 1.4 compatible?


I think we want the framework to be 1.4 compatible.  It's probably less 
a requirement for the
test cases :-), but I take your point - it's pretty hard to run things 
on 1.4 to check if the test

cases require 1.5

I agree we should fix this to use the 1.4 compatible timer.

-Marshall


-- Michael

[EMAIL PROTECTED] wrote:

Author: schor
Date: Thu Jul 12 13:30:48 2007
New Revision: 555746

URL: http://svn.apache.org/viewvc?view=rev&rev=555746
Log:
No Jira - added a test case to see if threading causes XMI
xmi serialization to slow down (it doesn't seem to, with
up to 32 threads).

Modified:

incubator/uima/uimaj/trunk/uimaj-core/src/test/java/org/apache/uima/cas/impl/XmiCasDeserializerTest.java 



Modified: 
incubator/uima/uimaj/trunk/uimaj-core/src/test/java/org/apache/uima/cas/impl/XmiCasDeserializerTest.java 

URL: 
http://svn.apache.org/viewvc/incubator/uima/uimaj/trunk/uimaj-core/src/test/java/org/apache/uima/cas/impl/XmiCasDeserializerTest.java?view=diff&rev=555746&r1=555745&r2=555746 

== 

--- 
incubator/uima/uimaj/trunk/uimaj-core/src/test/java/org/apache/uima/cas/impl/XmiCasDeserializerTest.java 
(original)
+++ 
incubator/uima/uimaj/trunk/uimaj-core/src/test/java/org/apache/uima/cas/impl/XmiCasDeserializerTest.java 
Thu Jul 12 13:30:48 2007

@@ -29,6 +29,7 @@
 import java.io.StringWriter;
 import java.util.Iterator;
 import java.util.List;
+import java.util.Properties;
 import java.util.Stack;
 
 import javax.xml.parsers.FactoryConfigurationError;

@@ -55,12 +56,15 @@
 import org.apache.uima.cas_data.impl.CasComparer;
 import org.apache.uima.internal.util.XmlAttribute;
 import org.apache.uima.internal.util.XmlElementNameAndContents;
+import org.apache.uima.resource.CasManager;
 import org.apache.uima.resource.metadata.FsIndexDescription;
 import org.apache.uima.resource.metadata.TypeDescription;
+import org.apache.uima.resource.metadata.TypePriorities;
 import org.apache.uima.resource.metadata.TypeSystemDescription;
 import org.apache.uima.resource.metadata.impl.TypePriorities_impl;
 import 
org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl;

 import org.apache.uima.test.junit_extension.JUnitExtension;
+import org.apache.uima.util.CasCopier;
 import org.apache.uima.util.CasCreationUtils;
 import org.apache.uima.util.FileUtils;
 import org.apache.uima.util.XMLInputSource;
@@ -160,6 +164,88 @@
 xmlReader.setContentHandler(deserHandler3);
 xmlReader.parse(new InputSource(new StringReader(xml)));
   }
+  +  public void testMultiThreadedSerialize() throws Exception {
+try {
+  File tsWithNoMultiRefs = 
JUnitExtension.getFile("ExampleCas/testTypeSystem.xml");

+  doTestMultiThreadedSerialize(tsWithNoMultiRefs);
+  File tsWithMultiRefs = 
JUnitExtension.getFile("ExampleCas/testTypeSystem_withMultiRefs.xml");

+  doTestMultiThreadedSerialize(tsWithMultiRefs);
+} catch (Exception e) {
+  JUnitExtension.handleException(e);
+}
+  }
+
+  private static class DoSerialize implements Runnable{
+  private CAS cas;
+ 
+  DoSerialize(CAS aCas) {

+  cas = aCas;
+  }
+ 
+public void run() {

+try {
+serialize(cas, null);
+//serialize(cas, null);
+//serialize(cas, null);
+//serialize(cas, null);
+} catch (IOException e) {
+   
+e.printStackTrace();

+} catch (SAXException e) {
+   
+e.printStackTrace();

+}
+}
+  }
+  +  private static int MAX_THREADS = 16;
+  // do as sequence 1, 2, 4, 8, 16 and measure elapsed time
+  private static int [] threadsToUse = new int[] {1, 2, 4, 8, 16/*, 
32, 64*/};

+
+  private void doTestMultiThreadedSerialize(File 
typeSystemDescriptor) throws Exception {

+// deserialize a complex CAS from XCAS
+CAS cas = CasCreationUtils.createCas(typeSystem, new 
TypePriorities_impl(), indexes);

+
+InputStream serCasStream = new 
FileInputStream(JUnitExtension.getFile("ExampleCas/cas.xml"));

+XCASDeserializer deser = new XCASDeserializer(cas.getTypeSystem());
+ContentHandler deserHandler = deser.getXCASHandler(cas);
+SAXParserFactory fact = SAXParserFactory.newInstance();
+SAXParser parser = fact.newSAXParser();
+XMLReader xmlReader = parser.getXMLReader();
+xmlReader.setContentHandler(deserHandler);
+xmlReader.parse(new InputSource(serCasStream));
+serCasStream.close();
+
+// make n copies of the cas, so they all share
+// the same type system
++final CAS [] cases = new CAS[MAX_THREADS];
++for (int i = 0; i < MAX_THREADS; i++) {
+cases[i] = CasCreationUtils.createCas(cas.getTypeSystem(), 
new TypePriorities_impl()

Propsed changes re: multi-threading and typeSystemImpl

2007-07-14 Thread Marshall Schor
In addition to http://issues.apache.org/jira/browse/UIMA-500 which 
reduces lock contention,

there are some other changes needed.

TypeSystemImpl is not thread safe, but is used by many threads.

Because it has the property that it is "updated" by one thread, and then 
is read-only by all threads, it can take advantage of the JSR-133 update 
to Java,

and avoid lots of synch's or "volatile" refs.

JSR-133 is in Java 1.5 and later, but this post indicates the part of 
JSR-133 that changes the semantics of intermixing volatile and 
non-volatile refs to allow use of a "guard" flag (such as, in our case, 
the type system "committed" flag),  been put into 1.4 by Sun: 
http://www.cs.umd.edu/~pugh/java/memoryModel/archive/1888.html


By the way, a great reference for all this stuff is 
http://www.cs.umd.edu/~pugh/java/memoryModel/


In light of this, I would propose:

1) we make use of this to implement a one-time "flushing" of any cached 
values when the type system is committed, using a volatile guard value, 
and insure all other threads that are going to have read-only access to 
the type system do a "get" on the guard value. 

2) Change the impl of getArrayType for array types *that are not already 
defined and committed* to throw an error, or at least log a message that 
this

usage is not type safe.

3) The "SymbolTable" class is a general class, but is only used in the 
TypeSystemImpl for holding the Feature Structure and Type Name maps to 
their int codes.  It tries to be thread save, but isn't quite.  For 
instance, it synch locks "writes" to the hashmap, but doesn't synch lock 
"reads" of the hashmap, which
exposes threads doing reads to not seeing the writes done in another 
thread due to the Java memory model.


Rather than fix this, I would rather remove the synchs from Symbol 
Table, switch the Vector to ArrayList, and make use of the fact that the 
TypeSystemImpl is used by a single thread until committed, and then is 
"read-only" (not updated).


Other opinions?

-Marshall




Re: Question about Aggregate Analysis Engines with Remote

2007-07-16 Thread Marshall Schor

Michael Baessler wrote:

What do you mean by
   "we can add a remoteAE if and only if, the AE is already deployed 
on the remote machine"


do you use the ComponentDescriptorEditor (CDE) plugin to do that?

In this case, the CDE has to retrieve the meta data of the deployed AE 
to get for example the type system data.
The meta data of the deployed AE must be read to check if there are no 
conflicts when adding the AE to the aggregate.

This step should prevent errors in the aggregate descriptor early.


However, the CDE is designed to operate, even if the remote is not
running.  In that case, it may give some warnings -
saying that it can't read the remote's type system, so that if the
remote is declaring some types that are not declared
elsewhere, then the JCasGen that might be run might be "incomplete".

These are all "warnings" - the CDE will still let you build the
aggregate, and save it.

-Marshall



-- Michael

Benjamin Sznajder wrote:

Hi,

I am trying to build a an Aggregate AE with some AEs running in remote.
When we are building the descriptor for the Aggregate AEs, we can add a
remoteAE if and only if, the AE is already deployed on the remote 
machine.

It would be useful if we could add these AEs although if they are not
currently deployed.

Benjamin.



  









Re: Deprecate old InstallationController APIs

2007-07-17 Thread Marshall Schor
+1  I don't know of any use of the 2nd way, but I may be just 
un-informed ;-)  Lev Kozakov may know something about this.


-Marshall

Michael Baessler wrote:
When looking at the InstallationController code I see some old method, 
where I think they are never used.


The InstallationController can be configured in two different ways to 
install a PEAR file.


The first way is to specify the PEAR file location with a target 
installation directory beside some other additional settings. I think 
this is the common way...


The second way is to specify only the PEAR file component ID with a 
target installation directory beside some other additional settings. 
When using this way the user will be asked either via command line or 
with a separate GUI to specify the path where the PEAR file with the 
specified component ID is located. I don't think that this API will be 
used by any user.


With the changes at the InstallationController for the "target 
installation directory cleanup" I don't want to update the additional 
parameters for both ways described above. I don't think that the 
second way is in use, so I want to deprecate it (or even better remove 
it completely ??).


What do others think about this? Does any know an user that use the 
second way?


-- Michael






Re: [jira] Updated: (UIMA-458) For creating new UIMA descriptors in Eclipse, make accelerator keys work better

2007-07-17 Thread Marshall Schor
Good points - I'll keep both versions from now on.   I was trying to 
clean up Jira
a bit to see if we had any remaining issues that needed attention in 
this release.


There are some which I didn't know anything about - so I left as affects 
2.1 (only).
These are mostly things Thilo and Adam might know the status of... 
please have a look :-)


-Marshall

Adam Lally wrote:

On 7/17/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:

I wonder: is it a good idea to change the "affects version" label
for these issues?  I mean, they do affect 2.1, and since they haven't
been fixed it's reasonable to assume that they affect 2.2 also.
Is there an advantage to changing this?



I think you can actually select multiple versions for the "Affects
Version/s" label.  So I definitely wouldn't remove 2.1.

As to whether it's worth the effort to add 2.2, it might make sense to
do this to support using the JIRA search to search for issues
affecting a particular version.  I'm assuming JIRA doesn't do the
inference that unresolved issues affecting 2.1 also affect 2.2.

-Adam






Re: [jira] Commented: (UIMA-498) TAEConfiguratorPlugin throws NullPointer during activation

2007-07-17 Thread Marshall Schor
Good point.  I'm choosing to deferring this until after 2.2 is out to 
avoid having to retest...


-Marshall

Adam Lally (JIRA) wrote:
[ https://issues.apache.org/jira/browse/UIMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513246 ] 


Adam Lally commented on UIMA-498:
-

Any 3.0-compatibility stuff can be dropped completely now.  We decided to drop 
support for 3.0 and have rewritten our plugin manifests in the 3.1+ style.

  

TAEConfiguratorPlugin throws NullPointer during activation
--

Key: UIMA-498
URL: https://issues.apache.org/jira/browse/UIMA-498
Project: UIMA
 Issue Type: Bug
 Components: Eclipse plugins
   Affects Versions: 2.2
   Reporter: Jörn Kottmann
Fix For: 2.2


TAEConfiguratorPlugin throws a NPE during activation if the 
org.eclipse.platform plugin
is missing (but its not required <- not listed in manifest). 
The plugin tries to retrieve the version of the org.eclipse.platform plugin
I suggest to remove the code since the version is not used after retrieval. 
The code in question is located in the static initializer of the class. 



  




Re: [jira] Updated: (UIMA-458) For creating new UIMA descriptors in Eclipse, make accelerator keys work better

2007-07-17 Thread Marshall Schor

Possible 2.2 issues waiting some status change before we do a *vote*:

CVD issues: UIMA-420, UIMA-316, UIMA-307

Changing default for TypsSystemMgr.addFeature UIMA-395

Declaration of multiple externalResources with same name UIMA-346

If you know anything about these (such as - they've been fixed :-) 
please update Jira.


Then, take a look at the test plan and see if you think we're ready to 
vote


-Marshall

Marshall Schor wrote:
Good points - I'll keep both versions from now on.   I was trying to 
clean up Jira
a bit to see if we had any remaining issues that needed attention in 
this release.


There are some which I didn't know anything about - so I left as 
affects 2.1 (only).
These are mostly things Thilo and Adam might know the status of... 
please have a look :-)


-Marshall

Adam Lally wrote:

On 7/17/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:

I wonder: is it a good idea to change the "affects version" label
for these issues?  I mean, they do affect 2.1, and since they haven't
been fixed it's reasonable to assume that they affect 2.2 also.
Is there an advantage to changing this?



I think you can actually select multiple versions for the "Affects
Version/s" label.  So I definitely wouldn't remove 2.1.

As to whether it's worth the effort to add 2.2, it might make sense to
do this to support using the JIRA search to search for issues
affecting a particular version.  I'm assuming JIRA doesn't do the
inference that unresolved issues affecting 2.1 also affect 2.2.

-Adam










Re: [jira] Updated: (UIMA-498) TAEConfiguratorPlugin throws NullPointer during activation

2007-07-17 Thread Marshall Schor

Hi Jörn,

I think I would rather do (a) nothing, so retesting is not needed, or if 
we're going to update files, do (b) the real fix (remove

the dependency on platform) and then retest.

Do you think we can hold off until UIMA version 2.3 for this, or do you 
think it is important to get this right for 2.2?  I'm kind of thinking 
that most Eclipse users would have the "platform" included, but I may be 
unaware of some use cases...


-Marshall

Jörn Kottmann wrote:

Hi Marshall,

we should add the line for org.eclipse.platform dependency to the 
manifest for 2.2.
This will help the eclipse platform to enforce that the plugin is is 
run as we tested it.


Jörn

On Jul 17, 2007, at 7:47 PM, Marshall Schor (JIRA) wrote:



 [ 
https://issues.apache.org/jira/browse/UIMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
]


Marshall Schor updated UIMA-498:


Fix Version/s: (was: 2.2)
Affects Version/s: 2.1

Plan to fix after 2.2 is out.


TAEConfiguratorPlugin throws NullPointer during activation
--

Key: UIMA-498
URL: https://issues.apache.org/jira/browse/UIMA-498
Project: UIMA
 Issue Type: Bug
 Components: Eclipse plugins
   Affects Versions: 2.1, 2.2
   Reporter: Jörn Kottmann

TAEConfiguratorPlugin throws a NPE during activation if the 
org.eclipse.platform plugin

is missing (but its not required <- not listed in manifest).
The plugin tries to retrieve the version of the org.eclipse.platform 
plugin
I suggest to remove the code since the version is not used after 
retrieval.

The code in question is located in the static initializer of the class.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.









Re: Question about Aggregate Analysis Engines with Remote

2007-07-18 Thread Marshall Schor

Thanks for pointing out that the messages are not clear - we'll improve
those.

I think this has been fixed in Jira issues:
http://issues.apache.org/jira/browse/UIMA-402 and
http://issues.apache.org/jira/browse/UIMA-462

If you can build the new release from sources, you can test this :-)

If this is still an issue, please supply specific sequence of steps so
we can reproduce.

Thanks. -Marshall

Benjamin Sznajder wrote:

Hi Marshall,

Sorry, but in case we try to build an Aggregate AE including some remote
machines with the Eclipse plugin, the user gets a strong and annoying
problem:
The eclipse plugin gives an error message and lets the user to choose
between 2 (not clear) steps:
- revert to last valid: we do not know what that means and what happens to
the file.
- Edit Existing: we enter an annoying situation in which editing other tabs
becomes impossible...

I apologieze: my last post was not clear: I spoke about the Eclipse Plugin
specifically.

Benjamin



   
 Marshall Schor
 <[EMAIL PROTECTED]>   
To 
 16/07/2007 14:59  uima-dev@incubator.apache.org   
cc 
   
 Please respond to Subject 
 [EMAIL PROTECTED] Re: Question about Aggregate
   r.apache.orgAnalysis Engines with Remote
   
   
   
   
   
   





Michael Baessler wrote:
  

What do you mean by
   "we can add a remoteAE if and only if, the AE is already deployed
on the remote machine"

do you use the ComponentDescriptorEditor (CDE) plugin to do that?

In this case, the CDE has to retrieve the meta data of the deployed AE
to get for example the type system data.
The meta data of the deployed AE must be read to check if there are no
conflicts when adding the AE to the aggregate.
This step should prevent errors in the aggregate descriptor early.



However, the CDE is designed to operate, even if the remote is not
running.  In that case, it may give some warnings -
saying that it can't read the remote's type system, so that if the
remote is declaring some types that are not declared
elsewhere, then the JCasGen that might be run might be "incomplete".

These are all "warnings" - the CDE will still let you build the
aggregate, and save it.

-Marshall
  

-- Michael

Benjamin Sznajder wrote:


Hi,

I am trying to build a an Aggregate AE with some AEs running in remote.
When we are building the descriptor for the Aggregate AEs, we can add a
remoteAE if and only if, the AE is already deployed on the remote
machine.
It would be useful if we could add these AEs although if they are not
currently deployed.

Benjamin.




  










  





Re: [jira] Commented: (UIMA-510) JCasGen uses an older Java model for merging hand-coded code with generated code, which doesn't support Java beyond the 1.4 level.

2007-07-19 Thread Marshall Schor
I've realized that the issue of Java 1.5 or greater is only an issue for 
users who modify the JCas generated classes, and try and use Java 
constructs beyond 1.4.   This is because JCasGen only generates 1.4 
level code.  So I suggest we add a note about this, perhaps to the known 
issues section of the manual, for now.


Doing a "local update site" is an option - but this requires making the 
plugins into one or more Eclipse Features.  This is a to-do in Jira ( 
https://issues.apache.org/jira/browse/UIMA-74 ).  I agree this would be 
a great thing to do ;-)


-Marshall

Thilo Goetz wrote:

My opinion: not a problem in principle.  What might lead to user problems is
not that we prereq Eclipse 3.2, but rather that our current plugin install has
no way of checking for the Eclipse version.  Is it possible to have such a
thing as a local update site as part of our distribution?  Something where
we can check what level of Eclipse the user has and issue an error if it's
too low?  I think we've discussed this before, we could check for EMF at the
same time.  That would be very helpful, both to us and our users.

--Thilo

Marshall Schor (JIRA) wrote:
  
[ https://issues.apache.org/jira/browse/UIMA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513727 ] 


Marshall Schor commented on UIMA-510:
-

Investigation shows that the EMF merge code was updated to handle Java 1.5 (and 
later?) as of EMF 2.2, which corresponds to Eclipse 3.2.  How serious a problem 
is it if we have
Eclipse 3.2 as the minimum level of Eclipse we run with?


JCasGen uses an older Java model for merging hand-coded code with generated code,  which doesn't support Java beyond the 1.4 level. 



Key: UIMA-510
URL: https://issues.apache.org/jira/browse/UIMA-510
Project: UIMA
 Issue Type: Bug
 Components: Tools
   Affects Versions: 2.1, 2.2
   Reporter: Marshall Schor

JCasGen has a "merge" functionality to merge user-written code in previous versions of 
the generated JCas cover classes, with regenerated versions of these cover classes, so as to 
preserve the user-written code (new code or modifications).  The functionality is provided by EMF.  
EMF started issuing warning messages that the Java modelling package it uses, "JDOM", was 
not updated for Java versions beyond Java 1.4.  Because of this, running JCasGen gives the 
following message in the Eclipse Error Log: Using the JDOM API when the source compatibility is not 
set to '1.4' or lower can cause unpredictable results.
The fix is to see if our use of EMF for this can be modified to use JDT's AST 
APIs instead.  For EMF itself, there is a property for the Facade Helper Class 
in the GenModel that can be switched to  
org.eclipse.emf.codegen.merge.java.facade.ast.ASTFacadeHelper - this might be 
where to start looking.
  




  




Re: File uima-docbook/TODO

2007-07-20 Thread Marshall Schor

I think these TODOs came over from the original Velocity project.
+1 to delete.  These are not TODOs for our use, I think.

-Marshall

Thilo Goetz wrote:

Do we need this file?  Some TODOs seem out of date,
for the others we could maybe open Jira issues?

--Thilo


  




Re: Files without license headers in docbook project

2007-07-20 Thread Marshall Schor

Thilo Goetz wrote:

As you may have noticed, I tagged a first release candidate
earlier today.  Subsequently, RAT found a number of issue.
These are the remaining ones.  As soon as these are resolved,
I'll build another release candidate.

--Thilo

Thilo Goetz wrote:
  

RAT complains about these:

 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/docbook/uima/organization/name-conventions.txt
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/docbook/uima/organization/olink-names
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/docbook/uima/organization/profile_conditions
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/docbook/uima/organization/roles.txt

I deleted the last two - they are not used and not needed.  The first 
two I added the Apache license to.

 !? uimaj-2.2.0-incubating/uima-docbooks/src/olink/olink_db_html.xml
 !? uimaj-2.2.0-incubating/uima-docbooks/src/olink/olink_db_htmlsingle.xml
 !? uimaj-2.2.0-incubating/uima-docbooks/src/olink/olink_db_pdf.xml


Added Apache license to these files.

 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/olink/overview_and_setup/htmlsingle-target.db
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/olink/overview_and_setup/pdf-target.db
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/olink/references/htmlsingle-target.db
 !? uimaj-2.2.0-incubating/uima-docbooks/src/olink/references/pdf-target.db
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/olink/tools/htmlsingle-target.db
 !? uimaj-2.2.0-incubating/uima-docbooks/src/olink/tools/pdf-target.db
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/olink/tutorials_and_users_guides/htmlsingle-target.db
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/olink/tutorials_and_users_guides/pdf-target.db


These files all are "generated" by Docbook, so can't have a license.



 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-html.xsl
 !? 
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-pdf.xsl

These files are both "generated" by Docbook processing, from the 
corresponding .xml.  The .xml versions have the license.


Cheers, -Marshall


Re: Files without license headers in docbook project

2007-07-20 Thread Marshall Schor

Oops -

They're already *not* in SVN. 

So they shouldn't be in our source distribution.  If they are, we can 
fix the assembly build step.


-Marshall

Marshall Schor wrote:

Thilo Goetz wrote:

Marshall Schor wrote:
 

Thilo Goetz wrote:


[...]
 

These files all are "generated" by Docbook, so can't have a license.


   

 !?
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-html.xsl 



 !?
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-pdf.xsl 





These files are both "generated" by Docbook processing, from the
corresponding .xml.  The .xml versions have the license.

Cheers, -Marshall



So if they're generated, why are they checked in?  And why
"generated" in quotes, anything remarkable about this
generation?  I think I asked the same question at the last
release, but don't remember the answer ;-)
  
Good point.  They're not needed to be checked in :-).  Over time, the 
ant script has evolved do a "make-like"
thing and rebuild them if the source date is more recent or they're 
not present.  They're very quick to re-build,
unlike the OLINK databases (which is the only reason those are checked 
in, as I recall).


I'll delete these 2 xslt files from SVN.

-Marshall






Re: Files without license headers in docbook project

2007-07-20 Thread Marshall Schor

Thilo Goetz wrote:

Marshall Schor wrote:
  

Thilo Goetz wrote:


[...]
  

These files all are "generated" by Docbook, so can't have a license.




 !?
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-html.xsl

 !?
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-pdf.xsl




These files are both "generated" by Docbook processing, from the
corresponding .xml.  The .xml versions have the license.

Cheers, -Marshall



So if they're generated, why are they checked in?  And why
"generated" in quotes, anything remarkable about this
generation?  I think I asked the same question at the last
release, but don't remember the answer ;-)
  
Good point.  They're not needed to be checked in :-).  Over time, the 
ant script has evolved do a "make-like"
thing and rebuild them if the source date is more recent or they're not 
present.  They're very quick to re-build,
unlike the OLINK databases (which is the only reason those are checked 
in, as I recall).


I'll delete these 2 xslt files from SVN.

-Marshall


Re: Files without license headers in docbook project

2007-07-20 Thread Marshall Schor
I fixed the assembly step for the src distr to exclude the .xsl files - 
they were getting created by

the running of the docbook build script.

Tested - works.  See commit log for details of the 1 line change, which 
was to add

src/styles/titlepage/*.xsl  in the right spot.

-Marshall

Marshall Schor wrote:

Oops -

They're already *not* in SVN.
So they shouldn't be in our source distribution.  If they are, we can 
fix the assembly build step.


-Marshall

Marshall Schor wrote:

Thilo Goetz wrote:

Marshall Schor wrote:
 

Thilo Goetz wrote:


[...]
 

These files all are "generated" by Docbook, so can't have a license.


  

 !?
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-html.xsl 



 !?
uimaj-2.2.0-incubating/uima-docbooks/src/styles/titlepage/titlepage-pdf.xsl 





These files are both "generated" by Docbook processing, from the
corresponding .xml.  The .xml versions have the license.

Cheers, -Marshall



So if they're generated, why are they checked in?  And why
"generated" in quotes, anything remarkable about this
generation?  I think I asked the same question at the last
release, but don't remember the answer ;-)
  
Good point.  They're not needed to be checked in :-).  Over time, the 
ant script has evolved do a "make-like"
thing and rebuild them if the source date is more recent or they're 
not present.  They're very quick to re-build,
unlike the OLINK databases (which is the only reason those are 
checked in, as I recall).


I'll delete these 2 xslt files from SVN.

-Marshall










Re: uimaj-2.2.0-RC2

2007-07-23 Thread Marshall Schor

Michael Baessler wrote:

Hi,

I see Marshall has fixed the RAT issues reported by Thilo and the JIRA 
bug tracking system says that we have no open issues to fix for uimaj 
2.2.


So it seems that we are ready to build the hopefully latest release 
candidate level.

+1


But what do we do with all the issues that hang around in the 
"resolved" state. I think the assignee of the defect should verify the 
resolved issue for the releases
2.1 and 2.2 before we build the next level. 

+1
I don't think that we need a new level to resolve these issues, or I'm 
wrong here?
I agree.  The things in the resolved state should be changed to close 
state (unless the person doing the change has an issue, of course).
I went thru the resolved / not closed, and for many where I knew the 
situation, changed them to closed. 


I know Eddie will try and finish up some doc changes today, also.


And we still have some open test cases... I can do some more testing 
tomorrow, but I cannot do all. So please add you name to the tests 
that you can look at.
When I took a look, the only open tests were the "migration" tools, and 
testing the examples - some of which is done.  (I also tested the SOAP 
example - I'll add that.)


So I think we're good to go, once doc updates are in, and that includes 
perhaps redoing the readme stuff for what's changed to capture the 
additional Jira items relevant to the 2.2 release.


My plan was to build the uimaj-2.2.0-RC2 when all the tests are 
executed and all the defects are verified. So that we can just need to 
do a simple regression test

on this level. Does this sound good to you?

+1

-Marshall


Re: Annotation viewer testing

2007-07-24 Thread Marshall Schor
I'm no expert - but in the code fragment below - it looks like for 
windows, the browser launcher is not a browser, but

rather just one of the command shell windows?

-Marshall


Michael Baessler wrote:

Hi,

when testing the annotation viewer some misconfiguration came up.

The annotation viewer has the possibility to use the browser to show 
the annotation results. By default on Linux the "mozilla" browser is 
used.
On "old" Linux systems, "mozilla" is correct to call, but I think on 
newer Linux systems, "mozilla" was replaced with "firefox". I'm not an 
expert for Linux systems and
I'm not sure if that is true for all. But if, we may should change our 
code and initialize the static browser settings in BrowserUtil.java


/**
  * An initialization block that determines the operating system and 
the browser launcher command.

  */
 static {
   String osName = System.getProperty("os.name");
   if (osName.startsWith("Windows")) {
 if (osName.indexOf("9") > -1) {
   __osId = WINDOWS_9x;
   __browserLauncher = "command.com";
 } else {
   __osId = WINDOWS_NT;
   __browserLauncher = "cmd.exe";
 }
   } else if (osName.startsWith("Linux")) {
 __osId = LINUX;
 __browserLauncher = "mozilla";
   } else {
 __osId = OTHER;
 __browserLauncher = "netscape";
   }
 }

Do we have an expert that knows what is the best setting here?

-- Michael






Re: Annotation viewer testing

2007-07-24 Thread Marshall Schor
Some history:  Debian (a Linux distributor) has a legal issue with 
Mozilla (and Firefox), around

trademarks; google this or see http://lwn.net/Articles/118268/

The issues appear to involve firefox as well.  Debian's browser is named 
Iceweasel.


-Marshall

Michael Baessler wrote:

Hi,

when testing the annotation viewer some misconfiguration came up.

The annotation viewer has the possibility to use the browser to show 
the annotation results. By default on Linux the "mozilla" browser is 
used.
On "old" Linux systems, "mozilla" is correct to call, but I think on 
newer Linux systems, "mozilla" was replaced with "firefox". I'm not an 
expert for Linux systems and
I'm not sure if that is true for all. But if, we may should change our 
code and initialize the static browser settings in BrowserUtil.java


/**
  * An initialization block that determines the operating system and 
the browser launcher command.

  */
 static {
   String osName = System.getProperty("os.name");
   if (osName.startsWith("Windows")) {
 if (osName.indexOf("9") > -1) {
   __osId = WINDOWS_9x;
   __browserLauncher = "command.com";
 } else {
   __osId = WINDOWS_NT;
   __browserLauncher = "cmd.exe";
 }
   } else if (osName.startsWith("Linux")) {
 __osId = LINUX;
 __browserLauncher = "mozilla";
   } else {
 __osId = OTHER;
 __browserLauncher = "netscape";
   }
 }

Do we have an expert that knows what is the best setting here?

-- Michael






Re: Annotation viewer testing

2007-07-24 Thread Marshall Schor

Some useful leads:

http://www.linuxworld.com/news/2006/101106-portland-project.html describes
xdg-utils, and says Fedora, OpenSUSE, and Debian have already committed 
to installing the utilities.
One utility enables visiting a Web page in the user's chosen browser. 

On Debian, this capability is available via the command "x-www-browser", 
which launches the user's graphical browser of choice.
OSDL's xdg-utils extends the availability across distributions. 

See http://portland.freedesktop.org/wiki/ for details on xdg-utils.   It 
is licensed under the MIT license, and
the MIT/X11 license is listed as category "A" (allowed) on Cliff's 3rd 
party page.


-Marshall

Michael Baessler wrote:

Hi,

when testing the annotation viewer some misconfiguration came up.

The annotation viewer has the possibility to use the browser to show 
the annotation results. By default on Linux the "mozilla" browser is 
used.
On "old" Linux systems, "mozilla" is correct to call, but I think on 
newer Linux systems, "mozilla" was replaced with "firefox". I'm not an 
expert for Linux systems and
I'm not sure if that is true for all. But if, we may should change our 
code and initialize the static browser settings in BrowserUtil.java


/**
  * An initialization block that determines the operating system and 
the browser launcher command.

  */
 static {
   String osName = System.getProperty("os.name");
   if (osName.startsWith("Windows")) {
 if (osName.indexOf("9") > -1) {
   __osId = WINDOWS_9x;
   __browserLauncher = "command.com";
 } else {
   __osId = WINDOWS_NT;
   __browserLauncher = "cmd.exe";
 }
   } else if (osName.startsWith("Linux")) {
 __osId = LINUX;
 __browserLauncher = "mozilla";
   } else {
 __osId = OTHER;
 __browserLauncher = "netscape";
   }
 }

Do we have an expert that knows what is the best setting here?

-- Michael






Re: Source in the binary release

2007-07-25 Thread Marshall Schor

Michael Baessler wrote:

Adam Lally wrote:

When checking through the Resolved issues assigned to me I noticed
that one of them was the addition of jars containing our source code,
as part of our binary release.  I must have missed that when it went
in.
The current uimaj-2.2.0-03 does not contain the issue UIMA-499. That 
issue will be in the next level.


I'm a little uneasy about this.  Don't some companies have issues with
their people downloading source code?  Does this create a barrier for
UIMA to be used?
Good point. I think if someone just download the binary release he 
don't think that
he also gets the source code. So when I think about this again I would 
say that we don't

add the source jars to our binary distribution.
+1.  But what folks want/need is the javadocs in some way that can be 
accessed by users

(not developers) of UIMA.


What about having a separate developer release package that contains 
the binary distribution package with

the additional src jars.


What do other apache projects do here - is it common for binary
distributions to contain the source?  I think this perhaps deserves
some discussion.

Maybe one of our mentors can advice us here...
I think for developers of UIMA, they will download the source 
distribution, which of course

has all the source ;-)

The issue seems to be how to "conveniently" attach the JavaDocs (which 
are distributed

with the binary distribution, and also available via http from our website).

In Eclipse, the attachment of JavaDocs for Jar files in the class path 
are kept in the .classpath file. 
Our distribution includes these for the "examples" project - so if you 
download the binary UIMA
distribution, and run the post-unzip "fixup", and then "import" the 
Examples project into Eclipse, you

get the javadoc attachments already done.

An issue might be: what about new projects?  An alternative is to copy 
the examples project - giving
it the new project's name, and then deleting all its contents.  You then 
have the classpath set up
properly. 

Another (better?) alternative is to use Eclipse's ability to collect a 
set of jars into a named "library".
Then you can specify that library as something to add to a classpath.  
When you define the named
library, you can set its javadoc location.  Named libraries are kept in 
the workspace, and can
easily be added to any project.  So you would need to do this just once 
per workspace.


Given this, I recommend we document this better (I updated the JavaDocs 
chapter in the reference),

and *not* ship the sources in the bin distribution
for the reasons cited above. 


-Marshall



Re: uimaj-2.2.0-RC2

2007-07-25 Thread Marshall Schor
This build has the sources included in the binary distribution.   I 
think we agree to remove this, right?


-Marshall

Michael Baessler wrote:
I have build the next release candidate level for uimaj-2.2.0 
(uimaj-2.2.0-RC2). I currently upload the level to people.a.o. 
(finished in about 30 minutes)

The level will be available at /home/mbaessler/distributions.

I also uploaded the RAT reports with some comments... please have a look.

The level contains the following JIRA issues:
https://issues.apache.org/jira/browse/UIMA-500 : [#UIMA-500] Reduce 
excessive synch lock contention caused by calls to ll_isValidTypeCode 
that are not needed - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-473 : [#UIMA-473] Update 
README and RELEASE_NOTES - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-465 : [#UIMA-465] Need 
getViewIterator() method to work with a variable number of views - ASF 
JIRA
https://issues.apache.org/jira/browse/UIMA-507 : [#UIMA-507] Remove 
ref to gutenberg.org to avoid licensing entanglement possibility - ASF 
JIRA
https://issues.apache.org/jira/browse/UIMA-494 : [#UIMA-494] 
AnalysisEngineDescription_impl indirectly uses promletatic method 
URL.equals() - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-508 : [#UIMA-508] Docbook 
build tool - not updating the olink databases unless running the full 
4-book build - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-492 : [#UIMA-492] uimaj-cpe 
test failures on some machines when run from maven - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-316 : [#UIMA-316] CVD does 
not display auto-indexes correctly - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-499 : [#UIMA-499] Add 
source jars to binary distribution - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-496 : [#UIMA-496] PEAR API 
does not delete the PEAR ID subdirectory before the new content is 
installed - ASF JIRA
https://issues.apache.org/jira/browse/UIMA-307 : [#UIMA-307] Fix CVD 
screenshots - ASF JIRA


-- Michael

Marshall Schor wrote:

Michael Baessler wrote:

Hi,

I see Marshall has fixed the RAT issues reported by Thilo and the 
JIRA bug tracking system says that we have no open issues to fix for 
uimaj 2.2.


So it seems that we are ready to build the hopefully latest release 
candidate level.

+1


But what do we do with all the issues that hang around in the 
"resolved" state. I think the assignee of the defect should verify 
the resolved issue for the releases
2.1 and 2.2 before we build the next level. 

+1
I don't think that we need a new level to resolve these issues, or 
I'm wrong here?
I agree.  The things in the resolved state should be changed to close 
state (unless the person doing the change has an issue, of course).
I went thru the resolved / not closed, and for many where I knew the 
situation, changed them to closed.

I know Eddie will try and finish up some doc changes today, also.


And we still have some open test cases... I can do some more testing 
tomorrow, but I cannot do all. So please add you name to the tests 
that you can look at.
When I took a look, the only open tests were the "migration" tools, 
and testing the examples - some of which is done.  (I also tested the 
SOAP example - I'll add that.)


So I think we're good to go, once doc updates are in, and that 
includes perhaps redoing the readme stuff for what's changed to 
capture the additional Jira items relevant to the 2.2 release.


My plan was to build the uimaj-2.2.0-RC2 when all the tests are 
executed and all the defects are verified. So that we can just need 
to do a simple regression test

on this level. Does this sound good to you?

+1

-Marshall










Re: Source in the binary release

2007-07-27 Thread Marshall Schor

Thanks, Michael; nice job :-)

I'm going to update the documentation to describe in the same sections 
(where you

took out info about how to attach the source) similar
sections on how to attach the javadocs which are included in the binary 
distribution.


-Marshall

Michael Baessler wrote:

Michael Baessler wrote:

Adam Lally wrote:

When checking through the Resolved issues assigned to me I noticed
that one of them was the addition of jars containing our source code,
as part of our binary release.  I must have missed that when it went
in.
The current uimaj-2.2.0-03 does not contain the issue UIMA-499. That 
issue will be in the next level.


I'm a little uneasy about this.  Don't some companies have issues with
their people downloading source code?  Does this create a barrier for
UIMA to be used?
Good point. I think if someone just download the binary release he 
don't think that
he also gets the source code. So when I think about this again I 
would say that we don't

add the source jars to our binary distribution.

What about having a separate developer release package that contains 
the binary distribution package with

the additional src jars.
I opened issue UIMA-514 to remove the source jars from the binary 
distribution package. Please review my changes.


-- Michael






Re: Source in the binary release

2007-07-27 Thread Marshall Schor

Adam Lally wrote:

On 7/26/07, Michael Baessler <[EMAIL PROTECTED]> wrote:

So should be add the source jars to the source release?



I don't think that is the normal Apache thing to do.  On
http://incubator.apache.org/guides/releasemanagement.html it defines
source release as "a simple export from the repository."  I think
putting derived artifacts in the source distribution should probably
be avoided at least for the most part.

We could consider having a third distribution which is like the binary
distribution but also has source in the jar files.  But OTOH there's a
nice simplicity to having just two distributions - binary and source -
to build, test, vote on, and post on our website for download.

Do I get this right that this is mostly about making the javadocs
available in Eclipse?  Even people who don't want the source code
might want that, so I think it would be nicer to find a different way
to enable that.

A thought: is it possible to have our Eclipse plugins define a "New
UIMA Project" wizard that automatically creates a new project with all
the right jar and javadoc references?

+1 good idea; any volunteers? -Marshall


Re: CASEditor cas processors

2007-07-27 Thread Marshall Schor

Jörn Kottmann wrote:


I played around with the CasEditor and tried to add a cas processor, 
but unfortunately without success. I was able to add the cas 
processor (annotator) to my project and I found the place in the 
context menu to run it, but I ever get ClassNotFound exceptions... 
what is true, I never specified the classpath for the component. 
Since we are in an eclipse environment, I guess we can only add 
plugins dependencies to the classpath. So I have to package my 
annotator component as plugin to add it as dependency to the 
CasEditor... is that right?


Thats correct they must be loaded form the class loader which loaded 
the uima runtime plugin.
Would a better design be to consider the user has a "project" he's 
working in, and use the classpath associated with that project?

That's what the CDE does.

-Marshall


Suggestion for updating release notes automatically

2007-07-29 Thread Marshall Schor

In our release notes, the last section is cut/pasted from the release
note generation done by Jira.  We could replace that with a
link to the Jira system to (re)generate on demand the release notes.
The link would be, for instance for 2.2:

http://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12312272&styleName=Html&projectId=12310570&Create=Create
http://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12312272&styleName=Text&projectId=12310570&Create=Create

for the html and text

If we just put those links into our release notes the would always be up
to date :-).

Is this a good idea?

-Marshall







Are we ready to release 2.2?

2007-07-30 Thread Marshall Schor
Except for needing to rerun the release note generation, I'm +1 ! for 
doing the 2.2 release.

Thilo - if you agree, can you call for an official vote?

-Marshall


Re: Source in the binary release

2007-07-30 Thread Marshall Schor

Thilo Goetz wrote:

Adam Lally wrote:
  

-1 to this change.  What exactly is the concern here?
  

My main concern is what I originally said: "Don't some companies have
issues with their people downloading source code?"



Does that concern a large corporation that some of us work
for, or is this a known concern for other companies, too?
  
I asked Ken Coar, one of our mentors, and he said he recalled this being 
a problem for
some companies.  He mentioned "government" customers. 
  

Source code?  We're an open source project, after all.
  

Well we do have a source release.  What is the concern that led to
bundling the source in the binary distribution?  That people shouldn't
have to do two separate downloads?



Our source distribution is a pain to work with.  The source zip has
everything conveniently bundled, all you need to do is associate it
with the UIMA jars in Eclipse, and you can debug through the source
code.  To do this with our source distribution, you need to manually
zip/associate the source code from the various projects.

  

We could have three separate releases - binary, source, and both.  Not
my preference, but preferable to not having a binary-only release at
all.

 -Adam





  




Re: Source in the binary release

2007-07-30 Thread Marshall Schor

Thilo Goetz wrote:

Eddie Epstein wrote:
  

On 7/30/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:


Eddie Epstein wrote:
  

How hard is it to create the source jars from the UIMA source distribution?


Not hard *if* you have our build env. set up (i.e., maven etc).
  

I'm sort of confused. Above you said that users have already been
finding bugs, even without source jars in the binary distribution. I
understand that conceptually there are people that are motivated to
dig into debugging the UIMA framework who don't want to spend the
extra time [30 min?] to download and setup the UIMA source code, but
practically speaking, how often is this the case?

Why do so few open source projects include source jars in their binary
distribution?



It is simply a matter of convenience, nothing more, nothing less.  I
don't know why other projects don't do it.  Some don't have such an
awkward project setup as we do, so it's trivial to take the binary and
source distributions and put them together.
  
Alternatives: 

1) We could zip up all our source and post it with the release.  Anyone 
wanting to attach it
could download it and do so. 

2) We could have a version of our bin that had the source in the Jars 
with the classes.  This is perhaps

the most convenient - users need to do nothing -- the source is just there.

3) We could make #2 the "default", notify users in BOLD BIG LETTERS that 
the "binary" distribution is
changing to include the source, and say if this is an issue for any of 
our users, to please let us know, and

we'll then add a version (like we have now) without the source.

-Marshall



Re: Source in the binary release

2007-07-31 Thread Marshall Schor

Thilo Goetz wrote:

Adam Lally wrote:
  

On 7/30/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:


Adam Lally wrote:
  

-1 to this change.  What exactly is the concern here?
  

My main concern is what I originally said: "Don't some companies have
issues with their people downloading source code?"


Does that concern a large corporation that some of us work
for, or is this a known concern for other companies, too?

  

I only know the specifics for corporations that I happen to work for.
:)  Without knowing for sure that it *isn't* a problem for others, I
think it's a risky move to eliminate our binary-only release.

Addressing the convenience issue of dealing with our source release,
could we whip up a script that would add the right source files to the
right jar files?  It wouldn't even need to compile anything so
shouldn't have a dependency on anything but the "jar" command line
tool.



Sure, that would be fine.  It's an additional thing to maintain (as
opposed to the 0-maintenance maven magic ;-), but maybe that's not such
a big deal.

  
I also like this idea.  Ideally, it would work so the user would have 
minimum impact.
The minimum I could think of would be for the user to download one 
additional

thing and run one command.  It would be good if the user didn't have to
remember to specify some long path...

Maybe we can figure out how to have Eclipse help us here.  I wonder if
this could be packaged as a feature using the update site mechanism.  It 
seems to me

that many Eclipse technologies come packaged with the source as separately
downloadable things.  Of course, the downside would be that this 
wouldn't support the

non-Eclipse, alternate IDE user.

So - maybe a first step would be to have all the source in one zip,
available for download, together with a small readme that gives step by 
step instructions on how to

a) create an Eclipse "library"
b) attach the source

(Step (a) makes it so you don't have to re-do this for every project.)

-Marshall



-Adam





  




I SVN Updated everything, but am getting test errors with mvn install, on AnnotationTreeTest

2007-07-31 Thread Marshall Schor

Running org.apache.uima.cas.test.AnnotationTreeTest


testTree(org.apache.uima.cas.test.AnnotationTreeTest)  Time elapsed: 
0.16 sec  <<< FAILURE!

junit.framework.AssertionFailedError
 at 
junit.framework.AssertionFailedError.(AssertionFailedError.java:11)

 at junit.framework.Assert.fail(Assert.java:47)
 at junit.framework.Assert.assertTrue(Assert.java:20)
 at junit.framework.Assert.assertTrue(Assert.java:27)
 at 
org.apache.uima.cas.test.AnnotationTreeTest.testTree(AnnotationTreeTest.java:85)   



Does anyone else have this failure?  -Marshall


Re: Source in the binary release

2007-07-31 Thread Marshall Schor

Adam Lally wrote:

Actually I was thinking of something perhaps even easier for the user.
 What I meant was that the script would automatically add the source
files directly into the jar files in the UIMA binary distribution.  So
no action would be necessary at all in Eclipse.

(To locate the binary distribution the script needs the UIMA_HOME
environment variable or could fall back on assuming that the source
dist. was instlled in the "src" subdirectory of the binary dist.)




I took a crack at implementing these scripts and have committed them
to SVN.  I put them in uimaj-distr/src/main/readme_src because this
directory contains files that are copied to the root directory of the
source distribution.  Perhaps that directory should be renamed though.
 I updated the README file (also in readme_src) to explain what the
scripts do.

Let me know what you think.
  
So - the user needs to download the source distribution, unzip it 
somewhere, and then run this script.


Sounds good to me - thanks for doing this!

Some minor points:

Since this will overwrite his lib jars in the UIMA_HOME distr, it would 
be good to give a message,
followed by an "OK / Cancel" - saying where UIMA_HOME is pointing to, etc. 

Also - the resources need to be included in the jars (they have the 
message bundles, etc.). 


Maybe just modify the jar command from:

jar -uvf %UIMA_HOME%\lib\uima-tools.jar *  to
jar -uvf %UIMA_HOME%\lib\uima-tools.jar * ../resources/*

-Marshall

-Adam


  




Re: Source in the binary release

2007-07-31 Thread Marshall Schor

Adam Lally wrote:

On 7/31/07, Marshall Schor <[EMAIL PROTECTED]> wrote:
  

Also - the resources need to be included in the jars (they have the
message bundles, etc.).




The resource are already in the jars, so we don't need to add them in
this step.  Just the source files need to be added.
  
oops - I didn't look closely at the -u flag.  I thought you were 
re-building & replacing the jars.


-Marshall



Re: I SVN Updated everything, but am getting test errors with mvn install, on AnnotationTreeTest

2007-08-01 Thread Marshall Schor

Thilo Goetz wrote:

It's our old friend the type priorities again.  Even though the test
case does not use type priorities, types are sorted differently in
the type system depending on JVM version.  So the test case went
through with JDK 1.6_1, but not 1.5_7, for example.  This sucks.
Since I don't have time to go after the root cause of this, I have
modified the test case to remove this dependency.
  
As I recall, the type order changes when there is no ordering specified, 
because it depends on
the order returned from "hashing"; and the hash function changed from 
jdk to jdk.


-Marshall

--Thilo

Thilo Goetz wrote:
  

Marshall Schor wrote:


Running org.apache.uima.cas.test.AnnotationTreeTest


testTree(org.apache.uima.cas.test.AnnotationTreeTest)  Time elapsed:
0.16 sec  <<< FAILURE!
junit.framework.AssertionFailedError
 at
junit.framework.AssertionFailedError.(AssertionFailedError.java:11)
 at junit.framework.Assert.fail(Assert.java:47)
 at junit.framework.Assert.assertTrue(Assert.java:20)
 at junit.framework.Assert.assertTrue(Assert.java:27)
 at
org.apache.uima.cas.test.AnnotationTreeTest.testTree(AnnotationTreeTest.java:85)  



Does anyone else have this failure?  -Marshall
  

I see it too, but only when running all tests with mvn install.  Running
the test individually works fine.  I'll investigate.

--Thilo




  




Re: Source in the binary release

2007-08-01 Thread Marshall Schor

Adam Lally wrote:

On 8/1/07, Michael Baessler <[EMAIL PROTECTED]> wrote:
  

I see that issue 499 is still in reopen state. I checked in my changes
using this issue. So I think we can close them or is there anything else
we need to do?



OK with me to close it.
  
We should update the documentation (3 places?) which describes how to 
attach javadocs, to now also mention
running those scripts to attach the source. 


-Marshall

-Adam


  




Re: Source in the binary release

2007-08-02 Thread Marshall Schor

Thilo Goetz wrote:

Michael Baessler wrote:
  

Marshall Schor wrote:


We should update the documentation (3 places?) which describes how to
attach javadocs, to now also mention
running those scripts to attach the source.
  

Do you know where exactly the places are that we need to change? I would
like to finish this as soon as possible so that we get the release out
of the door.

-- Michael



I have added a short paragraph to the Eclipse setup section.  If more
documentation is needed elsewhere, feel free to add it.  I'd like to
build a release candidate tomorrow morning and start the vote on it,
if that's ok with everybody (which means I would like to close UIMA-499).
  
I updated the ref javadoc chapter also.  I think that's all that's 
needed.  +1 to closing 499 now.

-Marshall


Re: [VOTE] Release uimaj-2.2.0-RC4 as uimaj-2.2.0-incubating

2007-08-04 Thread Marshall Schor

Adam Lally wrote:

On 8/3/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:
  

I think we're finally ready ;-)  I cut RC3 this morning,
found a minor issue with RAT and did RC4.  Next time I'll
run RAT before tagging the release...

The release artifacts are available on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC4
If somebody (other than me) could check the signatures,
that would be great.




I found an issue - the RELEASE_NOTES still say version 2.1.0 at the
top. :(  I will fix in SVN.
  -Adam
  

Found another via scanning the project for 2.1:  The index.html in the
docbooks (which is the
small html that points to the 4 "books" had 2.1 as a version number -
changed that to 2.2 also.

Now fixed in SVN.

-Marshall



Re: Make that RC5 [was: uimaj-2.2.0-RC4]

2007-08-06 Thread Marshall Schor

Adam Lally wrote:

On 8/6/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:
  

Thilo Goetz wrote:


I built a new release candidate with Adam's and Marshall's fixes
in.  It's available as usual at on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC5.  Let's try a new vote
later today or tomorrow.

  


Looks good.  My functional test scripts all pass on this build.

I see on the test plan Wiki page it still says (with bright orange
background): "Noticing potential failure of "merging" code if JVM
target > 1.4 - need fix in next release".  Did that get addressed?  If
so we may want to update the Wiki.

  
Yes - was addressed via a known limitations remark.  It is low risk 
because JCasGen generates 1.4  level
code - it's only an issue if someone adds Java 5 code to the generated 
files by hand.  I'll update the wiki


-Marshall


Re: Make that RC5 [was: uimaj-2.2.0-RC4]

2007-08-06 Thread Marshall Schor

Marshall Schor wrote:

Adam Lally wrote:

On 8/6/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:
 

Thilo Goetz wrote:
   

I built a new release candidate with Adam's and Marshall's fixes
in.  It's available as usual at on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC5.  Let's try a new vote
later today or tomorrow.

  


Looks good.  My functional test scripts all pass on this build.

I see on the test plan Wiki page it still says (with bright orange
background): "Noticing potential failure of "merging" code if JVM
target > 1.4 - need fix in next release".  Did that get addressed?  If
so we may want to update the Wiki.

  
Yes - was addressed via a known limitations remark.  It is low risk 
because JCasGen generates 1.4  level
code - it's only an issue if someone adds Java 5 code to the generated 
files by hand.  I'll update the wiki


-Marshall


There are 2 issues to fix in JCasGen for next release:  this one, see 
http://issues.apache.org/jira/browse/UIMA-510
and another one related to running Eclipse in "headless" mode which 
changed in Eclipse 3.3 , see

http://issues.apache.org/jira/browse/UIMA-513


Re: Make that RC5 [was: uimaj-2.2.0-RC4]

2007-08-06 Thread Marshall Schor

Thilo Goetz wrote:

Thilo Goetz wrote:
  

I built a new release candidate with Adam's and Marshall's fixes
in.  It's available as usual at on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC5.  Let's try a new vote
later today or tomorrow.


+1 to doing a new vote :-)  -Marshall


Re: [VOTE] Release uimaj-2.2.0-RC5 as uimaj-2.2.0-incubating

2007-08-07 Thread Marshall Schor

Thilo Goetz wrote:

Ok, here we go.  This build passes our regression test
suite, as well as Adam's test scripts.  Michael and I
also ran some manual sanity checks.  RAT report looks
good, too.

The release artifacts are available on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC5

So please cast your vote:

[ ] +1 Release RC5, it's ready
[ ] -1 Don't release yet, I found issues

  
+1.  I verified the asc signatures for the windows zips (bin and src).  
Would
it make sense for Michael and Thilo to "cross-sign" each other's keys? 
that might make the gpg result messages more convincing.  Right now they say


gpg: Signature made 08/06/07 02:16:25 using DSA key ID C874155C
gpg: Good signature from "Thilo Goetz (CODE SIGNING KEY) 
<[EMAIL PROTECTED]>"

gpg: WARNING: This key is not certified with a trusted signature!
gpg:  There is no indication that the signature belongs to the 
owner.


Maybe that would reduce the warning?  If you do figure out how to do the
cross signing, please post something to the web site and we'll do it 
here, too.

:-)

-Marshall


Re: Put Maven artifacts up for vote too?

2007-08-07 Thread Marshall Schor

Adam Lally wrote:

As I recall we had a couple of user requests for us to publish our
jars as Maven artifacts in the Apache incubator repository.  And I
think I got our Maven metadata into shape during the 2.1.0 release
process after some comments from Dan Kulp on the [EMAIL PROTECTED] list.

So should we put these up for a vote as well?
  
OK with me.  I verified that in at least one Jar (and Adam should 
confirm the build does this

for all Jars) there are the following files in the META-INF directory:

  DISCLAIMER
  LICENSE
  NOTICES

which seem necessary for individual Maven-distributed Jars.

Maven has a the ability to download "sources" as well from the 
repository, for Eclipse:


|mvn -Declipse.downloadSources=true|

This tells maven to download all associated sources of jar files in the 
pom.xml. Amazing, how much easier it is to set up a project with proper 
debugging sources etc.


Do we need to do something to "enable" this to work?

-Marshall



Re: Moving to Java 1.5?

2007-08-07 Thread Marshall Schor

Thilo Goetz wrote:

Hi all,

I'm wondering if it's time to move to Java 1.5 after this
release.  While there are still people out there who use
Java 1.4, their numbers are shrinking rapidly.  I'm guessing
that by the time we do our next release, the need for
Java 1.4 compatibility will have gone.

I can see three possible courses we could take:

1) Leave things as they are, keep all of the UIMA Java
distribution 1.4 compatible.  We should do this if we have
reason to believe that demand for 1.4 compatibility will
continue for some time to come.

2) Keep the core jars 1.4 compatible, but allow 1.5 constructs
in our tools.  I'm saying this because I was thinking of
doing some improvements to CVD where I might want to use 1.5.
However, I don't like this approach myself as it complicates
our build process.

3) Allow Java 1.5 constructs everywhere (and change our build
accordingly).  If the developers are all agreed to do this,
we should also check what the folks on uima-user think.

Let me know what you think.
  
In general, I think users are very slow to move.  However, in this case 
I think we can

move to 1.5, due to the following:

1) Users needing 1.4 can use release 2.2.  If users are slow to adopt 
1.5, they might also

be slow to adopt UIMA 2.3 for similar reasons.
2) At least one big conservative computer company with a history of slow 
migration also seems to

have moved to 1.5 for new releases ;-)

I do note that Eclipse has not moved, though (you can run Eclipse on a 
1.4 engine;
see http://www.eclipse.org/downloads/moreinfo/jre.php  

Eclipse has some arguments for staying with 1.4 compatibility, but some 
things in Eclipse
are starting to require 1.5 (e.g., the bridge that allows running AWT 
and Swing inside SWT
on some platforms requires 1.5, and the Eclipse J2EE IDE features 
require 1.5).


But I'm also strongly in favor of posting to the uima-users list to see 
if there are any objections.



-Marshall


Re: Put Maven artifacts up for vote too?

2007-08-08 Thread Marshall Schor

Adam Lally wrote:

On 8/7/07, Adam Lally <[EMAIL PROTECTED]> wrote:
  

Maven automatically generates "sources" jar files (check your local
Maven repository).  Unfortunately these don't have the DISCLAIMER,
LICENSE, NOTICES.  I will see if I can figure out how to get those to
be added.




I've fixed this in SVN (see UIMA-521).

The command to get maven to generate the source jars is:
mvn source:jar

This page describes how to create bundles to upload to a repository:
http://maven.apache.org/guides/mini/guide-central-repository-upload.html

  
Reading the above link - it specifies "requirements" for the pom file 
for things going into
the Maven repository, which our POM doesn't have.  2 I found missing 
were the
 and the  elements. 


-Marshall

-Adam


  




uimaj project pom questions

2007-08-08 Thread Marshall Schor
In the uimaj project's pom, the  element points to itself.  Why 
is this done, and

where is the documentation in Maven which describes what this means?

Also, there are two elements defining the javadoc plugin - is this 
intentional, or should/can

these be merged?

If we are heading toward putting things in the Apache / Maven 
repositories, should we make
our POMs inherit from Apache-wide poms?  I recall some discussion about 
this in incubator - creating a parent for incubating projects, for instance.


-Marshall


Re: uimaj project pom questions

2007-08-08 Thread Marshall Schor

Adam Lally wrote:

On 8/8/07, Marshall Schor <[EMAIL PROTECTED]> wrote:
  

In the uimaj project's pom, the  element points to itself.



I don't see any  element in the uimaj project's pom.
  

right - my oops - I was looking at the uimaj-distr pom, not the uimaj pom.
  

Also, there are two elements defining the javadoc plugin - is this
intentional, or should/can
these be merged?




They are different.  One is in  and just specifies
what version of the javadoc plugin to use.  The other is in
 and has the specifics on javadoc generation when mvn
site:site is run.  (This generates the javadocs for the website - I
put instructions for this on the Wiki a while back.)  There is yet
another javadoc generation command as part of the assembly process,
that builds the javadocs to be included in the distribution.

  

If we are heading toward putting things in the Apache / Maven
repositories, should we make
our POMs inherit from Apache-wide poms?  I recall some discussion about
this in incubator - creating a parent for incubating projects, for instance.




Maybe but I would prefer to deal with that later.  As I recall this is
not a requirement.
  

OK.  In case it is useful, the incubator pom is in the incubator svn:
http://svn.apache.org/repos/asf/incubator/public/trunk/pom/pom.xml and 
short discussion can be found here: 
http://mail-archives.apache.org/mod_mbox/incubator-general/200609.mbox/thread


It has refs to repositories for snapshots and for "releases". 


-Marshall


Re: SVN tags for our releases

2007-08-08 Thread Marshall Schor

Thilo Goetz wrote:

For uimaj-2.2.0 I created a directory, and collected all the
various release candidate tags under it.  For 2.1.0, we
didn't do that, and it's a bit disorganized.  Anybody mind
if I retroactively create a uimaj-2.1.0 dir and move all
2.1 tags under it?  It's not going to affect our SVN history
in any way, it's just a little easier to find things.

Any objections?

--Thilo


  

+1  Marshall


Re: Put Maven artifacts up for vote too?

2007-08-08 Thread Marshall Schor

Adam Lally wrote:

On 8/8/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:
  

Sure, that would be good.  I'm currently building a new level, I'll
create the maven artifacts manually for this one.




I'm actually having some trouble at the moment... For one thing mvn
repository:bundle-create fails when I run it on the root uimaj
project, basically because this project doesn't have any artifact
associated with it.  Of course it would be nice if it would build the
artifacts for the sub-projects...

Next issue is that running it on uimaj-core complains about missing
fields in the POM, which I'm looking into.  This may be a good thing
that it's catching.

  

One more question: I assume that the mvn artifacts do not need to
be signed, at least I didn't see anything about that.  Is that your
understanding also?




I think that is correct.
  

Adam's done a great deal of work getting this Maven thing to go - Thanks!

I've found some places in Apache docs where it seems to be true that the 
Maven artifacts themselves,
do need to be signed.  One of these is the ref to releasing "Struts", 
found here:

http://struts.apache.org/2.x/docs/creating-and-signing-a-distribution.html

see step 6 and 7.

Next, see http://www.apache.org/dev/release-publishing.html
which, while not specific to Maven, seems to indicate the need to sign 
individual Jars, etc.


Also, http://mina.apache.org/developer-guide.html in step 2 uses
gpg-sign-all script to sign all the jars.

Some of the jar signing I saw seemed to be of a different ilk than gpg 
signing.

Do you know how the Jar signing described in
http://java.sun.com/docs/books/tutorial/deployment/jar/signindex.html
relates to the gpg signing?

-Marshall


Re: Put Maven artifacts up for vote too?

2007-08-09 Thread Marshall Schor

Adam Lally wrote:

On 8/8/07, Marshall Schor <[EMAIL PROTECTED]> wrote:
  

I've found some places in Apache docs where it seems to be true that the
Maven artifacts themselves,
do need to be signed.  One of these is the ref to releasing "Struts",
found here:
http://struts.apache.org/2.x/docs/creating-and-signing-a-distribution.html

see step 6 and 7.




Yes, looks like they are typically signed.

I just committed a change to the uimaj pom that will cause the
artifacts to be signed when the "mvn deploy" task is run.  You need to
first install gpg and create a key as described in:
http://incubator.apache.org/uima/distribution.html.
  

Great!  You are a true Maven maven :-)  -Marshall




Where we are thinking of posting our incubating Maven artifacts, and the rationale

2007-08-12 Thread Marshall Schor


I read through the info referenced in the note below from Incubator General.



[1] http://www.nabble.com/-POLL--Incubator-Maven-Repository- 
tf3699415.html#a10344861
[2] http://www.nabble.com/Maven-2-repo-for-incubating-project- 
releases--tf2008291.html#a5517073


Both of these raised questions of where incubating projects should put 
"maven" artifacts.


There is no conclusion that I see here.  Any opinions / rationale on 
where we would

post our maven artifacts, after the vote?

-Marshall

 Original Message 
Subject:Re: policy on incubating artifacts
Date:   Mon, 30 Jul 2007 23:32:04 -0700
From:   Craig L Russell <[EMAIL PROTECTED]>
Reply-To:   [EMAIL PROTECTED]
To: [EMAIL PROTECTED], Incubator <[EMAIL PROTECTED]>
References: 
<[EMAIL PROTECTED]> 
<[EMAIL PROTECTED]> 
<[EMAIL PROTECTED]>




Hi Steve,

There is plenty of information on the incubator web site. I suggest  
reading
http://incubator.apache.org/guides/releasemanagement.html (not  
normative)
http://incubator.apache.org/incubation/ 
Incubation_Policy.html#Releases (normative)


The references below are more discussions on the mechanics of how and  
where to store release artifacts. They are not disputing the process  
of release which is pretty well articulated in the references above.


You can post more detailed questions on [EMAIL PROTECTED]

Craig

On Jul 30, 2007, at 11:10 PM, Gilles Scokart wrote:

Mmm...  It seems that there is no real decision taken on the  
subject [1].

And it was not the first time that it was discussed. [2]

So, for the moment it is not clear for me what incubator project can
do, and what they can not do.


[1] http://www.nabble.com/-POLL--Incubator-Maven-Repository- 
tf3699415.html#a10344861
[2] http://www.nabble.com/Maven-2-repo-for-incubating-project- 
releases--tf2008291.html#a5517073


Gilles

2007/7/31, Brett Porter <[EMAIL PROTECTED]>:

I suggerst you search the [EMAIL PROTECTED] archives.

On 30/07/2007, at 9:09 PM, Steve Loughran wrote:


What's the policy on publishing stuff in the incubator? Like Ivy
alpha/betas?





--
Gilles SCOKART


Craig Russell
Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
408 276-5638 mailto:[EMAIL PROTECTED]
P.S. A good JDO? O, Gasp!




Re: Where we are thinking of posting our incubating Maven artifacts, and the rationale

2007-08-13 Thread Marshall Schor

Thilo Goetz wrote:

Marshall Schor wrote:
  

I read through the info referenced in the note below from Incubator
General.



[1] http://www.nabble.com/-POLL--Incubator-Maven-Repository-
tf3699415.html#a10344861
[2] http://www.nabble.com/Maven-2-repo-for-incubating-project-
releases--tf2008291.html#a5517073
  

Both of these raised questions of where incubating projects should put
"maven" artifacts.

There is no conclusion that I see here.  Any opinions / rationale on
where we would
post our maven artifacts, after the vote?

-Marshall


[...]

I don't remember a conclusion either, but I'm ok with putting them in
the incubator repository, as Adam suggested.  AIUI, the incubator has
not positively decided that it's ok for incubating projects to deploy
their artifacts to the central repo.  So the incubator repo it is, as
far as I'm concerned.  I don't feel like kicking off another discussion
on [EMAIL PROTECTED] right now :-)
  
OK - I didn't see that other email.  Looks like we'll have a lot of 
company on


/www/people.apache.org/repo/m2-incubating-repository   :-)

-Marshall



Re: How to get a list of allowed values for a type?

2007-08-13 Thread Marshall Schor

Adam Lally wrote:

Hm, it doesn't look like that's possible.  Seems like a missing API
feature.  If you need it, please open a JIRA issue.




I think it's possible only through the low-level TypeSystem API:

LowLevelTypeSystem lts = aTypeSystem.getLowLevelTypeSystem();
return lts.ll_getStringSet(lts.ll_getCodeForType(aType));

Would be good to add to the "high-level" API too.

-Adam


  

A quick search turned up a static public method:
TypeSystemUtil.getAllowedValuesForType(Type aType, TypeSystem aTypeSystem)

This implements what Adam suggested above, but is in the public api 
already I think.

The Javadoc does have a TODO - to make this a method on Type.

-Marshall



Re: [VOTE] Release uimaj-2.2.0-RC8 as uimaj-2.2.0-incubating

2007-08-13 Thread Marshall Schor

Thilo Goetz wrote:

We've had a few days to do regression testing
on this level, and nothing new has come up.

The release artifacts are available on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC8

So please cast your vote:

[ ] +1 Release RC8 as uimaj-2.2.0-incubating, it's ready
[ ] -1 Don't release yet, I found issues

--Thilo
  

I notice that we did not update the RELEASE_NOTES to include UIMA-522.

I hate to stop the release for this kind of thing, but we probably 
should fix it.


I'll regen the RELEASE_NOTES and put into SVN.
-Marshall



Re: [VOTE] Release uimaj-2.2.0-RC8 as uimaj-2.2.0-incubating

2007-08-13 Thread Marshall Schor

Marshall Schor wrote:

Thilo Goetz wrote:

We've had a few days to do regression testing
on this level, and nothing new has come up.

The release artifacts are available on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC8

So please cast your vote:

[ ] +1 Release RC8 as uimaj-2.2.0-incubating, it's ready
[ ] -1 Don't release yet, I found issues

--Thilo
  

I notice that we did not update the RELEASE_NOTES to include UIMA-522.

I hate to stop the release for this kind of thing, but we probably 
should fix it.


I'll regen the RELEASE_NOTES and put into SVN.
-Marshall

Done. -Marshall


Re: [VOTE] Release uimaj-2.2.0-RC8 as uimaj-2.2.0-incubating

2007-08-13 Thread Marshall Schor

Thilo Goetz wrote:

I'd rather not go through another iteration.  Your
documentation update about the thread-safe resources
didn't go in either, but we have to stop sometime.
The fact that we're missing a couple of bug fixes
from the release notes is not going to make me loose
any sleep :-)
  

The thread-safe doc updates can certainly go into a subsequent release.
I remember some people didn't like the idea of changing the release notes
so that instead of having this information in the release notes (and 
requiring us
to remember to update it prior to doing builds), users could generate 
anytime
they wanted to. 

I think the objection was that by altering Jira records, we could alter 
what the
release notes generated, and they would no longer accurately reflect 
what was
in the release. 

I don't feel very strongly one way or another about this, but lean 
toward redoing
the release (sorry!) if we want to keep the release notes "hard-coded" 
in our

distribution.

But if everyone else feels this is just too trivial to re-do the 
release, I'll go along :-)


-Marshall

Marshall Schor wrote:
  

Thilo Goetz wrote:


We've had a few days to do regression testing
on this level, and nothing new has come up.

The release artifacts are available on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC8

So please cast your vote:

[ ] +1 Release RC8 as uimaj-2.2.0-incubating, it's ready
[ ] -1 Don't release yet, I found issues

--Thilo
  
  

I notice that we did not update the RELEASE_NOTES to include UIMA-522.

I hate to stop the release for this kind of thing, but we probably
should fix it.

I'll regen the RELEASE_NOTES and put into SVN.
-Marshall




  




Re: [VOTE] Release uimaj-2.2.0-RC8 as uimaj-2.2.0-incubating

2007-08-13 Thread Marshall Schor

+1 to put this one out, then, from me .  -Marshall

Adam Lally wrote:

On 8/13/07, Marshall Schor <[EMAIL PROTECTED]> wrote:
  

I don't feel very strongly one way or another about this, but lean
toward redoing
the release (sorry!) if we want to keep the release notes "hard-coded"
in our
distribution.

But if everyone else feels this is just too trivial to re-do the
release, I'll go along :-)




I lean the other way, towards not redoing the release.  It's close
enough and this has been dragging on for long enough already.

Who knows, maybe the IPMC will find something and force us to redo it anyway. :)

-Adam


  




What is the expected behavior of type system merge for these cases?

2007-08-13 Thread Marshall Schor

1) Type Foo, feature Bar - range type FSArray whose element type is Baz
Type Foo, feature Bar - range type FSArray whose element type is NotBaz

(Should throw an exception?)

2) Type Foo, feature Bar - range type FSArray, whose element type is Baz
Type Foo, feature Bar - range type FSArray whose element type is 
Subtype_of_Baz


(Should be element Type = Subtype_of_Baz?)

3) Type Foo, feature Bar - range type FSArray with 
multipleReferencesAllowed = false
Type Foo, feature Bar - range type FSArray with 
multipleReferencesAllowed = true


(Should throw an exception?)

-Marshall


Re: Unambiguous filtered iterator

2007-08-14 Thread Marshall Schor

I think this thread was never responded to -  can we discuss now? or am I
mistaken?

-Marshall

Adam Lally wrote:

I got a user request for an unambiguous, filtered iterator that
filters first, then applies the unambiguousness constraint.  (An
unambiguous iterator is one where the next annotation returned is
guaranteed not to overlap with the previous annotation returned.)

It seems easy to do this: all we need to do is create a new instance
of the Subiterator class using new Subiterator(filteredIter), where
filteredIter is the filtered iterator.  Users can't call this directly
since Subiterator is package-private.

What would be a good public API for this?  Two ideas are:

static CAS.createUnambiguousIterator(Iterator iter)
static AnnotationIndex.createUnambiguousIterator(Iterator iter)


These are analogous to CAS.createFilteredIterator(Iterator iter).
However for unambiguous iterators, the Iterator must be an iterator
over the Annotation index, not some user-defined index..  So perhaps
making this a static method of AnnotationIndex is more clear.

Thoughts?

-Adam






Re: UIMA JMS

2007-08-15 Thread Marshall Schor

Michael Baessler wrote:

Eddie Epstein wrote:

To support some advanced users of UIMA, we have been working on an
alternative general scalability mechanism for UIMA analytics. Our
goals were to provide a standards-based, much more flexible and
powerful capability than that offered by the UIMA collection
processing manager, with less software complexity. To this end we have
developed an architecture based on asynchronous messaging technology
conforming to the JMS standard, and from that built a small
scalability extension for Apache UIMA, which we call UIMA JMS.

The extension uses JMS and allows incorporating alternative JMS
middleware implementations.  The primary end-user interface to UIMA
JMS is a new descriptor, the UIMA deployment descriptor. This
descriptor references standard UIMA component descriptors, and adds
the configuration information necessary to specify which annotators
are to be replicated, where they will be deployed, how many threads to
run concurrently, how error conditions are to be handled and several
other details.

Our initial implementation uses Apache's ActiveMQ for the JMS
messaging middleware.  We would like to explore donating this
extension to the UIMA project, if this is acceptable to the community,
and would appreciate any comments or feedback


Sounds interesting, but do you have some more detailed information 
UIMA JMS? Maybe some user documentation or stuff like that?


Good idea.  We're working through how to make this available - we have
have a PDF doc which is a start; we might be able to post, perhaps as an 
attachment to the UIMA wiki.  We're working on more documentation,

including more tutorial information.



Is this only an addition to the current Apache UIMA implementation 
(only additional components) or do we have to modify

the UIMA core projects to run UIMA JMS?


It is only an addition - no modification needed to the UIMA core.


-- Michael







Re: What is the expected behavior of type system merge for these cases?

2007-08-15 Thread Marshall Schor

OK.  Am implementing tests and then the fixes :-)

2 more cases have come up:

1)  One descriptor specifies multipleReferencesAllowed as false, the other
"omits" this.  The spec says omitting is the same as false.  So this 
will be OK.


2) One descriptor specifies an element Range Type restriction, the other 
doesn't specify.
I'm making this throw an exception, per the logic for #2 below in 
Thilo's note.


-Marshall

Adam Lally wrote:

On 8/13/07, Marshall Schor <[EMAIL PROTECTED]> wrote:
  

1) Type Foo, feature Bar - range type FSArray whose element type is Baz
 Type Foo, feature Bar - range type FSArray whose element type is NotBaz

(Should throw an exception?)

2) Type Foo, feature Bar - range type FSArray, whose element type is Baz
 Type Foo, feature Bar - range type FSArray whose element type is
Subtype_of_Baz

(Should be element Type = Subtype_of_Baz?)

3) Type Foo, feature Bar - range type FSArray with
multipleReferencesAllowed = false
 Type Foo, feature Bar - range type FSArray with
multipleReferencesAllowed = true

(Should throw an exception?)




I would vote for exceptions in all three cases (agreeing with Thilo's
logic about #2).

-Adam


  




Re: feature path evaluation

2007-08-15 Thread Marshall Schor

Michael Baessler wrote:

Michael Baessler wrote:

Hi,

does the UIMA framework has a method to evaluate a feature path as 
String value?


-- Michael



No response... does this mean, there is no such method or nobody have 
read this?

I missed reading this message...

By feature path - do you mean instances of the type 
org.apache.uima.cas.FeaturePath?  If so, I think there is no "toString" 
method for this.  Since it's just a string a feature names, a toString 
method would need to concatenate these with some separator character, I 
guess.


-Marshall


Re: What is the expected behavior of type system merge for these cases?

2007-08-15 Thread Marshall Schor

Thilo Goetz wrote:

Hi Marshall,

Marshall Schor wrote:
  

OK.  Am implementing tests and then the fixes :-)



please don't commit anything to trunk until we're done
with this release.  Else we might have to branch now,
and I'd like to avoid that if we can.
  
oops - sorry - didn't see this note, until after I had committed.  For 
these kinds of things,

best to send an instant message :-)...

  

2 more cases have come up:

1)  One descriptor specifies multipleReferencesAllowed as false, the other
"omits" this.  The spec says omitting is the same as false.  So this
will be OK.

2) One descriptor specifies an element Range Type restriction, the other
doesn't specify.
I'm making this throw an exception, per the logic for #2 below in
Thilo's note.



Not specifying a component type restriction is the same as specifying 
uima.cas.TOP.
  
OK - I guess this will need to be added to the test - to allow no type 
restriction to "match" one which specifies explicitly uima.cas.TOP.  
I'll re-open the Jira issue.


-Marshall


Re: [VOTE] Release uimaj-2.2.0-RC8 as uimaj-2.2.0-incubating

2007-08-15 Thread Marshall Schor
I think all the committers have voted +1...   Can we declare the vote 
closed and go to the incubator at this point?


-Marshall

Thilo Goetz wrote:

We've had a few days to do regression testing
on this level, and nothing new has come up.

The release artifacts are available on people.a.o at
/home/twgoetz/uima-distributions/2.2/RC8

So please cast your vote:

[ ] +1 Release RC8 as uimaj-2.2.0-incubating, it's ready
[ ] -1 Don't release yet, I found issues

--Thilo


  




Re: continuous integration

2007-08-15 Thread Marshall Schor
Saw another post (may be outdated) that said Continuum didn't handle 
"flat" Maven
structures, only the "nested" ones - and we're using the "flat" approach 
I think.


-Marshall (hoping to get to continuous integration at some point :-)

Marshall Schor wrote:

More info:

Several posts on maven-user on the topic "is continuum dead" suggest 
another alternative,


https://hudson.dev.java.net/

and another says here's a matrix of various tools:

http://docs.codehaus.org/display/DAMAGECONTROL/Continuous+Integration+Server+Feature+Matrix 




-Marshall (hoping that at some point, we'll get to continuous 
integration... )



Marshall Schor wrote:
If and when we decide that "continuous integration" is the way to go, 
Atlassian, the folks who did Jira and Confluence Wiki (which we're 
using) also have a continuous integration product, called Bamboo.


Several Apache projects are using it, here:  
http://opensource.bamboo.atlassian.com/


-Marshall










Re: UIMA JMS

2007-08-15 Thread Marshall Schor

The documentation for this has been posted to the Apache UIMA wiki.

You can navigate there by clicking on the wiki link of
http://incubator.apache.org/uima

and then on the Documentation link, and then on the
Documentation for Asynchronous Scaleout enablement of Apache UIMA 
<http://cwiki.apache.org/UIMA/uimaasdoc.html>

link (if you don't see it, please hit refresh in your browser).

-Marshall


Marshall Schor wrote:

Michael Baessler wrote:

Eddie Epstein wrote:

To support some advanced users of UIMA, we have been working on an
alternative general scalability mechanism for UIMA analytics. Our
goals were to provide a standards-based, much more flexible and
powerful capability than that offered by the UIMA collection
processing manager, with less software complexity. To this end we have
developed an architecture based on asynchronous messaging technology
conforming to the JMS standard, and from that built a small
scalability extension for Apache UIMA, which we call UIMA JMS.

The extension uses JMS and allows incorporating alternative JMS
middleware implementations.  The primary end-user interface to UIMA
JMS is a new descriptor, the UIMA deployment descriptor. This
descriptor references standard UIMA component descriptors, and adds
the configuration information necessary to specify which annotators
are to be replicated, where they will be deployed, how many threads to
run concurrently, how error conditions are to be handled and several
other details.

Our initial implementation uses Apache's ActiveMQ for the JMS
messaging middleware.  We would like to explore donating this
extension to the UIMA project, if this is acceptable to the community,
and would appreciate any comments or feedback


Sounds interesting, but do you have some more detailed information 
UIMA JMS? Maybe some user documentation or stuff like that?


Good idea.  We're working through how to make this available - we have
have a PDF doc which is a start; we might be able to post, perhaps as 
an attachment to the UIMA wiki.  We're working on more documentation,

including more tutorial information.



Is this only an addition to the current Apache UIMA implementation 
(only additional components) or do we have to modify

the UIMA core projects to run UIMA JMS?


It is only an addition - no modification needed to the UIMA core.


-- Michael











Re: What is the expected behavior of type system merge for these cases?

2007-08-15 Thread Marshall Schor

Thilo Goetz wrote:

Marshall Schor wrote:
  

Thilo Goetz wrote:


Hi Marshall,

Marshall Schor wrote:
 
  

OK.  Am implementing tests and then the fixes :-)



please don't commit anything to trunk until we're done
with this release.  Else we might have to branch now,
and I'd like to avoid that if we can.
  
  

oops - sorry - didn't see this note, until after I had committed.  For
these kinds of things,
best to send an instant message :-)...




Oh well, we'll have to figure out how to proceed then.  I'll have
to catch up on some SVN docs to see about branching.  It seems
highly unlikely that our release candidate will just pass the
incubator PMC ;-)

--Thilo



  
I think there also were other (previous) SVN updates done prior to my 
last one - for instance, updating the Docs re: threadsafe issues for 
shared resource impls.


I did branching when I did my hot fix for 2.1, as I recall.  So it can't 
be too hard ;-)


-Marshall


Re: continuous integration

2007-08-15 Thread Marshall Schor

Jörn Kottmann wrote:

Hi Marshall,

a few weeks ago I tested cruisecontrol with the uima project
and it worked with our maven project structure.

Maybe you would like to take a look at it.
Great, thanks. 


I'm still trying to get a sense of the communities.  I found this:
http://www.chris-read.net/?p=13
comparing CruiseControl version 2.6 (now at 2.7), bamboo 1.0 (it's now 
at 1.2.2), and

teamCity 1.2 (now at 2.1).

See also http://xooctory.xoocode.org/ - it's another CI open source project
whose team works on other Apache projects.  It has a section on comparison
about 1/2 down the page.

A quick look at Apache projects - I could only find one that was using 
CI (Harmony).


Still don't have any definite opinions, myself... 


-Marshall


Jörn

On Aug 15, 2007, at 9:01 PM, Marshall Schor wrote:

Saw another post (may be outdated) that said Continuum didn't handle 
"flat" Maven
structures, only the "nested" ones - and we're using the "flat" 
approach I think.


-Marshall (hoping to get to continuous integration at some point :-)

Marshall Schor wrote:

More info:

Several posts on maven-user on the topic "is continuum dead" suggest 
another alternative,


https://hudson.dev.java.net/

and another says here's a matrix of various tools:

http://docs.codehaus.org/display/DAMAGECONTROL/Continuous+Integration+Server+Feature+Matrix 




-Marshall (hoping that at some point, we'll get to continuous 
integration... )



Marshall Schor wrote:
If and when we decide that "continuous integration" is the way to 
go, Atlassian, the folks who did Jira and Confluence Wiki (which 
we're using) also have a continuous integration product, called 
Bamboo.


Several Apache projects are using it, here:  
http://opensource.bamboo.atlassian.com/


-Marshall
















UIMA and Export regulation 5D002

2007-08-17 Thread Marshall Schor

UIMA is currently not classified as 5D002 software
(a classification for software, requiring "notification" due to issues
around crypto).
To keep this status, we have to

   a) avoid including any 5D002 software in any distribution we do, and
   b) avoid using interfaces for 5D002 components (that we do not
   include in our distributions) that are specially designed
   to access crypto functionality in these components

The page http://www.apache.org/licenses/exports

lists Apache distributed software that is classified as 5D002
(note: for APR, only APR-Util- "development" version).
ActiveMq and Derby are on the list.

In the proposed UIMA extension for asynchronous
scaleout, we use, but do not distribute, ActiveMq 4.1,
which, in turn, includes Derby.

In UIMA-CPP, we use APR, and
we distribute it.  I think we don't use APR-Util (Eddie, please
confirm), which is 5D002 software.

If we include in our distribution any component
that is classified as 5D002 then
UIMA becomes 5D002, as well.

Additionally, even if we don't distribute these components,
if our UIMA software uses interfaces for these components
that are specially designed to access crypto functionality in
these components, then UIMA becomes 5D002 and
we need to follow the procedures
outlined in http://www.apache.org/dev/crypto.html.

-Marshall






Re: [jira] Commented: (UIMA-531) Cas Editor: Delete button of the FSView does not work correctly

2007-08-17 Thread Marshall Schor

When I tried to add Mylin to my Eclipse (3.3), it presented me with a menu
of features including "connectors" for Bugzilla and for Trac, but I 
didn't see one for Jira.


If there is one, where is it (url to update site)?
-Marshall

Thilo Goetz wrote:

Jörn Kottmann wrote:
  

Sorry, Michael I used Mylyn to attach the patch. Mylyn also attachted
the "context" of the issue.
A Mylyn context  contains only these files which are related to the issue.
This makes it possible to the eclipse ui to hide all projects and files
which are not related to
the issue. I was not aware that it will attach the context as .zip
attachment.

But as it seems it is not possible to attach files and grant the ASF.
I will attach the two patches again via the jira web interface.



The AL check box is part of the Apache Jira customization.  I guess that
the Mylyn Jira connector can't handle these kinds of customizations (yet).

  

Mylyn is still a great tool, maybe you would like to take a look at it.
It integrates jira into eclipse.  There is a few for issues and special
editors
to edit the issues. It is also possible to create new issues.



Not too shabby.  Now if subversion knew about Jira, we'd be on our way
to a professional work environment (where version control knows about
task management).  We currently have to manually hook up our commits
with Jira...

--Thilo

  

Jörn





  




Re: [jira] Commented: (UIMA-526) Cas Editor: Add a new Edit View for editing of FS

2007-08-17 Thread Marshall Schor

Found this via google:

Eclipse artifacts for 3.3 are in this repository:

http://repo1.maven.org/eclipse/

See:

http://osdir.com/ml/ide.eclipse.equinox.devel/2006-12/msg00012.html

-Marshall

Michael Baessler wrote:

Jörn Kottmann wrote:

I can, but the eclipse 3.3 plugins are not in the central m2
repository yet.  The CasEditor will then not build via maven.
I'm not sure how you got it to compile.  Do you know how to
get the newer versions of the eclipse plugins added to the
repository?


I use eclipse 3.3 to build and test it.

The maven stuff does not really work for me. I would like
to deploy the Cas Editor as RCP application and not as eclipse
plugin. To build a RCP application via a build script we
need an installation of eclipse.

Thanks,
Jörn


Hi Marshall or Adam,

do you know some more details about the eclipse 3.3 artifacts.
When will they be checked in to the maven repository?
Where is the right place to ask for it?

Is it possible to build an RCP application also using maven or do we 
need to do something different (Ant within Maven)

to get the code compiled?

-- Michael






factoring comment POM stuff to the parent

2007-08-17 Thread Marshall Schor
Is there any reason why we don't factor some common POM elements to the 
parent?


Some candidates:





  (except for eclipse plugins which use the "period" separator 
before "incubating")


-Marshall


Success in building eclipse update site

2007-08-17 Thread Marshall Schor
I've succeeded in getting maven to build the eclipse update site.  Here 
are some particulars, in case anyone wants to suggest improvements 
before I commit things.


First, I did two features.  One is just the runtime plugin.  This is for 
those projects which want to package something as an RCP (Rich Client 
Platform).   The other feature is the Eclipse tooling for UIMA, 
including these 4 things:  The Component Descriptor Editor, the Debug, 
the JCasGenp, and the Pear packager.


The two features are set up so that the tools one automatically includes 
the runtime (if it's not already there).


The features are also set up to "depend" on the things they need.  This 
causes, for instance, EMF be specified as a dependency.  If you select 
both the UIMA update site and a site having the EMF updates, the button 
to add needed dependencies automatically is usable.


There will be 3 new projects for this:
  uimaj-eclipse-feature-runtime
  uimaj-eclipse-feature-tools
  uimaj-eclipse-update-site

The feature projects have 1 file (feature.xml) needed by the maven 
build.  But to make
eclipse editing easier, I propose to check in the ".project" for these 
which sets the nature

to allow for the feature editor to run.

The update-site project has several files used by the maven build.  The 
main one is the site.xml file.  There are also some boilerplate files in 
a web directory. 

The update-site project also has a hand-crafted "build.xml" that does 
the actual building of the update site from the maven-generated 
artifacts.  This file unfortunately has one property that must be (at 
the moment) manually updated for each release, which says the release 
name using the "period" for the punctuation (n.n.n.incubating, not 
n.n.n-incubating), as

the style needed by eclipse plugins.

You can run this build by itself, stand-alone, and it will read the 
maven-generated artifacts in the various parts of uimaj projects / 
target directories, and it will build the update site from this in its 
target directory.  It does this from scratch, every time you run it (it 
takes 2 seconds on my machine).


The uimaj-distr pom file is updated to add a 3rd ant step to call this, 
after it has run the builds for the .bin (it depends on that build's 
output).


The last manual step needed would be to copy the 
uimaj-eclipse-update-site/target contents to our download site.  This is 
coded in the features as:

http://incubator.apache.org/uima/downloads/eclipse-update-site

Let me know if anyone thinks there's a better place.

If I don't hear anything in a day or 2 I'll probably continue to hold 
off on checking this in until we clear the 2.2 release, to reduce any 
branching issues.


-Marshall


Re: [jira] Commented: (UIMA-531) Cas Editor: Delete button of the FSView does not work correctly

2007-08-17 Thread Marshall Schor
Any startup hints?  When I installed the plugin and this extra for Jira, 
and said to add the Apache Jira, it seemed to do the right thing.  But 
when I said "Validate Settings" it replied

that "Mylar requires JIRA version 3.3.3 or later".

Then it shows an "x" next to the repository icon... and nothing works.

Thanks for any suggestions...

-Marshall

Thilo Goetz wrote:

Marshall Schor wrote:
  

When I tried to add Mylin to my Eclipse (3.3), it presented me with a menu
of features including "connectors" for Bugzilla and for Trac, but I
didn't see one for Jira.

If there is one, where is it (url to update site)?
-Marshall



http://download.eclipse.org/tools/mylyn/update/extras

--Thilo



  




Re: [jira] Commented: (UIMA-526) Cas Editor: Add a new Edit View for editing of FS

2007-08-21 Thread Marshall Schor

Michael Baessler wrote:

How should we go on with this issue?

Does anyone know if these packages will go into the maven repository 
in the future?
Is there a place to request these changes, so that the packages are 
added to the maven repository?

I think the place to ask about this is the maven-users mailing list.

-Marshall


If not we have to change the build, so that we use a local eclipse 
version to compile the CasEditor project.


-- Michael


Michael Baessler wrote:
It seems that org.eclipse.jface.viewers is missing in this 
repository. Only org.eclipse.jface.text is available.


-- Michael

Thilo Goetz wrote:

Maybe I don't know how to look, but although some 3.3
stuff is there, most things are missing.  This is where
we get our 3.2 and 3.1 stuff from...

Marshall Schor wrote:
 

Found this via google:

Eclipse artifacts for 3.3 are in this repository:

http://repo1.maven.org/eclipse/

See:

http://osdir.com/ml/ide.eclipse.equinox.devel/2006-12/msg00012.html

-Marshall








Re: Cas Editor: Build and Packaging [was: [jira] Commented: (UIMA-526) Cas Editor: Add a new Edit View for editing of FS]

2007-08-21 Thread Marshall Schor

Jörn Kottmann wrote:

How should we go on with this issue?

Does anyone know if these packages will go into the maven repository 
in the future?
Is there a place to request these changes, so that the packages are 
added to the maven repository?


If not we have to change the build, so that we use a local eclipse 
version to compile the CasEditor project.


There are two options how to build and deploy the Cas Editor:

1. Its build like the other plugins and deployed as eclipse plugin.

2. Its build as RCP application (this requires an eclipse install) and 
packaged for each

platform we want to support (win, linux and mac).
Maybe we could ask if anyone has a maven plugin for doing Eclipse RCP 
packaging on the maven-users list.


-Marshall


I would prefer the second option. The advantage is that it feels than 
more
like an application and not like an eclipse plugin with many ui stuff 
which does

not make sense for the Cas Editor.

Jörn





Re: [jira] Commented: (UIMA-526) Cas Editor: Add a new Edit View for editing of FS

2007-08-21 Thread Marshall Schor

I found this by googling these three terms:  maven eclipse rcp

http://docs.codehaus.org/display/MAVENUSER/Eclipse+Plugin

It includes instructions on
 - how to generate an internal repository of eclipse artifacts from an 
eclipse installation.

 - how to add the Eclipse RCP artifacts to the repository
 - how to create an Eclipse RCP Target in maven

-Marshall

Marshall Schor wrote:

Michael Baessler wrote:

How should we go on with this issue?

Does anyone know if these packages will go into the maven repository 
in the future?
Is there a place to request these changes, so that the packages are 
added to the maven repository?

I think the place to ask about this is the maven-users mailing list.

-Marshall


If not we have to change the build, so that we use a local eclipse 
version to compile the CasEditor project.


-- Michael


Michael Baessler wrote:
It seems that org.eclipse.jface.viewers is missing in this 
repository. Only org.eclipse.jface.text is available.


-- Michael

Thilo Goetz wrote:

Maybe I don't know how to look, but although some 3.3
stuff is there, most things are missing.  This is where
we get our 3.2 and 3.1 stuff from...

Marshall Schor wrote:
 

Found this via google:

Eclipse artifacts for 3.3 are in this repository:

http://repo1.maven.org/eclipse/

See:

http://osdir.com/ml/ide.eclipse.equinox.devel/2006-12/msg00012.html

-Marshall












Re: Cas Editor: Build and Packaging [was: [jira] Commented: (UIMA-526) Cas Editor: Add a new Edit View for editing of FS]

2007-08-21 Thread Marshall Schor

Marshall Schor wrote:

Jörn Kottmann wrote:

How should we go on with this issue?

Does anyone know if these packages will go into the maven repository 
in the future?
Is there a place to request these changes, so that the packages are 
added to the maven repository?


If not we have to change the build, so that we use a local eclipse 
version to compile the CasEditor project.


There are two options how to build and deploy the Cas Editor:

1. Its build like the other plugins and deployed as eclipse plugin.

2. Its build as RCP application (this requires an eclipse install) 
and packaged for each

platform we want to support (win, linux and mac).

I also found this: http://vyzivus.host.sk/maven2-build-plugin-howto.html
It describes building an RCP app using maven.  They have a bigger list 
of targets, though:


   * carbon.macosx.ppc
   * carbon.macosx.x86
   * gtk.linux.ia64
   * gtk.linux.ppc
   * gtk.linux.x86
   * gtk.linux.x86_64
   * gtk.solaris.sparc
   * motif.aix.ppc
   * motif.hpux.ia64_32
   * motif.hpux.PA_RISC
   * motif.linux.x86
   * motif.solaris.sparc
   * photon.qnx.x86
   * win32.win32.x86

-Marshall
Maybe we could ask if anyone has a maven plugin for doing Eclipse RCP 
packaging on the maven-users list.


-Marshall


I would prefer the second option. The advantage is that it feels than 
more
like an application and not like an eclipse plugin with many ui stuff 
which does

not make sense for the Cas Editor.

Jörn









Re: Java annotations for configuration parameters in AEs

2007-08-23 Thread Marshall Schor
I like this idea.  I think there are some details to work out. 
Most of these are maybe just things I need to learn :-)  Here's a few:

1) Whatever approach we take for this - it would be nice to align with
existing Eclipse support for this kind of thing (I assume there is some,
but I
haven't explored what it is).

2) We have to think about having now 2 places where the same thing is
specified.
If both places are used, do we need to check for "consistency" at run
startup time?

3) Can we have some kind of backwards-compatible approach that achieves
more of
the goals of writing information about configuration parameters in just
one place?  Should
we take some inspiration from how EMF does things, or how Spring support
is done in
Eclipse (Spring uses injection, and has some Eclipse plugins - see for
instance
http://springide.org)

-Marshall

Jörn Kottmann wrote:
> Hello,
>
> I would like to suggest that we use java annotations
> for the classes which implement an AE.
>
> The configuration parameters can be defined via java annotations
> and then injected into the AE before initializing.
> The implementor than does not need to write code that retrieves
> configuration parameters. The framework than can also guarantee
> that all mandatory parameters are set (currently its not possible to
> enforce that
> the implementation is in sync with the configurationParameters section
> in the descriptor.
>
> Here a small sample:
>
> public class SampleAE extends Annotator_ImplBase {
>   @Parameter(mandatory = true)
>   public Boolean booleanParameter; // <- value is injected
> }
>
> What do you think ?
>
> Jörn
>
>



Re: [jira] Commented: (UIMA-526) Cas Editor: Add a new Edit View for editing of FS

2007-08-23 Thread Marshall Schor
I got a reply on Maven-Users about this which said:

Subject:Re: Eclipse 3.3 plugin artifacts for Maven 2
Date:   Thu, 23 Aug 2007 15:32:18 +0200
From:   Carlos Sanchez <[EMAIL PROTECTED]>
Reply-To:   Maven Users List <[EMAIL PROTECTED]>
To: Maven Users List <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>



I'm working on it, it's just that it takes time
we use the eclipse plugin with the eclipse:to-maven goal
http://maven.apache.org/plugins/maven-eclipse-plugin/

you probably need to build the last code from subversion



Michael Baessler wrote:
> How should we go on with this issue?
>
> Does anyone know if these packages will go into the maven repository
> in the future?
> Is there a place to request these changes, so that the packages are
> added to the maven repository?
>
> If not we have to change the build, so that we use a local eclipse
> version to compile the CasEditor project.
>
> -- Michael
>
>
> Michael Baessler wrote:
>> It seems that org.eclipse.jface.viewers is missing in this
>> repository. Only org.eclipse.jface.text is available.
>>
>> -- Michael
>>
>> Thilo Goetz wrote:
>>> Maybe I don't know how to look, but although some 3.3
>>> stuff is there, most things are missing.  This is where
>>> we get our 3.2 and 3.1 stuff from...
>>>
>>> Marshall Schor wrote:
>>>  
>>>> Found this via google:
>>>>
>>>> Eclipse artifacts for 3.3 are in this repository:
>>>>
>>>> http://repo1.maven.org/eclipse/
>>>>
>>>> See:
>>>>
>>>> http://osdir.com/ml/ide.eclipse.equinox.devel/2006-12/msg00012.html
>>>>
>>>> -Marshall
>
>
>



Re: UIMA getting started

2007-08-23 Thread Marshall Schor
Hi Michael -

Just took a look.  I like the idea of a quick-start.

It seems there are two themes in this article - maybe they would be
better if separated.  One is more a "why UIMA" story,
and the other is how to install and get started.

My sense is that people prefer shorter, targeted pages, and would find
this more consumable if split up.

Some more specific comments:

What Can UIMA Be Used For:
-
I think this topic confuses applications that can be built ** if you
have the proper annotator components **. 
When written this way, it leads users to have the expectation that the
analysis capability comes with UIMA.
I think it is useful to say something like this:

UIMA is, by itself, an empty framework.  Its purpose is to enable a
world-wide, diverse community to develop
inter-operable, often complex analytic components, and allow them to be
combined and run together, with
framework supplied scaled-out and remoting as needed. 

Then it would be good to link to community sites giving creedence to the
notion that UIMA is being widely
adopted for this purpose.  (CMU repository, repository in Germany, link
to Gale project use of UIMA,
maybe do a google search for other good links).

Then, maybe some use cases.

INSTALL UIMA: 
---
suggest you give a link to how to verify the download (what are all
those .asc files - a user may be asking herself).
suggest you use some visual formatting to make it clear there are 4
steps:  
  download/unzip, set env vars (2 vars),  augment PATH,
 run script

Visual formatting can also be used in other sections, too :-)  e.g.,
Running the UIMA Analysis Example.

That's all for now :-)  -Marshall


Michael Baessler wrote:
> Michael Baessler wrote:
>> I have written an UIMA Getting Started - UIMA Examples article and
>> posted it on the Apache UIMA website.
>> The article gives a short introduction about what UIMA is, what it
>> can be used for, how it is installed and how the
>> UIMA analysis example can be executed. I hope that it helps first
>> time users to get an impression about what UIMA is
>> and what they can do with.
>>
>> My plan is to create a series of UIMA Getting Started articles about
>> different topics.
>> One of the next articles can be "Writing my first analysis
>> component". The articles should not replace the UIMA documentation,
>> they should just give a short overview with the basic concepts to
>> reduce the warm up time for UIMA.
>>
>> Currently there is no link to the article available on the website,
>> since the UIMA analysis example that is used in the article will
>> only be released with the next Apache UIMA release.
>>
>> But you can find the article here:
>>
>> http://incubator.apache.org/uima/doc-uima-examples.html
>>
>> Your comments and feedback are welcome?
>>
>> -- Michael
>>
> Does anyone had the chance to look at the getting started document?
>
> I would like to publish it with the UIMA 2.2 release.
>
> -- Michael
>
>



Re: Java 5 [was: Re: svn commit: r569003 - in /incubator/uima/uimaj/trunk/uimaj-core/src: main/java/org/apache/uima/resource/metadata/impl/MetaDataObject_impl.java test/java/org/apache/uima/resource/m

2007-08-23 Thread Marshall Schor
Oops, now I know why "deepEquals()" wasn't familiar...  I didn't realize
it was a Java 5 thing...

This one's not critical - so I would lean toward putting it back to just
plain equals.

Opinions? -Marshall



Thilo Goetz wrote:
> [EMAIL PROTECTED] wrote:
>   
>> Author: schor
>> Date: Thu Aug 23 07:11:20 2007
>> New Revision: 569003
>>
>> URL: http://svn.apache.org/viewvc?rev=569003&view=rev
>> Log:
>> [UIMA-534] fix equals test for Maps, also make Array tests cover
>> more cases, including arrays of arrays
>>
>> Added:
>> 
>> incubator/uima/uimaj/trunk/uimaj-core/src/test/java/org/apache/uima/resource/metadata/impl/MetaDataObject_implTest.java
>> Modified:
>> 
>> incubator/uima/uimaj/trunk/uimaj-core/src/main/java/org/apache/uima/resource/metadata/impl/MetaDataObject_impl.java
>>
>> Modified: 
>> incubator/uima/uimaj/trunk/uimaj-core/src/main/java/org/apache/uima/resource/metadata/impl/MetaDataObject_impl.java
>> URL: 
>> http://svn.apache.org/viewvc/incubator/uima/uimaj/trunk/uimaj-core/src/main/java/org/apache/uima/resource/metadata/impl/MetaDataObject_impl.java?rev=569003&r1=569002&r2=569003&view=diff
>> ==
>> --- 
>> incubator/uima/uimaj/trunk/uimaj-core/src/main/java/org/apache/uima/resource/metadata/impl/MetaDataObject_impl.java
>>  (original)
>> +++ 
>> incubator/uima/uimaj/trunk/uimaj-core/src/main/java/org/apache/uima/resource/metadata/impl/MetaDataObject_impl.java
>>  Thu Aug 23 07:11:20 2007
>> 
> [...]
>
>   
>> +  if (val1 instanceof Object[])  return 
>> Arrays.deepEquals((Object[])val1, (Object[])val2);
>> 
>
>
> Marshall, thanks for forcing the Java 5 issue ;-)
>
> Seriously, I think we've only heard positive opinions
> about moving to Java 5.  Our users haven't complained
> either, to the contrary.  If anybody has strong feelings
> about being 1.4 compatible, speak up now or forever
> hold your peace :-)
>
> --Thilo
>
>
>   



Re: Subtypes of lists and arrays

2007-08-23 Thread Marshall Schor
Adam Lally wrote:
> On 8/23/07, Jörn Kottmann <[EMAIL PROTECTED]> wrote:
>   
>> Hello,
>>
>> is it possible to define subtypes of lists and array types in uima ?
>>
>> 
>
> You cannot create subtypes of arrays, but I believe you can create
> subtypes of lists.
>   
hmmm, I think not.  See setupTSDefault method in CASImpl class.  At the
bottom is the list of
"built-in" features - and it locks all except the Document Annotation, I
think.

If you need a different kind of list type, you can define one pretty
easily, from scratch, I think.

-Marshall


Re: UIMA JMS

2007-08-23 Thread Marshall Schor
See below for some additions, hopefully helpful :-)

Eddie Epstein wrote:
> Hi Michael,
>
> On 8/22/07, Michael Baessler <[EMAIL PROTECTED]> wrote:
>   
>> 1) Do you have any experiences with the memory footprint when using UIMA
>> AS? It seems to me that when deploying a larger system with multiple AEs
>> a lot of queues are used. Each AS aggregate use a queue for the
>> delegates. How is the performance with all these queues? Do you have any
>> measurements?
>> 
>
> This is a great question. When an aggregate is deployed with
> asynchronous delegates, each process call goes through a queue. For
> colocated delegates the message is just a reference to the in-memory
> CAS, so there is no serialization overhead. Moreover, a colocated
> broker in the same JVM is used for communication between colocated
> components, and ActiveMQ has optimized the producer/consumer paths to
> a colocated broker. But even with these optimizations the overhead is
> undesirable for some configurations.
>
> I will get some specific overhead times for calling colocated
> delegates in the next couple of days. For remote delegates, the
> overhead is basically determined by XmiCas serialization  steps and is
> a function of CAS content.
>
> Note that by default an aggregate is deployed as an AS primitive, that
> is, as a single threaded component, so there is no performance
> degradation for processing within the aggregate. 
An AS primitive can have a  element - in which case,
UIMA AS will
replicate it, and run each instance as a single threaded component.
> An aggregate is only
> deployed asynchronously if required, i.e. one of the delegates is a
> remote service, a colocated delegate is to be replicated, special
> error handling is desired for a delegate, or it is simply desired to
> run delegates in separate threads for concurrency.
>
>   
>> 2) Collection Process complete - The documentation says: "If a component
>> is replicated, only one of the instances will receive the
>> collectionProcessComplete call". I think replicated mean there is more
>> than one instance of the same component. So why does only one of the
>> components receive that call? Is that given by design that only one, and
>> we don't know which of the component receive the information? I think is
>> is the same as with a CAS, right?
>> 
>
> That's right. If it is required to have all CASes go through the same
> instance of an analytic then it should not be replicated.
>
>   
Part of the reason for only having one instance receive the
collectionProcessComplete call, is that the model supports multiple
instances (deployed on separate machines, for example) listening to the
same queue.  These instances are independent, and can come and go
during the processing of the collection.   For instance, one might
crash, and another might be started
up after some time to take up the slack.  When we send a
collectionProcessComplete call, we do not know
how many instances at that moment are listening to the queue.
>> 3) When the system processes a document without any CasMultiplier the
>> process call for this document blocks until the result is created and
>> returned?  So in the system only one CAS is created and used.
>> 
>
> Not sure what you mean by "the system" here. An AS primitive will
> process only one CAS at a time; 
-- unless it has "multiple instances" specified, but even then, each
instance only processes
one CAS at a time --
> an AS aggregate can process more than
> one CAS at a time, based on the number of delegates and the size of
> the caspool specified at the top level of the service
>   
and the number of instances of its AS primitives.
>   
>> If the system has also CasMultiplier components the CasPool size for a
>> CasMultiplier component can limit the CASes that can be used/created at
>> the same time. But how does this work if the system collects the
>> documents itself? The the call blocked as long as all the documents are
>> processed?
>> 
>
> If an AS aggregate has a CasMultiplier, additional CASes can be put
> into play concurrently, limited by the size of the CasMultiplier's
> caspool. The design relies on the proper choice of caspool sizes to
> enable the desired level of concurrent processing. The caspools also
> limit the number of requests that can build up in any input queue,
> avoiding queue overflows that are otherwise possible in asynchronous
> messaging systems.
>
>   
>> 4) The error handling seems to be similar as in the CPM with some
>> additional new features (real retry). Is there some reuse of the old code?
>> 
>
> No code reuse, only reuse of error handling concepts.
>
>   
>> 5) UIMA AS does not have a StatusListener that can/must be implemented
>> to get some information about the system. How are the results reposted
>> in the good case? I understand that in an error case, the error with
>> some additional background information is returned.
>> 
>
> The custom flow controller is the key to application cust

Re: Subtypes of lists and arrays

2007-08-23 Thread Marshall Schor
Adam Lally wrote:
> On 8/23/07, Marshall Schor <[EMAIL PROTECTED]> wrote:
>   
>> Adam Lally wrote:
>> 
>>> On 8/23/07, Jörn Kottmann <[EMAIL PROTECTED]> wrote:
>>>
>>>   
>>>> Hello,
>>>>
>>>> is it possible to define subtypes of lists and array types in uima ?
>>>>
>>>>
>>>> 
>>> You cannot create subtypes of arrays, but I believe you can create
>>> subtypes of lists.
>>>
>>>   
>> hmmm, I think not.  See setupTSDefault method in CASImpl class.  At the
>> bottom is the list of
>> "built-in" features - and it locks all except the Document Annotation, I
>> think.
>>
>> 
>
> The list types are set to be "feature final" but not "inheritance
> final".  So I think it is possible to declare subtypes of them.  You
> just aren't allowed to modify the list type definition itself and add
> more features to it.
>   
good point - I was speed-reading, again...  :-)
> -Adam
>
>
>   



Re: Success in building eclipse update site

2007-08-23 Thread Marshall Schor
Thilo Goetz wrote:
> Marshall Schor wrote:
>   
>> I've succeeded in getting maven to build the eclipse update site.  Here
>> are some particulars, in case anyone wants to suggest improvements
>> before I commit things.
>>
>> First, I did two features.  One is just the runtime plugin.  This is for
>> those projects which want to package something as an RCP (Rich Client
>> Platform).   The other feature is the Eclipse tooling for UIMA,
>> including these 4 things:  The Component Descriptor Editor, the Debug,
>> the JCasGenp, and the Pear packager.
>>
>> The two features are set up so that the tools one automatically includes
>> the runtime (if it's not already there).
>>
>> The features are also set up to "depend" on the things they need.  This
>> causes, for instance, EMF be specified as a dependency.  If you select
>> both the UIMA update site and a site having the EMF updates, the button
>> to add needed dependencies automatically is usable.
>>
>> There will be 3 new projects for this:
>>   uimaj-eclipse-feature-runtime
>>   uimaj-eclipse-feature-tools
>>   uimaj-eclipse-update-site
>>
>> The feature projects have 1 file (feature.xml) needed by the maven
>> build.  But to make
>> eclipse editing easier, I propose to check in the ".project" for these
>> which sets the nature
>> to allow for the feature editor to run.
>> 
> [...]
>
> Is there no other way to do this?  We already have so many projects.  One
> file per project really seems like a bit of a waste.  No big deal, but if
> there's a way this can all go into one project, I'd prefer that.
>   
I agree - it's silly, but that's what the Eclipse mechanisms seem to
make you do.   I guess it's because each has
its own "builder", or "nature" ...  Or maybe I just don't know how to do
it the other way :-)

-Marshall


[Fwd: Re: fix-permissions.sh and sudo]

2007-08-23 Thread Marshall Schor
>From the repository mailing list, there's some discussion on how to
deploy things to the (maven, but in this case, snapshot) repositories. 
Especially around the "permissions" element - this might be useful to
compare against what we plan to do :-)

-Marshall

 Original Message 
Subject:Re: fix-permissions.sh and sudo
Date:   Thu, 23 Aug 2007 23:47:04 +0200
From:   Carlos Sanchez <[EMAIL PROTECTED]>
Reply-To:   [EMAIL PROTECTED]
To: David Blevins <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED], "Infrastructure Apache"
<[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>



On 8/23/07, David Blevins <[EMAIL PROTECTED]> wrote:
>
> On Aug 23, 2007, at 2:53 AM, Carlos Sanchez wrote:
>
> > it's users misconfiguration of their deployment settings in maven
>
> What is the right deployment config for the Apache m2-snapshot-
> repository?

   
  apache.snapshots
  carlos
  ...
  775
  664
   
   
  apache.releases
  carlos
  ...
  775
  664
   
   
  apache.website
  carlos
  ...
  775
  664
   

>
> I also like Matt's suggestion that such config info could be in the
> repository itself.

I think there's a jira for that already ;)

>
> -David
>
>


-- 
I could give you my word as a Spaniard.
No good. I've known too many Spaniards.
 -- The Princess Bride





Re: [jira] Commented: (UIMA-565) Cas Editor: Improve startup process

2007-09-07 Thread Marshall Schor
Jörn Kottmann (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/UIMA-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525671
>  ] 
>
> Jörn Kottmann commented on UIMA-565:
> 
>
> I have seen one tutorial which has described it.
> Its about a few hundreds lines of code, but I think its tricky to get it 
> correct if its done in a clean room.
>
> We need everything after the checkInstanceLocation method. Tough I am already 
> using an internal class
> for the choose workspace dialog. I really hope that the eclipe guys will make 
> it easier in the future to reuse
> the startup stuff.  
>   
This would be a great topic to take to a wider Apache audience - e.g.,
start a discussion on the [EMAIL PROTECTED] re: building Eclipse RCP
applications, how to do the startup code with respect to licensing.

-Marshall



Re: UIMA Sandbox component documentation

2007-09-12 Thread Marshall Schor
Michael Baessler wrote:
> Unfortunately there is no such documentation.
Well, we do have general (but not sandbox specific) docs on how to do
this on our website, here:
http://incubator.apache.org/uima/svn.html#building.eclipse


-Marshall


  1   2   3   4   5   6   7   8   9   10   >