[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2014-01-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873229#comment-13873229
 ] 

Peter Klügl commented on UIMA-2332:
---

after the latest improvements:
normal inference: 15.3 times faster
dynamic anchoring: 17.7 times faster

> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.2.0ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2014-01-09 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866482#comment-13866482
 ] 

Peter Klügl commented on UIMA-2332:
---

after the latest improvements (do not reevaluate the matching reference, arrays 
in RutaBasic):
normal inference: 12.5 times faster
dynamic anchoring: 15.5 times faster

> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2014-01-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865437#comment-13865437
 ] 

Peter Klügl commented on UIMA-2332:
---

Some more information about the profiling:
- the test script consists essentially of the first three phases of the ANNIE 
NER together with the gazetteers. The third phase contains about 54 rules. The 
Ruta script contains overall about 50 rules.
- ignoring the initialization, 82% of the time is used for inference (and 
dictionaries), 18% for initializing the RutaStream, that is the seeding (2%) 
and RutaBasics (16%)
- the main hotspot is TOP.getAddress() with 37%. 60% caused by 
FSIteratorWrapper.get(), 25% caused by FeatureStructureImpl.getType()

The next step of improvement could be to reduce the usage of nice 
lists/sets/maps, e.g., use arrays in RutaBasic.

> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: [jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2014-01-08 Thread Peter Klügl
Am 07.01.2014 21:28, schrieb Marshall Schor:
> On 1/7/2014 12:03 PM, Peter Klügl (JIRA) wrote:
>> [ 
>> https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864394#comment-13864394
>>  ] 
>>
>> Peter Klügl commented on UIMA-2332:
>> ---
>>
>> after the latest improvements:
>> normal inference: 11.4 times faster
>> dynamic anchoring: 13.3 times faster
>>
>> There are still many possibilities to improve the performance, but I think 
>> that's enough for now. Maybe I will take another look at it tomorrow and 
>> then resolve the issue for the next release.
> with such good progress, if there is more low-hanging fruit, +1 for you to 
> "take
> a look tomorrow"!

The remaining fruits would require more time (which I currently do not
have). Some require new concepts, others only a better implementation of
low-level functionality.

I will add some more information to the issue and then resolve it. I do
not want to optimize the inference for a test script, which does not
really resemble realistic ruta scripts (at least not those I create).
The test script is just a 1-to-1 translation of some ANNIE NER rules
missing much of the stuff ruta has in contrast to jape. If I find the
time, then maybe I will profile the inference for some of our rule
applications. I asked Philip to rerun a script for the segmentation of
clinical discharge letters, and the performance has improved by factor
5. That's not bad, but I think a closer look there will highlight
different spots of the inference for optimization.

Best,

Peter

> -Marshall
>>> Profile and optimize Ruta inference performance
>>> ---
>>>
>>> Key: UIMA-2332
>>> URL: https://issues.apache.org/jira/browse/UIMA-2332
>>> Project: UIMA
>>>  Issue Type: Improvement
>>>  Components: ruta
>>>Affects Versions: 2.0.0TextMarker
>>>Reporter: Peter Klügl
>>>Assignee: Peter Klügl
>>>Priority: Minor
>>> Fix For: 2.1.1ruta
>>>
>>>
>>> Increase the speed of the ruta rule inference. A starting point is the 
>>> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.1.5#6160)
>>



Re: [jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2014-01-07 Thread Marshall Schor

On 1/7/2014 12:03 PM, Peter Klügl (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864394#comment-13864394
>  ] 
>
> Peter Klügl commented on UIMA-2332:
> ---
>
> after the latest improvements:
> normal inference: 11.4 times faster
> dynamic anchoring: 13.3 times faster
>
> There are still many possibilities to improve the performance, but I think 
> that's enough for now. Maybe I will take another look at it tomorrow and then 
> resolve the issue for the next release.
with such good progress, if there is more low-hanging fruit, +1 for you to "take
a look tomorrow"!

-Marshall
>> Profile and optimize Ruta inference performance
>> ---
>>
>> Key: UIMA-2332
>> URL: https://issues.apache.org/jira/browse/UIMA-2332
>> Project: UIMA
>>  Issue Type: Improvement
>>  Components: ruta
>>Affects Versions: 2.0.0TextMarker
>>Reporter: Peter Klügl
>>Assignee: Peter Klügl
>>Priority: Minor
>> Fix For: 2.1.1ruta
>>
>>
>> Increase the speed of the ruta rule inference. A starting point is the 
>> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.1.5#6160)
>



[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2014-01-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864394#comment-13864394
 ] 

Peter Klügl commented on UIMA-2332:
---

after the latest improvements:
normal inference: 11.4 times faster
dynamic anchoring: 13.3 times faster

There are still many possibilities to improve the performance, but I think 
that's enough for now. Maybe I will take another look at it tomorrow and then 
resolve the issue for the next release.

> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: [jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2013-12-23 Thread Peter Klügl
Am 23.12.2013 17:48, schrieb Marshall Schor:
> This is the right way to develop :-)
>
> First you develop without too much attention to optimization, and then you
> measure, and work on those things that have high leverage!
>
> Congratulations!

Thanks, that's also what I always say :-)

We had never really problems with the speed, but maybe the optimization
should not be ignored completely ;-)

Best,

Peter

> -Marshall
> On 12/23/2013 11:11 AM, Peter Klügl (JIRA) wrote:
>> [ 
>> https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855716#comment-13855716
>>  ] 
>>
>> Peter Klügl commented on UIMA-2332:
>> ---
>>
>> First small improvement:
>> test script is now 5 times faster (!!!)...
>>
>>
>>> Profile and optimize Ruta inference performance
>>> ---
>>>
>>> Key: UIMA-2332
>>> URL: https://issues.apache.org/jira/browse/UIMA-2332
>>> Project: UIMA
>>>  Issue Type: Improvement
>>>  Components: ruta
>>>Affects Versions: 2.0.0TextMarker
>>>Reporter: Peter Klügl
>>>Assignee: Peter Klügl
>>>Priority: Minor
>>> Fix For: 2.1.1ruta
>>>
>>>
>>> Increase the speed of the ruta rule inference. A starting point is the 
>>> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.1.5#6160)
>>



[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2013-12-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855776#comment-13855776
 ] 

Peter Klügl commented on UIMA-2332:
---

after some improvements for compiled dictionaries: 8.6/11.5 times faster

> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: [jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2013-12-23 Thread Marshall Schor
:-)
On 12/23/2013 11:15 AM, Peter Klügl (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855720#comment-13855720
>  ] 
>
> Peter Klügl commented on UIMA-2332:
> ---
>
> with dynamic anchoring activated: 6.6 times faster
>
>> Profile and optimize Ruta inference performance
>> ---
>>
>> Key: UIMA-2332
>> URL: https://issues.apache.org/jira/browse/UIMA-2332
>> Project: UIMA
>>  Issue Type: Improvement
>>  Components: ruta
>>Affects Versions: 2.0.0TextMarker
>>Reporter: Peter Klügl
>>Assignee: Peter Klügl
>>Priority: Minor
>> Fix For: 2.1.1ruta
>>
>>
>> Increase the speed of the ruta rule inference. A starting point is the 
>> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.1.5#6160)
>



Re: [jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2013-12-23 Thread Marshall Schor
This is the right way to develop :-)

First you develop without too much attention to optimization, and then you
measure, and work on those things that have high leverage!

Congratulations!

-Marshall
On 12/23/2013 11:11 AM, Peter Klügl (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855716#comment-13855716
>  ] 
>
> Peter Klügl commented on UIMA-2332:
> ---
>
> First small improvement:
> test script is now 5 times faster (!!!)...
>
>
>> Profile and optimize Ruta inference performance
>> ---
>>
>> Key: UIMA-2332
>> URL: https://issues.apache.org/jira/browse/UIMA-2332
>> Project: UIMA
>>  Issue Type: Improvement
>>  Components: ruta
>>Affects Versions: 2.0.0TextMarker
>>Reporter: Peter Klügl
>>Assignee: Peter Klügl
>>Priority: Minor
>> Fix For: 2.1.1ruta
>>
>>
>> Increase the speed of the ruta rule inference. A starting point is the 
>> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.1.5#6160)
>



[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2013-12-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855720#comment-13855720
 ] 

Peter Klügl commented on UIMA-2332:
---

with dynamic anchoring activated: 6.6 times faster

> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2013-12-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855716#comment-13855716
 ] 

Peter Klügl commented on UIMA-2332:
---

First small improvement:
test script is now 5 times faster (!!!)...


> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (UIMA-2332) Profile and optimize Ruta inference performance

2013-12-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/UIMA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855704#comment-13855704
 ] 

Peter Klügl commented on UIMA-2332:
---

I created a small test script for some speed evaluations, similar to the NER 
JAPE rules in ANNIE. Looking at the results, I decided that Ruta seriously 
needs some speed improvements.

> Profile and optimize Ruta inference performance
> ---
>
> Key: UIMA-2332
> URL: https://issues.apache.org/jira/browse/UIMA-2332
> Project: UIMA
>  Issue Type: Improvement
>  Components: ruta
>Affects Versions: 2.0.0TextMarker
>Reporter: Peter Klügl
>Assignee: Peter Klügl
>Priority: Minor
> Fix For: 2.1.1ruta
>
>
> Increase the speed of the ruta rule inference. A starting point is the 
> slowdown of UIMA-2330, see RutaTypeMatcher.getMatchingAnnotations()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)