GitHub user KellenSunderland opened a pull request:

    https://github.com/apache/incubator-joshua/pull/51

    Changes to improve performance of KenLM

    This change cleans up the kenlm_wrap.cc file to get rid of the multimap 
that was recently added.  It replaces the multimap with a vector/unordered_set 
which should allow for faster lookups (and is also less code).  
    
    Also included in this change is a modification to the probRule call that 
packs the state and probability returned from that call into 64 bits, which is 
then unpacked on the Java side.  This eliminates the need to reference a Java 
object across the JNI boundary.  
    
    Finally the scope of Chart objects in KenLM is changed to be per sentence, 
per language model, which should guarantee that there are no crashes due to 
collisions.
    
    Many thanks to @kpu for providing the ideas behind these optimizations.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/KellenSunderland/incubator-joshua master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-joshua/pull/51.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #51
    
----
commit 5e9547526ad4bc15f48e665608897def552cb9ab
Author: Kenneth Heafield <git...@kheafield.com>
Date:   2016-09-13T08:58:26Z

    Probably won't compile but gets the idea across

commit 929760a35dda5f88792c44d6eef41f3e58cf7250
Author: Kellen Sunderland <kell...@amazon.com>
Date:   2016-09-13T09:23:42Z

    Merge branch 'master' of 
https://github.com/KellenSunderland/incubator-joshua

commit 4e07bb66d28e55357ee6b19b3c60a76a31d8dd75
Author: Kellen Sunderland <kell...@amazon.com>
Date:   2016-09-13T10:39:41Z

    Adapted Java side of JNI interface to get state and prob from packed long

commit 0252942dafc1679f2c5d6b8d6da7cd6884ca40c3
Author: Kellen Sunderland <kell...@amazon.com>
Date:   2016-09-13T11:58:05Z

    Manage pool of states on a per LM, per sentence basis

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to