I'm trying to explore Parts-Of-Speech tagging with SOLR. Firstly, am I
right in assuming that OpenNLP integration is the right direction in
which to proceed?

With respect to getting OpenNLP to work with SOLR (
http://wiki.apache.org/solr/OpenNLP ) , I tried following the
instructions , only to be faced with an error complaining that
OpenNLPTokenizerFactory cannot.be found . Upon researching the error,
I came across the issue
https://issues.apache.org/jira/browse/LUCENE-2899 , that indicates
that integration is not yet complete and the OpenNLP functionality is
only available via a patch (I'm runnign SOLR 4.1 locally).

I tried patching my SOLR 4.1 source , as well as a freshly downloaded
SOLR trunk, to no avail. I guess I just need some tips on how and what
to patch. I tried to patch the base directory as well as the lucene
directory. If there's something I need to hack in the  patch, do let
me know.

Thanks

vinayb@blackbox ~/Downloads/solr-4.1.0/lucene $ pwd
/home/vinayb/Downloads/solr-4.1.0/lucene
vinayb@blackbox ~/Downloads/solr-4.1.0/lucene $ ls
analysis   BUILD.txt    codecs            demo      highlighter
JRE_VERSION_MIGRATION.txt  LUCENE-2899.patch  misc
queries      sandbox  suggest                  tools
backwards  build.xml    common-build.xml  facet     ivy-settings.xml
licenses                   memory             module-build.xml
queryparser  site     SYSTEM_REQUIREMENTS.txt
benchmark  CHANGES.txt  core              grouping  join
LICENSE.txt                MIGRATE.txt        NOTICE.txt
README.txt   spatial  test-framework
vinayb@blackbox ~/Downloads/solr-4.1.0/lucene $ patch -p0 -i
LUCENE-2899.patch --dry-run
can't find file to patch at input line 5
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git dev-tools/eclipse/dot.classpath dev-tools/eclipse/dot.classpath
|index 1d2abc1..575b4f0 100644
|--- dev-tools/eclipse/dot.classpath
|+++ dev-tools/eclipse/dot.classpath
--------------------------
File to patch:

Reply via email to