Re: Integrate solr with openNLP

2014-09-10 Thread Aman Tandon
Hi,

What is the progress of integration of nlp with solr. If you have achieved
this integration techniques successfully then please share with us.

With Regards
Aman Tandon

On Tue, Jun 10, 2014 at 11:04 AM, Vivekanand Ittigi vi...@biginfolabs.com
wrote:

 Hi Aman,

 Yeah, We are also thinking the same. Using UIMA is better. And thanks to
 everyone. You guys really showed us the way(UIMA).

 We'll work on it.

 Thanks,
 Vivek


 On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  Hi Vikek,
 
  As everybody in the mail list mentioned to use UIMA you should go for it,
  as opennlp issues are not tracking properly, it can make stuck your
  development in near future if any issue comes, so its better to start
  investigate with uima.
 
 
  With Regards
  Aman Tandon
 
 
  On Fri, Jun 6, 2014 at 11:00 AM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
 
   Can anyone pleas reply..?
  
   Thanks,
   Vivek
  
   -- Forwarded message --
   From: Vivekanand Ittigi vi...@biginfolabs.com
   Date: Wed, Jun 4, 2014 at 4:38 PM
   Subject: Re: Integrate solr with openNLP
   To: Tommaso Teofili tommaso.teof...@gmail.com
   Cc: solr-user@lucene.apache.org solr-user@lucene.apache.org, Ahmet
   Arslan iori...@yahoo.com
  
  
   Hi Tommaso,
  
   Yes, you are right. 4.4 version will work.. I'm able to compile now.
 I'm
   trying to apply named recognition(person name) token but im not seeing
  any
   change. my schema.xml looks like this:
  
   field name=text type=text_opennlp_pos_ner indexed=true
  stored=true
   multiValued=true/
  
   fieldType name=text_opennlp_pos_ner class=solr.TextField
   positionIncrementGap=100
 analyzer
   tokenizer class=solr.OpenNLPTokenizerFactory
 tokenizerModel=opennlp/en-token.bin
   /
   filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-person.bin
   /
   filter class=solr.LowerCaseFilterFactory/
 /analyzer
  
   /fieldType
  
   Please guide..?
  
   Thanks,
   Vivek
  
  
   On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili 
  tommaso.teof...@gmail.com
   
   wrote:
  
Hi all,
   
Ahment was suggesting to eventually use UIMA integration because
  OpenNLP
has already an integration with Apache UIMA and so you would just
 have
  to
use that [1].
And that's one of the main reason UIMA integration was done: it's a
framework that you can easily hook into in order to plug your NLP
   algorithm.
   
If you want to just use OpenNLP then it's up to you if either write
  your
own UpdateRequestProcessor plugin [2] to add metadata extracted by
   OpenNLP
to your documents or either you can write a dedicated analyzer /
   tokenizer
/ token filter.
   
For the OpenNLP integration (LUCENE-2899), the patch is not up to
 date
with the latest APIs in trunk, however you should be able to apply it
  to
(if I recall correctly) to 4.4 version or so, and also adapting it to
  the
latest API shouldn't be too hard.
   
Regards,
Tommaso
   
[1] :
   
  
 
 http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
[2] : http://wiki.apache.org/solr/UpdateRequestProcessor
   
   
   
2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid:
   
Can you extract names, locations etc using OpenNLP in plain/straight
  java
program?
   
If yes, here are two seperate options :
   
1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
example to integrate your NER code into it and write your own
 indexing
code. You have the full power here. No solr-plugins are involved.
   
2) Use 'Implementing a conditional copyField' given here :
http://wiki.apache.org/solr/UpdateRequestProcessor
as an example and integrate your NER code into it.
   
   
Please note that these are separate ways to enrich your incoming
documents, choose either (1) or (2).
   
   
   
On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi 
vi...@biginfolabs.com wrote:
Okay, but i dint understand what you said. Can you please elaborate.
   
Thanks,
Vivek
   
   
   
   
   
On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com
  wrote:
   
 Hi Vivekanand,

 I have never use UIMA+Solr before.

 Personally I think it takes more time to learn how to
 configure/use
these
 uima stuff.


 If you are familiar with java, write a class that extends
 UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these
 new
fields
 (organisation, city, person name, etc, to your document. This
 phase
  is
 usually called 'enrichment'.

 Does that makes sense?



 On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi 
vi...@biginfolabs.com
 wrote:
 Hi Ahmet,

 I followed what you said
 https://cwiki.apache.org/confluence/display

Re: Integrate solr with openNLP

2014-09-10 Thread Vivekanand Ittigi
Actually we dropped integrating nlp with solr but we took two different
ideas:

* we're using nlp seperately not with solr
* we're taking help of UIMA for solr. Its more advanced.

If you've a specific question. you can ask me. I'll tell you if i know.

-Vivek

On Wed, Sep 10, 2014 at 3:46 PM, Aman Tandon amantandon...@gmail.com
wrote:

 Hi,

 What is the progress of integration of nlp with solr. If you have achieved
 this integration techniques successfully then please share with us.

 With Regards
 Aman Tandon

 On Tue, Jun 10, 2014 at 11:04 AM, Vivekanand Ittigi vi...@biginfolabs.com
 
 wrote:

  Hi Aman,
 
  Yeah, We are also thinking the same. Using UIMA is better. And thanks to
  everyone. You guys really showed us the way(UIMA).
 
  We'll work on it.
 
  Thanks,
  Vivek
 
 
  On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
   Hi Vikek,
  
   As everybody in the mail list mentioned to use UIMA you should go for
 it,
   as opennlp issues are not tracking properly, it can make stuck your
   development in near future if any issue comes, so its better to start
   investigate with uima.
  
  
   With Regards
   Aman Tandon
  
  
   On Fri, Jun 6, 2014 at 11:00 AM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
  
Can anyone pleas reply..?
   
Thanks,
Vivek
   
-- Forwarded message --
From: Vivekanand Ittigi vi...@biginfolabs.com
Date: Wed, Jun 4, 2014 at 4:38 PM
Subject: Re: Integrate solr with openNLP
To: Tommaso Teofili tommaso.teof...@gmail.com
Cc: solr-user@lucene.apache.org solr-user@lucene.apache.org,
 Ahmet
Arslan iori...@yahoo.com
   
   
Hi Tommaso,
   
Yes, you are right. 4.4 version will work.. I'm able to compile now.
  I'm
trying to apply named recognition(person name) token but im not
 seeing
   any
change. my schema.xml looks like this:
   
field name=text type=text_opennlp_pos_ner indexed=true
   stored=true
multiValued=true/
   
fieldType name=text_opennlp_pos_ner class=solr.TextField
positionIncrementGap=100
  analyzer
tokenizer class=solr.OpenNLPTokenizerFactory
  tokenizerModel=opennlp/en-token.bin
/
filter class=solr.OpenNLPFilterFactory
  nerTaggerModels=opennlp/en-ner-person.bin
/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
   
/fieldType
   
Please guide..?
   
Thanks,
Vivek
   
   
On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili 
   tommaso.teof...@gmail.com

wrote:
   
 Hi all,

 Ahment was suggesting to eventually use UIMA integration because
   OpenNLP
 has already an integration with Apache UIMA and so you would just
  have
   to
 use that [1].
 And that's one of the main reason UIMA integration was done: it's a
 framework that you can easily hook into in order to plug your NLP
algorithm.

 If you want to just use OpenNLP then it's up to you if either write
   your
 own UpdateRequestProcessor plugin [2] to add metadata extracted by
OpenNLP
 to your documents or either you can write a dedicated analyzer /
tokenizer
 / token filter.

 For the OpenNLP integration (LUCENE-2899), the patch is not up to
  date
 with the latest APIs in trunk, however you should be able to apply
 it
   to
 (if I recall correctly) to 4.4 version or so, and also adapting it
 to
   the
 latest API shouldn't be too hard.

 Regards,
 Tommaso

 [1] :

   
  
 
 http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
 [2] : http://wiki.apache.org/solr/UpdateRequestProcessor



 2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid
 :

 Can you extract names, locations etc using OpenNLP in
 plain/straight
   java
 program?

 If yes, here are two seperate options :

 1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
 example to integrate your NER code into it and write your own
  indexing
 code. You have the full power here. No solr-plugins are involved.

 2) Use 'Implementing a conditional copyField' given here :
 http://wiki.apache.org/solr/UpdateRequestProcessor
 as an example and integrate your NER code into it.


 Please note that these are separate ways to enrich your incoming
 documents, choose either (1) or (2).



 On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:
 Okay, but i dint understand what you said. Can you please
 elaborate.

 Thanks,
 Vivek





 On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com
   wrote:

  Hi Vivekanand,
 
  I have never use UIMA+Solr before.
 
  Personally I think it takes more time to learn how to
  configure/use

Re: Integrate solr with openNLP

2014-06-09 Thread Vivekanand Ittigi
Hi Aman,

Yeah, We are also thinking the same. Using UIMA is better. And thanks to
everyone. You guys really showed us the way(UIMA).

We'll work on it.

Thanks,
Vivek


On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon amantandon...@gmail.com wrote:

 Hi Vikek,

 As everybody in the mail list mentioned to use UIMA you should go for it,
 as opennlp issues are not tracking properly, it can make stuck your
 development in near future if any issue comes, so its better to start
 investigate with uima.


 With Regards
 Aman Tandon


 On Fri, Jun 6, 2014 at 11:00 AM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:

  Can anyone pleas reply..?
 
  Thanks,
  Vivek
 
  -- Forwarded message --
  From: Vivekanand Ittigi vi...@biginfolabs.com
  Date: Wed, Jun 4, 2014 at 4:38 PM
  Subject: Re: Integrate solr with openNLP
  To: Tommaso Teofili tommaso.teof...@gmail.com
  Cc: solr-user@lucene.apache.org solr-user@lucene.apache.org, Ahmet
  Arslan iori...@yahoo.com
 
 
  Hi Tommaso,
 
  Yes, you are right. 4.4 version will work.. I'm able to compile now. I'm
  trying to apply named recognition(person name) token but im not seeing
 any
  change. my schema.xml looks like this:
 
  field name=text type=text_opennlp_pos_ner indexed=true
 stored=true
  multiValued=true/
 
  fieldType name=text_opennlp_pos_ner class=solr.TextField
  positionIncrementGap=100
analyzer
  tokenizer class=solr.OpenNLPTokenizerFactory
tokenizerModel=opennlp/en-token.bin
  /
  filter class=solr.OpenNLPFilterFactory
nerTaggerModels=opennlp/en-ner-person.bin
  /
  filter class=solr.LowerCaseFilterFactory/
/analyzer
 
  /fieldType
 
  Please guide..?
 
  Thanks,
  Vivek
 
 
  On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili 
 tommaso.teof...@gmail.com
  
  wrote:
 
   Hi all,
  
   Ahment was suggesting to eventually use UIMA integration because
 OpenNLP
   has already an integration with Apache UIMA and so you would just have
 to
   use that [1].
   And that's one of the main reason UIMA integration was done: it's a
   framework that you can easily hook into in order to plug your NLP
  algorithm.
  
   If you want to just use OpenNLP then it's up to you if either write
 your
   own UpdateRequestProcessor plugin [2] to add metadata extracted by
  OpenNLP
   to your documents or either you can write a dedicated analyzer /
  tokenizer
   / token filter.
  
   For the OpenNLP integration (LUCENE-2899), the patch is not up to date
   with the latest APIs in trunk, however you should be able to apply it
 to
   (if I recall correctly) to 4.4 version or so, and also adapting it to
 the
   latest API shouldn't be too hard.
  
   Regards,
   Tommaso
  
   [1] :
  
 
 http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
   [2] : http://wiki.apache.org/solr/UpdateRequestProcessor
  
  
  
   2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid:
  
   Can you extract names, locations etc using OpenNLP in plain/straight
 java
   program?
  
   If yes, here are two seperate options :
  
   1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
   example to integrate your NER code into it and write your own indexing
   code. You have the full power here. No solr-plugins are involved.
  
   2) Use 'Implementing a conditional copyField' given here :
   http://wiki.apache.org/solr/UpdateRequestProcessor
   as an example and integrate your NER code into it.
  
  
   Please note that these are separate ways to enrich your incoming
   documents, choose either (1) or (2).
  
  
  
   On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi 
   vi...@biginfolabs.com wrote:
   Okay, but i dint understand what you said. Can you please elaborate.
  
   Thanks,
   Vivek
  
  
  
  
  
   On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
Hi Vivekanand,
   
I have never use UIMA+Solr before.
   
Personally I think it takes more time to learn how to configure/use
   these
uima stuff.
   
   
If you are familiar with java, write a class that extends
UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new
   fields
(organisation, city, person name, etc, to your document. This phase
 is
usually called 'enrichment'.
   
Does that makes sense?
   
   
   
On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi 
   vi...@biginfolabs.com
wrote:
Hi Ahmet,
   
I followed what you said
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration.
  But
   how
can i achieve my goal? i mean extracting only name of the
 organization
   or
person from the content field.
   
I guess i'm almost there but something is missing? please guide me
   
Thanks,
Vivek
   
   
   
   
   
On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi 
   vi...@biginfolabs.com
wrote:
   
 Entire goal cant be said but one of those tasks can be like

Re: Integrate solr with openNLP

2014-06-06 Thread Aman Tandon
Hi Vikek,

As everybody in the mail list mentioned to use UIMA you should go for it,
as opennlp issues are not tracking properly, it can make stuck your
development in near future if any issue comes, so its better to start
investigate with uima.


With Regards
Aman Tandon


On Fri, Jun 6, 2014 at 11:00 AM, Vivekanand Ittigi vi...@biginfolabs.com
wrote:

 Can anyone pleas reply..?

 Thanks,
 Vivek

 -- Forwarded message --
 From: Vivekanand Ittigi vi...@biginfolabs.com
 Date: Wed, Jun 4, 2014 at 4:38 PM
 Subject: Re: Integrate solr with openNLP
 To: Tommaso Teofili tommaso.teof...@gmail.com
 Cc: solr-user@lucene.apache.org solr-user@lucene.apache.org, Ahmet
 Arslan iori...@yahoo.com


 Hi Tommaso,

 Yes, you are right. 4.4 version will work.. I'm able to compile now. I'm
 trying to apply named recognition(person name) token but im not seeing any
 change. my schema.xml looks like this:

 field name=text type=text_opennlp_pos_ner indexed=true stored=true
 multiValued=true/

 fieldType name=text_opennlp_pos_ner class=solr.TextField
 positionIncrementGap=100
   analyzer
 tokenizer class=solr.OpenNLPTokenizerFactory
   tokenizerModel=opennlp/en-token.bin
 /
 filter class=solr.OpenNLPFilterFactory
   nerTaggerModels=opennlp/en-ner-person.bin
 /
 filter class=solr.LowerCaseFilterFactory/
   /analyzer

 /fieldType

 Please guide..?

 Thanks,
 Vivek


 On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili tommaso.teof...@gmail.com
 
 wrote:

  Hi all,
 
  Ahment was suggesting to eventually use UIMA integration because OpenNLP
  has already an integration with Apache UIMA and so you would just have to
  use that [1].
  And that's one of the main reason UIMA integration was done: it's a
  framework that you can easily hook into in order to plug your NLP
 algorithm.
 
  If you want to just use OpenNLP then it's up to you if either write your
  own UpdateRequestProcessor plugin [2] to add metadata extracted by
 OpenNLP
  to your documents or either you can write a dedicated analyzer /
 tokenizer
  / token filter.
 
  For the OpenNLP integration (LUCENE-2899), the patch is not up to date
  with the latest APIs in trunk, however you should be able to apply it to
  (if I recall correctly) to 4.4 version or so, and also adapting it to the
  latest API shouldn't be too hard.
 
  Regards,
  Tommaso
 
  [1] :
 
 http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
  [2] : http://wiki.apache.org/solr/UpdateRequestProcessor
 
 
 
  2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid:
 
  Can you extract names, locations etc using OpenNLP in plain/straight java
  program?
 
  If yes, here are two seperate options :
 
  1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
  example to integrate your NER code into it and write your own indexing
  code. You have the full power here. No solr-plugins are involved.
 
  2) Use 'Implementing a conditional copyField' given here :
  http://wiki.apache.org/solr/UpdateRequestProcessor
  as an example and integrate your NER code into it.
 
 
  Please note that these are separate ways to enrich your incoming
  documents, choose either (1) or (2).
 
 
 
  On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com wrote:
  Okay, but i dint understand what you said. Can you please elaborate.
 
  Thanks,
  Vivek
 
 
 
 
 
  On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
   Hi Vivekanand,
  
   I have never use UIMA+Solr before.
  
   Personally I think it takes more time to learn how to configure/use
  these
   uima stuff.
  
  
   If you are familiar with java, write a class that extends
   UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new
  fields
   (organisation, city, person name, etc, to your document. This phase is
   usually called 'enrichment'.
  
   Does that makes sense?
  
  
  
   On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
   Hi Ahmet,
  
   I followed what you said
   https://cwiki.apache.org/confluence/display/solr/UIMA+Integration.
 But
  how
   can i achieve my goal? i mean extracting only name of the organization
  or
   person from the content field.
  
   I guess i'm almost there but something is missing? please guide me
  
   Thanks,
   Vivek
  
  
  
  
  
   On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
  
Entire goal cant be said but one of those tasks can be like this..
 we
   have
big document(can be website or pdf etc) indexed to the solr.
Lets say field name=content will sore store the contents of
  document.
All i want to do is pick name of persons,places from it using
 openNLP
  or
some other means.
   
Those names should be reflected in solr itself.
   
Thanks,
Vivek
   
   
On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com
  wrote

Re: Integrate solr with openNLP

2014-06-04 Thread Tommaso Teofili
Hi all,

Ahment was suggesting to eventually use UIMA integration because OpenNLP
has already an integration with Apache UIMA and so you would just have to
use that [1].
And that's one of the main reason UIMA integration was done: it's a
framework that you can easily hook into in order to plug your NLP algorithm.

If you want to just use OpenNLP then it's up to you if either write your
own UpdateRequestProcessor plugin [2] to add metadata extracted by OpenNLP
to your documents or either you can write a dedicated analyzer / tokenizer
/ token filter.

For the OpenNLP integration (LUCENE-2899), the patch is not up to date with
the latest APIs in trunk, however you should be able to apply it to (if I
recall correctly) to 4.4 version or so, and also adapting it to the latest
API shouldn't be too hard.

Regards,
Tommaso

[1] :
http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
[2] : http://wiki.apache.org/solr/UpdateRequestProcessor



2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid:

 Can you extract names, locations etc using OpenNLP in plain/straight java
 program?

 If yes, here are two seperate options :

 1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an example
 to integrate your NER code into it and write your own indexing code. You
 have the full power here. No solr-plugins are involved.

 2) Use 'Implementing a conditional copyField' given here :
 http://wiki.apache.org/solr/UpdateRequestProcessor
 as an example and integrate your NER code into it.


 Please note that these are separate ways to enrich your incoming
 documents, choose either (1) or (2).



 On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 Okay, but i dint understand what you said. Can you please elaborate.

 Thanks,
 Vivek





 On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com wrote:

  Hi Vivekanand,
 
  I have never use UIMA+Solr before.
 
  Personally I think it takes more time to learn how to configure/use these
  uima stuff.
 
 
  If you are familiar with java, write a class that extends
  UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new
 fields
  (organisation, city, person name, etc, to your document. This phase is
  usually called 'enrichment'.
 
  Does that makes sense?
 
 
 
  On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  Hi Ahmet,
 
  I followed what you said
  https://cwiki.apache.org/confluence/display/solr/UIMA+Integration. But
 how
  can i achieve my goal? i mean extracting only name of the organization or
  person from the content field.
 
  I guess i'm almost there but something is missing? please guide me
 
  Thanks,
  Vivek
 
 
 
 
 
  On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi vi...@biginfolabs.com
 
  wrote:
 
   Entire goal cant be said but one of those tasks can be like this.. we
  have
   big document(can be website or pdf etc) indexed to the solr.
   Lets say field name=content will sore store the contents of document.
   All i want to do is pick name of persons,places from it using openNLP
 or
   some other means.
  
   Those names should be reflected in solr itself.
  
   Thanks,
   Vivek
  
  
   On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi,
  
   Please tell us what you are trying to in a new treat. Your high level
   goal. There may be some other ways/tools such as (
   https://stanbol.apache.org ) other than OpenNLP.
  
  
  
   On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi 
   vi...@biginfolabs.com wrote:
  
  
  
   We'll surely look into UIMA integration.
  
   But before moving, is this( https://wiki.apache.org/solr/OpenNLP )
 the
   only link we've got to integrate?isn't there any other article or link
   which may help us to do fix this problem.
  
   Thanks,
   Vivek
  
  
  
  
   On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi,
   
   I believe I answered it. Let me re-try,
   
   There is no committed code for OpenNLP. There is an open ticket with
   patches. They may not work with current trunk.
   
   Confluence is the official documentation. Wiki is maintained by
   community. Meaning wiki can talk about some uncommitted
 features/stuff.
   Like this one : https://wiki.apache.org/solr/OpenNLP
   
   What I am suggesting is, have a look at
   https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
   
   
   And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is
  already
   doable with solr-uima. I am adding Tommaso (sorry for this but we need
  an
   authoritative answer here) to clarify this.
   
   
   Also consider indexing with SolrJ and use OpenNLP enrichment outside
  the
   solr. Use openNLP with plain java, enrich your documents and index
 them
   with SolJ. You don't have to too everything inside solr as
 solr-plugins.
   
   Hope this helps,
   
   Ahmet
   
   
   
   On Monday, June 2, 2014 11:15 PM, 

Re: Integrate solr with openNLP

2014-06-04 Thread Vivekanand Ittigi
Hi Tommaso,

Yes, you are right. 4.4 version will work.. I'm able to compile now. I'm
trying to apply named recognition(person name) token but im not seeing any
change. my schema.xml looks like this:

field name=text type=text_opennlp_pos_ner indexed=true stored=true
multiValued=true/

fieldType name=text_opennlp_pos_ner class=solr.TextField
positionIncrementGap=100
  analyzer
tokenizer class=solr.OpenNLPTokenizerFactory
  tokenizerModel=opennlp/en-token.bin
/
filter class=solr.OpenNLPFilterFactory
  nerTaggerModels=opennlp/en-ner-person.bin
/
filter class=solr.LowerCaseFilterFactory/
  /analyzer

/fieldType

Please guide..?

Thanks,
Vivek


On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili tommaso.teof...@gmail.com
wrote:

 Hi all,

 Ahment was suggesting to eventually use UIMA integration because OpenNLP
 has already an integration with Apache UIMA and so you would just have to
 use that [1].
 And that's one of the main reason UIMA integration was done: it's a
 framework that you can easily hook into in order to plug your NLP algorithm.

 If you want to just use OpenNLP then it's up to you if either write your
 own UpdateRequestProcessor plugin [2] to add metadata extracted by OpenNLP
 to your documents or either you can write a dedicated analyzer / tokenizer
 / token filter.

 For the OpenNLP integration (LUCENE-2899), the patch is not up to date
 with the latest APIs in trunk, however you should be able to apply it to
 (if I recall correctly) to 4.4 version or so, and also adapting it to the
 latest API shouldn't be too hard.

 Regards,
 Tommaso

 [1] :
 http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
 [2] : http://wiki.apache.org/solr/UpdateRequestProcessor



 2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid:

 Can you extract names, locations etc using OpenNLP in plain/straight java
 program?

 If yes, here are two seperate options :

 1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
 example to integrate your NER code into it and write your own indexing
 code. You have the full power here. No solr-plugins are involved.

 2) Use 'Implementing a conditional copyField' given here :
 http://wiki.apache.org/solr/UpdateRequestProcessor
 as an example and integrate your NER code into it.


 Please note that these are separate ways to enrich your incoming
 documents, choose either (1) or (2).



 On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:
 Okay, but i dint understand what you said. Can you please elaborate.

 Thanks,
 Vivek





 On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com wrote:

  Hi Vivekanand,
 
  I have never use UIMA+Solr before.
 
  Personally I think it takes more time to learn how to configure/use
 these
  uima stuff.
 
 
  If you are familiar with java, write a class that extends
  UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new
 fields
  (organisation, city, person name, etc, to your document. This phase is
  usually called 'enrichment'.
 
  Does that makes sense?
 
 
 
  On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  Hi Ahmet,
 
  I followed what you said
  https://cwiki.apache.org/confluence/display/solr/UIMA+Integration. But
 how
  can i achieve my goal? i mean extracting only name of the organization
 or
  person from the content field.
 
  I guess i'm almost there but something is missing? please guide me
 
  Thanks,
  Vivek
 
 
 
 
 
  On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
 
   Entire goal cant be said but one of those tasks can be like this.. we
  have
   big document(can be website or pdf etc) indexed to the solr.
   Lets say field name=content will sore store the contents of
 document.
   All i want to do is pick name of persons,places from it using openNLP
 or
   some other means.
  
   Those names should be reflected in solr itself.
  
   Thanks,
   Vivek
  
  
   On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi,
  
   Please tell us what you are trying to in a new treat. Your high level
   goal. There may be some other ways/tools such as (
   https://stanbol.apache.org ) other than OpenNLP.
  
  
  
   On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi 
   vi...@biginfolabs.com wrote:
  
  
  
   We'll surely look into UIMA integration.
  
   But before moving, is this( https://wiki.apache.org/solr/OpenNLP )
 the
   only link we've got to integrate?isn't there any other article or
 link
   which may help us to do fix this problem.
  
   Thanks,
   Vivek
  
  
  
  
   On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi,
   
   I believe I answered it. Let me re-try,
   
   There is no committed code for OpenNLP. There is an open ticket with
   patches. They may not work with current trunk.
   
   Confluence is the official 

Re: Integrate solr with openNLP

2014-06-03 Thread Ahmet Arslan
Hi,

Please tell us what you are trying to in a new treat. Your high level goal. 
There may be some other ways/tools such as ( https://stanbol.apache.org ) other 
than OpenNLP.



On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi vi...@biginfolabs.com 
wrote:



We'll surely look into UIMA integration. 

But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the only 
link we've got to integrate?isn't there any other article or link which may 
help us to do fix this problem.

Thanks,
Vivek




On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com wrote:

Hi,

I believe I answered it. Let me re-try, 

There is no committed code for OpenNLP. There is an open ticket with patches. 
They may not work with current trunk.

Confluence is the official documentation. Wiki is maintained by community. 
Meaning wiki can talk about some uncommitted features/stuff. Like this one : 
https://wiki.apache.org/solr/OpenNLP

What I am suggesting is, have a look at 
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration


And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is already 
doable with solr-uima. I am adding Tommaso (sorry for this but we need an 
authoritative answer here) to clarify this.


Also consider indexing with SolrJ and use OpenNLP enrichment outside the solr. 
Use openNLP with plain java, enrich your documents and index them with SolJ. 
You don't have to too everything inside solr as solr-plugins.

Hope this helps,

Ahmet



On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi vi...@biginfolabs.com 
wrote:
Thanks, I will check with the jira.. but you dint answe my first
question..? And there's no way to integrate solr with openNLP?or is there
any committed code, using which i can go head.

Thanks,
Vivek





On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 Here is the jira issue : https://issues.apache.org/jira/browse/LUCENE-2899


 Anyone can create an account.

 I didn't use UIMA by myself and I have little knowledge about it. But I
 believe it is possible to use OpenNLP inside UIMA.
 You need to dig into UIMA documentation.

 Solr UIMA integration already exists, thats why I questioned whether your
 requirement is possible with uima or not. I don't know the answer myself.

 Ahmet



 On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 Hi Arslan,

 If not uncommitted code, then which code to be used to integrate?

 If i have to comment my problems, which jira and how to put it?

 And why you are suggesting UIMA integration. My requirements is integrating
 with openNLP.? You mean we can do all the acitivties through UIMA as we do
 it using openNLP..?like name,location finder etc?

 Thanks,
 Vivek





 On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
 wrote:

  Hi,
 
  Uncommitted code could have these kind of problems. It is not guaranteed
  to work with latest trunk.
 
  You could commend the problem you face on the jira ticket.
 
  By the way, may be you are after something doable with already committed
  UIMA stuff?
 
  https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
 
  Ahmet
 
 
 
  On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  I followed this link to integrate https://wiki.apache.org/solr/OpenNLP
 to
  integrate
 
  Installation
 
  For English language testing: Until LUCENE-2899 is committed:
 
      1.pull the latest trunk or 4.0 branch
 
      2.apply the latest LUCENE-2899 patch
      3.do 'ant compile'
      cd solr/contrib/opennlp/src/test-files/training
      .
      .
      .
  i followed first two steps but got the following error while executing
 3rd
  point
 
  common.compile-core:
      [javac] Compiling 10 source files to
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java
 
      [javac] warning: [path] bad path element
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
  no such file or directory
 
      [javac]
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
  error: cannot find symbol
 
      [javac]     super(Version.LUCENE_44, input);
 
      [javac]                  ^
      [javac]   symbol:   variable LUCENE_44
      [javac]   location: class Version
      [javac]
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
  error: no suitable constructor found for Tokenizer(Reader)
      [javac]     super(input);
      [javac]     ^
      [javac]     constructor Tokenizer.Tokenizer(AttributeFactory) is not
  applicable
      [javac]       (actual argument Reader cannot be converted to
  AttributeFactory by method invocation conversion)
      [javac]     constructor Tokenizer.Tokenizer() is not applicable
      [javac]       

Re: Integrate solr with openNLP

2014-06-03 Thread Vivekanand Ittigi
Entire goal cant be said but one of those tasks can be like this.. we have
big document(can be website or pdf etc) indexed to the solr.
Lets say field name=content will sore store the contents of document. All
i want to do is pick name of persons,places from it using openNLP or some
other means.

Those names should be reflected in solr itself.

Thanks,
Vivek


On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 Please tell us what you are trying to in a new treat. Your high level
 goal. There may be some other ways/tools such as (
 https://stanbol.apache.org ) other than OpenNLP.



 On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:



 We'll surely look into UIMA integration.

 But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the
 only link we've got to integrate?isn't there any other article or link
 which may help us to do fix this problem.

 Thanks,
 Vivek




 On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,
 
 I believe I answered it. Let me re-try,
 
 There is no committed code for OpenNLP. There is an open ticket with
 patches. They may not work with current trunk.
 
 Confluence is the official documentation. Wiki is maintained by
 community. Meaning wiki can talk about some uncommitted features/stuff.
 Like this one : https://wiki.apache.org/solr/OpenNLP
 
 What I am suggesting is, have a look at
 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
 
 
 And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is already
 doable with solr-uima. I am adding Tommaso (sorry for this but we need an
 authoritative answer here) to clarify this.
 
 
 Also consider indexing with SolrJ and use OpenNLP enrichment outside the
 solr. Use openNLP with plain java, enrich your documents and index them
 with SolJ. You don't have to too everything inside solr as solr-plugins.
 
 Hope this helps,
 
 Ahmet
 
 
 
 On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:
 Thanks, I will check with the jira.. but you dint answe my first
 question..? And there's no way to integrate solr with openNLP?or is there
 any committed code, using which i can go head.
 
 Thanks,
 Vivek
 
 
 
 
 
 On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi,
 
  Here is the jira issue :
 https://issues.apache.org/jira/browse/LUCENE-2899
 
 
  Anyone can create an account.
 
  I didn't use UIMA by myself and I have little knowledge about it. But I
  believe it is possible to use OpenNLP inside UIMA.
  You need to dig into UIMA documentation.
 
  Solr UIMA integration already exists, thats why I questioned whether
 your
  requirement is possible with uima or not. I don't know the answer
 myself.
 
  Ahmet
 
 
 
  On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  Hi Arslan,
 
  If not uncommitted code, then which code to be used to integrate?
 
  If i have to comment my problems, which jira and how to put it?
 
  And why you are suggesting UIMA integration. My requirements is
 integrating
  with openNLP.? You mean we can do all the acitivties through UIMA as we
 do
  it using openNLP..?like name,location finder etc?
 
  Thanks,
  Vivek
 
 
 
 
 
  On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
 
  wrote:
 
   Hi,
  
   Uncommitted code could have these kind of problems. It is not
 guaranteed
   to work with latest trunk.
  
   You could commend the problem you face on the jira ticket.
  
   By the way, may be you are after something doable with already
 committed
   UIMA stuff?
  
   https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
  
   Ahmet
  
  
  
   On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
   I followed this link to integrate
 https://wiki.apache.org/solr/OpenNLP
  to
   integrate
  
   Installation
  
   For English language testing: Until LUCENE-2899 is committed:
  
   1.pull the latest trunk or 4.0 branch
  
   2.apply the latest LUCENE-2899 patch
   3.do 'ant compile'
   cd solr/contrib/opennlp/src/test-files/training
   .
   .
   .
   i followed first two steps but got the following error while executing
  3rd
   point
  
   common.compile-core:
   [javac] Compiling 10 source files to
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java
  
   [javac] warning: [path] bad path element
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
   no such file or directory
  
   [javac]
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
   error: cannot find symbol
  
   [javac] super(Version.LUCENE_44, input);
  
   [javac]  ^
   [javac]   symbol:   variable LUCENE_44
   [javac]   

Re: Integrate solr with openNLP

2014-06-03 Thread Vivekanand Ittigi
Hi Ahmet,

I followed what you said
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration. But how
can i achieve my goal? i mean extracting only name of the organization or
person from the content field.

I guess i'm almost there but something is missing? please guide me

Thanks,
Vivek


On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi vi...@biginfolabs.com
wrote:

 Entire goal cant be said but one of those tasks can be like this.. we have
 big document(can be website or pdf etc) indexed to the solr.
 Lets say field name=content will sore store the contents of document.
 All i want to do is pick name of persons,places from it using openNLP or
 some other means.

 Those names should be reflected in solr itself.

 Thanks,
 Vivek


 On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 Please tell us what you are trying to in a new treat. Your high level
 goal. There may be some other ways/tools such as (
 https://stanbol.apache.org ) other than OpenNLP.



 On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:



 We'll surely look into UIMA integration.

 But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the
 only link we've got to integrate?isn't there any other article or link
 which may help us to do fix this problem.

 Thanks,
 Vivek




 On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,
 
 I believe I answered it. Let me re-try,
 
 There is no committed code for OpenNLP. There is an open ticket with
 patches. They may not work with current trunk.
 
 Confluence is the official documentation. Wiki is maintained by
 community. Meaning wiki can talk about some uncommitted features/stuff.
 Like this one : https://wiki.apache.org/solr/OpenNLP
 
 What I am suggesting is, have a look at
 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
 
 
 And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is already
 doable with solr-uima. I am adding Tommaso (sorry for this but we need an
 authoritative answer here) to clarify this.
 
 
 Also consider indexing with SolrJ and use OpenNLP enrichment outside the
 solr. Use openNLP with plain java, enrich your documents and index them
 with SolJ. You don't have to too everything inside solr as solr-plugins.
 
 Hope this helps,
 
 Ahmet
 
 
 
 On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:
 Thanks, I will check with the jira.. but you dint answe my first
 question..? And there's no way to integrate solr with openNLP?or is there
 any committed code, using which i can go head.
 
 Thanks,
 Vivek
 
 
 
 
 
 On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi,
 
  Here is the jira issue :
 https://issues.apache.org/jira/browse/LUCENE-2899
 
 
  Anyone can create an account.
 
  I didn't use UIMA by myself and I have little knowledge about it. But I
  believe it is possible to use OpenNLP inside UIMA.
  You need to dig into UIMA documentation.
 
  Solr UIMA integration already exists, thats why I questioned whether
 your
  requirement is possible with uima or not. I don't know the answer
 myself.
 
  Ahmet
 
 
 
  On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  Hi Arslan,
 
  If not uncommitted code, then which code to be used to integrate?
 
  If i have to comment my problems, which jira and how to put it?
 
  And why you are suggesting UIMA integration. My requirements is
 integrating
  with openNLP.? You mean we can do all the acitivties through UIMA as
 we do
  it using openNLP..?like name,location finder etc?
 
  Thanks,
  Vivek
 
 
 
 
 
  On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
 
  wrote:
 
   Hi,
  
   Uncommitted code could have these kind of problems. It is not
 guaranteed
   to work with latest trunk.
  
   You could commend the problem you face on the jira ticket.
  
   By the way, may be you are after something doable with already
 committed
   UIMA stuff?
  
   https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
  
   Ahmet
  
  
  
   On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
   I followed this link to integrate
 https://wiki.apache.org/solr/OpenNLP
  to
   integrate
  
   Installation
  
   For English language testing: Until LUCENE-2899 is committed:
  
   1.pull the latest trunk or 4.0 branch
  
   2.apply the latest LUCENE-2899 patch
   3.do 'ant compile'
   cd solr/contrib/opennlp/src/test-files/training
   .
   .
   .
   i followed first two steps but got the following error while
 executing
  3rd
   point
  
   common.compile-core:
   [javac] Compiling 10 source files to
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java
  
   [javac] warning: [path] bad path element
  
  
 
 

Re: Integrate solr with openNLP

2014-06-03 Thread Ahmet Arslan
Hi Vivekanand,

I have never use UIMA+Solr before.

Personally I think it takes more time to learn how to configure/use these uima 
stuff.


If you are familiar with java, write a class that extends 
UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new fields 
(organisation, city, person name, etc, to your document. This phase is usually 
called 'enrichment'. 

Does that makes sense?



On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi vi...@biginfolabs.com 
wrote:
Hi Ahmet,

I followed what you said
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration. But how
can i achieve my goal? i mean extracting only name of the organization or
person from the content field.

I guess i'm almost there but something is missing? please guide me

Thanks,
Vivek





On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi vi...@biginfolabs.com
wrote:

 Entire goal cant be said but one of those tasks can be like this.. we have
 big document(can be website or pdf etc) indexed to the solr.
 Lets say field name=content will sore store the contents of document.
 All i want to do is pick name of persons,places from it using openNLP or
 some other means.

 Those names should be reflected in solr itself.

 Thanks,
 Vivek


 On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 Please tell us what you are trying to in a new treat. Your high level
 goal. There may be some other ways/tools such as (
 https://stanbol.apache.org ) other than OpenNLP.



 On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:



 We'll surely look into UIMA integration.

 But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the
 only link we've got to integrate?isn't there any other article or link
 which may help us to do fix this problem.

 Thanks,
 Vivek




 On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,
 
 I believe I answered it. Let me re-try,
 
 There is no committed code for OpenNLP. There is an open ticket with
 patches. They may not work with current trunk.
 
 Confluence is the official documentation. Wiki is maintained by
 community. Meaning wiki can talk about some uncommitted features/stuff.
 Like this one : https://wiki.apache.org/solr/OpenNLP
 
 What I am suggesting is, have a look at
 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
 
 
 And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is already
 doable with solr-uima. I am adding Tommaso (sorry for this but we need an
 authoritative answer here) to clarify this.
 
 
 Also consider indexing with SolrJ and use OpenNLP enrichment outside the
 solr. Use openNLP with plain java, enrich your documents and index them
 with SolJ. You don't have to too everything inside solr as solr-plugins.
 
 Hope this helps,
 
 Ahmet
 
 
 
 On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:
 Thanks, I will check with the jira.. but you dint answe my first
 question..? And there's no way to integrate solr with openNLP?or is there
 any committed code, using which i can go head.
 
 Thanks,
 Vivek
 
 
 
 
 
 On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi,
 
  Here is the jira issue :
 https://issues.apache.org/jira/browse/LUCENE-2899
 
 
  Anyone can create an account.
 
  I didn't use UIMA by myself and I have little knowledge about it. But I
  believe it is possible to use OpenNLP inside UIMA.
  You need to dig into UIMA documentation.
 
  Solr UIMA integration already exists, thats why I questioned whether
 your
  requirement is possible with uima or not. I don't know the answer
 myself.
 
  Ahmet
 
 
 
  On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  Hi Arslan,
 
  If not uncommitted code, then which code to be used to integrate?
 
  If i have to comment my problems, which jira and how to put it?
 
  And why you are suggesting UIMA integration. My requirements is
 integrating
  with openNLP.? You mean we can do all the acitivties through UIMA as
 we do
  it using openNLP..?like name,location finder etc?
 
  Thanks,
  Vivek
 
 
 
 
 
  On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
 
  wrote:
 
   Hi,
  
   Uncommitted code could have these kind of problems. It is not
 guaranteed
   to work with latest trunk.
  
   You could commend the problem you face on the jira ticket.
  
   By the way, may be you are after something doable with already
 committed
   UIMA stuff?
  
   https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
  
   Ahmet
  
  
  
   On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
   I followed this link to integrate
 https://wiki.apache.org/solr/OpenNLP
  to
   integrate
  
   Installation
  
   For English language testing: Until LUCENE-2899 is committed:
  
       1.pull the latest trunk or 4.0 branch
  
       2.apply the latest LUCENE-2899 patch
       3.do 'ant 

Re: Integrate solr with openNLP

2014-06-03 Thread Vivekanand Ittigi
Okay, but i dint understand what you said. Can you please elaborate.

Thanks,
Vivek


On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Vivekanand,

 I have never use UIMA+Solr before.

 Personally I think it takes more time to learn how to configure/use these
 uima stuff.


 If you are familiar with java, write a class that extends
 UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new fields
 (organisation, city, person name, etc, to your document. This phase is
 usually called 'enrichment'.

 Does that makes sense?



 On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 Hi Ahmet,

 I followed what you said
 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration. But how
 can i achieve my goal? i mean extracting only name of the organization or
 person from the content field.

 I guess i'm almost there but something is missing? please guide me

 Thanks,
 Vivek





 On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:

  Entire goal cant be said but one of those tasks can be like this.. we
 have
  big document(can be website or pdf etc) indexed to the solr.
  Lets say field name=content will sore store the contents of document.
  All i want to do is pick name of persons,places from it using openNLP or
  some other means.
 
  Those names should be reflected in solr itself.
 
  Thanks,
  Vivek
 
 
  On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi,
 
  Please tell us what you are trying to in a new treat. Your high level
  goal. There may be some other ways/tools such as (
  https://stanbol.apache.org ) other than OpenNLP.
 
 
 
  On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi 
  vi...@biginfolabs.com wrote:
 
 
 
  We'll surely look into UIMA integration.
 
  But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the
  only link we've got to integrate?isn't there any other article or link
  which may help us to do fix this problem.
 
  Thanks,
  Vivek
 
 
 
 
  On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi,
  
  I believe I answered it. Let me re-try,
  
  There is no committed code for OpenNLP. There is an open ticket with
  patches. They may not work with current trunk.
  
  Confluence is the official documentation. Wiki is maintained by
  community. Meaning wiki can talk about some uncommitted features/stuff.
  Like this one : https://wiki.apache.org/solr/OpenNLP
  
  What I am suggesting is, have a look at
  https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
  
  
  And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is
 already
  doable with solr-uima. I am adding Tommaso (sorry for this but we need
 an
  authoritative answer here) to clarify this.
  
  
  Also consider indexing with SolrJ and use OpenNLP enrichment outside
 the
  solr. Use openNLP with plain java, enrich your documents and index them
  with SolJ. You don't have to too everything inside solr as solr-plugins.
  
  Hope this helps,
  
  Ahmet
  
  
  
  On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com wrote:
  Thanks, I will check with the jira.. but you dint answe my first
  question..? And there's no way to integrate solr with openNLP?or is
 there
  any committed code, using which i can go head.
  
  Thanks,
  Vivek
  
  
  
  
  
  On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi,
  
   Here is the jira issue :
  https://issues.apache.org/jira/browse/LUCENE-2899
  
  
   Anyone can create an account.
  
   I didn't use UIMA by myself and I have little knowledge about it.
 But I
   believe it is possible to use OpenNLP inside UIMA.
   You need to dig into UIMA documentation.
  
   Solr UIMA integration already exists, thats why I questioned whether
  your
   requirement is possible with uima or not. I don't know the answer
  myself.
  
   Ahmet
  
  
  
   On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
   Hi Arslan,
  
   If not uncommitted code, then which code to be used to integrate?
  
   If i have to comment my problems, which jira and how to put it?
  
   And why you are suggesting UIMA integration. My requirements is
  integrating
   with openNLP.? You mean we can do all the acitivties through UIMA as
  we do
   it using openNLP..?like name,location finder etc?
  
   Thanks,
   Vivek
  
  
  
  
  
   On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan
 iori...@yahoo.com.invalid
  
   wrote:
  
Hi,
   
Uncommitted code could have these kind of problems. It is not
  guaranteed
to work with latest trunk.
   
You could commend the problem you face on the jira ticket.
   
By the way, may be you are after something doable with already
  committed
UIMA stuff?
   
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
   
Ahmet
   
   
   
On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
 

Re: Integrate solr with openNLP

2014-06-03 Thread Ahmet Arslan
Can you extract names, locations etc using OpenNLP in plain/straight java 
program?

If yes, here are two seperate options : 

1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an example to 
integrate your NER code into it and write your own indexing code. You have the 
full power here. No solr-plugins are involved.

2) Use 'Implementing a conditional copyField' given here : 
http://wiki.apache.org/solr/UpdateRequestProcessor
as an example and integrate your NER code into it. 


Please note that these are separate ways to enrich your incoming documents, 
choose either (1) or (2).



On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi vi...@biginfolabs.com 
wrote:
Okay, but i dint understand what you said. Can you please elaborate.

Thanks,
Vivek





On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Vivekanand,

 I have never use UIMA+Solr before.

 Personally I think it takes more time to learn how to configure/use these
 uima stuff.


 If you are familiar with java, write a class that extends
 UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these new fields
 (organisation, city, person name, etc, to your document. This phase is
 usually called 'enrichment'.

 Does that makes sense?



 On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 Hi Ahmet,

 I followed what you said
 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration. But how
 can i achieve my goal? i mean extracting only name of the organization or
 person from the content field.

 I guess i'm almost there but something is missing? please guide me

 Thanks,
 Vivek





 On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:

  Entire goal cant be said but one of those tasks can be like this.. we
 have
  big document(can be website or pdf etc) indexed to the solr.
  Lets say field name=content will sore store the contents of document.
  All i want to do is pick name of persons,places from it using openNLP or
  some other means.
 
  Those names should be reflected in solr itself.
 
  Thanks,
  Vivek
 
 
  On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi,
 
  Please tell us what you are trying to in a new treat. Your high level
  goal. There may be some other ways/tools such as (
  https://stanbol.apache.org ) other than OpenNLP.
 
 
 
  On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi 
  vi...@biginfolabs.com wrote:
 
 
 
  We'll surely look into UIMA integration.
 
  But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the
  only link we've got to integrate?isn't there any other article or link
  which may help us to do fix this problem.
 
  Thanks,
  Vivek
 
 
 
 
  On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi,
  
  I believe I answered it. Let me re-try,
  
  There is no committed code for OpenNLP. There is an open ticket with
  patches. They may not work with current trunk.
  
  Confluence is the official documentation. Wiki is maintained by
  community. Meaning wiki can talk about some uncommitted features/stuff.
  Like this one : https://wiki.apache.org/solr/OpenNLP
  
  What I am suggesting is, have a look at
  https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
  
  
  And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is
 already
  doable with solr-uima. I am adding Tommaso (sorry for this but we need
 an
  authoritative answer here) to clarify this.
  
  
  Also consider indexing with SolrJ and use OpenNLP enrichment outside
 the
  solr. Use openNLP with plain java, enrich your documents and index them
  with SolJ. You don't have to too everything inside solr as solr-plugins.
  
  Hope this helps,
  
  Ahmet
  
  
  
  On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com wrote:
  Thanks, I will check with the jira.. but you dint answe my first
  question..? And there's no way to integrate solr with openNLP?or is
 there
  any committed code, using which i can go head.
  
  Thanks,
  Vivek
  
  
  
  
  
  On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi,
  
   Here is the jira issue :
  https://issues.apache.org/jira/browse/LUCENE-2899
  
  
   Anyone can create an account.
  
   I didn't use UIMA by myself and I have little knowledge about it.
 But I
   believe it is possible to use OpenNLP inside UIMA.
   You need to dig into UIMA documentation.
  
   Solr UIMA integration already exists, thats why I questioned whether
  your
   requirement is possible with uima or not. I don't know the answer
  myself.
  
   Ahmet
  
  
  
   On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
   Hi Arslan,
  
   If not uncommitted code, then which code to be used to integrate?
  
   If i have to comment my problems, which jira and how to put it?
  
   And why you are suggesting UIMA integration. My requirements is
  integrating
   with openNLP.? 

Re: Integrate solr with openNLP

2014-06-02 Thread Ahmet Arslan
Hi,

Uncommitted code could have these kind of problems. It is not guaranteed to 
work with latest trunk.

You could commend the problem you face on the jira ticket.

By the way, may be you are after something doable with already committed UIMA 
stuff?

https://cwiki.apache.org/confluence/display/solr/UIMA+Integration

Ahmet



On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi vi...@biginfolabs.com 
wrote:
I followed this link to integrate https://wiki.apache.org/solr/OpenNLP to
integrate

Installation

For English language testing: Until LUCENE-2899 is committed:

    1.pull the latest trunk or 4.0 branch

    2.apply the latest LUCENE-2899 patch
    3.do 'ant compile'
    cd solr/contrib/opennlp/src/test-files/training
    .
    .
    .
i followed first two steps but got the following error while executing 3rd
point

common.compile-core:
    [javac] Compiling 10 source files to
/home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java

    [javac] warning: [path] bad path element
/home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
no such file or directory

    [javac]
/home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
error: cannot find symbol

    [javac]     super(Version.LUCENE_44, input);

    [javac]                  ^
    [javac]   symbol:   variable LUCENE_44
    [javac]   location: class Version
    [javac]
/home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
error: no suitable constructor found for Tokenizer(Reader)
    [javac]     super(input);
    [javac]     ^
    [javac]     constructor Tokenizer.Tokenizer(AttributeFactory) is not
applicable
    [javac]       (actual argument Reader cannot be converted to
AttributeFactory by method invocation conversion)
    [javac]     constructor Tokenizer.Tokenizer() is not applicable
    [javac]       (actual and formal argument lists differ in length)
    [javac] 2 errors
    [javac] 1 warning

Im really stuck how to passthough this step. I wasted my entire to fix this
but couldn't move a bit. Please someone help me..?

Thanks,
Vivek



Re: Integrate solr with openNLP

2014-06-02 Thread Vivekanand Ittigi
Hi Arslan,

If not uncommitted code, then which code to be used to integrate?

If i have to comment my problems, which jira and how to put it?

And why you are suggesting UIMA integration. My requirements is integrating
with openNLP.? You mean we can do all the acitivties through UIMA as we do
it using openNLP..?like name,location finder etc?

Thanks,
Vivek


On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:

 Hi,

 Uncommitted code could have these kind of problems. It is not guaranteed
 to work with latest trunk.

 You could commend the problem you face on the jira ticket.

 By the way, may be you are after something doable with already committed
 UIMA stuff?

 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration

 Ahmet



 On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 I followed this link to integrate https://wiki.apache.org/solr/OpenNLP to
 integrate

 Installation

 For English language testing: Until LUCENE-2899 is committed:

 1.pull the latest trunk or 4.0 branch

 2.apply the latest LUCENE-2899 patch
 3.do 'ant compile'
 cd solr/contrib/opennlp/src/test-files/training
 .
 .
 .
 i followed first two steps but got the following error while executing 3rd
 point

 common.compile-core:
 [javac] Compiling 10 source files to

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java

 [javac] warning: [path] bad path element

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
 no such file or directory

 [javac]

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
 error: cannot find symbol

 [javac] super(Version.LUCENE_44, input);

 [javac]  ^
 [javac]   symbol:   variable LUCENE_44
 [javac]   location: class Version
 [javac]

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
 error: no suitable constructor found for Tokenizer(Reader)
 [javac] super(input);
 [javac] ^
 [javac] constructor Tokenizer.Tokenizer(AttributeFactory) is not
 applicable
 [javac]   (actual argument Reader cannot be converted to
 AttributeFactory by method invocation conversion)
 [javac] constructor Tokenizer.Tokenizer() is not applicable
 [javac]   (actual and formal argument lists differ in length)
 [javac] 2 errors
 [javac] 1 warning

 Im really stuck how to passthough this step. I wasted my entire to fix this
 but couldn't move a bit. Please someone help me..?

 Thanks,
 Vivek




Re: Integrate solr with openNLP

2014-06-02 Thread Ahmet Arslan
Hi,

Here is the jira issue : https://issues.apache.org/jira/browse/LUCENE-2899 

Anyone can create an account. 

I didn't use UIMA by myself and I have little knowledge about it. But I believe 
it is possible to use OpenNLP inside UIMA.
You need to dig into UIMA documentation.

Solr UIMA integration already exists, thats why I questioned whether your 
requirement is possible with uima or not. I don't know the answer myself.

Ahmet



On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi vi...@biginfolabs.com 
wrote:
Hi Arslan,

If not uncommitted code, then which code to be used to integrate?

If i have to comment my problems, which jira and how to put it?

And why you are suggesting UIMA integration. My requirements is integrating
with openNLP.? You mean we can do all the acitivties through UIMA as we do
it using openNLP..?like name,location finder etc?

Thanks,
Vivek





On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:

 Hi,

 Uncommitted code could have these kind of problems. It is not guaranteed
 to work with latest trunk.

 You could commend the problem you face on the jira ticket.

 By the way, may be you are after something doable with already committed
 UIMA stuff?

 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration

 Ahmet



 On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 I followed this link to integrate https://wiki.apache.org/solr/OpenNLP to
 integrate

 Installation

 For English language testing: Until LUCENE-2899 is committed:

     1.pull the latest trunk or 4.0 branch

     2.apply the latest LUCENE-2899 patch
     3.do 'ant compile'
     cd solr/contrib/opennlp/src/test-files/training
     .
     .
     .
 i followed first two steps but got the following error while executing 3rd
 point

 common.compile-core:
     [javac] Compiling 10 source files to

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java

     [javac] warning: [path] bad path element

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
 no such file or directory

     [javac]

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
 error: cannot find symbol

     [javac]     super(Version.LUCENE_44, input);

     [javac]                  ^
     [javac]   symbol:   variable LUCENE_44
     [javac]   location: class Version
     [javac]

 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
 error: no suitable constructor found for Tokenizer(Reader)
     [javac]     super(input);
     [javac]     ^
     [javac]     constructor Tokenizer.Tokenizer(AttributeFactory) is not
 applicable
     [javac]       (actual argument Reader cannot be converted to
 AttributeFactory by method invocation conversion)
     [javac]     constructor Tokenizer.Tokenizer() is not applicable
     [javac]       (actual and formal argument lists differ in length)
     [javac] 2 errors
     [javac] 1 warning

 Im really stuck how to passthough this step. I wasted my entire to fix this
 but couldn't move a bit. Please someone help me..?

 Thanks,
 Vivek





Re: Integrate solr with openNLP

2014-06-02 Thread Vivekanand Ittigi
Thanks, I will check with the jira.. but you dint answe my first
question..? And there's no way to integrate solr with openNLP?or is there
any committed code, using which i can go head.

Thanks,
Vivek


On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 Here is the jira issue : https://issues.apache.org/jira/browse/LUCENE-2899


 Anyone can create an account.

 I didn't use UIMA by myself and I have little knowledge about it. But I
 believe it is possible to use OpenNLP inside UIMA.
 You need to dig into UIMA documentation.

 Solr UIMA integration already exists, thats why I questioned whether your
 requirement is possible with uima or not. I don't know the answer myself.

 Ahmet



 On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 Hi Arslan,

 If not uncommitted code, then which code to be used to integrate?

 If i have to comment my problems, which jira and how to put it?

 And why you are suggesting UIMA integration. My requirements is integrating
 with openNLP.? You mean we can do all the acitivties through UIMA as we do
 it using openNLP..?like name,location finder etc?

 Thanks,
 Vivek





 On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
 wrote:

  Hi,
 
  Uncommitted code could have these kind of problems. It is not guaranteed
  to work with latest trunk.
 
  You could commend the problem you face on the jira ticket.
 
  By the way, may be you are after something doable with already committed
  UIMA stuff?
 
  https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
 
  Ahmet
 
 
 
  On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  I followed this link to integrate https://wiki.apache.org/solr/OpenNLP
 to
  integrate
 
  Installation
 
  For English language testing: Until LUCENE-2899 is committed:
 
  1.pull the latest trunk or 4.0 branch
 
  2.apply the latest LUCENE-2899 patch
  3.do 'ant compile'
  cd solr/contrib/opennlp/src/test-files/training
  .
  .
  .
  i followed first two steps but got the following error while executing
 3rd
  point
 
  common.compile-core:
  [javac] Compiling 10 source files to
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java
 
  [javac] warning: [path] bad path element
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
  no such file or directory
 
  [javac]
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
  error: cannot find symbol
 
  [javac] super(Version.LUCENE_44, input);
 
  [javac]  ^
  [javac]   symbol:   variable LUCENE_44
  [javac]   location: class Version
  [javac]
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
  error: no suitable constructor found for Tokenizer(Reader)
  [javac] super(input);
  [javac] ^
  [javac] constructor Tokenizer.Tokenizer(AttributeFactory) is not
  applicable
  [javac]   (actual argument Reader cannot be converted to
  AttributeFactory by method invocation conversion)
  [javac] constructor Tokenizer.Tokenizer() is not applicable
  [javac]   (actual and formal argument lists differ in length)
  [javac] 2 errors
  [javac] 1 warning
 
  Im really stuck how to passthough this step. I wasted my entire to fix
 this
  but couldn't move a bit. Please someone help me..?
 
  Thanks,
  Vivek
 
 




Re: Integrate solr with openNLP

2014-06-02 Thread Ahmet Arslan
Hi,

I believe I answered it. Let me re-try, 

There is no committed code for OpenNLP. There is an open ticket with patches. 
They may not work with current trunk.

Confluence is the official documentation. Wiki is maintained by community. 
Meaning wiki can talk about some uncommitted features/stuff. Like this one : 
https://wiki.apache.org/solr/OpenNLP

What I am suggesting is, have a look at 
https://cwiki.apache.org/confluence/display/solr/UIMA+Integration


And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is already doable 
with solr-uima. I am adding Tommaso (sorry for this but we need an 
authoritative answer here) to clarify this.


Also consider indexing with SolrJ and use OpenNLP enrichment outside the solr. 
Use openNLP with plain java, enrich your documents and index them with SolJ. 
You don't have to too everything inside solr as solr-plugins.

Hope this helps,

Ahmet


On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi vi...@biginfolabs.com 
wrote:
Thanks, I will check with the jira.. but you dint answe my first
question..? And there's no way to integrate solr with openNLP?or is there
any committed code, using which i can go head.

Thanks,
Vivek





On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 Here is the jira issue : https://issues.apache.org/jira/browse/LUCENE-2899


 Anyone can create an account.

 I didn't use UIMA by myself and I have little knowledge about it. But I
 believe it is possible to use OpenNLP inside UIMA.
 You need to dig into UIMA documentation.

 Solr UIMA integration already exists, thats why I questioned whether your
 requirement is possible with uima or not. I don't know the answer myself.

 Ahmet



 On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 Hi Arslan,

 If not uncommitted code, then which code to be used to integrate?

 If i have to comment my problems, which jira and how to put it?

 And why you are suggesting UIMA integration. My requirements is integrating
 with openNLP.? You mean we can do all the acitivties through UIMA as we do
 it using openNLP..?like name,location finder etc?

 Thanks,
 Vivek





 On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
 wrote:

  Hi,
 
  Uncommitted code could have these kind of problems. It is not guaranteed
  to work with latest trunk.
 
  You could commend the problem you face on the jira ticket.
 
  By the way, may be you are after something doable with already committed
  UIMA stuff?
 
  https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
 
  Ahmet
 
 
 
  On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  I followed this link to integrate https://wiki.apache.org/solr/OpenNLP
 to
  integrate
 
  Installation
 
  For English language testing: Until LUCENE-2899 is committed:
 
      1.pull the latest trunk or 4.0 branch
 
      2.apply the latest LUCENE-2899 patch
      3.do 'ant compile'
      cd solr/contrib/opennlp/src/test-files/training
      .
      .
      .
  i followed first two steps but got the following error while executing
 3rd
  point
 
  common.compile-core:
      [javac] Compiling 10 source files to
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java
 
      [javac] warning: [path] bad path element
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
  no such file or directory
 
      [javac]
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
  error: cannot find symbol
 
      [javac]     super(Version.LUCENE_44, input);
 
      [javac]                  ^
      [javac]   symbol:   variable LUCENE_44
      [javac]   location: class Version
      [javac]
 
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
  error: no suitable constructor found for Tokenizer(Reader)
      [javac]     super(input);
      [javac]     ^
      [javac]     constructor Tokenizer.Tokenizer(AttributeFactory) is not
  applicable
      [javac]       (actual argument Reader cannot be converted to
  AttributeFactory by method invocation conversion)
      [javac]     constructor Tokenizer.Tokenizer() is not applicable
      [javac]       (actual and formal argument lists differ in length)
      [javac] 2 errors
      [javac] 1 warning
 
  Im really stuck how to passthough this step. I wasted my entire to fix
 this
  but couldn't move a bit. Please someone help me..?
 
  Thanks,
  Vivek
 
 




Re: Integrate solr with openNLP

2014-06-02 Thread Vivekanand Ittigi
We'll surely look into UIMA integration.

But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the only
link we've got to integrate?isn't there any other article or link which may
help us to do fix this problem.

Thanks,
Vivek


On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 I believe I answered it. Let me re-try,

 There is no committed code for OpenNLP. There is an open ticket with
 patches. They may not work with current trunk.

 Confluence is the official documentation. Wiki is maintained by community.
 Meaning wiki can talk about some uncommitted features/stuff. Like this one
 : https://wiki.apache.org/solr/OpenNLP

 What I am suggesting is, have a look at
 https://cwiki.apache.org/confluence/display/solr/UIMA+Integration


 And search how to use OpenNLP inside UIMA. May be LUCENE-2899 is already
 doable with solr-uima. I am adding Tommaso (sorry for this but we need an
 authoritative answer here) to clarify this.


 Also consider indexing with SolrJ and use OpenNLP enrichment outside the
 solr. Use openNLP with plain java, enrich your documents and index them
 with SolJ. You don't have to too everything inside solr as solr-plugins.

 Hope this helps,

 Ahmet


 On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi vi...@biginfolabs.com
 wrote:
 Thanks, I will check with the jira.. but you dint answe my first
 question..? And there's no way to integrate solr with openNLP?or is there
 any committed code, using which i can go head.

 Thanks,
 Vivek





 On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan iori...@yahoo.com wrote:

  Hi,
 
  Here is the jira issue :
 https://issues.apache.org/jira/browse/LUCENE-2899
 
 
  Anyone can create an account.
 
  I didn't use UIMA by myself and I have little knowledge about it. But I
  believe it is possible to use OpenNLP inside UIMA.
  You need to dig into UIMA documentation.
 
  Solr UIMA integration already exists, thats why I questioned whether your
  requirement is possible with uima or not. I don't know the answer myself.
 
  Ahmet
 
 
 
  On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
  Hi Arslan,
 
  If not uncommitted code, then which code to be used to integrate?
 
  If i have to comment my problems, which jira and how to put it?
 
  And why you are suggesting UIMA integration. My requirements is
 integrating
  with openNLP.? You mean we can do all the acitivties through UIMA as we
 do
  it using openNLP..?like name,location finder etc?
 
  Thanks,
  Vivek
 
 
 
 
 
  On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan iori...@yahoo.com.invalid
  wrote:
 
   Hi,
  
   Uncommitted code could have these kind of problems. It is not
 guaranteed
   to work with latest trunk.
  
   You could commend the problem you face on the jira ticket.
  
   By the way, may be you are after something doable with already
 committed
   UIMA stuff?
  
   https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
  
   Ahmet
  
  
  
   On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
   I followed this link to integrate https://wiki.apache.org/solr/OpenNLP
  to
   integrate
  
   Installation
  
   For English language testing: Until LUCENE-2899 is committed:
  
   1.pull the latest trunk or 4.0 branch
  
   2.apply the latest LUCENE-2899 patch
   3.do 'ant compile'
   cd solr/contrib/opennlp/src/test-files/training
   .
   .
   .
   i followed first two steps but got the following error while executing
  3rd
   point
  
   common.compile-core:
   [javac] Compiling 10 source files to
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java
  
   [javac] warning: [path] bad path element
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar:
   no such file or directory
  
   [javac]
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
   error: cannot find symbol
  
   [javac] super(Version.LUCENE_44, input);
  
   [javac]  ^
   [javac]   symbol:   variable LUCENE_44
   [javac]   location: class Version
   [javac]
  
  
 
 /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
   error: no suitable constructor found for Tokenizer(Reader)
   [javac] super(input);
   [javac] ^
   [javac] constructor Tokenizer.Tokenizer(AttributeFactory) is
 not
   applicable
   [javac]   (actual argument Reader cannot be converted to
   AttributeFactory by method invocation conversion)
   [javac] constructor Tokenizer.Tokenizer() is not applicable
   [javac]   (actual and formal argument lists differ in length)
   [javac] 2 errors
   [javac] 1 warning
  
   Im really stuck how to