Hi Bart,

I haven't done any UIMA work (I used other stuff for my NLP phase), so not
sure I can help much further. But in general, you are venturing into pure
research territory here.

Even for dates, what do you actually mean? Just fixed expression? Relative
dates (e.g. last tuesday?). What about times (7pm?).

Same with cities. If you want it offline, you need the gazetteer and
disambiguation modules. Gazetteer for cities (worldwide) is huge and has a
lot of duplicate names (Paris, Ontario is apparently a short drive from
London, Ontario eh?). Something like
http://www.maxmind.com/en/worldcities? And disambiguation usually
requires training corpus that is similar to
what your text will look like.

Online services like OpenCalais are backed by gigantic databases and some
serious corpus-training Machine Language disambiguation algorithms.

So, no plug-and-play solution here. If you really need to get this done, I
would recommend narrowing down the specification of exactly what you will
settle for and looking for software that can do it. Once you have that,
integration with Solr is your next - and smaller - concern.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Fri, Feb 8, 2013 at 10:41 AM, jazz <jazzsa...@me.com> wrote:

> Thanks Alex,
>
> I checked the documentation but it seems there is only a webservice
> (OpenCalais) available to extract dates and places.
>
> http://uima.apache.org/sandbox.html
>
> Do you know is there is a Solr Compatible UIMA add-on which detects dates
> and places (cities) without a webservice? If not, how do you write one?
>
> Regards, Bart
>
> On 8 Feb 2013, at 15:29, Alexandre Rafalovitch wrote:
>
> > Yes, it is possible. You are looking at UIMA or OpenNLP integration, most
> > probably in Update Request Processor pipeline.
> >
> > Have a look here as a start: https://wiki.apache.org/solr/SolrUIMA
> >
> > You will have to put some serious work into this, it is not all tied
> > together and packaged. Mostly because the Natural Language Processing
> (the
> > field you are getting into) is kind of messy all of its own.
> >
> > Good luck,
> >    Alex.
> >
> > Personal blog: http://blog.outerthoughts.com/
> > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> > - Time is the quality of nature that keeps events from happening all at
> > once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
> >
> >
> > On Fri, Feb 8, 2013 at 9:24 AM, jazz <jazzsa...@me.com> wrote:
> >
> >> Hi,
> >>
> >> I want to know if Solr can analyze text and recoginze dates and places.
> If
> >> yes, is it then possible to create new dynamic fields with these dates
> and
> >> places (e.g. city).
> >>
> >> Thanks, Bart
> >>
>
>

Reply via email to