Hi Bart, I haven't done any UIMA work (I used other stuff for my NLP phase), so not sure I can help much further. But in general, you are venturing into pure research territory here.
Even for dates, what do you actually mean? Just fixed expression? Relative dates (e.g. last tuesday?). What about times (7pm?). Same with cities. If you want it offline, you need the gazetteer and disambiguation modules. Gazetteer for cities (worldwide) is huge and has a lot of duplicate names (Paris, Ontario is apparently a short drive from London, Ontario eh?). Something like http://www.maxmind.com/en/worldcities? And disambiguation usually requires training corpus that is similar to what your text will look like. Online services like OpenCalais are backed by gigantic databases and some serious corpus-training Machine Language disambiguation algorithms. So, no plug-and-play solution here. If you really need to get this done, I would recommend narrowing down the specification of exactly what you will settle for and looking for software that can do it. Once you have that, integration with Solr is your next - and smaller - concern. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Feb 8, 2013 at 10:41 AM, jazz <jazzsa...@me.com> wrote: > Thanks Alex, > > I checked the documentation but it seems there is only a webservice > (OpenCalais) available to extract dates and places. > > http://uima.apache.org/sandbox.html > > Do you know is there is a Solr Compatible UIMA add-on which detects dates > and places (cities) without a webservice? If not, how do you write one? > > Regards, Bart > > On 8 Feb 2013, at 15:29, Alexandre Rafalovitch wrote: > > > Yes, it is possible. You are looking at UIMA or OpenNLP integration, most > > probably in Update Request Processor pipeline. > > > > Have a look here as a start: https://wiki.apache.org/solr/SolrUIMA > > > > You will have to put some serious work into this, it is not all tied > > together and packaged. Mostly because the Natural Language Processing > (the > > field you are getting into) is kind of messy all of its own. > > > > Good luck, > > Alex. > > > > Personal blog: http://blog.outerthoughts.com/ > > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > > - Time is the quality of nature that keeps events from happening all at > > once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) > > > > > > On Fri, Feb 8, 2013 at 9:24 AM, jazz <jazzsa...@me.com> wrote: > > > >> Hi, > >> > >> I want to know if Solr can analyze text and recoginze dates and places. > If > >> yes, is it then possible to create new dynamic fields with these dates > and > >> places (e.g. city). > >> > >> Thanks, Bart > >> > >