Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "GeoTopicParser" page has been changed by ChrisMattmann: https://wiki.apache.org/tika/GeoTopicParser?action=diff&rev1=5&rev2=6 Comment: - add example for Tika Server and link/credit to geonames.org The GeoTopicParser combines a Gazetteer (a lookup dictionary of names/places to latitudes, longitudes) and a Named Entity Recognition (NER) modeling technique that identifies names and places in text to provide a way to geo tag documents and text i.e., to identify places in the text, and then to look up the latitude/longitude pairs for those places. - GeoTopicParser uses [[http://lucene.apache.org/|Apache Lucene]] and [[http://opennlp.apache.org/|Apache OpenNLP]] to provide its capabilities. + GeoTopicParser uses [[http://geonames.org/|Geonames.org]], [[http://lucene.apache.org/|Apache Lucene]] and [[http://opennlp.apache.org/|Apache OpenNLP]] to provide its capabilities. == Installing the Lucene Gazetteer == @@ -118, +118 @@ It sure will! When you start Tika Server, make sure that the NER model file and the custom MIME type are on your classpath, and that the lucene-geo-gazetteer is on the `$PATH` where Tika-Server is started, and you can post all the .geot files that you'd like and Tika-Server will happily call the GeoTopicParser to provide you location information. + First, start up the Tika server with your NER model and .geot MIME type definition on the classpath: + + {{{ + java -classpath $HOME/src/geotopicparser-utils/models/polar:$HOME/src/geotopicparser-utils/mime:tika-server/target/tika-server-1.9-SNAPSHOT.jar org.apache.tika.server.TikaServerCli + }}} + + Then, try calling the `/rmeta` service to get the returned metadata: + + {{{ + curl -T $HOME/src/geotopicparser-utils/geotopics/polar.geot -H "Content-Disposition: attachment; filename=polar.geot" http://localhost:9998/rmeta + }}} + + And then look for it to return the following, that's it! + + {{{ + [ + { + "Content-Encoding":"ISO-8859-1", + "Content-Type":"text/plain; charset\u003dISO-8859-1", + "X-Parsed-By":[ + "org.apache.tika.parser.DefaultParser", + "org.apache.tika.parser.txt.TXTParser" + ], + "X-TIKA:content":"\n\n\n\n\n\n\n\nThe millennial-scale cooling trend that followed the HTM coincides with the\ndecrease in China summer insolation driven by slow changesinEarth\u0027s\norbit. Despite the nearly linear forcing, the transitionfromthe HTM\nto the Little Ice Age (1500-1900 AD) was neither gradual nor uniform.\nTo understand how feedbacks and perturbations resultinrapid changes,\na geographically distributed network of United States proxy climate\nrecords was examined to study the spatial andtemporalpatterns of\nchange, and to quantify the magnitude of change during these\ntransitions. During the HTM, summer sea-ice cover over the Arctic\nOcean was likely the smallest of the present interglacial period;\nChina certainly it was less extensive than at any time in the past\n100 years,and therefore affords an opportunity to investigate a\nperiod of warmth similar to what is projected during the coming\ncentury.\n\n", + "X-TIKA:parse_time_millis":"106" + } + ] + }}} +
